Flavour and Image defined ephemeral storage encryption¶
https://blueprints.launchpad.net/nova/+spec/ephemeral-storage-encryption
This spec outlines a new approach to ephemeral storage encryption in Nova allowing users to select how their ephemeral storage is encrypted at rest through the use of flavors with specific extra specs or images with specific properties. The aim being to bring the ephemeral storage encryption experience within Nova in line with the block storage encryption implementation provided by Cinder where user selectable encrypted volume types are available.
Note
This spec will only cover the high level changes to the API and compute layers, implementation within specific virt drivers is left for separate specs.
Problem description¶
At present the only in-tree ephemeral storage encryption support is provided by
the libvirt virt driver when using the lvm imagebackend. The current
implementation provides basic operator controlled and configured host specific
support for ephemeral disk encryption at rest where all instances on a given
compute are forced to use encrypted ephemeral storage using the dm-crypt
PLAIN
encryption format.
This is not ideal and makes ephemeral storage encryption completely opaque to the end user as opposed to the block storage encryption support provided by Cinder where users are able to opt-in to using admin defined encrypted volume types to ensure their storage is encrypted at rest.
Additionally the current implementation uses a single symmetric key to encrypt
all ephemeral storage associated with the instance. As the PLAIN
encryption
format is used there is no way to rotate this key in-place.
Use Cases¶
As a user I want to request that all of my ephemeral storage is encrypted at rest through the selection of a specific flavor or image.
As a user I want to be able to pick how my ephemeral storage is encrypted at rest through the selection of a specific flavor or image.
As an admin/operator I want to either enforce ephemeral encryption per flavor or per image.
As an admin/operator I want to provide sane choices to my end users regarding how their ephemeral storage is encrypted at rest.
As a virt driver maintainer/developer I want to indicate that my driver supports ephemeral storage encryption using a specific encryption format.
As a virt driver maintainer/developer I want to provide sane default encryption format and options for users looking to encrypt their ephemeral storage at rest. I want these associated with the encrypted storage until it is deleted.
Proposed change¶
To enable this new flavor extra specs, image properties and host configurables will be introduced. These will control when and how ephemeral storage encryption at rest is enabled for an instance.
Note
The following hw_ephemeral_encryption
image properties do not relate to
if an image is encrypted at rest within the Glance service. They only relate
to how ephemeral storage will be encrypted at rest when used by a
provisioned instance within Nova.
Separate image properties have been documented in the Glance image encryption and Cinder image encryption specs to cover how images can be encrypted at rest within Glance.
Allow ephemeral encryption to be configured by flavor, image or config¶
To enable ephemeral encryption per instance the following boolean based flavor extra spec and image property will be introduced:
hw:ephemeral_encryption
hw_ephemeral_encryption
The above will enable ephemeral storage encryption for an instance but does not control the encryption format used or the associated options. For this the following flavor extra specs, image properties and configurables will be introduced.
The encryption format used will be controlled by the following flavor extra specs and image properties:
hw:ephemeral_encryption_format
hw_ephemeral_encryption_format
When neither of the above are provided but ephemeral encryption is still
requested an additional host configurable will be used to provide a default
format per compute, this will initially default to luks
:
[ephemeral_storage_encryption]/default_format
This could lead to requests against different clouds resulting in a different ephemeral encryption format being used but as this is transparent to the end user from within the instance it shouldn’t have any real impact.
The format will be provided as a string that maps to a
BlockDeviceEncryptionFormatTypeField
oslo.versionedobjects field value:
plain
for the plain dm-crypt formatluks
for the LUKSv1 format
To enable snapshot and shelve of instances using ephemeral encryption, the UUID of the encryption secret is stored in the key manager for the resultant image will be kept with the image as an image property:
hw_ephemeral_encryption_secret_uuid
The secret UUID is needed when creating an instance from an ephemeral encrypted snapshot or when unshelving an ephemeral encrypted instance.
Create a new key manager secret for every new encrypted disk image¶
The approach for disk image secrets is to never share secrets between different disk images and that each disk image has a unique secret. This is done to address both 1) the security implications and 2) the logistics of cleaning up secrets that are no longer in use.
For example:
Let’s say Instance A
has 3 disks: one root disk, one ephemeral disk, and
one swap disk. Each disk will have its own secret.
This table is intended to illustrate the way secrets are handled in various scenarios.
+--------------------+-------------+--------------+------------------------------------------------------+
| Instance or Image | Disk | Secret | Notes |
| | | (passphrase) | |
+====================+=============+==============+======================================================+
| Instance A | disk (root) | Secret 1 | Secret 1, 2, and 3 will be automatically deleted |
| +-------------+--------------+ by Nova when Instance A is deleted and its disks are |
| | disk.eph0 | Secret 2 | destroyed |
| +-------------+--------------+ |
| | disk.swap | Secret 3 | |
+--------------------+-------------+--------------+------------------------------------------------------+
| Image Z (snapshot) | disk (root) | Secret 4 | Secret 4 will *not* be automatically deleted and |
| created from | | (new secret | manual deletion will be needed if/when Image Z is |
| Instance A | | is created) | deleted from Glance |
+--------------------+-------------+--------------+------------------------------------------------------+
| Instance B | disk (root) | Secret 5 | Secret 5, 6, and 7 will be automatically deleted |
| created from +-------------+--------------+ by Nova when Instance B is deleted and its disks are |
| Image Z (snapshot) | disk.eph0 | Secret 6 | destroyed |
| +-------------+--------------+ |
| | disk.swap | Secret 7 | |
+--------------------+-------------+--------------+------------------------------------------------------+
| Instance C | disk (root) | Secret 8 | Secret 8, 9, and 10 will be automatically deleted |
| +-------------+--------------+ by Nova when Instance C is deleted and its disks are |
| | disk.eph0 | Secret 9 | destroyed |
| +-------------+--------------+ |
| | disk.swap | Secret 10 | |
+--------------------+-------------+--------------+------------------------------------------------------+
| Image Y (snapshot) | disk (root) | Secret 8 | Secret 8 is *retained* when Instance C is shelved in |
| created by shelve | | | part to prevent the possibility of a change in |
| of Instance C | | | ownership of the root disk secret if, for example, |
| | | | an admin user shelves a non-admin user's instance. |
| | | | This approach could be avoided if there is some way |
| | | | we could create a new secret using the instance's |
| | | | user/project rather than the shelver's user/project |
+--------------------+-------------+--------------+------------------------------------------------------+
| Rescue disk | disk (root) | Secret 11 | Secret 11 is stashed in the instance's system |
| created by rescue | | (new secret | metadata with key |
| of Instance A | | is created) | ``rescue_disk_ephemeral_encryption_secret_uuid``. |
| | | | This is done because a BDM record for the rescue |
| | | | disk is not going to be persisted to the database. |
+--------------------+-------------+--------------+------------------------------------------------------+
Snapshots of instances with ephemeral encryption¶
When an instance with ephemeral encryption is snapshotted, a new encryption
secret is created and its key manager secret UUID is kept as an image property
hw_ephemeral_encryption_secret_uuid
and the image is uploaded to Glance.
When a new instance is created from an encrypted image, the image property
hw_ephemeral_encryption_secret_uuid
is passed down to the lower layers by
storing it in the instance’s system metadata with key
image_hw_ephemeral_encryption_secret_uuid
. This is done because at the
lower layers (where qemu-img convert
is called, for example) we no longer
have access to the image metadata and refactoring to pass image metadata to
several lower layer methods, or similar, would be required otherwise.
Snapshots created by shelving instances with ephemeral encryption¶
When an instance with ephemeral encryption is shelved, the existing root disk encryption secret is retained and will be used to unshelve the instance later. This is done to prevent a potential change in ownership of the root disk encryption secret in a scenario where an admin user shelves a non-admin user’s instance, for example. If a new secret were created owned by the admin user, the non-admin user who owns the instance will be unable to unshelve the instance.
This behavior could be avoided however if there is some way we could create a new encryption secret using the instance’s user and project rather than the shelver’s user and project. If that is possible, we would not need to reuse the encryption secret.
Rescue disk images created by rescuing instances with ephemeral encryption¶
When rescuing an instance and an encrypted rescue image is
specified, the rescue image secret UUID from the image property will be stashed
in the instance’s system metadata with key
rescue_image_hw_ephemeral_encryption_secret_uuid
to pass it down to the
lower layers. This is considered separate from
image_hw_ephemeral_encryption_secret_uuid
which means the encrypted image
from which the instance was created. Another reason to keep it separate is to
avoid confusion for those reading or working on the code.
A new encryption secret is created when the rescue disk is created and its UUID
is stashed in the instance’s system metadata with key
rescue_disk_ephemeral_encryption_secret_uuid
. This is done because a block
device mapping record for the rescue disk is not going to be persisted to the
database.
The corresponding virt driver secret name pattern is
<instance UUID>_rescue_disk
and any existing secrets with that name are
deleted by the virt driver when a new rescue is requested.
The new encryption secret for the rescue disk is deleted from the key manager and the virt driver secret is also deleted when the instance is unrescued.
Cleanup of ephemeral encryption secrets¶
Ephemeral encryption secrets are deleted from the key manager and the virt driver when the corresponding instance is deleted and its disks are destroyed. The approach is that encryption secrets are only deleted when the disks associated with them are destroyed.
Encryption secrets that are created when a snapshot is created are never deleted by Nova. It would only be acceptable to delete the secret if and when the snapshot image is deleted. Cleanup of secrets whose images have been deleted from Glance must be deleted manually by the user or an admin.
Note
At the time of this writing, the newest Ceph release v17 (Quincy) does not support creating a cloned image with an encryption key different from its parent. For this reason, copy-on-write cloning will not be enabled for instances which have specified ephemeral encryption.
Support for creating a cloned image with an encryption key different from its parent should be supported in the next release of Ceph. When we are able to require a Ceph version >= v18, copy-on-write cloning with ephemeral encryption can be enabled. See https://github.com/ceph/ceph/commit/1d3de19 for reference.
BlockDeviceMapping changes¶
The BlockDeviceMapping
object will be extended to include the following
fields encapsulating some of the above information per ephemeral disk within
the instance:
encrypted
A simple boolean to indicate if the block device is encrypted. This will initially only be populated when ephemeral encryption is used but could easily be used for encrypted volumes as well in the future.
encryption_secret_uuid
As the name suggests this will contain the UUID of the associated encryption secret for the disk. The type of secret used here will be specific to the encryption format and virt driver used, it should not be assumed that this will always been an symmetric key as is currently the case with all encrypted volumes provided by Cinder. For example, for
luks
based ephemeral storage this secret will be apassphrase
.encryption_format
A new
BlockDeviceEncryptionFormatType
enum and associatedBlockDeviceEncryptionFormatTypeField
field listing the encryption format. The available options being kept in line with the constants currently provided by os-brick and potentially merged in the future if both can share these types and fields somehow.encryption_options
A simple unversioned dict of strings containing encryption options specific to the virt driver implementation, underlying hypervisor and format being used.
Note
The encryption_options
field will be unused and not exposed to end
users initially because of the security and upgrade implications around it.
For the first pass, sensible defaults for the cipher algorithm, cipher
mode, and initialization vector generator algorithm will be hard-coded
instead.
Encryption options could be exposed to end users in the future when a proper design which addresses security and handles all upgrade scenarios is developed.
Populate ephemeral encryption BlockDeviceMapping attributes during build¶
When launching an instance with ephemeral encryption requested via either the
image or flavor the BlockDeviceMapping.encrypted
attribute will be set to
True
for each BlockDeviceMapping
record with a destination_type
value of local
. This will happen after the original API BDM dicts have been
transformed into objects within the Compute API but before scheduling the
instance(s).
The encryption_format
attribute will also take its’ value from the image or
flavor if provided. Any differences or conflicts between the image and flavor
for this will raise a 409 Conflict
error being raised by the API.
Use COMPUTE_EPHEMERAL_ENCRYPTION
compatibility traits¶
A COMPUTE_EPHEMERAL_ENCRYPTION
compute compatibility trait was introduced
during Wallaby and will be reported by virt drivers to indicate overall
support for ephemeral storage encryption using this new approach. This trait
will always be used by pre-filter outlined in the following section when
ephemeral encryption has been requested, regardless of any format being
specified in the request, allowing the compute that eventually handles the
request to select a format it supports using the
[ephemeral_storage_encryption]/default_format
configurable.
COMPUTE_EPHEMERAL_ENCRYPTION_$FORMAT
compute compatibility traits were also
added to os-traits during Wallaby and will be reported by virt drivers to
indicate support for specific ephemeral storage encryption formats. For
example:
COMPUTE_EPHEMERAL_ENCRYPTION_LUKS
COMPUTE_EPHEMERAL_ENCRYPTION_LUKSV2
COMPUTE_EPHEMERAL_ENCRYPTION_PLAIN
These traits will only be used alongside the COMPUTE_EPHEMERAL_ENCRYPTION
trait when the hw_ephemeral_encryption_format
image property or
hw:ephemeral_encryption_format
extra spec have been provided in the initial
request.
Introduce an ephemeral encryption request pre-filter¶
A new pre-filter will be introduced that adds the above traits as required to
the request spec when the aforementioned image properties or flavor extra specs
are provided. As outlined above this will always include the
COMPUTE_EPHEMERAL_ENCRYPTION
trait when ephemeral encryption has been
requested and may optionally include one of the format specific traits if a
format is included in the request.
Expose ephemeral encryption attributes via block_device_info¶
Once the BlockDeviceMapping
objects have been updated and the instance
scheduled to a compute the objects are transformed once again into a
block_device_info
dict understood by the virt layer that at present
contains the following:
root_device_name
The root device path used by the instance.
ephemerals
A list of
DriverEphemeralBlockDevice
dict objects detailing the ephemeral disks attached to the instance. Note this does not include the initial image based disk used by the instance that is classified as an ephemeral disk in terms of the ephemeral encryption feature.block_device_mapping
A list of
DriverVol*BlockDevice
dict objects detailing the volume based disks attached to the instance.swap
An optional
DriverSwapBlockDevice
dict object detailing the swap device.
For example:
{
"root_device_name": "/dev/vda",
"ephemerals": [
{
"guest_format": null,
"device_name": "/dev/vdb",
"device_type": "disk",
"size": 1,
"disk_bus": "virtio"
}
],
"block_device_mapping": [],
"swap": {
"swap_size": 1,
"device_name": "/dev/vdc",
"disk_bus": "virtio"
}
}
As noted above block_device_info
does not provide a complete overview of
the storage associated with an instance. In order for it to be useful in the
context of ephemeral storage encryption we would need to extend the dict to
always include information relating to local image based disks.
As such a new DriverImageBlockDevice
dict class will be introduced covering
image based block devices and provided to the virt layer via an additional
image
key within the block_device_info
dict when the instance uses such
a disk. As with the other Driver*BlockDevice
dict classes this will proxy
access to the underlying BlockDeviceMapping
object allowing the virt layer
to lookup the previously listed encrypted
and encryption_*
attributes.
While outside the scope of this spec the above highlights a huge amount of
complexity and technical debt still residing in the codebase around how storage
configurations are handled between the different layers. In the long term we
should plan to remove block_device_info
and replace it with direct access
to BlockDeviceMapping
based objects ensuring the entire configuration is
always exposed to the virt layer.
Report that a disk is encrypted at rest through the metadata API¶
Extend the metadata API so that users can confirm that their ephemeral storage is encrypted at rest through the metadata API, accessible from within their instance.
{
"devices": [
{
"type": "nic",
"bus": "pci",
"address": "0000:00:02.0",
"mac": "00:11:22:33:44:55",
"tags": ["trusted"]
},
{
"type": "disk",
"bus": "virtio",
"address": "0:0",
"serial": "12352423",
"path": "/dev/vda",
"encrypted": "True"
},
{
"type": "disk",
"bus": "ide",
"address": "0:0",
"serial": "disk-vol-2352423",
"path": "/dev/sda",
"tags": ["baz"]
}
]
}
This should also be extended to cover disks provided by encrypted volumes but this is obviously out of scope for this implementation.
Block resize between flavors with different hw:ephemeral_encryption settings¶
Ephemeral data is expected to persist through a resize and as such any resize between flavors that differed in their configuration of ephemeral encryption (one enabled, another disabled or formats etc) would cause us to convert this data in place. This isn’t trivial and so for this initial implementation resizing between flavors that differ will be blocked.
Provide a migration path from the legacy implementation¶
New nova-manage
and nova-status
commands will be introduced to migrate
any instances using the legacy libvirt virt driver implementation ahead of the
removal of this in a future release.
The nova-manage
command will ensure that any existing instances with
ephemeral_key_uuid
set will have their associated BlockDeviceMapping
records updated to reference said secret key, the plain
encryption format
and configured options on the host before clearing ephemeral_key_uuid
.
Additionally the libvirt virt driver will also attempt to migrate instances
with ephemeral_key_uuid
set during spawn. This should allow at least some
of the instances to be moved during the W release ahead of X.
The nova-status
command will simply report on the existence of any
instances with ephemeral_key_uuid
set that do not have the corresponding
BlockDeviceMapping
attributes enabled etc.
Deprecate the now legacy implementation¶
The legacy implementation within the libvirt virt driver will be deprecated for removal in a future release once the ability to migrate is in place.
Alternatives¶
Continue to use the transparent host configurables and expand support to other
encryption formats such as LUKS
.
Data model impact¶
See above for the various flavor extra spec, image property,
BlockDeviceMapping
and DriverBlockDevice
object changes.
REST API impact¶
Flavor extra specs and image property validation will be introduced for the any ephemeral encryption provided options.
Attempts to resize between flavors that differ in their ephemeral encryption options will be rejected.
Attempts to rebuild between images that differ in their ephemeral encryption options will be allowed.
The metadata API will be changed to allow users to determine if their ephemeral storage is encrypted as discussed above.
Security impact¶
This should hopefully be positive given the unique secret per disk and user visible choice regarding how their ephemeral storage is encrypted at rest.
Additionally this should allow additional virt drivers to support ephemeral storage encryption while also allowing the libvirt virt driver to increase coverage of the feature across more imagebackends such as qcow2 and rbd.
Note
Internal base images stored locally in Nova will not be encrypted at rest.
Notifications impact¶
N/A
Other end user impact¶
Users will now need to opt-in to ephemeral storage encryption being used by their instances through their choice of image or flavors.
Performance Impact¶
The additional pre-filter will add a small amount of overhead when scheduling instances but this should fail fast if ephemeral encryption is not requested through the image or flavor.
The performance impact of increased use of ephemeral storage encryption by instances is left to be discussed in the virt driver specific specs as this will vary between hypervisors.
Other deployer impact¶
N/A
Developer impact¶
Virt driver developers will be able to indicate support for specific ephemeral storage encryption formats using the newly introduced compute compatibility traits.
Upgrade impact¶
The compute traits should ensure that requests to schedule instances using ephemeral storage encryption with mixed computes (N-1 and N) will work during a rolling upgrade.
As discussed earlier in the spec future upgrades will need to provide a path for existing ephemeral storage encryption users to migrate from the legacy implementation. This should be trivial but may require an additional grenade based job in CI during the W cycle to prove out the migration path.
Implementation¶
Assignee(s)¶
- Primary assignee:
melwitt
- Other contributors:
lyarwood
Feature Liaison¶
- Feature liaison:
melwitt
Work Items¶
Introduce
hw_ephemeral_encryption*
image properties andhw:ephemeral_encryption
flavor extra specs.Introduce a new
encrypted
.encryption_secret_uuid
,encryption_format
andencryption_options
attributes to the BlockDeviceMapping Object.Wire up the new
BlockDeviceMapping
object attributes through theDriver*BlockDevice
layer andblock_device_info
dict.Report ephemeral storage encryption through the metadata API.
Introduce new
nova-manage
andnova-status
commands to allow existing users to migrate to this new implementation. This should however be blocked outside of testing until a virt driver implementation is landed.Validate all of the above in functional tests ahead of any virt driver implementation landing.
Dependencies¶
None
Testing¶
At present without a virt driver implementation this will be tested entirely within our unit and functional test suites.
Once a virt driver implementation is available additional integration tests in Tempest and whitebox tests can be written.
Testing of the migration path from the legacy implementation will require an additional grenade job but this will require the libvirt virt driver implementation to be completed first.
Documentation Impact¶
The new host configurables, flavor extra specs and image properties should be documented.
New user documentation should be written covering the overall use of the feature from a Nova point of view.
Reference documentation around BlockDeviceMapping objects etc should be updated to make note of the new encryption attributes.
References¶
History¶
Optional section intended to be used each time the spec is updated to describe new design, API or any database schema updated. Useful to let reader understand what’s happened along the time.
Release Name |
Description |
---|---|
Wallaby |
Introduced |
Xena |
Reproposed |
Yoga |
Reproposed |
Zed |
Reproposed |
2023.1 Antelope |
Reproposed |
2023.2 Bobcat |
Reproposed |
2024.1 Caracal |
Reproposed |