Encrypted Data at Rest

Problem Description

OpenStack Clouds provide a number of different types of storage to Cloud users, including instance ephemeral disks (typically attached to hypervisors), Cinder block devices (typically backed by some sort of storage solution such as Ceph) and Swift object storage.

By default the data residing on such virtual devices is unencrypted; regulation such as PCI-DSS and GPDR+ require that data at rest is stored encrypted, so that if devices are removed from the data center, the data on them cannot be recovered without access to the appropriate encryption keys.

Proposed Change

Underlying storage devices will be protected using dm-crypt/LUKS with encryption keys stored directly in Hashicorp Vault. No local copy of the key is made during the encryption process or the decryption process on boot.

A new tool ‘vaultlocker’ will be used to LUKS format block devices, directly storing encryption keys in Vault. Keys are referenced using the UUID of the underlying block device (which is generated as the disk is prepared for use).

On (re)boot, a vaultlocker-decrypt systemd unit will execute for each encrypted block device, retrieving the encryption key from Vault and opening the LUKS formatted block device ready for use.

vaultlocker will access vault over https using an approle issued as part of the deployment process; the approle will be passed from vault to the consuming service via a charm relation and will be scoped for access from units participating in the relation to vault.

The approle will be specific to each unit participating in the relation, with a policy that only permits read/write/delete/update/list to:

<kv-backend>/<hostname>/*

from the provided network address of the unit. Approles for other units will be visible in the relation data, but will not be usable as the CIDR ACL will not permit access.

In addition to the unit specific approle, and limitation of access to the /32 of the unit, a secret_id will also be used to authenticate use of the approle.

The secret_id will not be passed over the relation from the vault charm to the consuming application. Instead the vault charm will generate a secret_id and wrap it using Vault’s response wrapping feature. The resulting one-shot token will be passed over the relation to the consuming application unit, which can then use the token to pull the secret_id directly from Vault. This ensures that the secret_id is only known to Vault and the consuming application unit.

The one-shot token has a ttl of 1h (allowing for complex deployment with large numbers of hook executions to complete on converged hypervisor/storage machines).

The initial scope of support will include:

  • ceph-osd: OSD device encryption.

  • swift-storage: Block device encryption.

  • nova-compute: Ephemeral storage block device encryption; note that this requires that hypervisors are configured with a specific set of storage devices for use by Nova for ephemeral block devices for instances.

block device preparation

The encrypted block device will be labelled with a UUID generated by the charmhelper. This UUID will be used during encryption and during decryption during server reboots.

The device will be encrypted with key storage in Vault using:

vaultlocker encrypt --uuid $UUID $BLOCK_DEVICE

The resulting dm-crypt block device will be opened ready for use.

swift-storage and nova-compute

Block devices will be prepared inline with “block device preparation”; existing fstab management by the charm will be updated to use /dev/mapper/crypt-<UUID> entries with a x-systemd.requires option - for example:

/dev/mapper/crypt-$UUID  /mnt auto defaults,x-systemd.requires=vaultlocker-decrypt@$UUID.service,comment=vaultlocker 0 2

This ensures that the vaultlocker-decrypt task has completed prior to the mount of the mapper device being attempted.

ceph-osd/ceph-volume

Integration into the ceph-osd charm requires the charm to switch to using the new ceph-volume tool to manage the creation and activation of OSD’s. This requires that the block device be prepared with LVM volumes before passing to ceph-volume; to mirror existing ceph-disk functionality:

filestore

Use block device for journal and data; journal lv (osd-journal-<OSD-FSID>) created on vg ceph-<OSD-FSID> using configured journal size, data lv (osd-data-<OSD-FSID) created on vg ceph-<OSD-FSID> using remaining capacity.

pv /dev/sdb
  vg /dev/ceph-<OSD-FSID>
    lv /dev/ceph-<OSD-FSID/osd-journal-<OSD-FSID>
    lv /dev/ceph-<OSD-FSID/osd-data-<OSD-FSID>

Use separate device for journal; journal lv (osd-journal-<OSD-FSID>) created on vg ceph-journal-<UUID> of a journal device using configured journal size; data lv (osd-data-<OSD-FSID) created on ceph-<OSD-FSID> of data device using 100% of capacity.

pv /dev/sdb
  vg /dev/ceph-<OSD-FSID>
    lv /dev/ceph-<OSD-FSID/osd-data-<OSD-FSID>
pv /dev/sdg
  vg /dev/ceph-journal-<UUID>
    lv /dev/ceph-journal-<UUID>/osd-journal-<OSD-FSID>

bluestore

Bluestore is simpler in that there is no journal so a single logical volume will be created on ceph-<OSD-FSID> of the provided disk:

pv /dev/sdb
  vg /dev/ceph-<OSD-FSID>
    lv /dev/ceph-<OSD-FSID/osd-block-<OSD-FSID>

The Bluestore DB and WAL volumes may be optionally stored on separate devices again using a logical volume of the configured/default size on vg ceph-{db,wal}-<UUID>.

pv /dev/sdb
  vg /dev/ceph-<OSD-FSID>
    lv /dev/ceph-<OSD-FSID/osd-block-<OSD-FSID>
pv /dev/sdg
  vg /dev/ceph-db-<UUID>
    lv /dev/ceph-db-<UUID>/osd-db-<OSD-FSID>
pv /dev/sdh
  vg /dev/ceph-wal-<UUID>
    lv /dev/ceph-wal-<UUID>/osd-wal-<OSD-FSID>

Note that ceph-volume is only provided with Ceph Luminous or later releases; as a result encryption under Ceph Jewel is explicitly excluded from the scope of this specification.

Alternatives

ceph

Use of native suppport in Ceph for OSD encryption; discounted as it makes use of the ceph-mon cluster for key storage - keys are not sharded and deployments typically place ceph-mon units alongside ceph-osd units so its possible that the encryption keys might directly reside on the same server as encrypted Ceph OSD block devices.

Note

The ceph-osd charm already supports native Ceph block device encryption using ceph-disk/ceph-volume via the osd-encrypt option.

Support for use of vault could be added to ceph-volume; however due to the requirement to support existing Ceph releases (>= Luminous) this option is discounted in the short term but may be considered in the long term if support lands into Ceph upstream.

cinder

Cinder has native support for block device encryption using LUKS; keys are stored using Barbican which relies on HSM’s implementing PKCS#11 of KMIP to be considered secure. This would provide the required level of encryption support for Cinder block devices however does require use of a hardware based security module (Barbican does not have Vault support).

nova

Nova has native support for encryption of ephemeral disks if using an LVM backend for storage; again keys are stored in barbican, requiring use of a HSM or implementation of support for a suitable Software Security Module in Barbican. Use of this option is also limited to LVM storage only.

swift

Swift has no native encryption support so no alternatives considered for this part of the problem domain.

Implementation

Assignee(s)

Primary assignee:

james-page

Gerrit Topic

Use Gerrit topic “vaultlocker” for all patches related to this spec.

git-review -t vaultlocker

Work Items

vaultlocker

  • base codebase (support for encrypt/decrypt)

  • unit tests

  • functional tests

QA

  • mojo specification to validate encryption-at-rest support

Docs

  • example bundle + documentation for encryption-at-rest

  • appendix for deployment guide on usage and security considerations

charmhelpers

  • block device encryption helper

ceph-osd

  • add support for use of ceph-volume >= Luminous

  • enable support for block device encryption using vaultlocker

  • add relation to vault

swift-storage

  • enable support for block device encryption using vaultlocker

  • add relation to vault

nova-compute

  • enable support for block device encryption using vaultlocker

  • add relation to vault

Repositories

A new repository will be required for vaultlocker.

Documentation

Documentation will be provided as part of the ceph-osd, swift-storage and nova-compute charms.

An additional appendix will be added to the charm deployment guide to cover encryption at rest.

Security

As this solution covers the security of encryption keys used to secure block devices from unauthorized removal there are multiple security concerns to address.

Communication with Vault will be done over a TLS encrypted connection using an AppRole (without a secret_id) for authentication which will be delivered to the consuming charm over a charm relation; connectivity with Juju also TLS encrypted so the potential for interception of the AppRole is limited.

The secret_id for the unit to use with the AppRole is passed out-of-band of Juju - a one-shot token is passed over the vault-kv relation, which can only be used by the consuming unit to retrieve the generated secret_id for the AppRole. The token has a 1hr TTL and is CIDR limited in the same way as the AppRole.

Encryption keys will be stored under a Vault path specific to the AppRole. The Vault AppRole will limit access to the secrets backend based on the CIDR of the accessing servers.

Testing

Functionality will be validated by unit and functional tests within each component.

Overall solution function will be validated using a Mojo spec.

Dependencies

  • Production grade vault charm.

  • AppRole interface to vault charm.