Glance as first-line defense for image format attacks¶
https://blueprints.launchpad.net/glance/+spec/glance-as-defender
Glance is the point of entry for images into the cloud. It is the orifice through which an untrusted (but authenticated) user brings an image into the system, after which it will be processed by backend routines either for format conversion or preparation for boot.
It is at this stage in the pipeline where we are best positioned to do sanity checking about the images we accept, and the point at which we can validate some of the metadata provided by the user to ensure that downstream services (such as nova, glance, ironic and others) can reasonably assume that metadata to be correct.
Problem description¶
Glance will (for the most part) allow a user to upload literally content and
declare its format to be any of the valid values we have for disk_format
and
container_format
. This is certainly surprising to downstream services,
external consumers, and humans which expect the format stated on the image to
be in line with the actual content.
A further problem is in the use of the raw
value for disk_format
. In
general, we use raw
to mean “a byte-for-byte image of a block device”,
usually with a partition table and often a bootloader (in the case of a root
disk). However, in reality raw
has come to mean “anything in a format for
which we do not have another name”. This catch-all behavior means that if we
want to support images of full-disk formats, we also need to support
“anything else we don’t know about” to some degree.
Proposed change¶
This spec proposes two major changes to glance:
First, we will start enforcing that the format of the uploaded content matches
the disk_format
declared on the image. Since our format_inspector
module
is already in the data pipeline for both upload and import we simply need
to remove the “never fail” behavior we currently have and abort the process
if we determine that the format does not match what was claimed. Consider the
following two examples:
An image is declared to be of format
qcow2
but the content uploaded is something else (either another complex format such asvmdk
or something we do not recognize).An image is declared to be of format
raw
but the content uploaded is detected as aqcow2
.
Ideally, we would reject both of these cases. However, the second is more
complex because there are situations where a service handling a disk image
that it does not assume any particular format may be stopped from storing it
in glance if the user of that image has given it a specific format. Thus, for
the first iteration of this work, we will only enforce the first case, which
means that raw
could contain a qcow2
, but a vmdk
could not.
The use of [image_format]/disk_formats
will effectively
allow an admin to limit the types of disks they accept. Today, that only
limits “honest” users, but this change will make it enforced on content as
well.
The second major change is significant in terms of the model and behavior
of existing users, but is small in absolute terms and impact to glance itself.
A new disk_format
option of gpt
will be added, which will henceforth
serve to signify that “this is an image of a raw block device with a partition
table” thus removing our need to overlap that definition with “this is
something we do not recognize” for raw
. The definition of gpt
is
actually a superset of the legacy PC MBR (Master Boot Record) format, which
means an inspector for this format should have no problem detecting and
allowing images of very old (read: Windows XP/2003 vintage) disk images. Thus,
many of the images we now legitimately use raw
for will be (and need to be
converted to be) a disk_format
of gpt
going forward.
We will add a configuration option to disable this behavior as a relief valve to support migration to this stricter model and/or to account for false positive detections.
require_image_format_match
: Default to true, but allow setting to false to avoid aborting the upload/import if the format does not match the content.
Alternatives¶
Glance could continue to be ambivalent about the content uploaded and the mismatch between that and the metadata it stores. Services like nova and cinder will have to continue treating glance as untrustworthy and remain highly suspicious of its metadata.
Data model impact¶
The only data model change that should be required here is one to allow the
new disk_format
value of gpt
to be specified in the API and stored in
the database.
There will, of course, be a need to convert existing raw
images in the
database to gpt
, and thus some tooling will be required. Options for that
could be a glance-manage
command to automatically (or manually) do this,
or allow it to be done through the API. Alternately, we could also annotate
existing images and have the API report them as gpt
if the client is
determined to be new enough.
REST API impact¶
The main REST API impact comes from allowing gpt
as one of the valid
options for disk_format
. Additional impact could come if we decide to
provide format conversion (or reporting) through the API.
Security impact¶
In general, this will improve security for the entire cloud by allowing nova, cinder, and other users of glance some amount of trust in the image content and associated metadata. It is important to avoid the other services thinking they no longer need to inspect image content entirely. Security for this sort of thing is best provided in layers and services need to continue to be vigilant about images they download from glance, certainly applying context-specific checks before using them.
Notifications impact¶
None (aside from the new format).
Other end user impact¶
Users will defintitely be impacted as the muscle memory of (over) using raw as both a catch-all and as meaning “an image of a whole disk” will take some time to un-learn.
Performance Impact¶
We are already using format_inspector
in the data pipeline. We will need to
run all the inspectors in parallel instead of just the declared-format one
we are currently using. However, these are designed to be as memory-efficient
as possible, and thus the overhead should be minimal. Actual performance of
the upload itself should not be impacted.
Other deployer impact¶
Deployers will have work to do here for sure, specifically as existing disks
marked as raw
will (mostly) need to be converted to gpt
. We can not
just convert any raw
, as many of those may be kernel images, or other
formats that we do not (but probably need to) support identifying.
Options for that would be:
Tell operators to just do it themselves and provide some way to change the
disk_format
of an image in the database.Provide a tool to detect and convert the images based on content. We can, in many cases, do this without seeing the whole image, as things that should be
gpt
will be identified within the first sector or two of the content. This could be used only for convertingraw
togpt
, but could be written generically in a way that allows operators to audit all their images to make sure they are in the format they claim to be.
Developer impact¶
No specific impact, although as more formats need to be supported, additional inspector modules will need to be written.
Implementation¶
Assignee(s)¶
- Primary assignee:
danms
Work Items¶
Make the
format_inspector
in the data pipeline detect all formats and abort the upload if the format is determined and does not matchMake glance depend on
oslo.utils
forformat_inspector
Add
gpt
as a validdisk_format
Write tooling for converting
raw
disks togpt
in the databaseAdd config options for new/fallback behaviors
Dependencies¶
The gpt part of this depends on the oslo port of the inspector code.
Testing¶
We will need negative tempest tests for format mismatches, which can be written without much drama. Most of the formats require just a few 512-byte sectors of data to be detected and we can generate those inline in tempest tests to make sure that glance rejects mismatches.
Documentation Impact¶
We will definitely need documentation about the raw-to-gpt behavior change,
and we could definitely use better documentation about when to use raw
,
which will be easier to explain in the context of gpt
.