API Evolution: etag ID

As time has gone on, the API surface has evolved, and the community has given rise to more use-cases and more users. This has resulted in the identification of several shortcomings in the current REST API: it is not as user-friendly as one might expect, and it does not service all of the use-cases we wish that it would.

This specification describes the implementation of etag identifiers on API responses, allowing for better conflict resolution between concurrent API clients.

https://bugs.launchpad.net/ironic/+bug/1605728

Problem description

Multiple clients attempting to update the same resource can overwrite each others’ changes accidentally and without notice. This is referred to as the lost update problem, and it is a problem in the Bare Metal service. The lost update problem is commonly addressed through the use of “etag” identifiers, as described in this API Working Group specification on etags. This specification proposes that the Bare Metal service provides etags support to address this problem.

Proposed change

The Bare Metal service shall begin to store, internally, a unique etag identifier for every API modifiable resource (Node, Port, Portgroup, Chassis). This identifier shall be returned in the body and headers of successful responses to GET requests for requested resource beginning with the introduction of a new microversion. There are mainly two cases:

  • GET one resource(subresource). The ETag will be returned in the response headers and body. Python clients may operate with resource using fresh etag (see client section). Ironic shell users see etag in the response.

  • GET list of resources(subresources). Etag of each resource will be available in response body, as field of the resource:

    {
    ‘ports’: [
    {

    uuid: ‘11111111-2222-3333-4444-555555555555’,

    <all other fields>,

    etag: ‘W/eeeeeeeeeeeeee’

    },

    {

    uuid: ‘66666666-7777-8888-9999-222222222222’,

    <all other fields>,

    etag: ‘W/tttttttttttttt’

    }

    ]

    }

    Note that, putting etags to headers does not make sense, because it is impossible to distinguish etags at client side in standard and simple way.

All requests to modify a resource or subresource, whether through a PUT, PATCH, or DELETE request, SHOULD begin to require If-Match header with an appropriate etag identifier to be supplied in the request (for subresource it is etag of that subresource). Note that If-Match will not be required according to RFC standard specification of If-Match Header.

Therefore SHOULD keyword used here which means that If-Match header will be optional parameter in client (see keywords to indicate requirement levels).

This is important that originally according to rfc entity-tags using If-Match MUST be provided by clients to pass request successfully. When ETags are used for cache validation it is useful. Specifically to our use case making If-Match header necessary will decrease user-friendliness of ironic client. It is up to users to decide if they want to be aware of what they change.

If-None-Match or any other ‘Precondition Header Fields’ will not be supported.

To save efficiency of the request the If-Match header SHALL be validated at API and conductor side by comparing the supplied If-Match header against the current etag string. If the supplied header does not match the actual value, a 412 Precondition Failed SHALL be returned.

On the successful receipt of any request to modify a resource or subresource, a new etag identifier SHALL be generated by the server (it MUST NOT depend on which ironic-api-version came in the request). Where the modification is synchronous and the response already includes a representation of the resource, the new etag identifier SHALL be included in the response body and header.

Etag is a SHA-512 hash generated from JSON string encoded compactly (ordering of dict is not needed) from the dictionary of oslo versioned object fields. Etag will be generated for ALL create and modify requests taking into account the fields except of the ignored ones:

For node ignored fields are:

driver_internal_info, etag, updated_at

For port, portgroup and chassis it is only etag and updated_at.

The generated etag will contain “W/” prefix to specify that weak validator is used. Strong validators are more useful for comparison, but in our use-case, taking into account, that etag is not changed on every update and not when metadata changes (e.g. Content-Type), weak validators are applied. See weak vs strong validators.

Alternatives

If we do not implement etag support, we will not have any means to prevent races between clients’ PATCH requests, and we will not be able to implement support for PUT as a means to update resources or subresources.

Data model impact

As on every REST API request the objects are obtained from database and while it is not going to be changed, etag shall be retrieved from object returned from db. This is more efficiently than spending time to generate hash again and again.

For that purpose a new field etag SHALL be added to each resource table to store the etag identifier. Etag identifier will be String Field with data length limited to 130 characters (etag is a string containing prefix “W/” + SHA-512 hexadecimal number 128 digits long).

New field etag SHALL be also added to Object model in order to be consistent with db layer. Object layer will also be the place where etag will be regenerated based on current Object fields.

Etag will be also included to notifications payload to make them more flexible and usable.

State Machine Impact

None

REST API impact

A new etag header MUST be sent in response to all GET and POST requests starting from specific API microversion, as well as synchronous PATCH and PUT requests.

The same If-Match etag header SHOULD be accepted in all PUT, PATCH and DELETE requests. That means that each endpoint offering any update capability should have logic to validate etag optioanlly.

A new error status 412 PreconditionFailed SHALL be introduced and used to signal the clients that their version of a resource is stale, when the supplied etag header does not match the server’s version.

Client (CLI) impact

Using new microversion clients get the ability to update resource being aware of what exactly they change. For that they SHOULD send an If-Match header with an etag identifier and supported Ironic API version in the header of requests to modify any resource. There are two options: doing this either through CLI or through Python Client API. The last option is available for any python developer script used in clouds (useful for production).

The workflow of etags usage in ironicclient shell:

  • Client does GET request.

  • Starting from specific API microversion, response SHALL include an etag for requested resource(s) in the headers. ETag SHALL be also included within returned resources in the body, both for GET individual resource and resource collection.

  • Etag may be specified by users if needed when doing any operation leading to resource changing by adding --etag flag to the command. Etag can be obtained from body or headers of response:

    ironic --ironic-api-version 1.40 node-update \
    --etag <etag_string> driver_info/foo=value
    

    This etag string is put as If-Match header to the request.

  • Ironic API pops If-Match header, checks it with rpc_node’s etag and if they match, the entity tag is sent further to RPC where conductor validates it again. If etags are not matching at some point, the 412 PreconditionFailed error will be raised. If requested X-Openstack-Ironic-Api-Version does not support etags yet, NotAcceptable error is raised.

To make Python Client API usable without shell, resources will be stored as full-featured objects (not just the bag of attributes), including the etag identifier. To do this the ironicclient API will be rewritten in the way that the Resource class is able to update itself and call manager to send requests available in NodeManager. The Resource will be stored in the memory like all Python objects are stored during process execution.

For the Python API all appropriate actions of Resource object will accept optional parameter etag. The workflow can be the following:

  • In Python Shell or in some script clients do GET request to the resource. The etag returned in the response will be stored in the resource representation. E.g.

    node = node_manager.get(node_ident)

  • Afterwards at any time user scripts can do the action on the resource itself:

    new_node = node.update(patch, etag=True)

    Afterwards they will have the up-to-date resource representation if the request will be validated on the server.

    Note, that for 1 standard deprecation period If-Match will not be sent to server by default. Clients will be warned that in the next release etag parameter will be True by default.

If etag is requested it will be retrieved from current resource representation

(as node.etag or getattr(node, 'etag')). Afterwards it will be sent as If-Match header, it means that user cares about up-to-date information. If etag is not present at the resource and clients did not turn off etag option, they will fail if using the API version greater or equal than etag API version.

Depending on the situation, the client may choose to transparently retry, or display a diff to the user of the stored vs. server-side resource. Clients should also begin alerting the user when an update request fails due to a resource conflict.

“ironic” CLI

See above.

“openstack baremetal” CLI

Same workflow described in the Client Impact.

RPC API impact

RPC API version needs to be bumped to accept etag parameter for actions on the resource. The etag parameter, default as None, should be passed to the appropriate methods.

Driver API impact

None

Nova driver impact

Nova ironic driver may use new Ironic API microversion, so ironic api version used by nova virt driver needs to be bumped. Until etag option in python ironicclient API is True, in the nova driver we should explicitly specify etag=False in appropriate methods through Node resource object.

Ramdisk impact

None

Security impact

None

Other end user impact

Sending If-Match header may fail due to 412 Precondition Failed error. A client may retry with new fresh etag or/and display a diff between two resource’s representations.

Scalability impact

None

Performance Impact

New etag’s generation may increase the time to respond depending on the resource size.

Other deployer impact

Some services (e.g. Nova) change baremetal resources through API, so they may upgrade Ironic API to use etag. If services do not upgrade, warn the deployer about that in the case skipping these upgrades may violate some strong recommendations and information consistency is not guaranteed on ironic side.

Developer impact

Python developers can work with Resource object as with full-featured objects and perform modifying operations on them. Also they can implement scripts that are using etag option in parallel in efficient way.

Implementation

Assignee(s)

Primary assignee:
galyna
Other contributors:
vdrok

Work Items

  • Implement database migration, adding an internal etag field to all top-level resources.
  • Implement generation and validation utility functions in common code.
  • Implement changes within the ironic.api.controllers.v1 modules to accept and return etag identifiers when fetching or changing resources as appropriate.
  • Implement unit tests and tempest tests.
  • Update api-ref documentation.
  • Implement changes in the python client library and openstack CLI to begin caching etags on GET requests and sending etags on PUT/PATCH/DELETE requests.

Dependencies

None

Testing

Unit and tempest tests shall be added that ensure etag identifiers are returned, that they are validated by requests to modify resources, and that proper errors are returned when an invalid (or merely not current) etag is supplied.

Upgrades and Backwards Compatibility

Backwards compatibility is retained because etags SHOULD only be returned and required in new microversions.

This change does not include substantive changes to any resource.

Documentation Impact

The proper use of etag identifiers shall be documented in our API reference.