Log Request ID Mappings

https://blueprints.launchpad.net/nova/+spec/log-request-id-mappings

Tracking requests across multiple OpenStack services is difficult.

Problem description

Each OpenStack service sends a request ID header with HTTP responses. This value can be useful for tracking down problems in the logs. However, when operations cross service boundaries, this tracking becomes difficult, as services generate a new ID for each inbound request; a nova request ID cannot help users find debug info for other services that were called by nova while a request to nova was being fulfilled. This becomes especially problematic when many requests are coming at once, especially at the gate, where tempest is now running tests in parallel. Simply matching timestamps between service logs is no longer a feasible solution. Therefore, another method of request tracing is needed to aid debugging.

Proposed change

A request ID is generated at the start of request processing. This is the ID that a user will see when their request returns. Other OpenStack services that nova calls out to (such as glance) will send responses with their request ID as a header. By logging the mapping of the two request IDs (nova->glance), a user will be able to easily lookup the request ID returned by glance in the n-api log. With the glance request ID in hand, a user can then check the glance logs for debug info which corresponds to the request made to nova.

There are two request IDs required to form the log message: one generated by nova, and one included in the response from another service. The request ID generated by nova is in the context that is passed in to the python client wrappers. The request ID of the other service does not yet reach nova’s client wrappers; this value being returned depends on some client blueprints being implemented (see Dependencies). Once both request IDs are available, the logging will be done using link_request_ids() from request_utils.

Alternatives

This idea surfaced in previous cycles[1], with the proposed solution being a unified request ID that would be passed between services. While work was started on this, the approach was eventually deemed unsatisfactory, as it would allow for meddling with the request ID value on the client side. A client reusing the same request ID would eliminate the benefit of request tracing. With a carefully chosen request ID, one user could even interfere with the debugging effort of another user. So having the request ID be generated by the server is essential.

Data model impact

None.

REST API impact

None.

Security impact

None.

Notifications impact

The request_utils module used will generate an INFO notification for each request ID mapping that is logged.

Other end user impact

The only places an end user will see this is in the nova logs or in the generated notifications.

Performance Impact

None.

Other deployer impact

In order for nova to log the request ID value of the other service, the python client for that service needs to be passing back the request ID it gets from the HTTP response from the service. New releases of python-glanceclient, python-cinderclient, etc. will be necessary for this to occur. As such, deployers will need to be running newer versions of the client in order for the request ID mappings to be logged. If the client is not returning the request ID, no logging will occur.

Developer impact

None.

Implementation

Assignee(s)

Primary assignee:

chris-buccella

Work Items

  1. Sync request_utils module from oslo

  2. As various service’s python clients are updated to return the request ID, log the request ID mapping using request_utils.link_request_ids(): 1. python-glanceclient 2. python-cinderclient 3. python-neutronclient

  3. Update requirements.txt with the new required versions of the clients

Dependencies

Testing

A tempest test could be added to ensure that notifications are being generated in the correct format.

Documentation Impact

The Operations Guide should be updated particularly:

  • Chapter 11, under “Determining Which Component Is Broken”

  • Chapter 13, under “Tracing Instance Requests”

References

[0] Proposed change for nova<->glance logging from Icehouse cycle: https://review.openstack.org/#/c/68518

[1] Discussion from the HK Summit: https://etherpad.openstack.org/p/icehouse-summit-nova-cross-project-request-ids

[2] Refinements from the ML: http://lists.openstack.org/pipermail/openstack-dev/2013-December/020774.html