The TripleO GUI currently has no way to persist logging information.
The TripleO GUI is a web application without its own dedicated backend. As such, any and all client-side errors are lost when the End User reloads the page or navigates away from the application. When things go wrong, the End User is unable to retrieve client-side logs because this information is not persisted.
I propose that we use Zaqar as a persistence backend for client-side logging. At present, the web application is already communicating with Zaqar using websockets. We can use this connection to publish new messages to a dedicated logging queue.
Zaqar messages have a TTL of one hour. So once every thirty minutes, Mistral
will query Zaqar using crontrigger, and retrieve all messages from the
tripleo-ui-logging queue. Mistral will then look for a file called
tripleo-ui-log in Swift. If this file exists, Mistral will check its size.
If the size exceeds a predetermined size (e.g. 10MB), Mistral will rename it to
tripleo-ui-log-<timestamp>, and create a new file in its place. The file
will then receive the messages from Zaqar, one per line. Once we reach, let’s
say, a hundred archives (about 1GB) we can start removing dropping data in order
to prevent unnecessary data accumulation.
To view the logging data, we can ask Swift for 10 latest messages with a prefix
tripleo-ui-log. These files can be presented in the GUI for download.
Should the user require, we can present a “View more” link that will display the
rest of the collected files.
None at this time
There is a chance of logging sensitive data. I propose that we apply some common scrubbing mechanism to the messages before they are stored in Swift.
Other End User Impact¶
Sending additional messages over an existing websocket connection should have a negligible performance impact on the web application. Likewise, running hourly cron tasks in Mistral shouldn’t impose a significant burden on the undercloud machine.
Other Deployer Impact¶
Developers should also benefit from having a centralized logging system in place as a means of improving productivity when debugging.
- Primary assignee:
Introduce a central logging system (already in progress, see blueprint)
Introduce a global error handler
Convert all logging messages to JSON using a standard format
Configuration: the name for the Zaqar queue to carry the logging data
Introduce a Mistral workflow to drain a Zaqar queue and publish the acquired data to a file in Swift
Introduce GUI elements to download the log files
We can write unit tests for the code that handles sending messages over the websocket connection. We might be able to write an integration smoke test that will ensure that a message is received by the undercloud. We can also add some testing code to tripleo-common to cover the logic that drains the queue, and publishes the log data to Swift.
We need to document the default name of the Zaqar queue, the maximum size of each log file, and how many log files can be stored at most. On the End User side, we should document the fact that a GUI-oriented log is available, and the way to get it.