In-flight Validations for the overcloud¶
Currently, we don’t have any way to run validations inside a deploy run. This spec aims to provide the necessary information on how to implement such in-flight validations for an overcloud deploy.
Currently, operators and developers have to wait a long time before getting an error in case a service isn’t running as expected.
This leads to loss of time and resources.
After each container/service is started, a new step is added to run one or more validations on the deployed host in order to ensure the service is actually working as expected at said step.
These validations must not use Mistral Workflow, in order to provide support for the undercloud/standalone case.
The best way to push those validations would be through the already existing
deploy_steps_tasks keywork. A validation should be either at the start
of the next step, or at the end of the current step we want to check.
The validations should point to an external playbook, for instance hosted in
tripleo-validations. If there isn’t real use to create a playbook for the
validation, it might be inline - but it must be short, for example a single test
for an open port.
There isn’t really other alternative. We might think running the validation ansible playbook directly is a good idea, but it will break the wanted convergence with the UI.
For now, there isn’t such validations, we can start fresh.
No security impact.
If a service isn’t starting properly, the upgrade might fail. This is also true for a fresh deploy.
We might want different validation tasks/workflows if we’re in an upgrade state.
Other End User Impact¶
End user will get early failure in case of issues detected by the validations. This is an improvement, as for now it might fail at a later step, and might break things due to the lack of valid state.
Running in-flight validation WILL slow the overall deploy/upgrade process, but on the other hand, it will ensure we have a clean state before each step.
Other Deployer Impact¶
No other deployer impact.
Validations will need to be created and documented in order to get proper runs.
Who is leading the writing of the code? Or is this a blueprint where you’re throwing it out there to see who picks it up?
If more than one person is working on the implementation, please designate the primary author and contact.
- Primary assignee:
- Other contributors:
<launchpad-id or None>
Add new hook for the
Provide proper documentation on its use
Please keep in mind the Validation Framework spec when implementing things: https://review.openstack.org/589169
What is the impact on the docs? Don’t repeat details discussed above, but please reference them here.