Include the URL of your launchpad blueprint:
Validate each step during the installation to be able to stop fast in case of errors and provide feedback on which components are in error.
During deployment, problems are often spotted at the end of the configuration and can accumulate on top of each other making it difficult to find the root cause.
Deployers and developers will benefit by having the installation process fails fast and spotting the lowest level possible components causing the problem.
Leverage the steps already defined in Tripleo to run a validation tool at the end of each step.
During each step, collect assertions about what components are configured on each host then at the end of the step, run a validation tool consumming the assertions to report all the failed assertions.
We could use Puppet to add assertions in the code to validate what has been configured. The drawback of this approach is the difficulty to have a good reporting on what are the issues compared to a specialized tool that can be run outside of the installer if needed.
The other drawback to this approach is that it can’t be reused in future if/when we support non-puppet configuration and it probably also can’t be used when we use puppet to generate an external config file for containers.
This feature will be activated automatically in the installer.
If needed, the deployer or developper will be able to launch the tool by hand to validate a set of assertions.
We expect the validations to take less than one minute by step.
The objective is to have a fastest iterative process by failing fast.
Each configuration module will need to generate assertions to be consummed by the validation tool.
Note that this approach (multiple step application of ansible in localhost mode via heat) for upgrades and it will work well for validations too.
Primary assignee: <firstname.lastname@example.org>
To be added.
The change will be used automatically in the CI so it will always be tested.
We’ll need to document integration with whatever validation tool is used, e.g so that those integrating new services (or in future out-of-tree additional services) can know how to integrate with the validation.
A similar approach was used in SpinalStack using serverspec. See https://github.com/redhat-cip/config-tools/blob/master/verify-servers.sh
A collection of Ansible playbooks to detect and report potential issues during TripleO deployments: https://github.com/openstack/tripleo-validations
Prototype of composable upgrades with Heat+Ansible: https://review.openstack.org/#/c/393448/