Step by step validation

Include the URL of your launchpad blueprint:

https://blueprints.launchpad.net/tripleo/+spec/step-by-step-validation

Validate each step during the installation to be able to stop fast in case of errors and provide feedback on which components are in error.

Problem Description

During deployment, problems are often spotted at the end of the configuration and can accumulate on top of each other making it difficult to find the root cause.

Deployers and developers will benefit by having the installation process fails fast and spotting the lowest level possible components causing the problem.

Proposed Change

Overview

Leverage the steps already defined in Tripleo to run a validation tool at the end of each step.

During each step, collect assertions about what components are configured on each host then at the end of the step, run a validation tool consumming the assertions to report all the failed assertions.

Alternatives

We could use Puppet to add assertions in the code to validate what has been configured. The drawback of this approach is the difficulty to have a good reporting on what are the issues compared to a specialized tool that can be run outside of the installer if needed.

The other drawback to this approach is that it can’t be reused in future if/when we support non-puppet configuration and it probably also can’t be used when we use puppet to generate an external config file for containers.

Security Impact

  • some validations may require access to sensitive data like passwords or keys to access the components.

Other End User Impact

This feature will be activated automatically in the installer.

If needed, the deployer or developper will be able to launch the tool by hand to validate a set of assertions.

Performance Impact

We expect the validations to take less than one minute by step.

Other Deployer Impact

The objective is to have a fastest iterative process by failing fast.

Developer Impact

Each configuration module will need to generate assertions to be consummed by the validation tool.

Implementation

Note that this approach (multiple step application of ansible in localhost mode via heat) for upgrades and it will work well for validations too.

https://review.openstack.org/#/c/393448/

Assignee(s)

Primary assignee: <shardy@redhat.com>

Other contributors to help validate services:
<launchpad-id or None>

Work Items

  • generate assertions about the configured components on the server being configured in yaml files.
  • implement the validation tool leveraging the work that has already been done in tripleo-validations that will do the following steps:
    1. collect yaml files from the servers on the undercloud.
    2. run validations in parallel on each server from the undercloud.
    3. report all issues and exit with 0 if no error or 1 if there is at least one error.

Dependencies

To be added.

Testing

The change will be used automatically in the CI so it will always be tested.

Documentation Impact

We’ll need to document integration with whatever validation tool is used, e.g so that those integrating new services (or in future out-of-tree additional services) can know how to integrate with the validation.

References

A similar approach was used in SpinalStack using serverspec. See https://github.com/redhat-cip/config-tools/blob/master/verify-servers.sh

A collection of Ansible playbooks to detect and report potential issues during TripleO deployments: https://github.com/openstack/tripleo-validations

Prototype of composable upgrades with Heat+Ansible: https://review.openstack.org/#/c/393448/