OpenStack Extreme Testing

Cross Project Spec - None

User Story Tracker - None

Problem description

Problem Definition

In order to provide competitive service to the customers, OpenStack operators are upgrading components, integrating new hardware, scaling up, and making configuration changes in frequent manner. However, all of those variations are not tested in current OpenStack test systems. Most of the OpenStack cloud service providers conduct tests by themselves before introducing new changes to production. Those tests include integration testing, component interface testing, operational acceptance testing, destructive testing, concurrent testing, performance testing, etc.. Currently the OpenStack ecosystem has unit, functional, and integration testing, and most of the above listed tests are missing or only partially implemented in the ecosystem.

Opportunity/Justification

These extended tests can significantly improve the overall quality of the OpenStack and dramatically reduce the delivery time to introduce a new release or new changes to production environment. Tests will be run before stable release by the QA team or even more collaboratively by the 3rd parties CI interface, spreading the cost of pre-stable testing and increasing the amount of issues reported for fix before release. However, testing upstream code with all possible combinations of HW and configurations is not practical. One possible solution is, QA team will run these extended tests on few pre-selected reference architectures and other architectures will be added as 3rd party CIs. After release the tests can be used by each distributor in their stabilization processes and finally each operator as they stabilize their configuration and each deployment. Currently operators are doing these extended tests by themselves and not collaborating and taking advantage of each other.

Requirements Specification

Use Cases

This section utilizes the OpenStack UX Personas.

  • Destructive testing

    As Rey the Cloud Operator, I would like to have all the OpenStack projects to be tested for destructive scenarios on OpenStack cloud system with High Availability configurations such as controller node high availability, Networking, Storage, Compute service high availability etc.. So that as we deploy OpenStack into production we have fewer situations in which OpenStack functions themselves fail (bugs fixed beforehand) and for others we avoid or can plan to mitigate with our specific configurations.

  • Concurrent testing

    As Rey, I would like to have following OpenStack projects to be tested before stable release for concurrent testing. So that as we deploy OpenStack into production environments we are confident that a real world situation of simultaneous function calls does not fail.

    Openstack Projects for extended testing
    • Nova
    • Cinder
    • Glance
    • Keystone
    • Neutron
    • Swift

Usage Scenario Examples

Destructive testing

Destructive testing simulates when part of the underlying OpenStack infrastructure (HW or SW) or a component of OpenStack itself fails or needs to be restarted and verifies that the system operates properly even in such conditions:

  • Shutdown a control node where API services are running and verify that API requests are processed as expected
  • Restart of network switches and verify that services can recover automatically
  • Restart some OpenStack services and verify that service can recover in expected downtime.
  • Generate DB/RabbitMQ downtime and verify that there are no request loss or non-recoverable errors in the system.
  • Shut off a hardware blade

Concurrent testing

Concurrent testing issues requests to a functioning OpenStack cloud more than 1 at a time. This can be the same functional request but for 2 different users or different functional requests but accessing the same resource. Expected result is that the functions complete in the same manner as they did when not issued simultaneously. Openstack Rally can use to conduct these concurrent tests.

  • Tenants added at the same moment
  • Networks requested at the same moment
  • In a constrained storage environment a release of storage and request for that storage happen at the same time.
  • Simultaneously shelve and migrate instance and then unshelve the instance
  • Simultaneously create multiple snapshots from an instance

Requirements

None.

Rejected User Stories / Usage Scenarios

None.

Glossary

None.