..
 This work is licensed under a Creative Commons Attribution 3.0 Unported
 License.

 http://creativecommons.org/licenses/by/3.0/legalcode

=============================================================
Cinder Volume Active/Active support - General description
=============================================================

https://blueprints.launchpad.net/cinder/+spec/cinder-volume-active-active-support

Right now cinder-volume service can run only in Active/Passive HA fashion and
this spec proposes a possible path to support Active/Active configurations in
Cinder Volumes.

This spec will only provide a general description of the problem enumerating
the different issues we have to resolve without actually going into too much
detail.  It's more an eagle's eye kind of view of the problem.

Each specific issue will have its own spec that gives a detailed description of
the problem with proposed solution to the problem.


Problem description
===================

Right now cinder-volume service only accepts Active/Passive High Availability
configurations, and there are a number of things that need to change for it to
support Active/Active configurations.

API Races
---------

On API nodes given current code we are open to races in the code that affect
resources on the database, and this will be exacerbated when working with
Active/Active configurations.

Local Manager Locks
-------------------

We have multiple local locks in the manager code of the volume nodes to prevent
multiple green threads from accessing the same resource on specific operations.

This locking is local to the nodes and doesn't extend to other nodes, so we
need to solve mutual exclusion among volume nodes of the same cluster.

Job distribution
----------------

Cinder has no concept of clusters, only has the concept of hosts and each host
implements a specific backend/service.  A mechanism is needed to group hosts
from the same cluster under the same conceptual unit while retaining the
individual identities of the nodes belonging to the cluster for differentiation
in the clean up of crashed nodes.

Cleanup
-------

Right now only one node can work on a specific backend, and therefore on the
resources that it contains, so the cleanup is done by the node itself on
startup. And if the node does not come up and the resources are left on a stale
state it is not a big deal.

It is different with an Active/Active deployment since multiple nodes are
sharing the same storage back-end and a node can only do cleanup for the nodes
he was working on when he died/failed.

It is also important to do proper cleanup even when a specific node does not
come back to life, since other nodes from the same cluster can still manage
those resources.

Data Corruption Prevention
--------------------------

Since multiple nodes will be accessing the same storage back-end we have to be
extra careful not to access resources that are accessed by other nodes.

More relevant case is when we lose connection to the DB and we no longer can
send Service Heartbeats, since Scheduler's cleanup process (explained in
Cleanup proposed changes) will come into place and we could have 2 different
nodes accessing the same resource, one because it's still working on it and the
other because it is trying to do the cleanup.

Drivers' Locks
--------------

Some drivers require mutual exclusion for certain operations or when accessing
the same resources.

This mutual exclusion is currently being done using local locks in the same way
the manager does and they need to be able to work when multiple nodes are
accessing the same storage back-end.


Use Cases
=========

Operators that have hard requirements, SLA or other reasons, to have their
cloud operational at all times or have higher throughput requirements will want
to have the possibility to configure their deployments with an Active/Active
configuration.


Proposed change
===============

API Races
---------

Races on the API nodes will be removed used compare-and-swap updates to the DB.

- Specs: https://review.openstack.org/207101/

Job distribution
----------------

Job distribution will add the concept of cluster to cinder and send jobs using
a topic message queue using the cluster instead of the host like we are doing
now.

- Specs: https://review.openstack.org/232595

Cleanup
-------

Cleanup will keep track of resources that are have ongoing operations and will
have cleanup mechanisms on the Scheduler as well as the Volume nodes.

Cleanup on the nodes will happen on initialization as it is doing now but we'll
also have an automatic cleanup job on the scheduler for the cases where a node
with the same host name is not brought up.

Automatic cleanup mechanism will be disabled by default and it will be possible
to trigger it manually.

- Specs: https://review.openstack.org/236977

Data Corruption Prevention
--------------------------

Stop listening to new jobs from the Message Broker and halt all ongoing
operations so we are no longer accessing resources on the Storage Backend.

- Specs: https://review.openstack.org/237076

Manager Local Locks
-------------------

Default solution will be using a DLM with TooZ as the abstraction layer:

- Specs: https://review.openstack.org/202615

An alternative solution, that will be initially left as nice to have, will be
available for systems that don't want to install a DLM solution and are using
drivers that don't require distributed locking for Active-Active
configurations.  This solution replaces local file locks on c-vol's manager
with a DB locking mechanism using ``workers`` DB table (introduced by Cleanup
changes).

- Specs: https://review.openstack.org/237602

Drivers' Locks
--------------

We will be using a DLM solution with TooZ as the abstraction layer:

- Specs: https://review.openstack.org/202615

Alternatives
------------

There are quite a number of alternatives to not only each of the issues we need
to fix, and they are discussed in the respective specs except for the Drivers'
lock alternative that creates a generic locking mechanism extending the locking
mechanism implemented to remove `Manager Local Locks`_.

- Specs: https://review.openstack.org/237604


Data model impact
-----------------

Discussed in the respective specs.

REST API impact
---------------

Discussed in the respective specs.

Security impact
---------------

None

Notifications impact
--------------------

None

Other end user impact
---------------------

None

Performance Impact
------------------

Discussed in the respective specs.

Other deployer impact
---------------------

Discussed in the respective specs.

Developer impact
----------------

None

Implementation
==============

Assignee(s)
-----------

Discussed in the respective specs.

Work Items
----------

- API Races
- Job distribution
- Cleanup
- Data Corruption Prevention
- Manager Local Locks
- Drivers' Locks

Dependencies
============

None


Testing
=======

Discussed in the respective specs.


Documentation Impact
====================

Discussed in the respective specs.


References
==========

None