Update pacemaker and corosync infrastructure (Corosync 2.x)¶

https://blueprints.launchpad.net/fuel/+spec/corosync-2

A next iteration of Corosync & Pacemaker improvements required by scaling requirements, better Pacemaker management and new OS support.

Problem description¶

The current Pacemaker implementation has some limitations:

Doesn’t allow to deploy a large amount of OpenStack Controllers
Operations with CIB utilizes almost 100% of CPU on the Controller
Corosync shutdown process takes a lot of time
No support of new OSes as CentOS 7 or Ubuntu 14.04
Current Fuel Architecture is limited to Corosync 1.x and Pacemaker 1.x
Pacemaker service can be run only as a plugin for Corosync service. We cannot restart pacemaker separately from the corosync and vice versa.
Fuel fork of corosync module contains a lots of tunings for parallel deployment of controllers which cannot be contributed to the upstream yet because of the huge diverge of the code base

Proposed change¶

Support Fuel Controllers with Corosync 2.3.3 and Pacemaker 1.1.12 packages for Centos 6.5 and Ubuntu 14.04
Run Pacemaker service separated from Corosync (ver:1)
Get the puppet corosync module from puppetlabs and integrate it. That would allow to install and configure Corosync cluster with Pacemaker without additional reosurces for the code maintanance.
Move all custom Fuel changes for corosync and pacemaker providers to the separate pacemaker module. That would allow custom changes to not interfere with the upstream code.

Alternatives¶

Continue to develop and support Fuel fork of corosync module in order to make it compatible with Corosync 2 without help from puppet community
Leave Corosync 1.x infrastructure as is

Data model impact¶

None

REST API impact¶

None

Upgrade impact¶

Corosync 2.x is NOT compatible with previous versions of Corosync [0]. Please make sure to upgrade all nodes at once (full-downtime patching)

Security impact¶

None

Notifications impact¶

None

Other end user impact¶

If Corosync service started/restarted, Pacemaker service should be (re)started next as well. Otherwise, the inter-service communication layer would be broken.
Corosync service cannot be stopped gracefully prior to the Pacemaker service. When shutting down, pacemaker service should be turned off first.

Performance Impact¶

Deployment process will be improved and will require less time as CIB operations will not require 100% CPU time
Corosync 2 has a lot of improvements that allow to have up to 100 Controllers. Corosync 1.0 scales up to 10-16 node

Other deployer impact¶

None

Developer impact¶

All changes for custom pacemaker providers should go to the separate pacemaker module.
Any changes not related to the providers should be done for corosync module and contributed to the upstream as well

Implementation¶

Assignee(s)¶

Primary assignee: * sgolovatiuk@mirantis.com * bdobrelya@mirantis.com

Other contributors: * dilyin@mirantis.com

Work Items¶

Replace Corosync 1.x infrastructure with Corosync 2.3.3 and Pacemaker 1.1.12 at the staging mirrors
Adapt puppet modules for corosync and pacemaker for Corosync 2.x
Synchronize corosync manifest with puppetlabs as well
Push staging mirrors to the public ones once manifests is ready

Dependencies¶

Corosync 2.3.3 and Pacemaker 1.1.12 packages with dependency libraries

Testing¶

Standard swarm testing are required.
Manual HA testing is required.
Rally testing is preffered but not mandatory.

Acceptance criteria¶

Openstack clouds deployed by Fuel are passing OSTF tests with Corosync 2.

Documentation Impact¶

High Availability guide should be reviewed. For Ubuntu, crm tool stays as is, but documentation should be as well enhanced with pcs equivivalents for Centos
Upgrade/Patching impact should be described - corosync 2.x upgrading assumes full downtime for cloud

References¶

[0]	http://lists.corosync.org/pipermail/discuss/2012-April/001456.html

OpenStack

Update pacemaker and corosync infrastructure (Corosync 2.x)¶

Problem description¶

Proposed change¶

Alternatives¶

Data model impact¶

REST API impact¶

Upgrade impact¶

Security impact¶

Notifications impact¶

Other end user impact¶

Performance Impact¶

Other deployer impact¶

Developer impact¶

Implementation¶

Assignee(s)¶

Work Items¶

Dependencies¶

Testing¶

Acceptance criteria¶

Documentation Impact¶

References¶

Table Of Contents

Previous topic

Next topic

Project Source

This Page

OpenStack

Update pacemaker and corosync infrastructure (Corosync 2.x)¶

Problem description¶

Proposed change¶

Alternatives¶

Data model impact¶

REST API impact¶

Upgrade impact¶

Security impact¶

Notifications impact¶

Other end user impact¶

Performance Impact¶

Other deployer impact¶

Developer impact¶

Implementation¶

Assignee(s)¶

Work Items¶

Dependencies¶

Testing¶

Acceptance criteria¶

Documentation Impact¶

References¶

Table Of Contents

Previous topic

Next topic

Project Source

This Page

Quick search

Navigation