https://blueprints.launchpad.net/fuel/+spec/mos-node-reinstallation
Node reinstallation allows fully/partially recover failed nodes using the standard fuel processes ‘provision’ and ‘deploy’. Full reinstallation - purge all data from reinstalled node Partial reinstallation - some data can be preserved OS will be reinstalled. In case when only system should be reinstalled from scratch (partially) Partition Preservation feature should be enabled.
(https://blueprints.launchpad.net/fuel/+spec/partition-preservation)
Currently fuel does not fully support functioning node reinstallation. Slave nodes can’t be restored after fail. Including but not limited to MongoDB failures, Galera failures, update failures, upgrade failures, etc.
Reinstallation feature includes multiple changes which should be implemented.
Partition Preservation (will be implemented separately) (https://blueprints.launchpad.net/fuel/+spec/partition-preservation).
Node renaming (https://blueprints.launchpad.net/fuel/+spec/node-naming).
MongoDB recovery in case of failure (assumed should be fixed in 7.0).
Swift ring sync during redeploy (assumed should be fixed in 7.0).
Reinstallation process:
None
None
API part will not change. Reinstallation process will use standard API calls - provision and deploy
API changes will be in partition preservation (https://blueprints.launchpad.net/fuel/+spec/partition-preservation).
Node renaming (https://blueprints.launchpad.net/fuel/+spec/node-naming).
None
None
None
None
Reinstallation process using partition preservation should improve deployment stage. Swift, Mysql, Mongodb services synchronization time should be shorter. In case compute node should be reinstalled using partition preservation method VM images migration not required.
None
None
None
None
Primary Assignee: | |
---|---|
Ivan Ponomarev | |
QA: | Dmitriy Kruglov |
Nandatory design review: | |
Vladimir Kuklin |
No strict dependencies
- It is possible to perform a full reinstallation (all data is purged) of a failed slave node to recover to previous working state
- It is possible to perform a partial reinstallation (some data is preserved) of a failed slave node to recover to previous working state
Scenarios to automate
Reinstall single compute:
Reinstall single controller:
Reinstallation of full cluster:
Reinstallation documentation will be added to the User Guide section