Improve the storage of the ActionPlan¶
https://blueprints.launchpad.net/watcher/+spec/planner-storage-action-plan
Problem description¶
Action plan is built by the default planner of the decision-engine service. Every Action is linked to the next one so we have linked list of Actions as result of default planner’s work. Applier service executes the action plan. Launching actions in a sequence has negative influence on performance, especially if there are a lot of actions. Every action must be completed before next action will be run, so it will take a lot of time to launch migration actions in sequence.
Use Cases¶
As an administrator, I would like to be able to run an action plan that will launch actions for every node (compute, network, storage) in parallel.
Proposed change¶
We can launch several concurrent actions with the same action weight in parallel.
This specification defines two different planners with the same input/output
dataflow: weight planner and workload stabilization planner. Each planner
should use new parents
attribute that will store list of actions the
action is linked to, but there are different ways to fill up this attribute.
Weight planner is designed to get unified way to parallelize execution of actions. It knows nothing about structure of specified actions and how they are related to each other. The main goal is to build sets of actions ordered by their weights. High weight actions will be planned before the low weight ones. Parallelization is the availability to execute several actions, with the same action type, in parallel. Parallelization factor is limited by another taskflow parameter named max_worker. Administrator can specify each action type’s weight and its parallelization ability in the watcher configuration file. Let’s show example of acyclic directed graph for this planner:
---sleep2 ---migration1 ---resize1
/ \ / \ / \
I----sleep1-------migration2------resize2----resize3----E
\ /
---migration3
This graph is built with the following settings:
Weight |
Parallelization |
|
---|---|---|
Sleep |
70 |
2 |
Migration |
60 |
3 |
Resize |
50 |
2 |
Since there are three resize actions and only two are allowed to be executed at the same time, the third resize action will be executed only when the previous actions are done.
Weight planner is easy to configure and has to be considered as new default planner.
Workload Stabilization planner respects features of action types by allowing
to affect on building of parents
. Some action types like resizing
the instance should be run only before/after migration of the same instance,
not in parallel. There are to be specified some constraints:
Unlinked actions are to be performed in parallel.
Only one action can be performed per instance at one time
If there are more than one action per instance, linked actions must be executed in sequence with respect to action weights.
So we can add new attribute parents
that will store list of actions
the current action is linked to. It can be shown using
the acyclic directed graph:
----------------a1(t=dis)-----------
/ \
| \
| ----------------a2(t=mig)-------------\
|/ \
I-----a3(t=mig) -- a31(t=res)---------------E
|\ /
| ----------a4(t=resize)----------------/
| /
\ /
a5(resource=123, -- a51(resource=123,
t=mig) t=resize)
Here we can see the following links:
a1 action disables the compute node and is not linked with another action. It can be run in parallel with other action (there is no constraint).
a2 action migrates the instance and is not linked with another action. It can be run in parallel with other action (there is no constraint).
a3 action migrates the instance and is parent to a31 action. It can be run in parallel with other action.
a31 action resizes an instance after this last one had been migrated by action a3.
a4 action resizes the instance and is not linked with another action. It can be run in parallel with other action (there is no constraint).
a5 action migrates the instance and is parent to a51 action. It can be run in parallel with other action.
a51 action resizes the same instance after this last one had been migrated by action a5.
As we can see, all actions that are independent to each other can be performed in parallel. Meanwhile, if some actions are linked to the same resource then they are to be performed with respecting to action weights. Currently, we can define the following weights:
End of graph - 0
Disable Compute Node - 1
Resize instance - 2
Migrate instance - 3
Initial point - 4
There is estimated pseudocode to show part of workload stabilization planner’s workflow:
action = [uuid, type, resource_id, metadata]
init = Flow()
action_weights = {
'turn_host_to_acpi_s3_state': 0,
'resize': 1,
'migrate': 2,
'sleep': 3,
'change_nova_service_state': 4,
'nop': 5,
}
actions = sorted_by_weights(descended)
for action in actions:
a_type = action['action_type']
if a_type != 'turn_host_to_acpi_s3_state':
db_action = self._create_action(context, action)
plugin_action = self.load_child_class(
db_action.action_type)
parents = plugin_action.validate_parents(
resource_action_map, action)
if parents:
db_action.parents = parents
db_action.save()
else:
# if we have an action that will make host unreachable, we need to
complete all actions (resize and migration type) related to the
host.
parent_actions = get_actions(metadata=action[metadata][host])
resize_actions = [x for x in parent_actions if x[type] == resize]
migration_actions = [x for x in parent_actions if x[type] == mig]
resize_migration_parents = [x[parents] for x in resize_actions]
# Since resize actions have less weight than migration, they may
have migration actions as parents and must be connected to the
turn_host_to_acpi_s3_state action firstly.
action_parents = []
action_parents.extend([x[uuid] for x in resize_actions])
# Add migrations that aren't linked to resize type actions
action_parents.extend([x[uuid] for x in migration_actions
if [x[uuid]] not in resize_migration_parents)
db_action = create_action(action, parents=action_parents)
This spec is limited to simple chained list of actions as action plan. The second part of modifying action plan’s executing will contain graph for parallel executing of action plans.
Alternatives¶
None
Data model impact¶
next
column should be removed from the Action table.parents
column should be added to the Action table. Type: JSON.first_action_id
column should be removed from Action Plan table.ActionPlan object major version should probably be updated to 2.0
REST API impact¶
None
Security impact¶
None
Notifications impact¶
None
Other end user impact¶
New configuration parameters in watcher.conf:
[watcher_planner] planner = weight #planner = workload_stabilization
[watcher_planners.weight] #weights = turn_host_to_acpi_s3_state:10,resize:20,migrate:30,sleep:40, change_nova_service_state:50,nop:60 #parallelization = turn_host_to_acpi_s3_state:2,resize:2,migrate:2,sleep:1, change_nova_service_state:1,nop:1
[watcher_planners.workload_stabilization] #weights = turn_host_to_acpi_s3_state:10,resize:20,migrate:30,sleep:40, change_nova_service_state:50,nop:60
Performance Impact¶
None
Other deployer impact¶
We will have 2 new planner extensions. We should reinstall properly watcher by running pip install [-e].
Developer impact¶
None
Implementation¶
Assignee(s)¶
Primary assignee: Alexander Chadin <a.chadin@servionica.ru>
Other contributors: Vincent Francoise <Vincent.FRANCOISE@b-com.com
Work Items¶
Update data model in accordance with proposed changes (in fact API and objects).
Remove default planner.
Add watcher/decision_engine/planner/weight.py and watcher/decision_engine/planner/workload_stabilization.py
Make weight planner as default.
Update the documentation.
Add appropriate unit tests.
Dependencies¶
https://blueprints.launchpad.net/watcher/+spec/plugins-parameters
Testing¶
Unit tests will be added to validate these modifications.
Documentation Impact¶
Update the defaultplanner documentation in accordance with new changes.
References¶
History¶
None