Check the destination host when migrating or evacuating¶
https://blueprints.launchpad.net/nova/+spec/check-destination-on-migrations
Provide a way to make sure that resource allocation is consistent for all operations, even if a destination host is provided.
Problem description¶
Live migrations and evacuations allow the possibility to either specify a destination host or not. The former option totally bypasses the scheduler by calling the destination Compute RPC API directly.
Unfortunately, there are some cases when migrating a VM, it breaks the scheduler rules so it so it potentially breaks future boot requests due to some constraints not enforced when migrating/evacuating (like allocation ratios).
We should modify that logic to explicitly call the Scheduler any time a move (ie. either a live-migration or an evacuation) is requested (whether the destination host is provided or not) so that the Scheduler would verify the destination host thru all the enabled filters and if successful consume the instance usage from its internal HostState.
That said, we also understand that there are usecases where an operator wants to move an instance manually and not call the scheduler, even if the operator knows that he explicitly breaks scheduler rules (eg. a filter not passing, an affinity policy violated or an instance taking an already allocated pCPU in the context of CPU pinning).
Use Cases¶
Some of the normal usecases (verifying the destination) could be :
As an operator, I want to make sure that the destination host I’m providing when live migrating a specific instance would be correct and wouldn’t break my internal cloud because of a discrepancy between how I calculate the destination host capacity and how the scheduler is taking in account memory allocation ratio (see the References section below)
As an operator, I want to make sure that live-migrating an instance to a specific destination wouldn’t impact my existing instances running on that destination host because of some affinity that I missed.
Proposed change¶
This spec goes beyond what the persist-request-spec blueprint [1] by making sure that before each call to select_destinations(), the RequestSpec object is read from the current instance to schedule and will make sure that after the result of select_destinations(), the RequestSpec object will be persisted.
That way, we will be able to get the original RequestSpec from the
corresponding instance from the user creating the VM including the scheduler
hints. Given that, we propose to amend the RequestSpec object to include a new
field called requested_destination
which would be a ComputeNode object (at
least having the host and hypervisor_hostname fields set) and would be set by
the conductor for each method (here live-migrate and rebuild_instance
respectively) accepting an optional destination host.
Note that this new field would nothing have in common with a migration object or an Instance.host field, since it would just be a reference to an equivalent scheduler hint saying ‘I want to go there’ (and not the ugly force_hosts information passed as an Availability Zone hack…).
It will be the duty of the conductor (within the live_migrate and evacuate
methods) to get the RequestSpec related to the instance, add the
requested_destination
field, set the related Migration object to
scheduled
and call the scheduler’s select_destinations
method.
The last step would be of course to store the updated RequestSpec object.
If the requested destination is unacceptable for the scheduler, then the
conductor will change the Migration status to conflict
.
The idea behind that is that the Scheduler would check that field in the _schedule() method of FilterScheduler and would then just call the filters only for that destination.
As the RequestSpec object blueprint cares about backwards compatibility by
providing the legacy request_spec
and filter_properties
to the old
select_destinations
API method, we wouldn’t pass the new
requested_destination
field as a key for the request_spec.
Since this BP also provides a way for operators to bypass the Scheduler, we
will amend the API for all migrations including a destination host by adding an
extra request body argument called force
(accepting True or False,
defaulted to False) and the corresponding CLI methods will expose that
force
option. If the microversion asked by the client is older than the
version providing the field, then it won’t be passed (neither True or False,
rather the key won’t exist) to the conductor so the conductor won’t call the
scheduler - to keep the existing behaviour (see the REST API section below for
further details).
In order to keep track of those forced calls, we propose to log as an instance
action the fact that the migration has been forced so that the operator could
potentially reschedule the instance later on if he wishes. For that, we propose
to add two new possible actions, called FORCED_MIGRATE
(when live-migrating
) and FORCED_REBUILD
(when evacuating)
That way means that an operator can get all the instances having either
FORCED_MIGRATE
or FORCED_REBUILD
just by calling the
/os-instance-actions API resource for each instance, and we could also later
add a new blueprint (out of that spec scope) for getting the list of instances
having the last specific action set to something (here FORCED_something).
Alternatives¶
We could just provide a way to call the scheduler for having an answer if the destination host is valid or not, but it wouldn’t consume the instance usage which is from our perspective the key problem with the existing design.
Data model impact¶
None.
REST API impact¶
The proposed change just updates the POST request body for the
os-migrateLive
and evacuate
actions to include the
optional force
boolean field defaulted to False if the request has a
minimum version.
Depending on whether the host
and force
fields are set or null, the
actions and return codes are:
If a host parameter is supplied in the request body, the scheduler will now be asked to verify that the requested target compute node is actually able to accommodate the request, including honouring all previously-used scheduler hints. If the scheduler determines the request cannot be accommodated by the requested target host node, the related Migration object will change the
status
field toconflict
.If a host parameter is supplied in the request body, a new –force parameter may also be supplied in the request body. If present, the scheduler shall not be consulted to determine if the target compute node can be accommodated, and no Migration object will be updated.
If –force parameter is supplied in the request body but the host parameter is either null (for live-migrate) or not provided (for evacuate), then an HTTP 400 Bad Request will be served to the user.
Of course, since it’s a new request body attribute, it will get a new API microversion, meaning that if the attribute is not provided, the scheduler won’t be called by the conductor (to keep the existing behaviour where setting a host bypasses the scheduler).
JSON schema definition for the body data of
os-migrateLive
:
migrate_live = {
'type': 'object',
'properties': {
'os-migrateLive': {
'type': 'object',
'properties': {
'block_migration': parameter_types.boolean,
'disk_over_commit': parameter_types.boolean,
'host': host,
'force': parameter_types.boolean
},
'required': ['block_migration', 'disk_over_commit', 'host'],
'additionalProperties': False,
},
},
'required': ['os-migrateLive'],
'additionalProperties': False,
}
JSON schema definition for the body data of
evacuate
:
evacuate = {
'type': 'object',
'properties': {
'evacuate': {
'type': 'object',
'properties': {
'host': parameter_types.hostname,
'force': parameter_types.boolean,
'onSharedStorage': parameter_types.boolean,
'adminPass': parameter_types.admin_password,
},
'required': ['onSharedStorage'],
'additionalProperties': False,
},
},
'required': ['evacuate'],
'additionalProperties': False,
}
There should be no policy change as we’re not changing the action by itself but rather just providing a new option.
Security impact¶
None.
Notifications impact¶
None.
Other end user impact¶
Python-novaclient will accept a force
option for the following methods :
evacuate
live-migrate
Performance Impact¶
A new RPC call will be done by default when migrating or evacuating but it shouldn’t really impact the performance since it’s the normal behaviour for a general migration. In order to leave that RPC asynchronous from the API query, we won’t give the result of the check within the original request, but rather modify the Migration object status (see the REST API impact section above).
Other deployer impact¶
None.
Developer impact¶
None.
Implementation¶
Assignee(s)¶
- Primary assignee:
sylvain-bauza
Work Items¶
Read any existing RequestSpec before calling
select_destinations()
in all the conductor methods calling itAmend RequestSpec object with
requested_destination
fieldModify conductor methods for evacuate and live_migrate to fill in
requested_destination
, callscheduler_client.select_destinations()
and persist the amended RequestSpec object right after the call.Modify FilterScheduler._schedule() to introspect
requested_destination
and call filters for only that host if so.Extend the API (and bump a new version) to add a
force
attribute for both above API resources with the appropriate behaviours.Bypass the scheduler if the flag is set and log either
FORCED_REBUILD
orFORCED_MIGRATE
action.Add a new
force
option to python-novaclient and expose it in CLI for bothevacuate
andlive-migrate
commands
Dependencies¶
As said above in the proposal, since scheduler hints are part of the request
and are not persisted yet, we need to depend on persisting the RequestSpec
object [1] before calling select_destinations()
so that a future migration
would read that RequestSpec and provide it again.
Testing¶
API samples will need to be updated and unittests will cover the behaviour. In-tree functional tests will be amended to cover that option.
Documentation Impact¶
As said, API samples will be modified to include the new attribute.
References¶
[1] http://specs.openstack.org/openstack/nova-specs/specs/liberty/approved/persist-request-spec.html
Lots of bugs are mentioning the caveat we described above. Below are the ones I identified and who will be closed once the spec implementation lands :
https://bugs.launchpad.net/nova/+bug/1451831 Specifying a destination node with nova live_migration does not take into account overcommit setting (ram_allocation_ratio)
https://bugs.launchpad.net/nova/+bug/1214943 Live migration should use the same memory over subscription logic as instance boot
https://bugs.launchpad.net/nova/+bug/1452568 nova allows to live-migrate instance from one availability zone to another