Check the destination host when migrating or evacuating

https://blueprints.launchpad.net/nova/+spec/check-destination-on-migrations-newton

Provide a way to make sure that resource allocation is consistent for all operations, even if a destination host is provided.

Problem description

Live migrations and evacuations allow the possibility to either specify a destination host or not. The former option totally bypasses the scheduler by calling the destination Compute RPC API directly.

Unfortunately, there are some cases when migrating a VM, it breaks the scheduler rules so it so it potentially breaks future boot requests due to some constraints not enforced when migrating/evacuating (like allocation ratios).

We should modify that logic to explicitly call the Scheduler any time a move (ie. either a live-migration or an evacuation) is requested (whether the destination host is provided or not) so that the Scheduler would verify the destination host thru all the enabled filters and if successful consume the instance usage from its internal HostState.

That said, we also understand that there are usecases where an operator wants to move an instance manually and not call the scheduler, even if the operator knows that he explicitly breaks scheduler rules (eg. a filter not passing, an affinity policy violated or an instance taking an already allocated pCPU in the context of CPU pinning).

Use Cases

Some of the normal usecases (verifying the destination) could be :

As an operator, I want to make sure that the destination host I’m providing when live migrating a specific instance would be correct and wouldn’t break my internal cloud because of a discrepancy between how I calculate the destination host capacity and how the scheduler is taking in account memory allocation ratio (see the References section below)

As an operator, I want to make sure that live-migrating an instance to a specific destination wouldn’t impact my existing instances running on that destination host because of some affinity that I missed.

Proposed change

This spec goes beyond what the persist-request-spec blueprint [1] by making sure that before each call to select_destinations(), the RequestSpec object is read from the current instance to schedule and will make sure that after the result of select_destinations(), the RequestSpec object will be persisted.

That way, we will be able to get the original RequestSpec from the corresponding instance from the user creating the VM including the scheduler hints. Given that, we propose to amend the RequestSpec object to include a new field called requested_destination which would be a ComputeNode object (at least having the host and hypervisor_hostname fields set) and would be set by the conductor for each method (here live-migrate and rebuild_instance respectively) accepting an optional destination host.

Note that this new field would nothing have in common with a migration object or an Instance.host field, since it would just be a reference to an equivalent scheduler hint saying ‘I want to go there’ (and not the ugly force_hosts information passed as an Availability Zone hack…).

It will be the duty of the conductor (within the live_migrate and evacuate methods) to get the RequestSpec related to the instance, add the requested_destination field, set the related Migration object to scheduled and call the scheduler’s select_destinations method. The last step would be of course to store the updated RequestSpec object. If the requested destination is unacceptable for the scheduler, then the conductor will change the Migration status to conflict.

The idea behind that is that the Scheduler would check that field in the _schedule() method of FilterScheduler and would then just call the filters only for that destination.

As the RequestSpec object blueprint cares about backwards compatibility by providing the legacy request_spec and filter_properties to the old select_destinations API method, we wouldn’t pass the new requested_destination field as a key for the request_spec.

Since this BP also provides a way for operators to bypass the Scheduler, we will amend the API for all migrations including a destination host by adding an extra request body argument called force (accepting True or False, defaulted to False) and the corresponding CLI methods will expose that force option. If the microversion asked by the client is older than the version providing the field, then it won’t be passed (neither True or False, rather the key won’t exist) to the conductor so the conductor won’t call the scheduler - to keep the existing behaviour (see the REST API section below for further details).

In order to keep track of those forced calls, we propose to log as an instance action the fact that the migration has been forced so that the operator could potentially reschedule the instance later on if he wishes. For that, we propose to add two new possible actions, called FORCED_MIGRATE (when live-migrating ) and FORCED_REBUILD (when evacuating) That way means that an operator can get all the instances having either FORCED_MIGRATE or FORCED_REBUILD just by calling the /os-instance-actions API resource for each instance, and we could also later add a new blueprint (out of that spec scope) for getting the list of instances having the last specific action set to something (here FORCED_something).

Alternatives

We could just provide a way to call the scheduler for having an answer if the destination host is valid or not, but it wouldn’t consume the instance usage which is from our perspective the key problem with the existing design.

Data model impact

None.

REST API impact

The proposed change just updates the POST request body for the os-migrateLive and evacuate actions to include the optional force boolean field defaulted to False if the request has a minimum version.

Depending on whether the host and force fields are set or null, the actions and return codes are:

  • If a host parameter is supplied in the request body, the scheduler will now be asked to verify that the requested target compute node is actually able to accommodate the request, including honouring all previously-used scheduler hints. If the scheduler determines the request cannot be accommodated by the requested target host node, the related Migration object will change the status field to conflict.

  • If a host parameter is supplied in the request body, a new –force parameter may also be supplied in the request body. If present, the scheduler shall not be consulted to determine if the target compute node can be accommodated, and no Migration object will be updated.

  • If –force parameter is supplied in the request body but the host parameter is either null (for live-migrate) or not provided (for evacuate), then an HTTP 400 Bad Request will be served to the user.

Of course, since it’s a new request body attribute, it will get a new API microversion, meaning that if the attribute is not provided, the scheduler won’t be called by the conductor (to keep the existing behaviour where setting a host bypasses the scheduler).

  • JSON schema definition for the body data of os-migrateLive:

migrate_live = {
    'type': 'object',
    'properties': {
        'os-migrateLive': {
            'type': 'object',
            'properties': {
                'block_migration': parameter_types.boolean,
                'disk_over_commit': parameter_types.boolean,
                'host': host,
                'force': parameter_types.boolean
            },
            'required': ['block_migration', 'disk_over_commit', 'host'],
            'additionalProperties': False,
        },
    },
    'required': ['os-migrateLive'],
    'additionalProperties': False,
}
  • JSON schema definition for the body data of evacuate:

evacuate = {
    'type': 'object',
    'properties': {
        'evacuate': {
            'type': 'object',
            'properties': {
                'host': parameter_types.hostname,
                'force': parameter_types.boolean,
                'onSharedStorage': parameter_types.boolean,
                'adminPass': parameter_types.admin_password,
            },
            'required': ['onSharedStorage'],
            'additionalProperties': False,
        },
    },
    'required': ['evacuate'],
    'additionalProperties': False,
}
  • There should be no policy change as we’re not changing the action by itself but rather just providing a new option.

Security impact

None.

Notifications impact

None.

Other end user impact

Python-novaclient will accept a force option for the following methods :

  • evacuate

  • live-migrate

Performance Impact

A new RPC call will be done by default when migrating or evacuating but it shouldn’t really impact the performance since it’s the normal behaviour for a general migration. In order to leave that RPC asynchronous from the API query, we won’t give the result of the check within the original request, but rather modify the Migration object status (see the REST API impact section above).

Other deployer impact

None.

Developer impact

None.

Implementation

Assignee(s)

Primary assignee:

sylvain-bauza

Work Items

  • Read any existing RequestSpec before calling select_destinations() in all the conductor methods calling it

  • Amend RequestSpec object with requested_destination field

  • Modify conductor methods for evacuate and live_migrate to fill in requested_destination, call scheduler_client.select_destinations() and persist the amended RequestSpec object right after the call.

  • Modify FilterScheduler._schedule() to introspect requested_destination and call filters for only that host if so.

  • Extend the API (and bump a new version) to add a force attribute for both above API resources with the appropriate behaviours.

  • Bypass the scheduler if the flag is set and log either FORCED_REBUILD or FORCED_MIGRATE action.

  • Add a new force option to python-novaclient and expose it in CLI for both evacuate and live-migrate commands

Dependencies

As said above in the proposal, since scheduler hints are part of the request and are not persisted yet, we need to depend on persisting the RequestSpec object [1] before calling select_destinations() so that a future migration would read that RequestSpec and provide it again.

Testing

API samples will need to be updated and unittests will cover the behaviour. In-tree functional tests will be amended to cover that option.

Documentation Impact

As said, API samples will be modified to include the new attribute.

References

[1] http://specs.openstack.org/openstack/nova-specs/specs/liberty/approved/persist-request-spec.html

Lots of bugs are mentioning the caveat we described above. Below are the ones I identified and who will be closed once the spec implementation lands :