Currently if we migrate a volume from one backend to another backend, especially when these two backends are handling two different vendors of arrays, we can’t attach this migrating volume to server until the migration is totally complete.
Both the dd way or driver-specific way will fail if we attach a volume when it is migrating: * If we migrate the volume through dd way, the source or target volume is not
usable to server.
Currently cinder supports four cases for volume migration:
This spec will only focus on case 1, to make an available volume usable(can be attached to a server) immediately after we issue migration, no need to wait until the migration complete. Case 2 is already well handled by most drivers, and if we go into case 2, we won’t go to case 1 again, so this spec will focus on case 1.
In brief, the async migration is to use some features of backend array and I’ve known that these features are already supported by most vendors like EMC VMAX, IBM SVC, HP XP, NetApp and so on. Note that for the backends not support these features, they still can use the existing driver specific migration or host-copy way. We won’t affect anything to the existing routine, just add a new way for developers or users to choose.
These features are:
arrays are connected with FC or iSCSI fabric. * We can migrate a remote LUN to a local LUN and meanwhile the remote(source) LUN is writable and readable with the migration task is undergoing, after the migration is completed, the local(target) LUN is exactly the same as the remote(source) LUN and no data will be lost.
To enable one array to take over other array’s volume, we should allow one driver to call other drivers’ interfaces directly or indirectly. These two drivers are from two independent backends, can be managed by one single volume node or two different volume nodes.
To enable an available volume usable(attach to a server) immediately after we issued migration from one backend to another backend, we should ALLOW the migration task to be send to the TARGET backend.
There will be a new interface for driver which will be called ‘migrate_volume_target’. We introduce a new interface instead of use the existing ‘migrate_volume’ interface because there would be lots of differences between them, a significant difference is that the drivers executing the ‘migrate_volume’ interface always take them self as the source backend, but the new interface should take itself as the target backend.
Some change will be made in the volume/manager.py for migrate_volume routine:
Let users wait a long time before the volume is usable for a server until the whole migration is totally complete.
Currently no direct notifications. But users can know that the async migration is started when the “migration_status” changed to “migrating_attachable” and is finished when the “migration_status” changed to “success”.
For other impact, like what operations are permitted and what operations are not permitted, I think it’s same like the existing migration.
After issued “cinder migrate” command, end users can check the “migration_status” by “cinder show” command. If “migration_status” is “migrating”, this volume is probably not safe to attach, and if “migration_status” is “migrating_attachable” or “success”, we can attach it safely. But one thing we should let end user know is that if “migration_status” is “migrating_attachable”, the volume is safe to attach but the performance of the volume may not as good as other volumes for the moment before the undergoing migration is done.
As we know, driver assisted migration is mostly more efficent than host copy, assuming that there is no read-through along with the migration.
If we attach the migrating volume and do read-through operations on the volume, the performance of the read/write is surelly not so good as the direct read/write. As far as I know, for the OLTP workload, the performance may decrease no more than 15 percent.
If deployers want to use the async feature, they should make sure the backend driver supports this feature, and make sure these backends are connected to each other.
Driver developers should implement two new interfaces: “migrate_volume_target” and “complete_migration_target”.
If drivers won’t support this feature, driver devlopers needn’t do anything. We won’t break down any driver or any existing function.