[Pacemaker] Problem with Xen live migration

Vladislav Bogdanov bubble at hoster-ok.com
Tue Jan 18 14:19:33 UTC 2011


18.01.2011 16:00, Vladislav Bogdanov пишет:
> 18.01.2011 15:41, Vadym Chepkov wrote:
> 
> ...
> 
>>>>
>>>> I have tried it myself, but concluded it's impossible to do it reliably with the current code.
>>>> For the live migration to work you have to remove any colocation constraints (group included) with the Xen resource.
>>>> drbd code includes a "helper" script - /etc/xen/scripts/block-drbd, but this script can't be used in pacemaker environment, 
>>>> because it is not cluster aware. And pacemaker is not handling this scenario at the moment:
>>>> When Xen on drbd is stopped - both drbd nodes are secondary - makes pacemaker "unhappy".
>>>> You need to have both drbd nodes as primary during migration only, 
>>>> but if you specify master-max="2", then both drbd nodes are primary all the time - disaster waiting to happen.
>>>
>>> Unless clustered LVM locking is enabled and working:
>>> # sed -ri 's/^([ \t]+locking_type).*/    locking_type = 3/'
>>> /etc/lvm/lvm.conf
>>> # sed -ri 's/^([ \t]+fallback_to_local_locking).*/
>>> fallback_to_local_locking = 1/' /etc/lvm/lvm.conf
>>> # vgchange -cy VG_NAME
>>> # service clvmd start
>>> # vgs|grep VG_NAME
>>>
>>> Of cause, this may vary from one distro to another.
>>
>> unfortunately, this is not available on Redhat 5 and this is where Xen is used, since  Redhat 6, Fedora dropped Xen support.
>> But out of curiosity, how would clvmd prevent you from starting Xen VM on both nodes accidentally?
> 
> It will not. This is a job for CRM. It will just allow you to safely
> operate on LVM VG on both nodes - create, delete, activate, deactivate
> LVs (like ocfs2 allows you to safely operate on filesystem itself on
> both nodes).
> Although, you can use exclusive locks (vgchange -aey), but this will
> make live migration impossible.

You can also try 'lvchange -aly' on node where you need to start VM (or
where it is migrating to). Thus you do not have that LV active
everywhere. This should probably done inside of VM RA (but may also be
done by two-instance clone of custom RA which just activates/deactivates
LV). I didn't try it, but probably will do.

Another way is to do 'lvchange -aey' on 'start', and 'lvchange -aly' on
'migrate_to' and 'migrate_from', but I'm not sure that 'aey' -> 'aly'
will work, especially in an atomic way.

> 
> I also first tried to go with RHEL5, but it has number of unresolvable
> problems with cluster components (mainly - DLM has no support for
> userspace cluster stacks), so I switched to F13 and will probably try
> RHEL6 (or CentOS6).
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker




More information about the Pacemaker mailing list