[Pacemaker] Problem with Xen live migration

Vadym Chepkov vchepkov at gmail.com
Tue Jan 18 12:45:49 UTC 2011


On Jan 17, 2011, at 6:44 PM, Jean-Francois Malouin wrote:

> Back again to setup an active/passive cluster for Xen with live migration
> but so far, no go. Xen DomU is shutdown and restarted when I move the
> Xen resource.
> 
> I'm using Debian Squeeze, pacemaker 1.0.9.1, corosync 1.2.1-4 with Xen 4.0.1-1
> and kernel 2.6.32-30. DRBD is at 8.3.8.
> 
> A logical volume 'xen_vg' is sitting on top of a drbd block device and an OCFS2
> filesystem is created on the LV to hold the disk image for the Xen guest:
> 
> [ drbd resDRBDr1 ] -> [ LVM resLVM ] -> [ OCFS2 resOCFS2 ] 
> 
> The cluster logic is (timeouts, etc removed) something along those
> lines:
> 
> primitive resDRBDr1 ocf:linbit:drbd params drbd_resource="r1" ...
> primitive resLVM ocf:heartbeat:LVM params volgrpname="xen_vg" ...
> primitive resOCFS2 ocf:heartbeat:Filesystem fstype="ocfs2" ...
> primitive resXen1 ocf:heartbeat:Xen \
>        params xmfile="/etc/xen/xen1cfg" name="xen1" \
>        meta allow-migrate="true"
> group groLVM-OCFS resLVM resOCFS2 
> ms msDRBDr1 resDRBDr1 \
>        meta notify="true" master-max="2" interleave="true"
> colocation colLVM-OCFS-on-DRBDr1Master inf: groLVM-OCFS msDRBDr1:Master
> colocation colXen-with-OcfsXen inf: resXen1 groLVM-OCFS
> order ordDRBDr1-before-LVM inf: msDRBDr1:promote groLVM-OCFS:start
> order ordLVM-OCFS-before-Xen inf: groLVM-OCFS:start resXen1:start
> 
> DRBD is configured with 'allow-to-primary'.
> 
> When I try to live migrate 'crm resource move' the Xen guest I get:
> 
> pengine: [11978]: notice: check_stack_element: Cannot migrate resXen1
> due to dependency on group groLVM-OCFS (coloc)
> 
> and the guest is shutdown and restarted on the other node.
> 
> What am I missing? Something obvious or the cluster logic as it is
> cannot permit live Xen migration? 
> 
> I have verified that Xen live migration works as (without pacemaker in
> the picture) on the now passive node I can manually promote the drbd
> block device, vgscan to scan for the logical volume, 'lvchange -ay' to
> make it available, mount the OCFS2 filesystem, 'xm migrate --live' on
> the active node, and the DomU is available on the other node.
> 
> Any help or examples very much appreciated!
> jf


I have tried it myself, but concluded it's impossible to do it reliably with the current code.
For the live migration to work you have to remove any colocation constraints (group included) with the Xen resource.
drbd code includes a "helper" script - /etc/xen/scripts/block-drbd, but this script can't be used in pacemaker environment, 
because it is not cluster aware. And pacemaker is not handling this scenario at the moment:
When Xen on drbd is stopped - both drbd nodes are secondary - makes pacemaker "unhappy".
You need to have both drbd nodes as primary during migration only, 
but if you specify master-max="2", then both drbd nodes are primary all the time - disaster waiting to happen.

Cheers,
Vadym






More information about the Pacemaker mailing list