[Pacemaker] Xen, Pacemaker and live migration
Frank Meier
frank.meier at hr-group.de
Tue Jan 24 10:00:25 UTC 2012
Hi Andrew,
I transfered your config to my needings (i.a. without ocfs2/fs-xen) and
it live-migrate like a charme.
Thanks a lot.
Mit freundlichen Grüßen
Frank Meier
UNIX-Basis
Hamm Reno Group GmbH
Industriegebiet West | D-66987 Thaleischweiler-Fröschen
T.+49(0)6334 444-8322 | F.+49(0)6334 444-8190
frank.meier at hr-group.de | www.reno.de
___________________________________________________________________
Sitz: Am Tie 7 | D-49086 Osnabrück
Handelsregister Osnabrück HRB 19587
Geschäftsführer: Hans-Jürgen de Fries,
Jens Gransee, Manfred Klumpp,
Robert Reisch
Am 23.01.2012 23:04, schrieb Daugherity, Andrew W:
>> Date: Mon, 23 Jan 2012 07:24:58 +0100
>> From: Frank Meier <frank.meier at hr-group.de>
>> To: The Pacemaker cluster resource manager
>> <pacemaker at oss.clusterlabs.org>
>> Subject: Re: [Pacemaker] Xen, Pacemaker and live migration
>> Message-ID: <4F1CFD3A.7020200 at hr-group.de>
>> Content-Type: text/plain; charset="ISO-8859-1"
>>
>> Hi,
>>
>> there is an clvm:0 on the first and an clvm:1 on the second node. So
>> it's OK, isn't it?
>
> It's OK in the sense that it's running on every node, yes, but it also hints at the problem. I encountered this too when setting up my Xen cluster.
>
> With a colocation directive like yours:
>>> colocation VM1WithLVM1forVM1 inf: VM1 LVMforVM1
>
> VM1 will get colocated with LVMforVM1:0 (for example), then the migration attempt will check for LVMforVM1:0 on the other node, and not find it (because it's LVMforVM1:1 there). Pacemaker will then bail on live migration, stop the VM, and then start it on the target node, where it will now be colocated with LVMforVM1:1.
>
> For the same reason, you don't want to exclusively open the LV or VG.
>
> The solution is to use an order constraint instead of colocation. The VM doesn't need to be running with a specific cLVM instance, just started after cLVM.
>
> Also, as others have mentioned, using groups will simplify your configuration, and you need to make sure your migrate_to timeout is long enough (migrate_from just checks that the VM is running on the target, and should complete nearly instantly).
>
> For example, I have:
> ====
> primitive clvm ocf:lvm2:clvmd \
> params daemon_timeout="30" \
> op start interval="0" timeout="90" \
> op stop interval="0" timeout="100"
> primitive clvm-xenvg ocf:heartbeat:LVM \
> params volgrpname="xen_san"
> primitive cmirror ocf:lvm2:cmirrord \
> params daemon_timeout="30" \
> op start interval="0" timeout="90" \
> op stop interval="0" timeout="100"
> primitive dlm ocf:pacemaker:controld \
> op start interval="0" timeout="90" \
> op stop interval="0" timeout="100"
> primitive fs-xen ocf:heartbeat:Filesystem \
> params device="/dev/xen_san/meta" directory="/mnt/xen" fstype="ocfs2" \
> op start interval="0" timeout="60" \
> op stop interval="0" timeout="60" \
> op monitor interval="20" timeout="40"
> primitive o2cb ocf:ocfs2:o2cb \
> op start interval="0" timeout="90" \
> op stop interval="0" timeout="100"
>
> primitive vm-webdev ocf:heartbeat:Xen \
> params xmfile="/mnt/xen/vm/webdev" \
> meta allow-migrate="true" target-role="Started" is-managed="true" \
> utilization cores="2" mem="1024" \
> op start interval="0" timeout="60" \
> op stop interval="0" timeout="60" \
> op migrate_to interval="0" timeout="180" \
> op monitor interval="30" timeout="30" start-delay="60"
> (etc.)
>
> group clvm-glue dlm clvm o2cb cmirror \
> meta target-role="started"
> group xen-vg-fs clvm-xenvg fs-xen \
> meta target-role="started"
> clone c-clvm-glue clvm-glue \
> meta interleave="true" ordered="true"
> clone c-xen-vg-fs xen-vg-fs \
> meta interleave="true" ordered="true"
> colocation colo-clvmglue-xenvgfs inf: c-xen-vg-fs c-clvm-glue
>
>
> order o-clvmglue-xenvgfs inf: c-clvm-glue c-xen-vg-fs
>
> order o-webdev inf: c-xen-vg-fs vm-webdev
> (etc.)
> ====
> Each Xen VM resource has a corresponding order constraint starting it after the clvm VG is active. The only reason I split the CLVM into two groups is so I could stop my fs-xen resource (an ocfs2 filesystem, stored on clvm, where I store my Xen config files and lock files) without stopping clvmd entirely. This is important if I ever have to unmount and fsck it.
>
>
> Andrew Daugherity
> Systems Analyst
> Division of Research, Texas A&M University
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list