[Pacemaker] migration fix for ocf:heartbeat:Xen
Daugherity, Andrew W
adaugherity at tamu.edu
Tue Aug 23 16:54:35 CET 2011
> Message: 7
> Date: Thu, 11 Aug 2011 21:07:00 +0000
> From: "Daugherity, Andrew W" <adaugherity at tamu.edu>
> To: "pacemaker at oss.clusterlabs.org" <pacemaker at oss.clusterlabs.org>
> Subject: [Pacemaker] migration fix for ocf:heartbeat:Xen
> Message-ID: <93B5E618-AD19-4993-8066-CB4F8E4EF322 at tamu.edu>
> Content-Type: text/plain; charset="us-ascii"
>
> I have discovered that sometimes when migrating a VM, the migration itself will succeed, but the migrate_from call on the target node will fail, as apparently the status hasn't settled down yet. This is more likely to happen when stopping pacemaker on a node, causing all its VMs to migrate away. Migration succeeds, but then (sometimes) the status call in migrate_from fails, and the VM is unnecessarily stopped and started. Note that it is NOT a timeout problem, as the migrate_from operation (which only checks status) takes less than a second.
>
> I noticed the VirtualDomain RA does a loop rather than just checking the status once as the Xen RA does, so I patched a similar thing into the Xen RA, and that solved my problem.
(patch/logs snipped)
No comments? What does it take to get this patch accepted? I'd much rather use the mainline version than have to reapply my patch after every HAE update. I guess I could open an SR with Novell but this is ultimately an upstream issue.
Andrew Daugherity
Systems Analyst
Division of Research, Texas A&M University
adaugherity at tamu.edu
More information about the Pacemaker
mailing list