[Pacemaker] VirtualDomain Shutdown Timeout
Andrew Martin
amartin at xes-inc.com
Thu Mar 29 13:25:51 UTC 2012
Hi Andrew,
Thanks, that sounds good. I am using the Ubuntu HA ppa, so I will wait for a 1.1.7 package to become available.
Andrew
----- Original Message -----
From: "Andrew Beekhof" <andrew at beekhof.net>
To: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
Sent: Thursday, March 29, 2012 1:08:21 AM
Subject: Re: [Pacemaker] VirtualDomain Shutdown Timeout
On Sun, Mar 25, 2012 at 6:27 AM, Andrew Martin <amartin at xes-inc.com> wrote:
> Hello,
>
> I have configured a KVM virtual machine primitive using Pacemaker 1.1.6 and
> Heartbeat 3.0.5 on Ubuntu 10.04 Server using DRBD as the storage device (so
> there is no shared storage, no live-migration):
> primitive p_vm ocf:heartbeat:VirtualDomain \
> params config="/vmstore/config/vm.xml" \
> meta allow-migrate="false" \
> op start interval="0" timeout="180s" \
> op stop interval="0" timeout="120s" \
> op monitor interval="10" timeout="30"
>
> I would expect the following events to happen on failover on the "from" node
> (the migration source) if the VM hangs while shutting down:
> 1. VirtualDomain issues "virsh shutdown vm" to gracefully shutdown the VM
> 2. pacemaker waits 120 seconds for the timeout specified in the "op stop"
> timeout
> 3. VirtualDomain waits a bit less than 120 seconds to see if it will
> gracefully shutdown. Once it gets to almost 120 seconds, it issues "virsh
> destroy vm" to hard stop the VM.
> 4. pacemaker wakes up from the 120 second timeout and sees that the VM has
> stopped and proceeds with the failover
>
> However, I observed that VirtualDomain seems to be using the timeout from
> the "op start" line, 180 seconds, yet pacemaker uses the 120 second timeout.
> Thus, the VM is still running after the pacemaker timeout is reached and so
> the node is STONITHed. Here is the relevant section of code from
> /usr/lib/ocf/resource.d/heartbeat/VirtualDomain:
> VirtualDomain_Stop() {
> local i
> local status
> local shutdown_timeout
> local out ex
>
> VirtualDomain_Status
> status=$?
>
> case $status in
> $OCF_SUCCESS)
> if ! ocf_is_true $OCF_RESKEY_force_stop; then
> # Issue a graceful shutdown request
> ocf_log info "Issuing graceful shutdown request for domain
> ${DOMAIN_NAME}."
> virsh $VIRSH_OPTIONS shutdown ${DOMAIN_NAME}
> # The "shutdown_timeout" we use here is the operation
> # timeout specified in the CIB, minus 5 seconds
> shutdown_timeout=$(( $NOW +
> ($OCF_RESKEY_CRM_meta_timeout/1000) -5 ))
> # Loop on status until we reach $shutdown_timeout
> while [ $NOW -lt $shutdown_timeout ]; do
>
> Doesn't $OCF_RESKEY_CRM_meta_timeout correspond to the timeout value in the
> "op stop ..." line?
It should, however there was a bug in 1.1.6 where this wasn't the case.
The relevant patch is:
https://github.com/beekhof/pacemaker/commit/fcfe6fe
Or you could try 1.1.7
>
> How can I optimize my pacemaker configuration so that the VM will attempt to
> gracefully shutdown and then at worst case destroy the VM before the
> pacemaker timeout is reached? Moreover, is there anything I can do inside of
> the VM (another Ubuntu 10.04 install) to optimize/speed up the shutdown
> process?
>
> Thanks,
>
> Andrew
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120329/2a293078/attachment.htm>
More information about the Pacemaker
mailing list