[Pacemaker] [PATCH] change timeouts, startup behaviour ocf:heartbeat:ManageVE (OpenVZ VE cluster resource)

Dejan Muhamedagic dejanmm at fastmail.fm
Wed Mar 13 12:18:38 EDT 2013


On Tue, Mar 12, 2013 at 12:58:44PM +0000, Tim Small wrote:
> The attached patch changes the behaviour of the OpenVZ virtual machine
> cluster resource agent, so that:
> 
> 1. The default resource stop timeout is greater than the hardcoded

Just for the record: where is this hardcoded actually? Is it
also documented?

> timeout in "vzctl stop" (after this time, vzctl forcibly stops the
> virtual machine) (since failure to stop a resource can lead to the
> cluster node being evicted from the cluster entirely - and this is
> generally a BAD thing).

Agreed.

> 2. The start operation now waits for resource startup to complete i.e.
> for the VE to "boot up" (so that the cluster manager can detect VEs
> which are hanging on startup, and also throttle simultaneous startups,
> so as not-to overburden the node in question).  Since the start
> operation now does a lot more, the default start operation timeout has
> been increased.

I'm not sure if we can introduce this just like that. It changes
significantly the agent's behaviour.

BTW, how does vzctl know when the VE is started?

> 3. Backs off the default timeouts and intervals for various operations
> to less aggressive values.

Please make patches which are self-contained, but can be
described in a succinct manner. If the description above matches
the code modifications, then there should be three instead of
one patch.

Please continue the discussion at linux-ha-dev, that's where RA
development discussions take place.

Cheers,

Dejan

> 
> Cheers,
> 
> Tim.
> 
> 
> n.b.  There is a bug in the Debian 6.0 (Squeeze) OpenVZ kernel such that
> "vzctl start <VEID> --wait" hangs.  The bug doesn't impact the
> OpenVZ.org kernels (and hence won't impact Debian 7.0 Wheezy either).
> 
> -- 
> South East Open Source Solutions Limited
> Registered in England and Wales with company number 06134732.  
> Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
> VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309
> 

> --- ManageVE.old	2010-10-22 05:54:50.000000000 +0000
> +++ ManageVE	2013-03-12 11:39:47.895102380 +0000
> @@ -26,12 +26,15 @@
>  #
>  #
>  # Created  07. Sep 2006
> -# Updated  18. Sep 2006
> +# Updated  12. Mar 2013
>  #
> -# rev. 1.00.3
> +# rev. 1.00.4
>  #
>  # Changelog
>  #
> +# 12/Mar/13 1.00.4 Wait for VE startup to finish, lengthen default start timeout.
> +#                  Default stop timeout to longer than the vzctl stop 'polite'
> +#                  interval.
>  # 12/Sep/06 1.00.3 more cleanup
>  # 12/Sep/06 1.00.2 fixed some logic in start_ve
>  #                  general cleanup all over the place
> @@ -67,7 +70,7 @@
>  <?xml version="1.0"?>
>  <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
>  <resource-agent name="ManageVE">
> -  <version>1.00.3</version>
> +  <version>1.00.4</version>
>  
>    <longdesc lang="en">
>      This OCF complaint resource agent manages OpenVZ VEs and thus requires
> @@ -87,12 +90,12 @@
>    </parameters>
>  
>    <actions>
> -    <action name="start" timeout="75" />
> -    <action name="stop" timeout="75" />
> -    <action name="status" depth="0" timeout="10" interval="10" />
> -    <action name="monitor" depth="0" timeout="10" interval="10" />
> -    <action name="validate-all" timeout="5" />
> -    <action name="meta-data" timeout="5" />
> +    <action name="start" timeout="240" />
> +    <action name="stop" timeout="150" />
> +    <action name="status" depth="0" timeout="20" interval="60" />
> +    <action name="monitor" depth="0" timeout="20" interval="60" />
> +    <action name="validate-all" timeout="10" />
> +    <action name="meta-data" timeout="10" />
>    </actions>
>  </resource-agent>
>  END
> @@ -127,7 +130,7 @@
>      return $retcode
>    fi
>  
> -  $VZCTL start $VEID >& /dev/null
> +  $VZCTL start $VEID --wait >& /dev/null
>    retcode=$?
>  
>    if [[ $retcode != 0 && $retcode != 32 ]]; then

> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Pacemaker mailing list