[Pacemaker] How to delay first monitor op upon resource start?

Andrew Beekhof andrew at beekhof.net
Mon Mar 17 19:05:06 EDT 2014


On 14 Mar 2014, at 7:14 am, David Vossel <dvossel at redhat.com> wrote:

> ----- Original Message -----
>> From: "Gianluca Cecchi" <gianluca.cecchi at gmail.com>
>> To: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
>> Sent: Thursday, March 13, 2014 12:00:16 PM
>> Subject: [Pacemaker] How to delay first monitor op upon resource start?
>> 
>> Hello,
>> I have some init based scripts that I configure as lsb resources.
>> They are java based (in this case ovirt-engine and
>> ovirt-websocket-proxy from oVirt project) and they are started through
>> the rhel "daemon" function.
>> Basically it needs a few seconds before the scripts exit and the
>> status option returns ok.
>> So most of times when used as resources in pacemaker, their start is
>> registered as FAILED because the "status" call happens too quickly.
>> In the mean time I solved the problem putting a "sleep 5" before the
>> exit, but I would like to know if I can set a resource or cluster
>> parameter so that the first status monitor after start is delayed.
>> So I don't need to ask maintainer to make the change to the script and
>> I don't need after every update to remember to re-modify the script.
> 
> 
> This is a problem with the LSB script. No scripts that pacemaker manages should ever return "start" until "status"  passes. The "status" passing should be a condition for "start" passing.  You should make a loop at the end of the "start" function that waits for "status" to pass before returning.
> 
> with that said... there is a way to delay the monitor operation in pacemaker like you are wanting.  This is a terrible idea, i don't recommend it, and i don't guarantee it won't get deprecated entirely someday.

As much as I'd like to, pragmatism will probably get in the way.
There will always be broken scripts that say "I'm started" before they're actually started :-(

> the option is called 'start-delay' and you set it within the monitor operation section (same place interval and timeout are set). Set that option to the amount of milliseconds you want to delay the operation execution.

I thought it was in seconds.  Write in the form of "10s" to be sure.

> 
> -- Vossel
> 
>> Another option would be to try the status after start more than once
>> so that eventually the first time is not ok, but it is so the second
>> one....
>> 
>> Thanks in advance,
>> Gianluca
>> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140318/2c08301b/attachment-0003.sig>


More information about the Pacemaker mailing list