[Pacemaker] Master/Slave not failing over

Thu Jun 24 18:43:45 EDT 2010

Thanks for pointing that out.

I am still having issues with the master/slave resource. When I cause one of the monitoring actions to fail, the master node gets a DEMOTE, STOP, START, PROMOTE and the slave resource just sits there. I want to see DEMOTE on the failed master, then PROMOTE on the slave, then STOP on the failed master, followed by START on the failed master. How can I achieve this? Is there some sort of constraint or something I can put in place to make it happen?

Thanks again for any insights.

Eliot Gable
Senior Product Developer
1228 Euclid Ave, Suite 390
Cleveland, OH 44115

Direct: 216-373-4808
Fax: 216-373-4657
egable at broadvox.net

CONFIDENTIAL COMMUNICATION.  This e-mail and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom it is addressed. If you are not the intended recipient, please call me immediately.  BROADVOX is a registered trademark of Broadvox, LLC.

-----Original Message-----
From: Dejan Muhamedagic [mailto:dejanmm at fastmail.fm]
Sent: Thursday, June 24, 2010 12:37 PM
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] Master/Slave not failing over

Hi,

On Thu, Jun 24, 2010 at 12:12:34PM -0400, Eliot Gable wrote:
> On another note, I cannot seem to get Pacemaker to monitor the master node. It monitors the slave node just fine. These are the operations I have defined:
>
>         op monitor interval="5" timeout="30s" \
>         op monitor interval="10" timeout="30s" OCF_CHECK_LEVEL="10" \
>         op monitor interval="5" role="Master" timeout="30s" \
>         op monitor interval="10" role="Master" timeout="30s" OCF_CHECK_LEVEL="10" \
>         op start interval="0" timeout="40s" \
>         op stop interval="0" timeout="20s"
>
> Did I do something wrong?

Yes, all monitor intervals have to be different. I don't know
what happened without looking at the logs, but you should set sth
like this:

         op monitor interval="6" role="Master" timeout="30s" \
         op monitor interval="11" role="Master" timeout="30s" OCF_CHECK_LEVEL="10" \

Thanks,

Dejan

> Eliot Gable
> Senior Product Developer
> 1228 Euclid Ave, Suite 390
> Cleveland, OH 44115
>
> Direct: 216-373-4808
> Fax: 216-373-4657
> egable at broadvox.net<mailto:egable at broadvox.net>
>
> [cid:image001.gif at 01CB1396.87214DC0]
> CONFIDENTIAL COMMUNICATION.  This e-mail and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom it is addressed. If you are not the intended recipient, please call me immediately.  BROADVOX is a registered trademark of Broadvox, LLC.
>
> From: Eliot Gable [mailto:egable at broadvox.com]
> Sent: Thursday, June 24, 2010 11:55 AM
> To: The Pacemaker cluster resource manager
> Subject: [Pacemaker] Master/Slave not failing over
>
> I am using the latest CentOS 5.5 packages for pacemaker/corosync. I have a master/slave resource up and running, and when I make the master fail, instead of immediately promoting the slave, it restarts the failed master and re-promotes it back to master. This takes longer than if it would just immediately promote the slave. I can understand it waiting for a DEMOTE action to succeed on the failed master before it promotes the slave, but that is all it should need to do it. Is there any way I can change this behavior? Am I missing some key point in the process?
>
>
> Eliot Gable
> Senior Product Developer
> 1228 Euclid Ave, Suite 390
> Cleveland, OH 44115
>
> Direct: 216-373-4808
> Fax: 216-373-4657
> egable at broadvox.net<mailto:egable at broadvox.net>
>
> [cid:image001.gif at 01CB1396.87214DC0]
> CONFIDENTIAL COMMUNICATION.  This e-mail and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom it is addressed. If you are not the intended recipient, please call me immediately.  BROADVOX is a registered trademark of Broadvox, LLC.
>
>
> ________________________________
> CONFIDENTIAL. This e-mail and any attached files are confidential and should be destroyed and/or returned if you are not the intended and proper recipient.
>
> ________________________________
> CONFIDENTIAL. This e-mail and any attached files are confidential and should be destroyed and/or returned if you are not the intended and proper recipient.

> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

CONFIDENTIAL.  This e-mail and any attached files are confidential and should be destroyed and/or returned if you are not the intended and proper recipient.