[Pacemaker] Question regarding starting of master/slave resources and ELECTIONs
Bob Schatz
bschatz at yahoo.com
Wed Apr 13 17:19:50 UTC 2011
Andrew,
Thanks for responding. Comments inline with <Bob>
________________________________
From: Andrew Beekhof <andrew at beekhof.net>
To: The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org>
Cc: Bob Schatz <bschatz at yahoo.com>
Sent: Tue, April 12, 2011 11:23:14 PM
Subject: Re: [Pacemaker] Question regarding starting of master/slave resources
and ELECTIONs
On Wed, Apr 13, 2011 at 4:54 AM, Bob Schatz <bschatz at yahoo.com> wrote:
> Hi,
> I am running Pacemaker 1.0.9 with Heartbeat 3.0.3.
> I create 5 master/slave resources in /etc/ha.d/resource.d/startstop during
> post-start.
I had no idea this was possible. Why would you do this?
<Bob> We and I know of a couple of other companies, bundle LinuxHA/Pacemaker
into an appliance. For me, when the appliance boots, it creates HA resources
based on the hardware it discovers. I assumed that once POST-START was called
in the startstop script and we have a DC then the cluster is up and running. I
then use "crm" commands to create the configuration, etc. I further assumed
that since we have one DC in the cluster then all "crm" commands which modify
the configuration would be ordered even if the DC fails over to a different
node. Is this incorrect?
> I noticed that 4 of the master/slave resources will start right away but the
> 5 master/slave resource seems to take a minute or so and I am only running
> with one node.
> Is this expected?
Probably, if the other 4 take around a minute each to start.
There is an lrmd config variable that controls how much parallelism it
allows (but i forget the name).
<Bob> It's max-children and I set it to 40 for this test to see if it would
change the behavior. (/sbin/lrmadmin -p max-children 40)
> My configuration is below and I have also attached ha-debug.
> Also, what triggers a crmd election?
Node up/down events and whenever someone replaces the cib (which the
shell used to do a lot).
<Bob> For my test, I only started one node so that I could avoid node up/down
events. I believe the log shows the cib being replaced. Since I am using crm
then I assume it must be due to crm. Do the crm_resource, etc commands also
replace the cib? Would that avoid elections as a result of cibs being replaced?
Thanks,
Bob
> I seemed to have a lot of elections in
> the attached log. I was assuming that on a single node I would only run the
> election once in the beginning and then there would not be another one until
> a new node joined.
>
> Thanks,
> Bob
>
> My configuration is:
> node $id="856c1f72-7cd1-4906-8183-8be87eef96f2" mgraid-s000030311-1
> primitive SSJ000030312 ocf:omneon:ss \
> params ss_resource="SSJ000030312"
> ssconf="/var/omneon/config/config.J000030312" \
> op monitor interval="3s" role="Master" timeout="7s" \
> op monitor interval="10s" role="Slave" timeout="7" \
> op stop interval="0" timeout="20" \
> op start interval="0" timeout="300"
> primitive SSJ000030313 ocf:omneon:ss \
> params ss_resource="SSJ000030313"
> ssconf="/var/omneon/config/config.J000030313" \
> op monitor interval="3s" role="Master" timeout="7s" \
> op monitor interval="10s" role="Slave" timeout="7" \
> op stop interval="0" timeout="20" \
> op start interval="0" timeout="300"
> primitive SSJ000030314 ocf:omneon:ss \
> params ss_resource="SSJ000030314"
> ssconf="/var/omneon/config/config.J000030314" \
> op monitor interval="3s" role="Master" timeout="7s" \
> op monitor interval="10s" role="Slave" timeout="7" \
> op stop interval="0" timeout="20" \
> op start interval="0" timeout="300"
> primitive SSJ000030315 ocf:omneon:ss \
> params ss_resource="SSJ000030315"
> ssconf="/var/omneon/config/config.J000030315" \
> op monitor interval="3s" role="Master" timeout="7s" \
> op monitor interval="10s" role="Slave" timeout="7" \
> op stop interval="0" timeout="20" \
> op start interval="0" timeout="300"
> primitive SSS000030311 ocf:omneon:ss \
> params ss_resource="SSS000030311"
> ssconf="/var/omneon/config/config.S000030311" \
> op monitor interval="3s" role="Master" timeout="7s" \
> op monitor interval="10s" role="Slave" timeout="7" \
> op stop interval="0" timeout="20" \
> op start interval="0" timeout="300"
> primitive icms lsb:S53icms \
> op monitor interval="5s" timeout="7" \
> op start interval="0" timeout="5"
> primitive mgraid-stonith stonith:external/mgpstonith \
> params hostlist="mgraid-canister" \
> op monitor interval="0" timeout="20s"
> primitive omserver lsb:S49omserver \
> op monitor interval="5s" timeout="7" \
> op start interval="0" timeout="5"
> ms ms-SSJ000030312 SSJ000030312 \
> meta clone-max="2" notify="true" globally-unique="false"
> target-role="Started"
> ms ms-SSJ000030313 SSJ000030313 \
> meta clone-max="2" notify="true" globally-unique="false"
> target-role="Started"
> ms ms-SSJ000030314 SSJ000030314 \
> meta clone-max="2" notify="true" globally-unique="false"
> target-role="Started"
> ms ms-SSJ000030315 SSJ000030315 \
> meta clone-max="2" notify="true" globally-unique="false"
> target-role="Started"
> ms ms-SSS000030311 SSS000030311 \
> meta clone-max="2" notify="true" globally-unique="false"
> target-role="Started"
> clone Fencing mgraid-stonith
> clone cloneIcms icms
> clone cloneOmserver omserver
> location ms-SSJ000030312-master-w1 ms-SSJ000030312 \
> rule $id="ms-SSJ000030312-master-w1-rule" $role="master" 100: #uname
> eq mgraid-s000030311-0
> location ms-SSJ000030313-master-w1 ms-SSJ000030313 \
> rule $id="ms-SSJ000030313-master-w1-rule" $role="master" 100: #uname
> eq mgraid-s000030311-0
> location ms-SSJ000030314-master-w1 ms-SSJ000030314 \
> rule $id="ms-SSJ000030314-master-w1-rule" $role="master" 100: #uname
> eq mgraid-s000030311-0
> location ms-SSJ000030315-master-w1 ms-SSJ000030315 \
> rule $id="ms-SSJ000030315-master-w1-rule" $role="master" 100: #uname
> eq mgraid-s000030311-0
> location ms-SSS000030311-master-w1 ms-SSS000030311 \
> rule $id="ms-SSS000030311-master-w1-rule" $role="master" 100: #uname
> eq mgraid-s000030311-0
> order orderms-SSJ000030312 0: cloneIcms ms-SSJ000030312
> order orderms-SSJ000030313 0: cloneIcms ms-SSJ000030313
> order orderms-SSJ000030314 0: cloneIcms ms-SSJ000030314
> order orderms-SSJ000030315 0: cloneIcms ms-SSJ000030315
> order orderms-SSS000030311 0: cloneIcms ms-SSS000030311
> property $id="cib-bootstrap-options" \
> dc-version="1.0.9-89bd754939df5150de7cd76835f98fe90851b677" \
> cluster-infrastructure="Heartbeat" \
> dc-deadtime="5s" \
> stonith-enabled="true"
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110413/8e9ddc52/attachment.htm>
More information about the Pacemaker
mailing list