[Pacemaker] What exactly happen when a node joins a cluster ?

Sun Mar 20 15:02:20 EDT 2011

Hello,

I changed my configuration to cloned resources. I had one clone I did
not want to startup on 2 of my nodes.

To fix the the not "nice" output, a script returns always 0 was added:

#!/bin/ksh
exit 0

regards,

Thomas

[root at m-lab-prx-as-1 corosync]# crm configure show
node m-lab-prx-as-1
node m-lab-prx-as-2
node m-lab-prx-lb-1
node m-lab-prx-lb-2
primitive ping_gw1-primitive ocf:pacemaker:ping \
        params dampen="5s" multiplier="100" host_list="10.12.18.254" \
        op monitor interval="30s"
primitive ping_gw2-primitive ocf:pacemaker:ping \
        params dampen="5s" multiplier="100" host_list="10.12.19.254" \
        op monitor interval="30s"
primitive res_httpd ocf:heartbeat:apache \
        params configfile="/etc/httpd/conf/httpd.conf" httpd="/usr/sbin/httpd" \
        op monitor interval="60"
primitive res_kamailio-primitive lsb:kamailio \
        op monitor interval="60s" role="Started" timeout="30s" on-fail="restart"
primitive res_mysqld-primitive lsb:mysqld \
        op monitor interval="60s" role="Started" timeout="30s" on-fail="restart"
primitive res_npupdate-primitive lsb:npupdate \
        op monitor interval="60s" role="Started" timeout="30s" on-fail="restart"
primitive sysinfo-primitive ocf:heartbeat:SysInfo \
        op monitor interval="60s" timeout="20s"
clone httpd res_httpd
clone kamailio res_kamailio-primitive
clone mysqld res_mysqld-primitive
clone npupdate res_npupdate-primitive
clone ping_gw1 ping_gw1-primitive
clone ping_gw2 ping_gw2-primitive
clone sysinfo sysinfo-primitive
location loc_1 npupdate -inf: m-lab-prx-as-1
location loc_2 npupdate -inf: m-lab-prx-as-2
order order_1 : ( mysqld ) ( kamailio npupdate )
property $id="cib-bootstrap-options" \
        dc-version="1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3" \
        cluster-infrastructure="openais" \
        expected-quorum-votes="4" \
        no-quorum-policy="ignore" \
        stonith-enabled="false" \
        symmetric-cluster="true" \
        resource-stickiness="1"

On Thu, Mar 10, 2011 at 12:15 PM, Andrew Beekhof <andrew at beekhof.net> wrote:
> On Tue, Mar 1, 2011 at 11:22 PM, Thomas Baumann <bt047265 at gmail.com> wrote:
>> Hello,
>>
>> They are  2 groups with identical resources, each group is assigned to
>> its own node. Everything is fine for the first startup, but if one
>> node stops or starts, the troubles starts.
>>
>> I guess my problem are the more or less identical resources, it looks
>> like if a node joins the resources are started and to be sure they are
>> stopped at the other nodes.
>
> They might look started, but Pacemaker didn't do it.
>
> When a node joins, we ask it the status of _every_ configured resource
> - to be sure nothing was already running.
> You're using lsb scripts though, so starting group_lb1 on a node makes
> it look like group_lb2 is running too.
> There's also a fair chance you've got these services being started
> outside of the cluster when the node boots - also very bad.
>
> Remove group_lb2 and clone group_lb1 instead.
>
>>
>> They are 2 loadbalancer with 2 application servers connected.  If
>> everything is running application server 1 uses database of
>> loadbalancer 1 (as2 uses lb2).  If loadbalcer 1 is not running, I need
>> to startup application server 1
>> with a different configuration, so it uses the databae od loadbalancer
>> 2.  I can't use virtual IP's, I need to reconfigure the application
>> servers.
>>
>> regards,
>>
>> Thomas
>>
>> Attached my configuration:
>>
>> <cib validate-with="pacemaker-1.0" crm_feature_set="3.0.1"
>> have-quorum="1" dc-uuid="m-lab-prx-lb-1" admin_epoch="1" epoch="2900"
>> num_updates="0" cib-last-written="Fri Feb 25 16:04:35 2011">
>>  <configuration>
>>    <crm_config>
>>      <cluster_property_set id="cib-bootstrap-options">
>>        <nvpair id="cib-bootstrap-options-dc-version"
>> name="dc-version"
>> value="1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3"/>
>>        <nvpair id="cib-bootstrap-options-cluster-infrastructure"
>> name="cluster-infrastructure" value="openais"/>
>>        <nvpair id="cib-bootstrap-options-expected-quorum-votes"
>> name="expected-quorum-votes" value="4"/>
>>        <nvpair id="cib-bootstrap-options-no-quorum-policy"
>> name="no-quorum-policy" value="ignore"/>
>>        <nvpair id="cib-bootstrap-options-stonith-enabled"
>> name="stonith-enabled" value="false"/>
>>      </cluster_property_set>
>>    </crm_config>
>>    <nodes>
>>        <node id="m-lab-prx-lb-1" uname="m-lab-prx-lb-1" type="normal"/>
>>      <node id="m-lab-prx-lb-2" uname="m-lab-prx-lb-2" type="normal"/>
>>      <node id="m-lab-prx-as-1" uname="m-lab-prx-as-1" type="normal"/>
>>      <node id="m-lab-prx-as-2" uname="m-lab-prx-as-2" type="normal"/>
>>    </nodes>
>>    <resources>
>>    <group id="group_lb1">
>>        <primitive id="res_mysqld_lb1-primitive" class="lsb" type="mysqld">
>>                                <operations>
>>                                        <op id="op_mysqld_lb1" name="monitor" interval="60s"
>> timeout="30s" on-fail="restart"/>
>>                                </operations>
>>                        </primitive>
>>                        <primitive id="res_ser_lb1-primitive" class="lsb" type="kamailio">
>>                                <operations>
>>                                        <op id="op_ser_lb1" name="monitor" interval="60s" timeout="30s"
>> on-fail="restart"/>
>>                                </operations>
>>                        </primitive>
>>                </group>
>>                <group id="group_lb2">
>>        <primitive id="res_mysqld_lb2-primitive" class="lsb" type="mysqld">
>>                                <operations>
>>                                        <op id="op_mysqld_lb2" name="monitor" interval="60s"
>> timeout="30s" on-fail="restart"/>
>>                                </operations>
>>                        </primitive>
>>                        <primitive id="res_ser_lb2-primitive" class="lsb" type="kamailio">
>>                                <operations>
>>                                        <op id="op_ser_lb2" name="monitor" interval="60s" timeout="30s"
>> on-fail="restart"/>
>>                                </operations>
>>                        </primitive>
>>                </group>
>>                </resources>
>>    <constraints>
>>    <rsc_location id="loc_lb1" rsc="group_lb1">
>>        <rule id="rule_loc_lb1" score="-INFINITY">
>>          <expression attribute="#uname" id="expression_loc_lb1"
>> operation="ne" value="m-lab-prx-lb-1"/>
>>        </rule>
>>    </rsc_location>
>>    <rsc_location id="loc_lb2" rsc="group_lb2">
>>        <rule id="rule_loc_lb2" score="-INFINITY">
>>          <expression attribute="#uname" id="expression_loc_lb2"
>> operation="ne" value="m-lab-prx-lb-2"/>
>>        </rule>
>>    </rsc_location>
>>    </constraints>
>>    <rsc_defaults/>
>>    <op_defaults/>
>>  </configuration>
>> </cib>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>
>