[Pacemaker] Error: cluster is not currently running on this node

Wed Aug 20 17:16:54 CEST 2014

tnx!

will do that and let you know.

miha

Dne 8/20/2014 5:03 PM, piše emmanuel segura:
> Hi,
>
> You need to give every cluster parameter to the
> fence_bladecenter_snmp, so from sp2  you neet to use " Attributes:
> action=off ipaddr=172.30.0.2 port=8 community=test login=snmp8
> passwd=soft1234", command to use from sp2 for test your fencing
> "fence_bladecenter_snmp -a 172.30.0.2 -l snmp8 -p soft1234 -c test -o
> status" and if the status is ok, when you will scheduled down time for
> your system, you can try to reboot with "fence_bladecenter_snmp -a
> 172.30.0.2 -l snmp8 -p soft1234 -c test -o reboot"
>
> 2014-08-20 16:22 GMT+02:00 Miha <miha at softnet.si>:
>> ok, will do that. This will not affect sip2?
>>
>> sorry for my noob question but I must be careful as this is in production ;)
>>
>> So, "fence_bladecenter_snmp reboot" right?
>>
>> br
>> miha
>>
>> Dne 8/19/2014 11:53 AM, piše emmanuel segura:
>>
>>> sorry,
>>>
>>> That was a typo, fixed "try to poweroff sp1 by hand, using the
>>> fence_bladecenter_snmp in your shell"
>>>
>>> 2014-08-19 11:17 GMT+02:00 Miha <miha at softnet.si>:
>>>> hi,
>>>>
>>>> what do you mean by "by had of powweroff sp1"? do power off server sip1?
>>>>
>>>> One thing also bothers me. Why on sip2 cluster service is not running if
>>>> still virual ip and etc are all properly running?
>>>>
>>>> tnx
>>>> miha
>>>>
>>>>
>>>> Dne 8/19/2014 9:08 AM, piše emmanuel segura:
>>>>
>>>>> Your config look ok, have you tried to use fence_bladecenter_snmp by
>>>>> had for poweroff sp1?
>>>>>
>>>>> http://www.linuxcertif.com/man/8/fence_bladecenter_snmp/
>>>>>
>>>>> 2014-08-19 8:05 GMT+02:00 Miha <miha at softnet.si>:
>>>>>> sorry, here is it:
>>>>>>
>>>>>> <cluster config_version="9" name="sipproxy">
>>>>>>      <fence_daemon/>
>>>>>>      <clusternodes>
>>>>>>        <clusternode name="sip1" nodeid="1">
>>>>>>          <fence>
>>>>>>            <method name="pcmk-method">
>>>>>>              <device name="pcmk-redirect" port="sip1"/>
>>>>>>            </method>
>>>>>>          </fence>
>>>>>>        </clusternode>
>>>>>>        <clusternode name="sip2" nodeid="2">
>>>>>>          <fence>
>>>>>>            <method name="pcmk-method">
>>>>>>              <device name="pcmk-redirect" port="sip2"/>
>>>>>>            </method>
>>>>>>          </fence>
>>>>>>        </clusternode>
>>>>>>      </clusternodes>
>>>>>>      <cman expected_votes="1" two_node="1"/>
>>>>>>      <fencedevices>
>>>>>>        <fencedevice agent="fence_pcmk" name="pcmk-redirect"/>
>>>>>>      </fencedevices>
>>>>>>      <rm>
>>>>>>        <failoverdomains/>
>>>>>>        <resources/>
>>>>>>      </rm>
>>>>>> </cluster>
>>>>>>
>>>>>>
>>>>>> br
>>>>>> miha
>>>>>>
>>>>>> Dne 8/18/2014 11:33 AM, piše emmanuel segura:
>>>>>>> your cman /etc/cluster/cluster.conf ?
>>>>>>>
>>>>>>> 2014-08-18 7:08 GMT+02:00 Miha <miha at softnet.si>:
>>>>>>>> Hi Emmanuel,
>>>>>>>>
>>>>>>>> this is my config:
>>>>>>>>
>>>>>>>>
>>>>>>>> Pacemaker Nodes:
>>>>>>>>      sip1 sip2
>>>>>>>>
>>>>>>>> Resources:
>>>>>>>>      Master: ms_drbd_mysql
>>>>>>>>       Meta Attrs: master-max=1 master-node-max=1 clone-max=2
>>>>>>>> clone-node-max=1
>>>>>>>> notify=true
>>>>>>>>       Resource: p_drbd_mysql (class=ocf provider=linbit type=drbd)
>>>>>>>>        Attributes: drbd_resource=clusterdb_res
>>>>>>>>        Operations: monitor interval=29s role=Master
>>>>>>>> (p_drbd_mysql-monitor-29s)
>>>>>>>>                    monitor interval=31s role=Slave
>>>>>>>> (p_drbd_mysql-monitor-31s)
>>>>>>>>      Group: g_mysql
>>>>>>>>       Resource: p_fs_mysql (class=ocf provider=heartbeat
>>>>>>>> type=Filesystem)
>>>>>>>>        Attributes: device=/dev/drbd0 directory=/var/lib/mysql_drbd
>>>>>>>> fstype=ext4
>>>>>>>>        Meta Attrs: target-role=Started
>>>>>>>>       Resource: p_ip_mysql (class=ocf provider=heartbeat type=IPaddr2)
>>>>>>>>        Attributes: ip=XXX.XXX.XXX.XXX cidr_netmask=24 nic=eth2
>>>>>>>>       Resource: p_mysql (class=ocf provider=heartbeat type=mysql)
>>>>>>>>        Attributes: datadir=/var/lib/mysql_drbd/data/ user=root
>>>>>>>> group=root
>>>>>>>> config=/var/lib/mysql_drbd/my.cnf pid=/var/run/mysqld/mysqld.pid
>>>>>>>> socket=/var/lib/mysql/mysql.sock binary=/usr/bin/mysqld_safe
>>>>>>>> additional_parameters="--bind-address=212.13.249.55 --user=root"
>>>>>>>>        Meta Attrs: target-role=Started
>>>>>>>>        Operations: start interval=0 timeout=120s (p_mysql-start-0)
>>>>>>>>                    stop interval=0 timeout=120s (p_mysql-stop-0)
>>>>>>>>                    monitor interval=20s timeout=30s
>>>>>>>> (p_mysql-monitor-20s)
>>>>>>>>      Clone: cl_ping
>>>>>>>>       Meta Attrs: interleave=true
>>>>>>>>       Resource: p_ping (class=ocf provider=pacemaker type=ping)
>>>>>>>>        Attributes: name=ping multiplier=1000
>>>>>>>> host_list=XXX.XXX.XXX.XXXX
>>>>>>>>        Operations: monitor interval=15s timeout=60s
>>>>>>>> (p_ping-monitor-15s)
>>>>>>>>                    start interval=0s timeout=60s (p_ping-start-0s)
>>>>>>>>                    stop interval=0s (p_ping-stop-0s)
>>>>>>>>      Resource: opensips (class=lsb type=opensips)
>>>>>>>>       Meta Attrs: target-role=Started
>>>>>>>>       Operations: start interval=0 timeout=120 (opensips-start-0)
>>>>>>>>                   stop interval=0 timeout=120 (opensips-stop-0)
>>>>>>>>
>>>>>>>> Stonith Devices:
>>>>>>>>      Resource: fence_sip1 (class=stonith type=fence_bladecenter_snmp)
>>>>>>>>       Attributes: action=off ipaddr=172.30.0.2 port=8 community=test
>>>>>>>> login=snmp8
>>>>>>>> passwd=soft1234
>>>>>>>>       Meta Attrs: target-role=Started
>>>>>>>>      Resource: fence_sip2 (class=stonith type=fence_bladecenter_snmp)
>>>>>>>>       Attributes: action=off ipaddr=172.30.0.2 port=9 community=test1
>>>>>>>> login=snmp8 passwd=soft1234
>>>>>>>>       Meta Attrs: target-role=Started
>>>>>>>> Fencing Levels:
>>>>>>>>
>>>>>>>> Location Constraints:
>>>>>>>>       Resource: ms_drbd_mysql
>>>>>>>>         Constraint: l_drbd_master_on_ping
>>>>>>>>           Rule: score=-INFINITY role=Master boolean-op=or
>>>>>>>> (id:l_drbd_master_on_ping-rule)
>>>>>>>>             Expression: not_defined ping
>>>>>>>> (id:l_drbd_master_on_ping-expression)
>>>>>>>>             Expression: ping lte 0 type=number
>>>>>>>> (id:l_drbd_master_on_ping-expression-0)
>>>>>>>> Ordering Constraints:
>>>>>>>>       promote ms_drbd_mysql then start g_mysql (INFINITY)
>>>>>>>> (id:o_drbd_before_mysql)
>>>>>>>>       g_mysql then start opensips (INFINITY) (id:opensips_after_mysql)
>>>>>>>> Colocation Constraints:
>>>>>>>>       g_mysql with ms_drbd_mysql (INFINITY) (with-rsc-role:Master)
>>>>>>>> (id:c_mysql_on_drbd)
>>>>>>>>       opensips with g_mysql (INFINITY) (id:c_opensips_on_mysql)
>>>>>>>>
>>>>>>>> Cluster Properties:
>>>>>>>>      cluster-infrastructure: cman
>>>>>>>>      dc-version: 1.1.10-14.el6-368c726
>>>>>>>>      no-quorum-policy: ignore
>>>>>>>>      stonith-enabled: true
>>>>>>>> Node Attributes:
>>>>>>>>      sip1: standby=off
>>>>>>>>      sip2: standby=off
>>>>>>>>
>>>>>>>>
>>>>>>>> br
>>>>>>>> miha
>>>>>>>>
>>>>>>>> Dne 8/14/2014 3:05 PM, piše emmanuel segura:
>>>>>>>>
>>>>>>>>> ncomplete=10, Source=/var/lib/pacemaker/pengine/pe-warn-7.bz2):
>>>>>>>>> Stopped
>>>>>>>>> Jul 03 14:10:51 [2701] sip2       crmd:   notice:
>>>>>>>>> too_many_st_failures:         No devices found in cluster to fence
>>>>>>>>> sip1, giving up
>>>>>>>>>
>>>>>>>>> Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
>>>>>>>>>       Processed st_query reply from sip2: OK (0)
>>>>>>>>> Jul 03 14:10:54 [2697] sip2 stonith-ng:    error: remote_op_done:
>>>>>>>>>       Operation reboot of sip1 by sip2 for
>>>>>>>>> stonith_admin.cman.28299 at sip2.94474607: No such device
>>>>>>>>>
>>>>>>>>> Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
>>>>>>>>>       Processed st_notify reply from sip2: OK (0)
>>>>>>>>> Jul 03 14:10:54 [2701] sip2       crmd:   notice:
>>>>>>>>> tengine_stonith_notify:       Peer sip1 was not terminated (reboot)
>>>>>>>>> by
>>>>>>>>> sip2 for sip2: No such device
>>>>>>>>> (ref=94474607-8cd2-410d-bbf7-5bc7df614a50) by client
>>>>>>>>> stonith_admin.cman.28299
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
>>>>>>>>>
>>>>>>>>> Sorry for the short answer, have you tested your cluster fencing ?
>>>>>>>>> can
>>>>>>>>> you show your cluster.conf xml?
>>>>>>>>>
>>>>>>>>> 2014-08-14 14:44 GMT+02:00 Miha <miha at softnet.si>:
>>>>>>>>>> emmanuel,
>>>>>>>>>>
>>>>>>>>>> tnx. But how to know why fancing stop working?
>>>>>>>>>>
>>>>>>>>>> br
>>>>>>>>>> miha
>>>>>>>>>>
>>>>>>>>>> Dne 8/14/2014 2:35 PM, piše emmanuel segura:
>>>>>>>>>>
>>>>>>>>>>> Node sip2: UNCLEAN (offline) is unclean because the cluster
>>>>>>>>>>> fencing
>>>>>>>>>>> failed to complete the operation
>>>>>>>>>>>
>>>>>>>>>>> 2014-08-14 14:13 GMT+02:00 Miha <miha at softnet.si>:
>>>>>>>>>>>> hi.
>>>>>>>>>>>>
>>>>>>>>>>>> another thing.
>>>>>>>>>>>>
>>>>>>>>>>>> On node I pcs is running:
>>>>>>>>>>>> [root at sip1 ~]# pcs status
>>>>>>>>>>>> Cluster name: sipproxy
>>>>>>>>>>>> Last updated: Thu Aug 14 14:13:37 2014
>>>>>>>>>>>> Last change: Sat Feb  1 20:10:48 2014 via crm_attribute on sip1
>>>>>>>>>>>> Stack: cman
>>>>>>>>>>>> Current DC: sip1 - partition with quorum
>>>>>>>>>>>> Version: 1.1.10-14.el6-368c726
>>>>>>>>>>>> 2 Nodes configured
>>>>>>>>>>>> 10 Resources configured
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Node sip2: UNCLEAN (offline)
>>>>>>>>>>>> Online: [ sip1 ]
>>>>>>>>>>>>
>>>>>>>>>>>> Full list of resources:
>>>>>>>>>>>>
>>>>>>>>>>>>        Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
>>>>>>>>>>>>            Masters: [ sip2 ]
>>>>>>>>>>>>            Slaves: [ sip1 ]
>>>>>>>>>>>>        Resource Group: g_mysql
>>>>>>>>>>>>            p_fs_mysql (ocf::heartbeat:Filesystem):    Started sip2
>>>>>>>>>>>>            p_ip_mysql (ocf::heartbeat:IPaddr2):       Started sip2
>>>>>>>>>>>>            p_mysql    (ocf::heartbeat:mysql): Started sip2
>>>>>>>>>>>>        Clone Set: cl_ping [p_ping]
>>>>>>>>>>>>            Started: [ sip1 sip2 ]
>>>>>>>>>>>>        opensips       (lsb:opensips): Stopped
>>>>>>>>>>>>        fence_sip1     (stonith:fence_bladecenter_snmp):
>>>>>>>>>>>> Started
>>>>>>>>>>>> sip2
>>>>>>>>>>>>        fence_sip2     (stonith:fence_bladecenter_snmp):
>>>>>>>>>>>> Started
>>>>>>>>>>>> sip2
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [root at sip1 ~]#
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Dne 8/14/2014 2:12 PM, piše Miha:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi emmanuel,
>>>>>>>>>>>>>
>>>>>>>>>>>>> i think so, what is the best way to check?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sorry for my noob question, I have confiured this 6 mouths ago
>>>>>>>>>>>>> and
>>>>>>>>>>>>> everything was working fine till now. Now I need to find out
>>>>>>>>>>>>> what
>>>>>>>>>>>>> realy
>>>>>>>>>>>>> heppend beffor I do something stupid.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> tnx
>>>>>>>>>>>>>
>>>>>>>>>>>>> Dne 8/14/2014 1:58 PM, piše emmanuel segura:
>>>>>>>>>>>>>> are you sure your cluster fencing is working?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2014-08-14 13:40 GMT+02:00 Miha <miha at softnet.si>:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I noticed today that I am having some problem with cluster. I
>>>>>>>>>>>>>>> noticed
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> master server is offilne but still virutal ip is assigned to
>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>> services are running properly (for production).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If I do this I am getting this notifications:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [root at sip2 cluster]# pcs status
>>>>>>>>>>>>>>> Error: cluster is not currently running on this node
>>>>>>>>>>>>>>> [root at sip2 cluster]# /etc/init.d/corosync status
>>>>>>>>>>>>>>> corosync dead but pid file exists
>>>>>>>>>>>>>>> [root at sip2 cluster]# pcs status
>>>>>>>>>>>>>>> Error: cluster is not currently running on this node
>>>>>>>>>>>>>>> [root at sip2 cluster]#
>>>>>>>>>>>>>>> [root at sip2 cluster]#
>>>>>>>>>>>>>>> [root at sip2 cluster]# tailf fenced.log
>>>>>>>>>>>>>>> Aug 14 13:34:25 fenced cman_get_cluster error -1 112
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The main thing is what to do now? Do "pcs start" and hope for
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> best
>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>> what?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have pasted log in pastebin: http://pastebin.com/SUp2GcmN
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> tnx!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> miha
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>>>>> Getting started:
>>>>>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>
>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>> Getting started:
>>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>
>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>> Getting started:
>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>
>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>> Getting started:
>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started:
>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>