[Pacemaker] Error: cluster is not currently running on this node

Tue Aug 19 06:05:14 UTC 2014

sorry, here is it:

<cluster config_version="9" name="sipproxy">
   <fence_daemon/>
   <clusternodes>
     <clusternode name="sip1" nodeid="1">
       <fence>
         <method name="pcmk-method">
           <device name="pcmk-redirect" port="sip1"/>
         </method>
       </fence>
     </clusternode>
     <clusternode name="sip2" nodeid="2">
       <fence>
         <method name="pcmk-method">
           <device name="pcmk-redirect" port="sip2"/>
         </method>
       </fence>
     </clusternode>
   </clusternodes>
   <cman expected_votes="1" two_node="1"/>
   <fencedevices>
     <fencedevice agent="fence_pcmk" name="pcmk-redirect"/>
   </fencedevices>
   <rm>
     <failoverdomains/>
     <resources/>
   </rm>
</cluster>


br
miha

Dne 8/18/2014 11:33 AM, piše emmanuel segura:
> your cman /etc/cluster/cluster.conf ?
>
> 2014-08-18 7:08 GMT+02:00 Miha <miha at softnet.si>:
>> Hi Emmanuel,
>>
>> this is my config:
>>
>>
>> Pacemaker Nodes:
>>   sip1 sip2
>>
>> Resources:
>>   Master: ms_drbd_mysql
>>    Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
>> notify=true
>>    Resource: p_drbd_mysql (class=ocf provider=linbit type=drbd)
>>     Attributes: drbd_resource=clusterdb_res
>>     Operations: monitor interval=29s role=Master (p_drbd_mysql-monitor-29s)
>>                 monitor interval=31s role=Slave (p_drbd_mysql-monitor-31s)
>>   Group: g_mysql
>>    Resource: p_fs_mysql (class=ocf provider=heartbeat type=Filesystem)
>>     Attributes: device=/dev/drbd0 directory=/var/lib/mysql_drbd fstype=ext4
>>     Meta Attrs: target-role=Started
>>    Resource: p_ip_mysql (class=ocf provider=heartbeat type=IPaddr2)
>>     Attributes: ip=XXX.XXX.XXX.XXX cidr_netmask=24 nic=eth2
>>    Resource: p_mysql (class=ocf provider=heartbeat type=mysql)
>>     Attributes: datadir=/var/lib/mysql_drbd/data/ user=root group=root
>> config=/var/lib/mysql_drbd/my.cnf pid=/var/run/mysqld/mysqld.pid
>> socket=/var/lib/mysql/mysql.sock binary=/usr/bin/mysqld_safe
>> additional_parameters="--bind-address=212.13.249.55 --user=root"
>>     Meta Attrs: target-role=Started
>>     Operations: start interval=0 timeout=120s (p_mysql-start-0)
>>                 stop interval=0 timeout=120s (p_mysql-stop-0)
>>                 monitor interval=20s timeout=30s (p_mysql-monitor-20s)
>>   Clone: cl_ping
>>    Meta Attrs: interleave=true
>>    Resource: p_ping (class=ocf provider=pacemaker type=ping)
>>     Attributes: name=ping multiplier=1000 host_list=XXX.XXX.XXX.XXXX
>>     Operations: monitor interval=15s timeout=60s (p_ping-monitor-15s)
>>                 start interval=0s timeout=60s (p_ping-start-0s)
>>                 stop interval=0s (p_ping-stop-0s)
>>   Resource: opensips (class=lsb type=opensips)
>>    Meta Attrs: target-role=Started
>>    Operations: start interval=0 timeout=120 (opensips-start-0)
>>                stop interval=0 timeout=120 (opensips-stop-0)
>>
>> Stonith Devices:
>>   Resource: fence_sip1 (class=stonith type=fence_bladecenter_snmp)
>>    Attributes: action=off ipaddr=172.30.0.2 port=8 community=test login=snmp8
>> passwd=soft1234
>>    Meta Attrs: target-role=Started
>>   Resource: fence_sip2 (class=stonith type=fence_bladecenter_snmp)
>>    Attributes: action=off ipaddr=172.30.0.2 port=9 community=test1
>> login=snmp8 passwd=soft1234
>>    Meta Attrs: target-role=Started
>> Fencing Levels:
>>
>> Location Constraints:
>>    Resource: ms_drbd_mysql
>>      Constraint: l_drbd_master_on_ping
>>        Rule: score=-INFINITY role=Master boolean-op=or
>> (id:l_drbd_master_on_ping-rule)
>>          Expression: not_defined ping (id:l_drbd_master_on_ping-expression)
>>          Expression: ping lte 0 type=number
>> (id:l_drbd_master_on_ping-expression-0)
>> Ordering Constraints:
>>    promote ms_drbd_mysql then start g_mysql (INFINITY)
>> (id:o_drbd_before_mysql)
>>    g_mysql then start opensips (INFINITY) (id:opensips_after_mysql)
>> Colocation Constraints:
>>    g_mysql with ms_drbd_mysql (INFINITY) (with-rsc-role:Master)
>> (id:c_mysql_on_drbd)
>>    opensips with g_mysql (INFINITY) (id:c_opensips_on_mysql)
>>
>> Cluster Properties:
>>   cluster-infrastructure: cman
>>   dc-version: 1.1.10-14.el6-368c726
>>   no-quorum-policy: ignore
>>   stonith-enabled: true
>> Node Attributes:
>>   sip1: standby=off
>>   sip2: standby=off
>>
>>
>> br
>> miha
>>
>> Dne 8/14/2014 3:05 PM, piše emmanuel segura:
>>
>>> ncomplete=10, Source=/var/lib/pacemaker/pengine/pe-warn-7.bz2): Stopped
>>> Jul 03 14:10:51 [2701] sip2       crmd:   notice:
>>> too_many_st_failures:         No devices found in cluster to fence
>>> sip1, giving up
>>>
>>> Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
>>>    Processed st_query reply from sip2: OK (0)
>>> Jul 03 14:10:54 [2697] sip2 stonith-ng:    error: remote_op_done:
>>>    Operation reboot of sip1 by sip2 for
>>> stonith_admin.cman.28299 at sip2.94474607: No such device
>>>
>>> Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
>>>    Processed st_notify reply from sip2: OK (0)
>>> Jul 03 14:10:54 [2701] sip2       crmd:   notice:
>>> tengine_stonith_notify:       Peer sip1 was not terminated (reboot) by
>>> sip2 for sip2: No such device
>>> (ref=94474607-8cd2-410d-bbf7-5bc7df614a50) by client
>>> stonith_admin.cman.28299
>>>
>>>
>>> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
>>>
>>> Sorry for the short answer, have you tested your cluster fencing ? can
>>> you show your cluster.conf xml?
>>>
>>> 2014-08-14 14:44 GMT+02:00 Miha <miha at softnet.si>:
>>>> emmanuel,
>>>>
>>>> tnx. But how to know why fancing stop working?
>>>>
>>>> br
>>>> miha
>>>>
>>>> Dne 8/14/2014 2:35 PM, piše emmanuel segura:
>>>>
>>>>> Node sip2: UNCLEAN (offline) is unclean because the cluster fencing
>>>>> failed to complete the operation
>>>>>
>>>>> 2014-08-14 14:13 GMT+02:00 Miha <miha at softnet.si>:
>>>>>> hi.
>>>>>>
>>>>>> another thing.
>>>>>>
>>>>>> On node I pcs is running:
>>>>>> [root at sip1 ~]# pcs status
>>>>>> Cluster name: sipproxy
>>>>>> Last updated: Thu Aug 14 14:13:37 2014
>>>>>> Last change: Sat Feb  1 20:10:48 2014 via crm_attribute on sip1
>>>>>> Stack: cman
>>>>>> Current DC: sip1 - partition with quorum
>>>>>> Version: 1.1.10-14.el6-368c726
>>>>>> 2 Nodes configured
>>>>>> 10 Resources configured
>>>>>>
>>>>>>
>>>>>> Node sip2: UNCLEAN (offline)
>>>>>> Online: [ sip1 ]
>>>>>>
>>>>>> Full list of resources:
>>>>>>
>>>>>>     Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
>>>>>>         Masters: [ sip2 ]
>>>>>>         Slaves: [ sip1 ]
>>>>>>     Resource Group: g_mysql
>>>>>>         p_fs_mysql (ocf::heartbeat:Filesystem):    Started sip2
>>>>>>         p_ip_mysql (ocf::heartbeat:IPaddr2):       Started sip2
>>>>>>         p_mysql    (ocf::heartbeat:mysql): Started sip2
>>>>>>     Clone Set: cl_ping [p_ping]
>>>>>>         Started: [ sip1 sip2 ]
>>>>>>     opensips       (lsb:opensips): Stopped
>>>>>>     fence_sip1     (stonith:fence_bladecenter_snmp):       Started sip2
>>>>>>     fence_sip2     (stonith:fence_bladecenter_snmp):       Started sip2
>>>>>>
>>>>>>
>>>>>> [root at sip1 ~]#
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Dne 8/14/2014 2:12 PM, piše Miha:
>>>>>>
>>>>>>> Hi emmanuel,
>>>>>>>
>>>>>>> i think so, what is the best way to check?
>>>>>>>
>>>>>>> Sorry for my noob question, I have confiured this 6 mouths ago and
>>>>>>> everything was working fine till now. Now I need to find out what
>>>>>>> realy
>>>>>>> heppend beffor I do something stupid.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> tnx
>>>>>>>
>>>>>>> Dne 8/14/2014 1:58 PM, piše emmanuel segura:
>>>>>>>> are you sure your cluster fencing is working?
>>>>>>>>
>>>>>>>> 2014-08-14 13:40 GMT+02:00 Miha <miha at softnet.si>:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I noticed today that I am having some problem with cluster. I
>>>>>>>>> noticed
>>>>>>>>> the
>>>>>>>>> master server is offilne but still virutal ip is assigned to it and
>>>>>>>>> all
>>>>>>>>> services are running properly (for production).
>>>>>>>>>
>>>>>>>>> If I do this I am getting this notifications:
>>>>>>>>>
>>>>>>>>> [root at sip2 cluster]# pcs status
>>>>>>>>> Error: cluster is not currently running on this node
>>>>>>>>> [root at sip2 cluster]# /etc/init.d/corosync status
>>>>>>>>> corosync dead but pid file exists
>>>>>>>>> [root at sip2 cluster]# pcs status
>>>>>>>>> Error: cluster is not currently running on this node
>>>>>>>>> [root at sip2 cluster]#
>>>>>>>>> [root at sip2 cluster]#
>>>>>>>>> [root at sip2 cluster]# tailf fenced.log
>>>>>>>>> Aug 14 13:34:25 fenced cman_get_cluster error -1 112
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The main thing is what to do now? Do "pcs start" and hope for the
>>>>>>>>> best
>>>>>>>>> or
>>>>>>>>> what?
>>>>>>>>>
>>>>>>>>> I have pasted log in pastebin: http://pastebin.com/SUp2GcmN
>>>>>>>>>
>>>>>>>>> tnx!
>>>>>>>>>
>>>>>>>>> miha
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>
>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>> Getting started:
>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>
>>>>>>>>
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started:
>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>