[Pacemaker] Error: cluster is not currently running on this node

Mon Aug 18 05:08:33 UTC 2014

Hi Emmanuel,

this is my config:

Pacemaker Nodes:
  sip1 sip2

Resources:
  Master: ms_drbd_mysql
   Meta Attrs: master-max=1 master-node-max=1 clone-max=2 
clone-node-max=1 notify=true
   Resource: p_drbd_mysql (class=ocf provider=linbit type=drbd)
    Attributes: drbd_resource=clusterdb_res
    Operations: monitor interval=29s role=Master (p_drbd_mysql-monitor-29s)
                monitor interval=31s role=Slave (p_drbd_mysql-monitor-31s)
  Group: g_mysql
   Resource: p_fs_mysql (class=ocf provider=heartbeat type=Filesystem)
    Attributes: device=/dev/drbd0 directory=/var/lib/mysql_drbd fstype=ext4
    Meta Attrs: target-role=Started
   Resource: p_ip_mysql (class=ocf provider=heartbeat type=IPaddr2)
    Attributes: ip=XXX.XXX.XXX.XXX cidr_netmask=24 nic=eth2
   Resource: p_mysql (class=ocf provider=heartbeat type=mysql)
    Attributes: datadir=/var/lib/mysql_drbd/data/ user=root group=root 
config=/var/lib/mysql_drbd/my.cnf pid=/var/run/mysqld/mysqld.pid 
socket=/var/lib/mysql/mysql.sock binary=/usr/bin/mysqld_safe 
additional_parameters="--bind-address=212.13.249.55 --user=root"
    Meta Attrs: target-role=Started
    Operations: start interval=0 timeout=120s (p_mysql-start-0)
                stop interval=0 timeout=120s (p_mysql-stop-0)
                monitor interval=20s timeout=30s (p_mysql-monitor-20s)
  Clone: cl_ping
   Meta Attrs: interleave=true
   Resource: p_ping (class=ocf provider=pacemaker type=ping)
    Attributes: name=ping multiplier=1000 host_list=XXX.XXX.XXX.XXXX
    Operations: monitor interval=15s timeout=60s (p_ping-monitor-15s)
                start interval=0s timeout=60s (p_ping-start-0s)
                stop interval=0s (p_ping-stop-0s)
  Resource: opensips (class=lsb type=opensips)
   Meta Attrs: target-role=Started
   Operations: start interval=0 timeout=120 (opensips-start-0)
               stop interval=0 timeout=120 (opensips-stop-0)

Stonith Devices:
  Resource: fence_sip1 (class=stonith type=fence_bladecenter_snmp)
   Attributes: action=off ipaddr=172.30.0.2 port=8 community=test 
login=snmp8 passwd=soft1234
   Meta Attrs: target-role=Started
  Resource: fence_sip2 (class=stonith type=fence_bladecenter_snmp)
   Attributes: action=off ipaddr=172.30.0.2 port=9 community=test1 
login=snmp8 passwd=soft1234
   Meta Attrs: target-role=Started
Fencing Levels:

Location Constraints:
   Resource: ms_drbd_mysql
     Constraint: l_drbd_master_on_ping
       Rule: score=-INFINITY role=Master boolean-op=or 
(id:l_drbd_master_on_ping-rule)
         Expression: not_defined ping (id:l_drbd_master_on_ping-expression)
         Expression: ping lte 0 type=number 
(id:l_drbd_master_on_ping-expression-0)
Ordering Constraints:
   promote ms_drbd_mysql then start g_mysql (INFINITY) 
(id:o_drbd_before_mysql)
   g_mysql then start opensips (INFINITY) (id:opensips_after_mysql)
Colocation Constraints:
   g_mysql with ms_drbd_mysql (INFINITY) (with-rsc-role:Master) 
(id:c_mysql_on_drbd)
   opensips with g_mysql (INFINITY) (id:c_opensips_on_mysql)

Cluster Properties:
  cluster-infrastructure: cman
  dc-version: 1.1.10-14.el6-368c726
  no-quorum-policy: ignore
  stonith-enabled: true
Node Attributes:
  sip1: standby=off
  sip2: standby=off

br
miha

Dne 8/14/2014 3:05 PM, piše emmanuel segura:
> ncomplete=10, Source=/var/lib/pacemaker/pengine/pe-warn-7.bz2): Stopped
> Jul 03 14:10:51 [2701] sip2       crmd:   notice:
> too_many_st_failures:         No devices found in cluster to fence
> sip1, giving up
>
> Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
>   Processed st_query reply from sip2: OK (0)
> Jul 03 14:10:54 [2697] sip2 stonith-ng:    error: remote_op_done:
>   Operation reboot of sip1 by sip2 for
> stonith_admin.cman.28299 at sip2.94474607: No such device
>
> Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
>   Processed st_notify reply from sip2: OK (0)
> Jul 03 14:10:54 [2701] sip2       crmd:   notice:
> tengine_stonith_notify:       Peer sip1 was not terminated (reboot) by
> sip2 for sip2: No such device
> (ref=94474607-8cd2-410d-bbf7-5bc7df614a50) by client
> stonith_admin.cman.28299
>
> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
>
> Sorry for the short answer, have you tested your cluster fencing ? can
> you show your cluster.conf xml?
>
> 2014-08-14 14:44 GMT+02:00 Miha <miha at softnet.si>:
>> emmanuel,
>>
>> tnx. But how to know why fancing stop working?
>>
>> br
>> miha
>>
>> Dne 8/14/2014 2:35 PM, piše emmanuel segura:
>>
>>> Node sip2: UNCLEAN (offline) is unclean because the cluster fencing
>>> failed to complete the operation
>>>
>>> 2014-08-14 14:13 GMT+02:00 Miha <miha at softnet.si>:
>>>> hi.
>>>>
>>>> another thing.
>>>>
>>>> On node I pcs is running:
>>>> [root at sip1 ~]# pcs status
>>>> Cluster name: sipproxy
>>>> Last updated: Thu Aug 14 14:13:37 2014
>>>> Last change: Sat Feb  1 20:10:48 2014 via crm_attribute on sip1
>>>> Stack: cman
>>>> Current DC: sip1 - partition with quorum
>>>> Version: 1.1.10-14.el6-368c726
>>>> 2 Nodes configured
>>>> 10 Resources configured
>>>>
>>>>
>>>> Node sip2: UNCLEAN (offline)
>>>> Online: [ sip1 ]
>>>>
>>>> Full list of resources:
>>>>
>>>>    Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
>>>>        Masters: [ sip2 ]
>>>>        Slaves: [ sip1 ]
>>>>    Resource Group: g_mysql
>>>>        p_fs_mysql (ocf::heartbeat:Filesystem):    Started sip2
>>>>        p_ip_mysql (ocf::heartbeat:IPaddr2):       Started sip2
>>>>        p_mysql    (ocf::heartbeat:mysql): Started sip2
>>>>    Clone Set: cl_ping [p_ping]
>>>>        Started: [ sip1 sip2 ]
>>>>    opensips       (lsb:opensips): Stopped
>>>>    fence_sip1     (stonith:fence_bladecenter_snmp):       Started sip2
>>>>    fence_sip2     (stonith:fence_bladecenter_snmp):       Started sip2
>>>>
>>>>
>>>> [root at sip1 ~]#
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Dne 8/14/2014 2:12 PM, piše Miha:
>>>>
>>>>> Hi emmanuel,
>>>>>
>>>>> i think so, what is the best way to check?
>>>>>
>>>>> Sorry for my noob question, I have confiured this 6 mouths ago and
>>>>> everything was working fine till now. Now I need to find out what realy
>>>>> heppend beffor I do something stupid.
>>>>>
>>>>>
>>>>>
>>>>> tnx
>>>>>
>>>>> Dne 8/14/2014 1:58 PM, piše emmanuel segura:
>>>>>> are you sure your cluster fencing is working?
>>>>>>
>>>>>> 2014-08-14 13:40 GMT+02:00 Miha <miha at softnet.si>:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I noticed today that I am having some problem with cluster. I noticed
>>>>>>> the
>>>>>>> master server is offilne but still virutal ip is assigned to it and
>>>>>>> all
>>>>>>> services are running properly (for production).
>>>>>>>
>>>>>>> If I do this I am getting this notifications:
>>>>>>>
>>>>>>> [root at sip2 cluster]# pcs status
>>>>>>> Error: cluster is not currently running on this node
>>>>>>> [root at sip2 cluster]# /etc/init.d/corosync status
>>>>>>> corosync dead but pid file exists
>>>>>>> [root at sip2 cluster]# pcs status
>>>>>>> Error: cluster is not currently running on this node
>>>>>>> [root at sip2 cluster]#
>>>>>>> [root at sip2 cluster]#
>>>>>>> [root at sip2 cluster]# tailf fenced.log
>>>>>>> Aug 14 13:34:25 fenced cman_get_cluster error -1 112
>>>>>>>
>>>>>>>
>>>>>>> The main thing is what to do now? Do "pcs start" and hope for the best
>>>>>>> or
>>>>>>> what?
>>>>>>>
>>>>>>> I have pasted log in pastebin: http://pastebin.com/SUp2GcmN
>>>>>>>
>>>>>>> tnx!
>>>>>>>
>>>>>>> miha
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>
>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>> Getting started:
>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>
>>>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>