[Pacemaker] Error: cluster is not currently running on this node
Miha
miha at softnet.si
Mon Aug 18 07:08:33 CEST 2014
Hi Emmanuel,
this is my config:
Pacemaker Nodes:
sip1 sip2
Resources:
Master: ms_drbd_mysql
Meta Attrs: master-max=1 master-node-max=1 clone-max=2
clone-node-max=1 notify=true
Resource: p_drbd_mysql (class=ocf provider=linbit type=drbd)
Attributes: drbd_resource=clusterdb_res
Operations: monitor interval=29s role=Master (p_drbd_mysql-monitor-29s)
monitor interval=31s role=Slave (p_drbd_mysql-monitor-31s)
Group: g_mysql
Resource: p_fs_mysql (class=ocf provider=heartbeat type=Filesystem)
Attributes: device=/dev/drbd0 directory=/var/lib/mysql_drbd fstype=ext4
Meta Attrs: target-role=Started
Resource: p_ip_mysql (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=XXX.XXX.XXX.XXX cidr_netmask=24 nic=eth2
Resource: p_mysql (class=ocf provider=heartbeat type=mysql)
Attributes: datadir=/var/lib/mysql_drbd/data/ user=root group=root
config=/var/lib/mysql_drbd/my.cnf pid=/var/run/mysqld/mysqld.pid
socket=/var/lib/mysql/mysql.sock binary=/usr/bin/mysqld_safe
additional_parameters="--bind-address=212.13.249.55 --user=root"
Meta Attrs: target-role=Started
Operations: start interval=0 timeout=120s (p_mysql-start-0)
stop interval=0 timeout=120s (p_mysql-stop-0)
monitor interval=20s timeout=30s (p_mysql-monitor-20s)
Clone: cl_ping
Meta Attrs: interleave=true
Resource: p_ping (class=ocf provider=pacemaker type=ping)
Attributes: name=ping multiplier=1000 host_list=XXX.XXX.XXX.XXXX
Operations: monitor interval=15s timeout=60s (p_ping-monitor-15s)
start interval=0s timeout=60s (p_ping-start-0s)
stop interval=0s (p_ping-stop-0s)
Resource: opensips (class=lsb type=opensips)
Meta Attrs: target-role=Started
Operations: start interval=0 timeout=120 (opensips-start-0)
stop interval=0 timeout=120 (opensips-stop-0)
Stonith Devices:
Resource: fence_sip1 (class=stonith type=fence_bladecenter_snmp)
Attributes: action=off ipaddr=172.30.0.2 port=8 community=test
login=snmp8 passwd=soft1234
Meta Attrs: target-role=Started
Resource: fence_sip2 (class=stonith type=fence_bladecenter_snmp)
Attributes: action=off ipaddr=172.30.0.2 port=9 community=test1
login=snmp8 passwd=soft1234
Meta Attrs: target-role=Started
Fencing Levels:
Location Constraints:
Resource: ms_drbd_mysql
Constraint: l_drbd_master_on_ping
Rule: score=-INFINITY role=Master boolean-op=or
(id:l_drbd_master_on_ping-rule)
Expression: not_defined ping (id:l_drbd_master_on_ping-expression)
Expression: ping lte 0 type=number
(id:l_drbd_master_on_ping-expression-0)
Ordering Constraints:
promote ms_drbd_mysql then start g_mysql (INFINITY)
(id:o_drbd_before_mysql)
g_mysql then start opensips (INFINITY) (id:opensips_after_mysql)
Colocation Constraints:
g_mysql with ms_drbd_mysql (INFINITY) (with-rsc-role:Master)
(id:c_mysql_on_drbd)
opensips with g_mysql (INFINITY) (id:c_opensips_on_mysql)
Cluster Properties:
cluster-infrastructure: cman
dc-version: 1.1.10-14.el6-368c726
no-quorum-policy: ignore
stonith-enabled: true
Node Attributes:
sip1: standby=off
sip2: standby=off
br
miha
Dne 8/14/2014 3:05 PM, piše emmanuel segura:
> ncomplete=10, Source=/var/lib/pacemaker/pengine/pe-warn-7.bz2): Stopped
> Jul 03 14:10:51 [2701] sip2 crmd: notice:
> too_many_st_failures: No devices found in cluster to fence
> sip1, giving up
>
> Jul 03 14:10:54 [2697] sip2 stonith-ng: info: stonith_command:
> Processed st_query reply from sip2: OK (0)
> Jul 03 14:10:54 [2697] sip2 stonith-ng: error: remote_op_done:
> Operation reboot of sip1 by sip2 for
> stonith_admin.cman.28299 at sip2.94474607: No such device
>
> Jul 03 14:10:54 [2697] sip2 stonith-ng: info: stonith_command:
> Processed st_notify reply from sip2: OK (0)
> Jul 03 14:10:54 [2701] sip2 crmd: notice:
> tengine_stonith_notify: Peer sip1 was not terminated (reboot) by
> sip2 for sip2: No such device
> (ref=94474607-8cd2-410d-bbf7-5bc7df614a50) by client
> stonith_admin.cman.28299
>
> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
>
> Sorry for the short answer, have you tested your cluster fencing ? can
> you show your cluster.conf xml?
>
> 2014-08-14 14:44 GMT+02:00 Miha <miha at softnet.si>:
>> emmanuel,
>>
>> tnx. But how to know why fancing stop working?
>>
>> br
>> miha
>>
>> Dne 8/14/2014 2:35 PM, piše emmanuel segura:
>>
>>> Node sip2: UNCLEAN (offline) is unclean because the cluster fencing
>>> failed to complete the operation
>>>
>>> 2014-08-14 14:13 GMT+02:00 Miha <miha at softnet.si>:
>>>> hi.
>>>>
>>>> another thing.
>>>>
>>>> On node I pcs is running:
>>>> [root at sip1 ~]# pcs status
>>>> Cluster name: sipproxy
>>>> Last updated: Thu Aug 14 14:13:37 2014
>>>> Last change: Sat Feb 1 20:10:48 2014 via crm_attribute on sip1
>>>> Stack: cman
>>>> Current DC: sip1 - partition with quorum
>>>> Version: 1.1.10-14.el6-368c726
>>>> 2 Nodes configured
>>>> 10 Resources configured
>>>>
>>>>
>>>> Node sip2: UNCLEAN (offline)
>>>> Online: [ sip1 ]
>>>>
>>>> Full list of resources:
>>>>
>>>> Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
>>>> Masters: [ sip2 ]
>>>> Slaves: [ sip1 ]
>>>> Resource Group: g_mysql
>>>> p_fs_mysql (ocf::heartbeat:Filesystem): Started sip2
>>>> p_ip_mysql (ocf::heartbeat:IPaddr2): Started sip2
>>>> p_mysql (ocf::heartbeat:mysql): Started sip2
>>>> Clone Set: cl_ping [p_ping]
>>>> Started: [ sip1 sip2 ]
>>>> opensips (lsb:opensips): Stopped
>>>> fence_sip1 (stonith:fence_bladecenter_snmp): Started sip2
>>>> fence_sip2 (stonith:fence_bladecenter_snmp): Started sip2
>>>>
>>>>
>>>> [root at sip1 ~]#
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Dne 8/14/2014 2:12 PM, piše Miha:
>>>>
>>>>> Hi emmanuel,
>>>>>
>>>>> i think so, what is the best way to check?
>>>>>
>>>>> Sorry for my noob question, I have confiured this 6 mouths ago and
>>>>> everything was working fine till now. Now I need to find out what realy
>>>>> heppend beffor I do something stupid.
>>>>>
>>>>>
>>>>>
>>>>> tnx
>>>>>
>>>>> Dne 8/14/2014 1:58 PM, piše emmanuel segura:
>>>>>> are you sure your cluster fencing is working?
>>>>>>
>>>>>> 2014-08-14 13:40 GMT+02:00 Miha <miha at softnet.si>:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I noticed today that I am having some problem with cluster. I noticed
>>>>>>> the
>>>>>>> master server is offilne but still virutal ip is assigned to it and
>>>>>>> all
>>>>>>> services are running properly (for production).
>>>>>>>
>>>>>>> If I do this I am getting this notifications:
>>>>>>>
>>>>>>> [root at sip2 cluster]# pcs status
>>>>>>> Error: cluster is not currently running on this node
>>>>>>> [root at sip2 cluster]# /etc/init.d/corosync status
>>>>>>> corosync dead but pid file exists
>>>>>>> [root at sip2 cluster]# pcs status
>>>>>>> Error: cluster is not currently running on this node
>>>>>>> [root at sip2 cluster]#
>>>>>>> [root at sip2 cluster]#
>>>>>>> [root at sip2 cluster]# tailf fenced.log
>>>>>>> Aug 14 13:34:25 fenced cman_get_cluster error -1 112
>>>>>>>
>>>>>>>
>>>>>>> The main thing is what to do now? Do "pcs start" and hope for the best
>>>>>>> or
>>>>>>> what?
>>>>>>>
>>>>>>> I have pasted log in pastebin: http://pastebin.com/SUp2GcmN
>>>>>>>
>>>>>>> tnx!
>>>>>>>
>>>>>>> miha
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>
>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>> Getting started:
>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>
>>>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
More information about the Pacemaker
mailing list