[Pacemaker] Simple installation Pacemaker + CMAN + fence-agents

Andrew Beekhof andrew at beekhof.net
Fri Nov 8 00:31:40 UTC 2013


Something seems very wrong with this at the corosync level.
Even fenced and the dlm are having issues.

Jan: Could this be firewall related?

On 27 Sep 2013, at 10:44 pm, Bartłomiej Wójcik <bartlomiej.wojcik at turbineam.com> wrote:

> W dniu 2013-09-27 04:26, Andrew Beekhof pisze:
>> On 26/09/2013, at 8:35 PM, Bartłomiej Wójcik <bartlomiej.wojcik at turbineam.com>
>>  wrote:
>> 
>> 
>>> Hello,
>>> 
>>> I install Pacemaker in accordance with 
>>> http://clusterlabs.org/quickstart-ubuntu.html
>>>  on Ubuntu 13.04 two nodes changing only the IP addresses.
>>> 
>>> /etc/cluster/cluster.conf:
>>> 
>>> <?xml version="1.0"?>
>>> <cluster config_version="1" name="pacemaker1">
>>> <logging debug="off"/>
>>> <clusternodes>
>>> <clusternode name="fmpgpool4" nodeid="1">
>>> <fence>
>>> <method name="pcmk-redirect">
>>> <device name="pcmk" port="fmpgpool4"/>
>>> </method>
>>> </fence>
>>> </clusternode>
>>> <clusternode name="fmpgpool5" nodeid="2">
>>> <fence>
>>> <method name="pcmk-redirect">
>>> <device name="pcmk" port="fmpgpool5"/>
>>> </method>
>>> </fence>
>>> </clusternode>
>>> </clusternodes>
>>> <fencedevices>
>>> <fencedevice name="pcmk" agent="fence_pcmk"/>
>>> </fencedevices>
>>> </cluster>
>>>     
>>> 
>>> gets only the server: 
>>> 	ps -ef|grep pacemaker
>>> 	
>>> 	
>>> pacemakerd 
>>> 
>> What do the logs from pacemakerd say?
>> 
>> 
>>> 
>>> 	and nothing more
>>> 
>>> 
>>> I try to do:
>>> 	crm configure property stonith-enabled=false
>>> 	
>>> and gets:
>>> 	Signon to CIB failed: connection failed
>>> 	Init failed, could not perform requested operations
>>> 	ERROR: cannot parse xml: no element found: line 1, column 0
>>> 	ERROR: No CIB!
>>> 
>>> 
>>> I don't know what could be wrong.
>>> 
>>> 
>>> Regards!
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Pacemaker mailing list: 
>>> Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>> 
>>> 
>>> Project Home: 
>>> http://www.clusterlabs.org
>>> 
>>> Getting started: 
>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> 
>>> Bugs: 
>>> http://bugs.clusterlabs.org
>> 
>> 
>> _______________________________________________
>> Pacemaker mailing list: 
>> Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> 
>> Project Home: 
>> http://www.clusterlabs.org
>> 
>> Getting started: 
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> 
>> Bugs: 
>> http://bugs.clusterlabs.org
> 
> Hello,
> 
> corosync.log:
> 
> Sep 26 11:14:50 corosync [MAIN  ] Corosync Cluster Engine ('1.4.4'): started and ready to provide service.
> Sep 26 11:14:50 corosync [MAIN  ] Corosync built-in features: nss
> Sep 26 11:14:50 corosync [MAIN  ] Successfully read config from /etc/cluster/cluster.conf
> Sep 26 11:14:50 corosync [MAIN  ] Successfully parsed cman config
> Sep 26 11:14:50 corosync [MAIN  ] Successfully configured openais services to load
> Sep 26 11:14:50 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).
> Sep 26 11:14:50 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Sep 26 11:14:50 corosync [TOTEM ] The network interface [10.0.0.34] is now up.
> Sep 26 11:14:50 corosync [QUORUM] Using quorum provider quorum_cman
> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync cluster quorum service v0.1
> Sep 26 11:14:50 corosync [CMAN  ] CMAN 3.1.8 (built Jan 17 2013 06:24:33) started
> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync CMAN membership service 2.90
> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: openais cluster membership service B.01.01
> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: openais event service B.01.01
> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: openais checkpoint service B.01.01
> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: openais message service B.03.01
> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: openais distributed locking service B.03.01
> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: openais timer service A.01.01
> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync extended virtual synchrony service
> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync configuration service
> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01
> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync cluster config database access v1.01
> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync profile loading service
> Sep 26 11:14:50 corosync [QUORUM] Using quorum provider quorum_cman
> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync cluster quorum service v0.1
> Sep 26 11:14:56 corosync [CLM   ] Members Left:
> Sep 26 11:14:56 corosync [CLM   ] Members Joined:
> Sep 26 11:14:56 corosync [CLM   ]       r(0) ip(10.0.0.35)
> Sep 26 11:14:56 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
> Set r/w permissions for uid=108, gid=0 on /var/log/cluster/corosync.log
> Sep 26 11:15:31 fmpgpool4 pacemakerd: [15471]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/root
> Sep 26 11:15:31 fmpgpool4 pacemakerd: [15471]: notice: main: Starting Pacemaker 1.1.7 (Build: ee0730e13d124c3d58f00016c3376a1de5323cff):  generated-manpages agent-manpages ncurses  hear
> tbeat corosync-plugin cman snmp libesmtp
> Sep 26 11:15:31 fmpgpool4 pacemakerd: [15471]: info: main: Maximum core file size is: 18446744073709551615
> Sep 26 11:23:16 fmpgpool4 pacemakerd: [15471]: ERROR: cluster_connect_cpg: Could not join the CPG group 'pacemakerd': 6
> Sep 26 11:23:16 fmpgpool4 pacemakerd: [15471]: ERROR: main: Couldn't connect to Corosync's CPG service
> Set r/w permissions for uid=108, gid=0 on /var/log/cluster/corosync.log
> Sep 26 11:27:30 fmpgpool4 pacemakerd: [15803]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/root
> Sep 26 11:27:30 fmpgpool4 pacemakerd: [15803]: notice: main: Starting Pacemaker 1.1.7 (Build: ee0730e13d124c3d58f00016c3376a1de5323cff):  generated-manpages agent-manpages ncurses  hear
> tbeat corosync-plugin cman snmp libesmtp
> Sep 26 11:27:30 fmpgpool4 pacemakerd: [15803]: info: main: Maximum core file size is: 18446744073709551615
> Sep 26 11:35:15 fmpgpool4 pacemakerd: [15803]: ERROR: cluster_connect_cpg: Could not join the CPG group 'pacemakerd': 6
> Sep 26 11:35:15 fmpgpool4 pacemakerd: [15803]: ERROR: main: Couldn't connect to Corosync's CPG service
> Set r/w permissions for uid=108, gid=0 on /var/log/cluster/corosync.log
> 
> dlm_controld.log:
> 
> Sep 26 11:14:54 dlm_controld dlm_controld 3.1.8 started
> Sep 26 11:15:04 dlm_controld daemon cpg_join error retrying
> Sep 26 11:15:14 dlm_controld daemon cpg_join error retrying
> Sep 26 11:15:24 dlm_controld daemon cpg_join error retrying
> Sep 26 11:15:34 dlm_controld daemon cpg_join error retrying
> Sep 26 11:15:44 dlm_controld daemon cpg_join error retrying
> Sep 26 11:15:54 dlm_controld daemon cpg_join error retrying
> Sep 26 11:16:04 dlm_controld daemon cpg_join error retrying
> Sep 26 11:16:14 dlm_controld daemon cpg_join error retrying
> Sep 26 11:16:24 dlm_controld daemon cpg_join error retrying
> and so on...
> 
> fenced.log
> 
> Sep 26 11:14:54 fenced fenced 3.1.8 started
> Sep 26 11:15:04 fenced daemon cpg_join error retrying
> Sep 26 11:15:14 fenced daemon cpg_join error retrying
> Sep 26 11:15:24 fenced daemon cpg_join error retrying
> Sep 26 11:15:34 fenced daemon cpg_join error retrying
> Sep 26 11:15:44 fenced daemon cpg_join error retrying
> Sep 26 11:15:54 fenced daemon cpg_join error retrying
> Sep 26 11:16:04 fenced daemon cpg_join error retrying
> Sep 26 11:16:14 fenced daemon cpg_join error retrying
> and so on...
> 
> 
> Regards!
> 
> 





More information about the Pacemaker mailing list