[Pacemaker] Simple installation Pacemaker + CMAN + fence-agents

Jan Friesse jfriesse at redhat.com
Mon Nov 11 02:47:26 EST 2013


Andrew Beekhof napsal(a):
> Something seems very wrong with this at the corosync level.
> Even fenced and the dlm are having issues.
> 
> Jan: Could this be firewall related?

Yes. This can be ether firewall on mcast issue. I would recommend to
turn off firewall completely (for testing). If this doesn't help, try
omping for multicast test.

Honza

> 
> On 27 Sep 2013, at 10:44 pm, Bartłomiej Wójcik <bartlomiej.wojcik at turbineam.com> wrote:
> 
>> W dniu 2013-09-27 04:26, Andrew Beekhof pisze:
>>> On 26/09/2013, at 8:35 PM, Bartłomiej Wójcik <bartlomiej.wojcik at turbineam.com>
>>>  wrote:
>>>
>>>
>>>> Hello,
>>>>
>>>> I install Pacemaker in accordance with 
>>>> http://clusterlabs.org/quickstart-ubuntu.html
>>>>  on Ubuntu 13.04 two nodes changing only the IP addresses.
>>>>
>>>> /etc/cluster/cluster.conf:
>>>>
>>>> <?xml version="1.0"?>
>>>> <cluster config_version="1" name="pacemaker1">
>>>> <logging debug="off"/>
>>>> <clusternodes>
>>>> <clusternode name="fmpgpool4" nodeid="1">
>>>> <fence>
>>>> <method name="pcmk-redirect">
>>>> <device name="pcmk" port="fmpgpool4"/>
>>>> </method>
>>>> </fence>
>>>> </clusternode>
>>>> <clusternode name="fmpgpool5" nodeid="2">
>>>> <fence>
>>>> <method name="pcmk-redirect">
>>>> <device name="pcmk" port="fmpgpool5"/>
>>>> </method>
>>>> </fence>
>>>> </clusternode>
>>>> </clusternodes>
>>>> <fencedevices>
>>>> <fencedevice name="pcmk" agent="fence_pcmk"/>
>>>> </fencedevices>
>>>> </cluster>
>>>>     
>>>>
>>>> gets only the server: 
>>>> 	ps -ef|grep pacemaker
>>>> 	
>>>> 	
>>>> pacemakerd 
>>>>
>>> What do the logs from pacemakerd say?
>>>
>>>
>>>>
>>>> 	and nothing more
>>>>
>>>>
>>>> I try to do:
>>>> 	crm configure property stonith-enabled=false
>>>> 	
>>>> and gets:
>>>> 	Signon to CIB failed: connection failed
>>>> 	Init failed, could not perform requested operations
>>>> 	ERROR: cannot parse xml: no element found: line 1, column 0
>>>> 	ERROR: No CIB!
>>>>
>>>>
>>>> I don't know what could be wrong.
>>>>
>>>>
>>>> Regards!
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: 
>>>> Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>>
>>>> Project Home: 
>>>> http://www.clusterlabs.org
>>>>
>>>> Getting started: 
>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>
>>>> Bugs: 
>>>> http://bugs.clusterlabs.org
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: 
>>> Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>>
>>> Project Home: 
>>> http://www.clusterlabs.org
>>>
>>> Getting started: 
>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>
>>> Bugs: 
>>> http://bugs.clusterlabs.org
>>
>> Hello,
>>
>> corosync.log:
>>
>> Sep 26 11:14:50 corosync [MAIN  ] Corosync Cluster Engine ('1.4.4'): started and ready to provide service.
>> Sep 26 11:14:50 corosync [MAIN  ] Corosync built-in features: nss
>> Sep 26 11:14:50 corosync [MAIN  ] Successfully read config from /etc/cluster/cluster.conf
>> Sep 26 11:14:50 corosync [MAIN  ] Successfully parsed cman config
>> Sep 26 11:14:50 corosync [MAIN  ] Successfully configured openais services to load
>> Sep 26 11:14:50 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).
>> Sep 26 11:14:50 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
>> Sep 26 11:14:50 corosync [TOTEM ] The network interface [10.0.0.34] is now up.
>> Sep 26 11:14:50 corosync [QUORUM] Using quorum provider quorum_cman
>> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync cluster quorum service v0.1
>> Sep 26 11:14:50 corosync [CMAN  ] CMAN 3.1.8 (built Jan 17 2013 06:24:33) started
>> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync CMAN membership service 2.90
>> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: openais cluster membership service B.01.01
>> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: openais event service B.01.01
>> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: openais checkpoint service B.01.01
>> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: openais message service B.03.01
>> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: openais distributed locking service B.03.01
>> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: openais timer service A.01.01
>> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync extended virtual synchrony service
>> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync configuration service
>> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01
>> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync cluster config database access v1.01
>> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync profile loading service
>> Sep 26 11:14:50 corosync [QUORUM] Using quorum provider quorum_cman
>> Sep 26 11:14:50 corosync [SERV  ] Service engine loaded: corosync cluster quorum service v0.1
>> Sep 26 11:14:56 corosync [CLM   ] Members Left:
>> Sep 26 11:14:56 corosync [CLM   ] Members Joined:
>> Sep 26 11:14:56 corosync [CLM   ]       r(0) ip(10.0.0.35)
>> Sep 26 11:14:56 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
>> Set r/w permissions for uid=108, gid=0 on /var/log/cluster/corosync.log
>> Sep 26 11:15:31 fmpgpool4 pacemakerd: [15471]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/root
>> Sep 26 11:15:31 fmpgpool4 pacemakerd: [15471]: notice: main: Starting Pacemaker 1.1.7 (Build: ee0730e13d124c3d58f00016c3376a1de5323cff):  generated-manpages agent-manpages ncurses  hear
>> tbeat corosync-plugin cman snmp libesmtp
>> Sep 26 11:15:31 fmpgpool4 pacemakerd: [15471]: info: main: Maximum core file size is: 18446744073709551615
>> Sep 26 11:23:16 fmpgpool4 pacemakerd: [15471]: ERROR: cluster_connect_cpg: Could not join the CPG group 'pacemakerd': 6
>> Sep 26 11:23:16 fmpgpool4 pacemakerd: [15471]: ERROR: main: Couldn't connect to Corosync's CPG service
>> Set r/w permissions for uid=108, gid=0 on /var/log/cluster/corosync.log
>> Sep 26 11:27:30 fmpgpool4 pacemakerd: [15803]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/root
>> Sep 26 11:27:30 fmpgpool4 pacemakerd: [15803]: notice: main: Starting Pacemaker 1.1.7 (Build: ee0730e13d124c3d58f00016c3376a1de5323cff):  generated-manpages agent-manpages ncurses  hear
>> tbeat corosync-plugin cman snmp libesmtp
>> Sep 26 11:27:30 fmpgpool4 pacemakerd: [15803]: info: main: Maximum core file size is: 18446744073709551615
>> Sep 26 11:35:15 fmpgpool4 pacemakerd: [15803]: ERROR: cluster_connect_cpg: Could not join the CPG group 'pacemakerd': 6
>> Sep 26 11:35:15 fmpgpool4 pacemakerd: [15803]: ERROR: main: Couldn't connect to Corosync's CPG service
>> Set r/w permissions for uid=108, gid=0 on /var/log/cluster/corosync.log
>>
>> dlm_controld.log:
>>
>> Sep 26 11:14:54 dlm_controld dlm_controld 3.1.8 started
>> Sep 26 11:15:04 dlm_controld daemon cpg_join error retrying
>> Sep 26 11:15:14 dlm_controld daemon cpg_join error retrying
>> Sep 26 11:15:24 dlm_controld daemon cpg_join error retrying
>> Sep 26 11:15:34 dlm_controld daemon cpg_join error retrying
>> Sep 26 11:15:44 dlm_controld daemon cpg_join error retrying
>> Sep 26 11:15:54 dlm_controld daemon cpg_join error retrying
>> Sep 26 11:16:04 dlm_controld daemon cpg_join error retrying
>> Sep 26 11:16:14 dlm_controld daemon cpg_join error retrying
>> Sep 26 11:16:24 dlm_controld daemon cpg_join error retrying
>> and so on...
>>
>> fenced.log
>>
>> Sep 26 11:14:54 fenced fenced 3.1.8 started
>> Sep 26 11:15:04 fenced daemon cpg_join error retrying
>> Sep 26 11:15:14 fenced daemon cpg_join error retrying
>> Sep 26 11:15:24 fenced daemon cpg_join error retrying
>> Sep 26 11:15:34 fenced daemon cpg_join error retrying
>> Sep 26 11:15:44 fenced daemon cpg_join error retrying
>> Sep 26 11:15:54 fenced daemon cpg_join error retrying
>> Sep 26 11:16:04 fenced daemon cpg_join error retrying
>> Sep 26 11:16:14 fenced daemon cpg_join error retrying
>> and so on...
>>
>>
>> Regards!
>>
>>
> 





More information about the Pacemaker mailing list