[Pacemaker] Simple installation Pacemaker + CMAN + fence-agents
Andrew Beekhof
andrew at beekhof.net
Fri Nov 8 00:31:40 UTC 2013
Something seems very wrong with this at the corosync level.
Even fenced and the dlm are having issues.
Jan: Could this be firewall related?
On 27 Sep 2013, at 10:44 pm, Bartłomiej Wójcik <bartlomiej.wojcik at turbineam.com> wrote:
> W dniu 2013-09-27 04:26, Andrew Beekhof pisze:
>> On 26/09/2013, at 8:35 PM, Bartłomiej Wójcik <bartlomiej.wojcik at turbineam.com>
>> wrote:
>>
>>
>>> Hello,
>>>
>>> I install Pacemaker in accordance with
>>> http://clusterlabs.org/quickstart-ubuntu.html
>>> on Ubuntu 13.04 two nodes changing only the IP addresses.
>>>
>>> /etc/cluster/cluster.conf:
>>>
>>> <?xml version="1.0"?>
>>> <cluster config_version="1" name="pacemaker1">
>>> <logging debug="off"/>
>>> <clusternodes>
>>> <clusternode name="fmpgpool4" nodeid="1">
>>> <fence>
>>> <method name="pcmk-redirect">
>>> <device name="pcmk" port="fmpgpool4"/>
>>> </method>
>>> </fence>
>>> </clusternode>
>>> <clusternode name="fmpgpool5" nodeid="2">
>>> <fence>
>>> <method name="pcmk-redirect">
>>> <device name="pcmk" port="fmpgpool5"/>
>>> </method>
>>> </fence>
>>> </clusternode>
>>> </clusternodes>
>>> <fencedevices>
>>> <fencedevice name="pcmk" agent="fence_pcmk"/>
>>> </fencedevices>
>>> </cluster>
>>>
>>>
>>> gets only the server:
>>> ps -ef|grep pacemaker
>>>
>>>
>>> pacemakerd
>>>
>> What do the logs from pacemakerd say?
>>
>>
>>>
>>> and nothing more
>>>
>>>
>>> I try to do:
>>> crm configure property stonith-enabled=false
>>>
>>> and gets:
>>> Signon to CIB failed: connection failed
>>> Init failed, could not perform requested operations
>>> ERROR: cannot parse xml: no element found: line 1, column 0
>>> ERROR: No CIB!
>>>
>>>
>>> I don't know what could be wrong.
>>>
>>>
>>> Regards!
>>>
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list:
>>> Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>>
>>> Project Home:
>>> http://www.clusterlabs.org
>>>
>>> Getting started:
>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>
>>> Bugs:
>>> http://bugs.clusterlabs.org
>>
>>
>> _______________________________________________
>> Pacemaker mailing list:
>> Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>>
>> Project Home:
>> http://www.clusterlabs.org
>>
>> Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>
>> Bugs:
>> http://bugs.clusterlabs.org
>
> Hello,
>
> corosync.log:
>
> Sep 26 11:14:50 corosync [MAIN ] Corosync Cluster Engine ('1.4.4'): started and ready to provide service.
> Sep 26 11:14:50 corosync [MAIN ] Corosync built-in features: nss
> Sep 26 11:14:50 corosync [MAIN ] Successfully read config from /etc/cluster/cluster.conf
> Sep 26 11:14:50 corosync [MAIN ] Successfully parsed cman config
> Sep 26 11:14:50 corosync [MAIN ] Successfully configured openais services to load
> Sep 26 11:14:50 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).
> Sep 26 11:14:50 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Sep 26 11:14:50 corosync [TOTEM ] The network interface [10.0.0.34] is now up.
> Sep 26 11:14:50 corosync [QUORUM] Using quorum provider quorum_cman
> Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync cluster quorum service v0.1
> Sep 26 11:14:50 corosync [CMAN ] CMAN 3.1.8 (built Jan 17 2013 06:24:33) started
> Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync CMAN membership service 2.90
> Sep 26 11:14:50 corosync [SERV ] Service engine loaded: openais cluster membership service B.01.01
> Sep 26 11:14:50 corosync [SERV ] Service engine loaded: openais event service B.01.01
> Sep 26 11:14:50 corosync [SERV ] Service engine loaded: openais checkpoint service B.01.01
> Sep 26 11:14:50 corosync [SERV ] Service engine loaded: openais message service B.03.01
> Sep 26 11:14:50 corosync [SERV ] Service engine loaded: openais distributed locking service B.03.01
> Sep 26 11:14:50 corosync [SERV ] Service engine loaded: openais timer service A.01.01
> Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync extended virtual synchrony service
> Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync configuration service
> Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync cluster closed process group service v1.01
> Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync cluster config database access v1.01
> Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync profile loading service
> Sep 26 11:14:50 corosync [QUORUM] Using quorum provider quorum_cman
> Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync cluster quorum service v0.1
> Sep 26 11:14:56 corosync [CLM ] Members Left:
> Sep 26 11:14:56 corosync [CLM ] Members Joined:
> Sep 26 11:14:56 corosync [CLM ] r(0) ip(10.0.0.35)
> Sep 26 11:14:56 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
> Set r/w permissions for uid=108, gid=0 on /var/log/cluster/corosync.log
> Sep 26 11:15:31 fmpgpool4 pacemakerd: [15471]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/root
> Sep 26 11:15:31 fmpgpool4 pacemakerd: [15471]: notice: main: Starting Pacemaker 1.1.7 (Build: ee0730e13d124c3d58f00016c3376a1de5323cff): generated-manpages agent-manpages ncurses hear
> tbeat corosync-plugin cman snmp libesmtp
> Sep 26 11:15:31 fmpgpool4 pacemakerd: [15471]: info: main: Maximum core file size is: 18446744073709551615
> Sep 26 11:23:16 fmpgpool4 pacemakerd: [15471]: ERROR: cluster_connect_cpg: Could not join the CPG group 'pacemakerd': 6
> Sep 26 11:23:16 fmpgpool4 pacemakerd: [15471]: ERROR: main: Couldn't connect to Corosync's CPG service
> Set r/w permissions for uid=108, gid=0 on /var/log/cluster/corosync.log
> Sep 26 11:27:30 fmpgpool4 pacemakerd: [15803]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/root
> Sep 26 11:27:30 fmpgpool4 pacemakerd: [15803]: notice: main: Starting Pacemaker 1.1.7 (Build: ee0730e13d124c3d58f00016c3376a1de5323cff): generated-manpages agent-manpages ncurses hear
> tbeat corosync-plugin cman snmp libesmtp
> Sep 26 11:27:30 fmpgpool4 pacemakerd: [15803]: info: main: Maximum core file size is: 18446744073709551615
> Sep 26 11:35:15 fmpgpool4 pacemakerd: [15803]: ERROR: cluster_connect_cpg: Could not join the CPG group 'pacemakerd': 6
> Sep 26 11:35:15 fmpgpool4 pacemakerd: [15803]: ERROR: main: Couldn't connect to Corosync's CPG service
> Set r/w permissions for uid=108, gid=0 on /var/log/cluster/corosync.log
>
> dlm_controld.log:
>
> Sep 26 11:14:54 dlm_controld dlm_controld 3.1.8 started
> Sep 26 11:15:04 dlm_controld daemon cpg_join error retrying
> Sep 26 11:15:14 dlm_controld daemon cpg_join error retrying
> Sep 26 11:15:24 dlm_controld daemon cpg_join error retrying
> Sep 26 11:15:34 dlm_controld daemon cpg_join error retrying
> Sep 26 11:15:44 dlm_controld daemon cpg_join error retrying
> Sep 26 11:15:54 dlm_controld daemon cpg_join error retrying
> Sep 26 11:16:04 dlm_controld daemon cpg_join error retrying
> Sep 26 11:16:14 dlm_controld daemon cpg_join error retrying
> Sep 26 11:16:24 dlm_controld daemon cpg_join error retrying
> and so on...
>
> fenced.log
>
> Sep 26 11:14:54 fenced fenced 3.1.8 started
> Sep 26 11:15:04 fenced daemon cpg_join error retrying
> Sep 26 11:15:14 fenced daemon cpg_join error retrying
> Sep 26 11:15:24 fenced daemon cpg_join error retrying
> Sep 26 11:15:34 fenced daemon cpg_join error retrying
> Sep 26 11:15:44 fenced daemon cpg_join error retrying
> Sep 26 11:15:54 fenced daemon cpg_join error retrying
> Sep 26 11:16:04 fenced daemon cpg_join error retrying
> Sep 26 11:16:14 fenced daemon cpg_join error retrying
> and so on...
>
>
> Regards!
>
>
More information about the Pacemaker
mailing list