[Pacemaker] ERROR: Wrong stack o2cb

Denis Witt denis.witt at concepts-and-training.de
Wed Jun 26 04:47:45 EDT 2013


On Tue, 25 Jun 2013 13:34:30 -0400 (EDT)
Jake Smith <jsmith at argotec.com> wrote:

> I'm guessing if you run:
> grep user /sys/fs/ocfs2/loaded_cluster_plugins 2>&1; rc=$?
> your going to return a 1 or something other than 0.  

Hi Jake,

true, as long as ocfs2_stack_user isn't loaded. I loaded it and now the
command returns 0. 

But now I'm not able to mount the cluster-drive anymore. Corosync fails
to start o2cb/ocfs2 and if I start it by hand the stack mismatches. 

Below you will find the corosync logs:

Jun 26 10:32:13 test4 corosync[3101]:   [MAIN  ] Corosync Cluster Engine ('1.4.2'): started and ready to provide service.
Jun 26 10:32:13 test4 corosync[3101]:   [MAIN  ] Corosync built-in features: nss
Jun 26 10:32:13 test4 corosync[3101]:   [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
Jun 26 10:32:13 test4 corosync[3101]:   [TOTEM ] Initializing transport (UDP/IP Multicast).
Jun 26 10:32:13 test4 corosync[3101]:   [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Jun 26 10:32:13 test4 corosync[3101]:   [TOTEM ] The network interface [10.0.2.18] is now up.
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: process_ais_conf: Reading configure
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: config_find_init: Local handle: 5650605097994944515 for logging
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: config_find_next: Processing additional logging options...
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: Found 'off' for option: debug
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: Found 'no' for option: to_logfile
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: Found 'yes' for option: to_syslog
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: Found 'daemon' for option: syslog_facility
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: config_find_init: Local handle: 2730409743423111172 for quorum
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: config_find_next: No additional configuration supplied for: quorum
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: No default for option: provider
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: config_find_init: Local handle: 5880381755227111429 for service
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: config_find_next: Processing additional service options...
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: Found '0' for option: ver
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: Defaulting to 'pcmk' for option: clustername
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: Defaulting to 'no' for option: use_logd
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: get_config_opt: Defaulting to 'no' for option: use_mgmtd
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: pcmk_startup: CRM: Initialized
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] Logging: Initialized pcmk_startup
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: pcmk_startup: Maximum core file size is: 18446744073709551615
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: pcmk_startup: Service: 9
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: pcmk_startup: Local hostname: test4-node1
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: pcmk_update_nodeid: Local node id: 302120970
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: update_member: Creating entry for node 302120970 born on 0
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: update_member: 0xe28c00 Node 302120970 now known as test4-node1 (was: (null))
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: update_member: Node test4-node1 now has 1 quorum votes (was 0)
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: update_member: Node 302120970/test4-node1 is now: member
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: spawn_child: Forked child 3132 for process cib
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: spawn_child: Forked child 3133 for process stonith-ng
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: spawn_child: Forked child 3134 for process lrmd
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: spawn_child: Forked child 3135 for process attrd
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: spawn_child: Forked child 3136 for process pengine
Jun 26 10:32:13 test4 corosync[3101]:   [pcmk  ] info: spawn_child: Forked child 3137 for process crmd
Jun 26 10:32:13 test4 corosync[3101]:   [SERV  ] Service engine loaded: Pacemaker Cluster Manager 1.1.7
Jun 26 10:32:14 test4 pengine: [3136]: info: Invoked: /usr/lib/pacemaker/pengine 
Jun 26 10:32:14 test4 corosync[3101]:   [SERV  ] Service engine loaded: openais checkpoint service B.01.01
Jun 26 10:32:14 test4 attrd: [3135]: info: Invoked: /usr/lib/pacemaker/attrd 
Jun 26 10:32:14 test4 attrd: [3135]: notice: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin)
Jun 26 10:32:14 test4 crmd: [3137]: info: Invoked: /usr/lib/pacemaker/crmd 
Jun 26 10:32:14 test4 cib: [3132]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/hacluster
Jun 26 10:32:14 test4 crmd: [3137]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/hacluster
Jun 26 10:32:14 test4 crmd: [3137]: notice: main: CRM Git Version: ee0730e13d124c3d58f00016c3376a1de5323cff
Jun 26 10:32:14 test4 cib: [3132]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig)
Jun 26 10:32:14 test4 corosync[3101]:   [SERV  ] Service engine loaded: corosync extended virtual synchrony service
Jun 26 10:32:14 test4 corosync[3101]:   [SERV  ] Service engine loaded: corosync configuration service
Jun 26 10:32:14 test4 corosync[3101]:   [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01
Jun 26 10:32:14 test4 corosync[3101]:   [SERV  ] Service engine loaded: corosync cluster config database access v1.01
Jun 26 10:32:14 test4 stonith-ng: [3133]: info: Invoked: /usr/lib/pacemaker/stonithd 
Jun 26 10:32:14 test4 stonith-ng: [3133]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/root
Jun 26 10:32:14 test4 stonith-ng: [3133]: info: get_cluster_type: Cluster type is: 'openais'
Jun 26 10:32:14 test4 stonith-ng: [3133]: notice: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin)
Jun 26 10:32:14 test4 stonith-ng: [3133]: info: init_ais_connection_classic: Creating connection to our Corosync plugin
Jun 26 10:32:14 test4 corosync[3101]:   [SERV  ] Service engine loaded: corosync profile loading service
Jun 26 10:32:14 test4 corosync[3101]:   [SERV  ] Service engine loaded: corosync cluster quorum service v0.1
Jun 26 10:32:14 test4 corosync[3101]:   [MAIN  ] Compatibility mode set to whitetank.  Using V1 and V2 of the synchronization engine.
Jun 26 10:32:14 test4 stonith-ng: [3133]: info: init_ais_connection_classic: AIS connection established
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: pcmk_ipc: Recorded connection 0xe32e10 for attrd/3135
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: pcmk_ipc: Recorded connection 0xe37170 for stonith-ng/3133
Jun 26 10:32:14 test4 attrd: [3135]: notice: main: Starting mainloop...
Jun 26 10:32:14 test4 stonith-ng: [3133]: info: get_ais_nodeid: Server details: id=302120970 uname=test4-node1 cname=pcmk
Jun 26 10:32:14 test4 stonith-ng: [3133]: info: init_ais_connection_once: Connection to 'classic openais (with plugin)': established
Jun 26 10:32:14 test4 stonith-ng: [3133]: info: crm_new_peer: Node test4-node1 now has id: 302120970
Jun 26 10:32:14 test4 stonith-ng: [3133]: info: crm_new_peer: Node 302120970 is now known as test4-node1
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] notice: pcmk_peer_update: Transitional membership event on ring 56: memb=0, new=0, lost=0
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] notice: pcmk_peer_update: Stable membership event on ring 56: memb=3, new=3, lost=0
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: pcmk_peer_update: NEW:  test4-node1 302120970
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: update_member: Creating entry for node 302186506 born on 56
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: update_member: Node 302186506/unknown is now: member
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: pcmk_peer_update: NEW:  .pending. 302186506
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: update_member: Creating entry for node 302252042 born on 56
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: update_member: Node 302252042/unknown is now: member
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: pcmk_peer_update: NEW:  .pending. 302252042
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: pcmk_peer_update: MEMB: test4-node1 302120970
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: pcmk_peer_update: MEMB: .pending. 302186506
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: pcmk_peer_update: MEMB: .pending. 302252042
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: update_member: Node test4-node1 now has process list: 00000000000000000000000000111312 (1118994)
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: send_member_notification: Sending membership update 56 to 0 children
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: update_member: 0xe28c00 Node 302120970 ((null)) born on: 56
Jun 26 10:32:14 test4 corosync[3101]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: update_member: 0xe3ddb0 Node 302186506 (test4-node2) born on: 52
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: update_member: 0xe3ddb0 Node 302186506 now known as test4-node2 (was: (null))
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: update_member: Node test4-node2 now has process list: 00000000000000000000000000111312 (1118994)
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: update_member: Node test4-node2 now has 1 quorum votes (was 0)
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: send_member_notification: Sending membership update 56 to 0 children
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: update_member: 0xe3dfe0 Node 302252042 (test4) born on: 8
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: update_member: 0xe3dfe0 Node 302252042 now known as test4 (was: (null))
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: update_member: Node test4 now has process list: 00000000000000000000000000111312 (1118994)
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: update_member: Node test4 now has 1 quorum votes (was 0)
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: update_expected_votes: Expected quorum votes 2 -> 3
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] info: send_member_notification: Sending membership update 56 to 0 children
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2)
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2)
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2)
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2)
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2)
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2)
Jun 26 10:32:14 test4 cib: [3132]: info: validate_with_relaxng: Creating RNG parser context
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2)
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2)
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2)
Jun 26 10:32:14 test4 corosync[3101]:   [pcmk  ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2)
Jun 26 10:32:14 test4 corosync[3101]:   [CPG   ] chosen downlist: sender r(0) ip(10.0.3.18) ; members(old:2 left:0)
Jun 26 10:32:14 test4 corosync[3101]:   [MAIN  ] Completed service synchronization, ready to provide service.
Jun 26 10:32:15 test4 lrmd: [3134]: info: enabling coredumps
Jun 26 10:32:15 test4 lrmd: [3134]: WARN: Core dumps could be lost if multiple dumps occur.
Jun 26 10:32:15 test4 lrmd: [3134]: WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability
Jun 26 10:32:15 test4 lrmd: [3134]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
Jun 26 10:32:15 test4 lrmd: [3134]: info: Started.
Jun 26 10:32:15 test4 cib: [3132]: info: startCib: CIB Initialization completed successfully
Jun 26 10:32:15 test4 cib: [3132]: info: get_cluster_type: Cluster type is: 'openais'
Jun 26 10:32:15 test4 cib: [3132]: notice: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin)
Jun 26 10:32:15 test4 cib: [3132]: info: init_ais_connection_classic: Creating connection to our Corosync plugin
Jun 26 10:32:15 test4 cib: [3132]: info: init_ais_connection_classic: AIS connection established
Jun 26 10:32:15 test4 corosync[3101]:   [pcmk  ] info: pcmk_ipc: Recorded connection 0xe3e670 for cib/3132
Jun 26 10:32:15 test4 corosync[3101]:   [pcmk  ] info: pcmk_ipc: Sending membership update 56 to cib
Jun 26 10:32:15 test4 cib: [3132]: info: get_ais_nodeid: Server details: id=302120970 uname=test4-node1 cname=pcmk
Jun 26 10:32:15 test4 cib: [3132]: info: init_ais_connection_once: Connection to 'classic openais (with plugin)': established
Jun 26 10:32:15 test4 cib: [3132]: info: crm_new_peer: Node test4-node1 now has id: 302120970
Jun 26 10:32:15 test4 cib: [3132]: info: crm_new_peer: Node 302120970 is now known as test4-node1
Jun 26 10:32:15 test4 cib: [3132]: info: cib_init: Starting cib mainloop
Jun 26 10:32:15 test4 cib: [3132]: notice: ais_dispatch_message: Membership 56: quorum acquired
Jun 26 10:32:15 test4 cib: [3132]: info: crm_new_peer: Node test4 now has id: 302252042
Jun 26 10:32:15 test4 cib: [3132]: info: crm_new_peer: Node 302252042 is now known as test4
Jun 26 10:32:15 test4 cib: [3132]: info: crm_update_peer: Node test4: id=302252042 state=member (new) addr=r(0) ip(10.0.4.18)  votes=1 born=8 seen=56 proc=00000000000000000000000000111312
Jun 26 10:32:15 test4 cib: [3132]: info: crm_update_peer: Node test4-node1: id=302120970 state=member (new) addr=r(0) ip(10.0.2.18)  (new) votes=1 (new) born=56 seen=56 proc=00000000000000000000000000111312 (new)
Jun 26 10:32:15 test4 cib: [3132]: info: crm_new_peer: Node test4-node2 now has id: 302186506
Jun 26 10:32:15 test4 cib: [3132]: info: crm_new_peer: Node 302186506 is now known as test4-node2
Jun 26 10:32:15 test4 cib: [3132]: info: crm_update_peer: Node test4-node2: id=302186506 state=member (new) addr=r(0) ip(10.0.3.18)  votes=1 born=52 seen=56 proc=00000000000000000000000000111312
Jun 26 10:32:15 test4 crmd: [3137]: info: do_cib_control: CIB connection established
Jun 26 10:32:15 test4 crmd: [3137]: info: get_cluster_type: Cluster type is: 'openais'
Jun 26 10:32:15 test4 crmd: [3137]: notice: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin)
Jun 26 10:32:15 test4 crmd: [3137]: info: init_ais_connection_classic: Creating connection to our Corosync plugin
Jun 26 10:32:15 test4 crmd: [3137]: info: init_ais_connection_classic: AIS connection established
Jun 26 10:32:15 test4 corosync[3101]:   [pcmk  ] info: pcmk_ipc: Recorded connection 0xe43070 for crmd/3137
Jun 26 10:32:15 test4 corosync[3101]:   [pcmk  ] info: pcmk_ipc: Sending membership update 56 to crmd
Jun 26 10:32:15 test4 crmd: [3137]: info: get_ais_nodeid: Server details: id=302120970 uname=test4-node1 cname=pcmk
Jun 26 10:32:15 test4 crmd: [3137]: info: init_ais_connection_once: Connection to 'classic openais (with plugin)': established
Jun 26 10:32:15 test4 crmd: [3137]: info: crm_new_peer: Node test4-node1 now has id: 302120970
Jun 26 10:32:15 test4 crmd: [3137]: info: crm_new_peer: Node 302120970 is now known as test4-node1
Jun 26 10:32:15 test4 crmd: [3137]: info: ais_status_callback: status: test4-node1 is now unknown
Jun 26 10:32:15 test4 crmd: [3137]: info: do_ha_control: Connected to the cluster
Jun 26 10:32:15 test4 crmd: [3137]: info: do_started: Delaying start, no membership data (0000000000100000)
Jun 26 10:32:15 test4 crmd: [3137]: notice: ais_dispatch_message: Membership 56: quorum acquired
Jun 26 10:32:15 test4 crmd: [3137]: info: crm_new_peer: Node test4 now has id: 302252042
Jun 26 10:32:15 test4 crmd: [3137]: info: crm_new_peer: Node 302252042 is now known as test4
Jun 26 10:32:15 test4 crmd: [3137]: info: ais_status_callback: status: test4 is now unknown
Jun 26 10:32:15 test4 crmd: [3137]: info: ais_status_callback: status: test4 is now member (was unknown)
Jun 26 10:32:15 test4 crmd: [3137]: info: crm_update_peer: Node test4: id=302252042 state=member (new) addr=r(0) ip(10.0.4.18)  votes=1 born=8 seen=56 proc=00000000000000000000000000111312
Jun 26 10:32:15 test4 crmd: [3137]: notice: crmd_peer_update: Status update: Client test4-node1/crmd now has status [online] (DC=<null>)
Jun 26 10:32:15 test4 crmd: [3137]: info: ais_status_callback: status: test4-node1 is now member (was unknown)
Jun 26 10:32:15 test4 crmd: [3137]: info: crm_update_peer: Node test4-node1: id=302120970 state=member (new) addr=r(0) ip(10.0.2.18)  (new) votes=1 (new) born=56 seen=56 proc=00000000000000000000000000111312 (new)
Jun 26 10:32:15 test4 crmd: [3137]: info: crm_new_peer: Node test4-node2 now has id: 302186506
Jun 26 10:32:15 test4 crmd: [3137]: info: crm_new_peer: Node 302186506 is now known as test4-node2
Jun 26 10:32:15 test4 crmd: [3137]: info: ais_status_callback: status: test4-node2 is now unknown
Jun 26 10:32:15 test4 crmd: [3137]: info: ais_status_callback: status: test4-node2 is now member (was unknown)
Jun 26 10:32:15 test4 crmd: [3137]: info: crm_update_peer: Node test4-node2: id=302186506 state=member (new) addr=r(0) ip(10.0.3.18)  votes=1 born=52 seen=56 proc=00000000000000000000000000111312
Jun 26 10:32:15 test4 crmd: [3137]: info: do_started: Delaying start, Config not read (0000000000000040)
Jun 26 10:32:15 test4 crmd: [3137]: notice: do_started: The local CRM is operational
Jun 26 10:32:15 test4 crmd: [3137]: notice: do_state_transition: State transition S_STARTING -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL origin=do_started ]
Jun 26 10:32:15 test4 stonith-ng: [3133]: notice: setup_cib: Watching for stonith topology changes
Jun 26 10:32:15 test4 stonith-ng: [3133]: info: main: Starting stonith-ng mainloop
Jun 26 10:32:16 test4 crmd: [3137]: info: ais_dispatch_message: Membership 56: quorum retained
Jun 26 10:32:17 test4 crmd: [3137]: info: update_dc: Set DC to test4 (3.0.6)
Jun 26 10:32:18 test4 crmd: [3137]: info: erase_status_tag: Deleting xpath: //node_state[@uname='test4-node1']/transient_attributes
Jun 26 10:32:18 test4 crmd: [3137]: info: update_attrd: Connecting to attrd...
Jun 26 10:32:18 test4 crmd: [3137]: notice: do_state_transition: State transition S_PENDING -> S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE origin=do_cl_join_finalize_respond ]
Jun 26 10:32:18 test4 attrd: [3135]: notice: attrd_local_callback: Sending full refresh (origin=crmd)
Jun 26 10:32:18 test4 cib: [3132]: info: cib_replace_notify: Replaced: -1.-1.-1 -> 0.104.145 from test4
Jun 26 10:32:18 test4 lrmd: [3134]: info: rsc:drbd:0 probe[2] (pid 3168)
Jun 26 10:32:18 test4 lrmd: [3134]: info: rsc:pingtest:0 probe[3] (pid 3169)
Jun 26 10:32:18 test4 lrmd: [3134]: info: rsc:sip probe[4] (pid 3170)
Jun 26 10:32:18 test4 lrmd: [3134]: info: rsc:apache probe[5] (pid 3171)
Jun 26 10:32:18 test4 lrmd: [3134]: info: operation monitor[3] on pingtest:0 for client 3137: pid 3169 exited with return code 7
Jun 26 10:32:18 test4 crmd: [3137]: info: process_lrm_event: LRM operation pingtest:0_monitor_0 (call=3, rc=7, cib-update=8, confirmed=true) not running
Jun 26 10:32:19 test4 lrmd: [3134]: info: rsc:p_controld:0 probe[6] (pid 3294)
Jun 26 10:32:19 test4 lrmd: [3134]: info: operation monitor[4] on sip for client 3137: pid 3170 exited with return code 7
Jun 26 10:32:19 test4 crmd: [3137]: info: process_lrm_event: LRM operation sip_monitor_0 (call=4, rc=7, cib-update=9, confirmed=true) not running
Jun 26 10:32:19 test4 apache[3171]: INFO: apache not running
Jun 26 10:32:19 test4 lrmd: [3134]: info: RA output: (p_controld:0:probe:stderr) dlm_controld.pcmk: no process found
Jun 26 10:32:19 test4 lrmd: [3134]: info: operation monitor[6] on p_controld:0 for client 3137: pid 3294 exited with return code 7
Jun 26 10:32:19 test4 crmd: [3137]: info: process_lrm_event: LRM operation p_controld:0_monitor_0 (call=6, rc=7, cib-update=10, confirmed=true) not running
Jun 26 10:32:19 test4 lrmd: [3134]: info: operation monitor[5] on apache for client 3137: pid 3171 exited with return code 7
Jun 26 10:32:19 test4 crmd: [3137]: info: process_lrm_event: LRM operation apache_monitor_0 (call=5, rc=7, cib-update=11, confirmed=true) not running
Jun 26 10:32:19 test4 attrd: [3135]: notice: attrd_trigger_update: Sending flush op to all hosts for: master-drbd:0 (10000)
Jun 26 10:32:19 test4 lrmd: [3134]: info: operation monitor[2] on drbd:0 for client 3137: pid 3168 exited with return code 8
Jun 26 10:32:19 test4 crmd: [3137]: info: process_lrm_event: LRM operation drbd:0_monitor_0 (call=2, rc=8, cib-update=12, confirmed=true) master
Jun 26 10:32:19 test4 attrd: [3135]: notice: attrd_trigger_update: Sending flush op to all hosts for: master-drbd:0 (10000)
Jun 26 10:32:19 test4 attrd: [3135]: notice: attrd_perform_update: Sent update 4: master-drbd:0=10000
Jun 26 10:32:20 test4 lrmd: [3134]: info: rsc:p_o2cb:0 probe[7] (pid 3330)
Jun 26 10:32:20 test4 lrmd: [3134]: info: rsc:fs_drbd:0 probe[8] (pid 3331)
Jun 26 10:32:20 test4 o2cb[3330]: INFO: configfs not mounted
Jun 26 10:32:20 test4 lrmd: [3134]: info: operation monitor[7] on p_o2cb:0 for client 3137: pid 3330 exited with return code 7
Jun 26 10:32:20 test4 crmd: [3137]: info: process_lrm_event: LRM operation p_o2cb:0_monitor_0 (call=7, rc=7, cib-update=13, confirmed=true) not running
Jun 26 10:32:20 test4 lrmd: [3134]: info: operation monitor[8] on fs_drbd:0 for client 3137: pid 3331 exited with return code 7
Jun 26 10:32:20 test4 crmd: [3137]: info: process_lrm_event: LRM operation fs_drbd:0_monitor_0 (call=8, rc=7, cib-update=14, confirmed=true) not running
Jun 26 10:32:20 test4 attrd: [3135]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)
Jun 26 10:32:20 test4 attrd: [3135]: notice: attrd_perform_update: Sent update 9: probe_complete=true
Jun 26 10:32:20 test4 lrmd: [3134]: info: rsc:pingtest:0 start[9] (pid 3392)
Jun 26 10:32:20 test4 lrmd: [3134]: info: rsc:p_controld:0 start[10] (pid 3393)
Jun 26 10:32:20 test4 lrmd: [3134]: info: RA output: (p_controld:0:start:stderr) dlm_controld.pcmk: no process found
Jun 26 10:32:20 test4 lrmd: [3134]: info: operation start[9] on pingtest:0 for client 3137: pid 3392 exited with return code 0
Jun 26 10:32:20 test4 crmd: [3137]: info: process_lrm_event: LRM operation pingtest:0_start_0 (call=9, rc=0, cib-update=15, confirmed=true) ok
Jun 26 10:32:20 test4 lrmd: [3134]: info: rsc:pingtest:0 monitor[11] (pid 3428)
Jun 26 10:32:20 test4 lrmd: [3134]: info: operation monitor[11] on pingtest:0 for client 3137: pid 3428 exited with return code 0
Jun 26 10:32:20 test4 crmd: [3137]: info: process_lrm_event: LRM operation pingtest:0_monitor_5000 (call=11, rc=0, cib-update=16, confirmed=false) ok
Jun 26 10:32:20 test4 cluster-dlm: [3452]: info: get_cluster_type: Cluster type is: 'openais'
Jun 26 10:32:20 test4 cluster-dlm: [3452]: info: init_ais_connection_classic: Creating connection to our Corosync plugin
Jun 26 10:32:20 test4 cluster-dlm: [3452]: info: init_ais_connection_classic: AIS connection established
Jun 26 10:32:20 test4 cluster-dlm: [3452]: info: get_ais_nodeid: Server details: id=302120970 uname=test4-node1 cname=pcmk
Jun 26 10:32:20 test4 cluster-dlm: [3452]: info: init_ais_connection_once: Connection to 'classic openais (with plugin)': established
Jun 26 10:32:20 test4 cluster-dlm: [3452]: info: crm_new_peer: Node test4-node1 now has id: 302120970
Jun 26 10:32:20 test4 cluster-dlm: [3452]: info: crm_new_peer: Node 302120970 is now known as test4-node1
Jun 26 10:32:20 test4 corosync[3101]:   [pcmk  ] info: pcmk_notify: Enabling node notifications for child 3452 (0xe47ef0)
Jun 26 10:32:20 test4 kernel: [   37.842184] dlm_controld.pc (3452): /proc/3452/oom_adj is deprecated, please use /proc/3452/oom_score_adj instead.
Jun 26 10:32:21 test4 cluster-dlm: [3452]: notice: ais_dispatch_message: Membership 56: quorum acquired
Jun 26 10:32:21 test4 cluster-dlm: [3452]: info: crm_new_peer: Node test4 now has id: 302252042
Jun 26 10:32:21 test4 cluster-dlm: [3452]: info: crm_new_peer: Node 302252042 is now known as test4
Jun 26 10:32:21 test4 cluster-dlm: [3452]: info: crm_update_peer: Node test4: id=302252042 state=member (new) addr=r(0) ip(10.0.4.18)  votes=1 born=8 seen=56 proc=00000000000000000000000000111312
Jun 26 10:32:21 test4 cluster-dlm: [3452]: info: crm_update_peer: Node test4-node1: id=302120970 state=member (new) addr=r(0) ip(10.0.2.18)  (new) votes=1 (new) born=56 seen=56 proc=00000000000000000000000000111312 (new)
Jun 26 10:32:21 test4 cluster-dlm: [3452]: info: crm_new_peer: Node test4-node2 now has id: 302186506
Jun 26 10:32:21 test4 cluster-dlm: [3452]: info: crm_new_peer: Node 302186506 is now known as test4-node2
Jun 26 10:32:21 test4 cluster-dlm: [3452]: info: crm_update_peer: Node test4-node2: id=302186506 state=member (new) addr=r(0) ip(10.0.3.18)  votes=1 born=52 seen=56 proc=00000000000000000000000000111312
Jun 26 10:32:22 test4 lrmd: [3134]: info: operation start[10] on p_controld:0 for client 3137: pid 3393 exited with return code 0
Jun 26 10:32:22 test4 crmd: [3137]: info: process_lrm_event: LRM operation p_controld:0_start_0 (call=10, rc=0, cib-update=17, confirmed=true) ok
Jun 26 10:32:22 test4 lrmd: [3134]: info: rsc:p_o2cb:0 start[12] (pid 3460)
Jun 26 10:32:22 test4 o2cb[3460]: INFO: ocfs2 not loaded
Jun 26 10:32:22 test4 o2cb[3460]: INFO: Starting p_o2cb:0
Jun 26 10:32:22 test4 kernel: [   39.321729] OCFS2 Node Manager 1.5.0
Jun 26 10:32:22 test4 kernel: [   39.363557] OCFS2 1.5.0
Jun 26 10:32:22 test4 ocfs2_controld: [3495]: info: get_cluster_type: Cluster type is: 'openais'
Jun 26 10:32:22 test4 ocfs2_controld: [3495]: info: init_ais_connection_classic: Creating connection to our Corosync plugin
Jun 26 10:32:22 test4 ocfs2_controld: [3495]: info: init_ais_connection_classic: AIS connection established
Jun 26 10:32:22 test4 ocfs2_controld: [3495]: info: get_ais_nodeid: Server details: id=302120970 uname=test4-node1 cname=pcmk
Jun 26 10:32:22 test4 ocfs2_controld: [3495]: info: init_ais_connection_once: Connection to 'classic openais (with plugin)': established
Jun 26 10:32:22 test4 ocfs2_controld: [3495]: info: crm_new_peer: Node test4-node1 now has id: 302120970
Jun 26 10:32:22 test4 ocfs2_controld: [3495]: info: crm_new_peer: Node 302120970 is now known as test4-node1
Jun 26 10:32:22 test4 corosync[3101]:   [pcmk  ] info: pcmk_notify: Enabling node notifications for child 3495 (0xe56a30)
Jun 26 10:32:23 test4 ocfs2_controld: [3495]: notice: ais_dispatch_message: Membership 56: quorum acquired
Jun 26 10:32:23 test4 ocfs2_controld: [3495]: info: crm_new_peer: Node test4 now has id: 302252042
Jun 26 10:32:23 test4 ocfs2_controld: [3495]: info: crm_new_peer: Node 302252042 is now known as test4
Jun 26 10:32:23 test4 ocfs2_controld: [3495]: info: crm_update_peer: Node test4: id=302252042 state=member (new) addr=r(0) ip(10.0.4.18)  votes=1 born=8 seen=56 proc=00000000000000000000000000111312
Jun 26 10:32:23 test4 ocfs2_controld: [3495]: info: crm_update_peer: Node test4-node1: id=302120970 state=member (new) addr=r(0) ip(10.0.2.18)  (new) votes=1 (new) born=56 seen=56 proc=00000000000000000000000000111312 (new)
Jun 26 10:32:23 test4 ocfs2_controld: [3495]: info: crm_new_peer: Node test4-node2 now has id: 302186506
Jun 26 10:32:23 test4 ocfs2_controld: [3495]: info: crm_new_peer: Node 302186506 is now known as test4-node2
Jun 26 10:32:23 test4 ocfs2_controld: [3495]: info: crm_update_peer: Node test4-node2: id=302186506 state=member (new) addr=r(0) ip(10.0.3.18)  votes=1 born=52 seen=56 proc=00000000000000000000000000111312
Jun 26 10:32:24 test4 lrmd: [3134]: info: operation start[12] on p_o2cb:0 for client 3137: pid 3460 exited with return code 0
Jun 26 10:32:24 test4 crmd: [3137]: info: process_lrm_event: LRM operation p_o2cb:0_start_0 (call=12, rc=0, cib-update=18, confirmed=true) ok
Jun 26 10:32:25 test4 attrd: [3135]: notice: attrd_trigger_update: Sending flush op to all hosts for: pingd (1000)
Jun 26 10:32:25 test4 attrd: [3135]: notice: attrd_perform_update: Sent update 14: pingd=1000
Jun 26 10:32:27 test4 lrmd: [3134]: info: rsc:fs_drbd:1 start[13] (pid 3538)
Jun 26 10:32:27 test4 Filesystem[3538]: INFO: Running start for /dev/drbd0 on /var/www
Jun 26 10:32:27 test4 lrmd: [3134]: info: RA output: (fs_drbd:1:start:stderr) FATAL: Module scsi_hostadapter not found.
Jun 26 10:32:28 test4 lrmd: [3134]: info: RA output: (fs_drbd:1:start:stderr) mount.ocfs2
Jun 26 10:32:28 test4 lrmd: [3134]: info: RA output: (fs_drbd:1:start:stderr) : 
Jun 26 10:32:28 test4 lrmd: [3134]: info: RA output: (fs_drbd:1:start:stderr) Cluster stack specified does not match the one currently running
Jun 26 10:32:28 test4 lrmd: [3134]: info: RA output: (fs_drbd:1:start:stderr)  
Jun 26 10:32:28 test4 lrmd: [3134]: info: RA output: (fs_drbd:1:start:stderr) while trying to join the group
Jun 26 10:32:28 test4 lrmd: [3134]: info: RA output: (fs_drbd:1:start:stderr) 
Jun 26 10:32:28 test4 Filesystem[3538]: ERROR: Couldn't mount filesystem /dev/drbd0 on /var/www
Jun 26 10:32:28 test4 lrmd: [3134]: info: operation start[13] on fs_drbd:1 for client 3137: pid 3538 exited with return code 1
Jun 26 10:32:28 test4 crmd: [3137]: info: process_lrm_event: LRM operation fs_drbd:1_start_0 (call=13, rc=1, cib-update=19, confirmed=true) unknown error
Jun 26 10:32:28 test4 attrd: [3135]: notice: attrd_ais_dispatch: Update relayed from test4
Jun 26 10:32:28 test4 attrd: [3135]: notice: attrd_trigger_update: Sending flush op to all hosts for: fail-count-fs_drbd:1 (INFINITY)
Jun 26 10:32:28 test4 attrd: [3135]: notice: attrd_perform_update: Sent update 17: fail-count-fs_drbd:1=INFINITY
Jun 26 10:32:28 test4 attrd: [3135]: notice: attrd_ais_dispatch: Update relayed from test4
Jun 26 10:32:28 test4 attrd: [3135]: notice: attrd_trigger_update: Sending flush op to all hosts for: last-failure-fs_drbd:1 (1372235547)
Jun 26 10:32:28 test4 attrd: [3135]: notice: attrd_perform_update: Sent update 20: last-failure-fs_drbd:1=1372235547
Jun 26 10:32:28 test4 lrmd: [3134]: info: rsc:fs_drbd:1 stop[14] (pid 3610)
Jun 26 10:32:28 test4 Filesystem[3610]: INFO: Running stop for /dev/drbd0 on /var/www
Jun 26 10:32:29 test4 lrmd: [3134]: info: operation stop[14] on fs_drbd:1 for client 3137: pid 3610 exited with return code 0
Jun 26 10:32:29 test4 crmd: [3137]: info: process_lrm_event: LRM operation fs_drbd:1_stop_0 (call=14, rc=0, cib-update=20, confirmed=true) ok
Jun 26 10:32:29 test4 lrmd: [3134]: info: rsc:p_o2cb:0 stop[15] (pid 3651)
Jun 26 10:32:29 test4 ocfs2_controld: kill node 302186506 - ocfs2_controld PROCDOWN
Jun 26 10:32:29 test4 stonith-ng: [3133]: info: initiate_remote_stonith_op: Initiating remote operation off for 302186506: 1be401d4-547d-4f59-b380-a5e996c70a31
Jun 26 10:32:29 test4 stonith-ng: [3133]: info: stonith_command: Processed st_query from test4-node1: rc=0
Jun 26 10:32:29 test4 stonith-ng: [3133]: info: crm_new_peer: Node test4 now has id: 302252042
Jun 26 10:32:29 test4 stonith-ng: [3133]: info: crm_new_peer: Node 302252042 is now known as test4
Jun 26 10:32:29 test4 stonith-ng: [3133]: info: crm_new_peer: Node test4-node2 now has id: 302186506
Jun 26 10:32:29 test4 stonith-ng: [3133]: info: crm_new_peer: Node 302186506 is now known as test4-node2
Jun 26 10:32:29 test4 o2cb[3651]: INFO: Stopping p_o2cb:0
Jun 26 10:32:29 test4 o2cb[3651]: INFO: Stopping ocfs2_controld.pcmk
Jun 26 10:32:29 test4 lrmd: [3134]: info: RA output: (p_o2cb:0:stop:stderr) /usr/lib/ocf/resource.d//pacemaker/o2cb: line 171: killproc: command not found
Jun 26 10:32:30 test4 lrmd: [3134]: info: operation stop[15] on p_o2cb:0 for client 3137: pid 3651 exited with return code 0
Jun 26 10:32:30 test4 crmd: [3137]: info: process_lrm_event: LRM operation p_o2cb:0_stop_0 (call=15, rc=0, cib-update=21, confirmed=true) ok
Jun 26 10:32:30 test4 lrmd: [3134]: info: rsc:p_controld:0 stop[16] (pid 3700)
Jun 26 10:32:30 test4 dlm_controld.pcmk: [3452]: notice: terminate_ais_connection: Disconnecting from Corosync
Jun 26 10:32:30 test4 kernel: [   47.319346] dlm: closing connection to node 302186506
Jun 26 10:32:30 test4 kernel: [   47.319361] dlm: closing connection to node 302120970
Jun 26 10:32:30 test4 kernel: [   47.319371] dlm: closing connection to node 302252042
Jun 26 10:32:31 test4 lrmd: [3134]: info: RA output: (p_controld:0:stop:stderr) dlm_controld.pcmk: no process found
Jun 26 10:32:32 test4 lrmd: [3134]: info: operation stop[16] on p_controld:0 for client 3137: pid 3700 exited with return code 0
Jun 26 10:32:32 test4 crmd: [3137]: info: process_lrm_event: LRM operation p_controld:0_stop_0 (call=16, rc=0, cib-update=22, confirmed=true) ok
Jun 26 10:32:41 test4 stonith-ng: [3133]: ERROR: remote_op_done: Operation off of 302186506 by <no-one> for test4-node1[54fdc547-e40b-45ba-be79-2bdf1d021e7d]: Operation timed out
Jun 26 10:32:41 test4 ocfs2_controld: Unknown result when kicking node 302186506 from the cluster
Jun 26 10:32:41 test4 crmd: [3137]: notice: tengine_stonith_notify: Peer 302186506 was not terminated (off) by <anyone> for test4-node1: Operation timed out (ref=1be401d4-547d-4f59-b380-a5e996c70a31)
Jun 26 10:32:41 test4 ocfs2_controld: Error from dlmc_fs_result: Invalid argument
Jun 26 10:32:41 test4 ocfs2_controld: dlmcontrol connection died
Jun 26 10:32:41 test4 ocfs2_controld: [3495]: notice: terminate_ais_connection: Disconnecting from Corosync

> Also in the above thread the 4th to last post from Andreas Kurz
> @Hastexo mentions this:  
>      > This message was immediately followed by "Wrong stack" errors,
>      > and    
 
>      check the content of /sysfs/fs/ocfs2/loaded_cluster_plugins ...
> and if you have that configfile and it contains the value "user" this
> is a good sign you have started ocfs2/o2cb via init  

It says user but the init-scripts are still disabled and not used. Also
it says user since I loaded the ocfs2_stack_user module, before that
change it was ocfs2.

I also tried reformatting the cluster-drive, now I'm able to mount it
by hand, but corosync still fails.

Best regards
Denis Witt




More information about the Pacemaker mailing list