[Pacemaker] Corosync and Pacemaker Hangs

Fri Sep 12 04:06:15 UTC 2014

12.09.2014 05:00, Norbert Kiam Maclang wrote:
> Hi,
> 
> After adding resource level fencing on drbd, I still ended up having
> problems with timeouts on drbd. Is there a recommended settings for
> this? I followed what is written in the drbd documentation -
> http://www.drbd.org/users-guide-emb/s-pacemaker-crm-drbd-backed-service.html
> , Another thing I can't understand is why during initial tests, even I
> reboot the vms several times, failover works. But after I soak it for a
> couple of hours (say for example 8 hours or more) and continue with the
> tests, it will not failover and experience split brain. I confirmed it
> though that everything is healthy before performing a reboot. Disk
> health and network is good, drbd is synced, time beetween servers is good.

I recall I've seen something similar a year ago (near the time your
pacemaker version is dated). I do not remember what was the exact
problem cause, but I saw that drbd RA timeouts because it waits for
something (fencing) in the kernel space to be done. drbd calls userspace
scripts from within kernelspace, and you'll see them in the process list
with the drbd kernel thread as a parent.

I'd also upgrade your corosync configuration from "member" to "nodelist"
syntax, specifying "name" parameter together with ring0_addr for nodes
(that parameter is not referenced in corosync docs but should be
somewhere in the Pacemaker Explained - it is used only by the pacemaker).

Also there is trace_ra functionality support in both pacemaker and crmsh
(cannot say if that is supported in versions you have though, probably
yes) so you may want to play with that to get the exact picture from the
resource agent.

Anyways, upgrading to 1.1.12 and more recent crmsh is nice to have for
you because you may be just hitting a long-ago solved and forgotten
bug/issue.

Concerning your
> 	expected-quorum-votes="1"

You need to configure votequorum in corosync with two_node: 1 instead of
that line.

> 
> # Logs:
> node01 lrmd[1036]:  warning: child_timeout_callback:
> drbd_pg_monitor_29000 process (PID 27744) timed out
> node01 lrmd[1036]:  warning: operation_finished:
> drbd_pg_monitor_29000:27744 - timed out after 20000ms
> node01 crmd[1039]:    error: process_lrm_event: LRM operation
> drbd_pg_monitor_29000 (69) Timed Out (timeout=20000ms)
> node01 crmd[1039]:  warning: update_failcount: Updating failcount for
> drbd_pg on tyo1mqdb01p after failed monitor: rc=1 (update=value++,
> time=1410486352)
> 
> Thanks,
> Kiam
> 
> On Thu, Sep 11, 2014 at 6:58 PM, Norbert Kiam Maclang
> <norbert.kiam.maclang at gmail.com <mailto:norbert.kiam.maclang at gmail.com>>
> wrote:
> 
>     Thank you Vladislav.
> 
>     I have configured resource level fencing on drbd and removed
>     wfc-timeout and defr-wfc-timeout (is this required?). My drbd
>     configuration is now:
> 
>     resource pg {
>       device /dev/drbd0;
>       disk /dev/vdb;
>       meta-disk internal;
>       disk {
>         fencing resource-only;
>         on-io-error detach;
>         resync-rate 40M;
>       }
>       handlers {
>         fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
>         after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
>         split-brain "/usr/lib/drbd/notify-split-brain.sh nkbm";
>       }
>       on node01 {
>         address 10.2.136.52:7789 <http://10.2.136.52:7789>;
>       }
>       on node02 {
>         address 10.2.136.55:7789 <http://10.2.136.55:7789>;
>       }
>       net {
>         verify-alg md5;
>         after-sb-0pri discard-zero-changes;
>         after-sb-1pri discard-secondary;
>         after-sb-2pri disconnect;
>       }
>     }
> 
>     Failover works on my initial test (restarting both nodes alternately
>     - this always works). Will wait for a couple of hours after doing a
>     failover test again (Which always fail on my previous setup).
> 
>     Thank you!
>     Kiam
> 
>     On Thu, Sep 11, 2014 at 2:14 PM, Vladislav Bogdanov
>     <bubble at hoster-ok.com <mailto:bubble at hoster-ok.com>> wrote:
> 
>         11.09.2014 05:57, Norbert Kiam Maclang wrote:
>         > Is this something to do with quorum? But I already set
> 
>         You'd need to configure fencing at the drbd resources level.
> 
>         http://www.drbd.org/users-guide-emb/s-pacemaker-fencing.html#s-pacemaker-fencing-cib
> 
> 
>         >
>         > property no-quorum-policy="ignore" \
>         > expected-quorum-votes="1"
>         >
>         > Thanks in advance,
>         > Kiam
>         >
>         > On Thu, Sep 11, 2014 at 10:09 AM, Norbert Kiam Maclang
>         > <norbert.kiam.maclang at gmail.com
>         <mailto:norbert.kiam.maclang at gmail.com>
>         <mailto:norbert.kiam.maclang at gmail.com
>         <mailto:norbert.kiam.maclang at gmail.com>>>
>         > wrote:
>         >
>         >     Hi,
>         >
>         >     Please help me understand what is causing the problem. I
>         have a 2
>         >     node cluster running on vms using KVM. Each vm (I am using
>         Ubuntu
>         >     14.04) runs on a separate hypervisor on separate machines.
>         All are
>         >     working well during testing (I restarted the vms
>         alternately), but
>         >     after a day when I kill the other node, I always end up
>         corosync and
>         >     pacemaker hangs on the surviving node. Date and time on
>         the vms are
>         >     in sync, I use unicast, tcpdump shows both nodes exchanges,
>         >     confirmed that DRBD is healthy and crm_mon show good
>         status before I
>         >     kill the other node. Below are my configurations and
>         versions I used:
>         >
>         >     corosync             2.3.3-1ubuntu1
>         >     crmsh                1.2.5+hg1034-1ubuntu3
>         >     drbd8-utils          2:8.4.4-1ubuntu1
>         >     libcorosync-common4  2.3.3-1ubuntu1
>         >     libcrmcluster4       1.1.10+git20130802-1ubuntu2
>         >     libcrmcommon3        1.1.10+git20130802-1ubuntu2
>         >     libcrmservice1       1.1.10+git20130802-1ubuntu2
>         >     pacemaker            1.1.10+git20130802-1ubuntu2
>         >     pacemaker-cli-utils  1.1.10+git20130802-1ubuntu2
>         >     postgresql-9.3       9.3.5-0ubuntu0.14.04.1
>         >
>         >     # /etc/corosync/corosync:
>         >     totem {
>         >     version: 2
>         >     token: 3000
>         >     token_retransmits_before_loss_const: 10
>         >     join: 60
>         >     consensus: 3600
>         >     vsftype: none
>         >     max_messages: 20
>         >     clear_node_high_bit: yes
>         >      secauth: off
>         >      threads: 0
>         >      rrp_mode: none
>         >      interface {
>         >                     member {
>         >                             memberaddr: 10.2.136.56
>         >                     }
>         >                     member {
>         >                             memberaddr: 10.2.136.57
>         >                     }
>         >                     ringnumber: 0
>         >                     bindnetaddr: 10.2.136.0
>         >                     mcastport: 5405
>         >             }
>         >             transport: udpu
>         >     }
>         >     amf {
>         >     mode: disabled
>         >     }
>         >     quorum {
>         >     provider: corosync_votequorum
>         >     expected_votes: 1
>         >     }
>         >     aisexec {
>         >             user:   root
>         >             group:  root
>         >     }
>         >     logging {
>         >             fileline: off
>         >             to_stderr: yes
>         >             to_logfile: no
>         >             to_syslog: yes
>         >     syslog_facility: daemon
>         >             debug: off
>         >             timestamp: on
>         >             logger_subsys {
>         >                     subsys: AMF
>         >                     debug: off
>         >                     tags:
>         enter|leave|trace1|trace2|trace3|trace4|trace6
>         >             }
>         >     }
>         >
>         >     # /etc/corosync/service.d/pcmk:
>         >     service {
>         >       name: pacemaker
>         >       ver: 1
>         >     }
>         >
>         >     /etc/drbd.d/global_common.conf:
>         >     global {
>         >     usage-count no;
>         >     }
>         >
>         >     common {
>         >     net {
>         >                     protocol C;
>         >     }
>         >     }
>         >
>         >     # /etc/drbd.d/pg.res:
>         >     resource pg {
>         >       device /dev/drbd0;
>         >       disk /dev/vdb;
>         >       meta-disk internal;
>         >       startup {
>         >         wfc-timeout 15;
>         >         degr-wfc-timeout 60;
>         >       }
>         >       disk {
>         >         on-io-error detach;
>         >         resync-rate 40M;
>         >       }
>         >       on node01 {
>         >         address 10.2.136.56:7789 <http://10.2.136.56:7789>
>         <http://10.2.136.56:7789>;
>         >       }
>         >       on node02 {
>         >         address 10.2.136.57:7789 <http://10.2.136.57:7789>
>         <http://10.2.136.57:7789>;
>         >       }
>         >       net {
>         >         verify-alg md5;
>         >         after-sb-0pri discard-zero-changes;
>         >         after-sb-1pri discard-secondary;
>         >         after-sb-2pri disconnect;
>         >       }
>         >     }
>         >
>         >     # Pacemaker configuration:
>         >     node $id="167938104" node01
>         >     node $id="167938105" node02
>         >     primitive drbd_pg ocf:linbit:drbd \
>         >     params drbd_resource="pg" \
>         >     op monitor interval="29s" role="Master" \
>         >     op monitor interval="31s" role="Slave"
>         >     primitive fs_pg ocf:heartbeat:Filesystem \
>         >     params device="/dev/drbd0"
>         directory="/var/lib/postgresql/9.3/main"
>         >     fstype="ext4"
>         >     primitive ip_pg ocf:heartbeat:IPaddr2 \
>         >     params ip="10.2.136.59" cidr_netmask="24" nic="eth0"
>         >     primitive lsb_pg lsb:postgresql
>         >     group PGServer fs_pg lsb_pg ip_pg
>         >     ms ms_drbd_pg drbd_pg \
>         >     meta master-max="1" master-node-max="1" clone-max="2"
>         >     clone-node-max="1" notify="true"
>         >     colocation pg_on_drbd inf: PGServer ms_drbd_pg:Master
>         >     order pg_after_drbd inf: ms_drbd_pg:promote PGServer:start
>         >     property $id="cib-bootstrap-options" \
>         >     dc-version="1.1.10-42f2063" \
>         >     cluster-infrastructure="corosync" \
>         >     stonith-enabled="false" \
>         >     no-quorum-policy="ignore"
>         >     rsc_defaults $id="rsc-options" \
>         >     resource-stickiness="100"
>         >
>         >     # Logs on node01
>         >     Sep 10 10:25:33 node01 crmd[1019]:   notice:
>         peer_update_callback:
>         >     Our peer on the DC is dead
>         >     Sep 10 10:25:33 node01 crmd[1019]:   notice:
>         do_state_transition:
>         >     State transition S_NOT_DC -> S_ELECTION [ input=I_ELECTION
>         >     cause=C_CRMD_STATUS_CALLBACK origin=peer_update_callback ]
>         >     Sep 10 10:25:33 node01 crmd[1019]:   notice:
>         do_state_transition:
>         >     State transition S_ELECTION -> S_INTEGRATION [
>         input=I_ELECTION_DC
>         >     cause=C_FSA_INTERNAL origin=do_election_check ]
>         >     Sep 10 10:25:33 node01 corosync[940]:   [TOTEM ] A new
>         membership
>         >     (10.2.136.56:52 <http://10.2.136.56:52>
>         <http://10.2.136.56:52>) was formed. Members left:
>         >     167938105
>         >     Sep 10 10:25:45 node01 kernel: [74452.740024] d-con pg:
>         PingAck did
>         >     not arrive in time.
>         >     Sep 10 10:25:45 node01 kernel: [74452.740169] d-con pg: peer(
>         >     Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk(
>         >     UpToDate -> DUnknown )
>         >     Sep 10 10:25:45 node01 kernel: [74452.740987] d-con pg:
>         asender
>         >     terminated
>         >     Sep 10 10:25:45 node01 kernel: [74452.740999] d-con pg:
>         Terminating
>         >     drbd_a_pg
>         >     Sep 10 10:25:45 node01 kernel: [74452.741235] d-con pg:
>         Connection
>         >     closed
>         >     Sep 10 10:25:45 node01 kernel: [74452.741259] d-con pg: conn(
>         >     NetworkFailure -> Unconnected )
>         >     Sep 10 10:25:45 node01 kernel: [74452.741260] d-con pg:
>         receiver
>         >     terminated
>         >     Sep 10 10:25:45 node01 kernel: [74452.741261] d-con pg:
>         Restarting
>         >     receiver thread
>         >     Sep 10 10:25:45 node01 kernel: [74452.741262] d-con pg:
>         receiver
>         >     (re)started
>         >     Sep 10 10:25:45 node01 kernel: [74452.741269] d-con pg: conn(
>         >     Unconnected -> WFConnection )
>         >     Sep 10 10:26:12 node01 lrmd[1016]:  warning:
>         child_timeout_callback:
>         >     drbd_pg_monitor_31000 process (PID 8445) timed out
>         >     Sep 10 10:26:12 node01 lrmd[1016]:  warning:
>         operation_finished:
>         >     drbd_pg_monitor_31000:8445 - timed out after 20000ms
>         >     Sep 10 10:26:12 node01 crmd[1019]:    error:
>         process_lrm_event: LRM
>         >     operation drbd_pg_monitor_31000 (30) Timed Out
>         (timeout=20000ms)
>         >     Sep 10 10:26:32 node01 crmd[1019]:  warning: cib_rsc_callback:
>         >     Resource update 23 failed: (rc=-62) Timer expired
>         >     Sep 10 10:27:03 node01 lrmd[1016]:  warning:
>         child_timeout_callback:
>         >     drbd_pg_monitor_31000 process (PID 8693) timed out
>         >     Sep 10 10:27:03 node01 lrmd[1016]:  warning:
>         operation_finished:
>         >     drbd_pg_monitor_31000:8693 - timed out after 20000ms
>         >     Sep 10 10:27:54 node01 lrmd[1016]:  warning:
>         child_timeout_callback:
>         >     drbd_pg_monitor_31000 process (PID 8938) timed out
>         >     Sep 10 10:27:54 node01 lrmd[1016]:  warning:
>         operation_finished:
>         >     drbd_pg_monitor_31000:8938 - timed out after 20000ms
>         >     Sep 10 10:28:33 node01 crmd[1019]:    error: crm_timer_popped:
>         >     Integration Timer (I_INTEGRATED) just popped in state
>         S_INTEGRATION!
>         >     (180000ms)
>         >     Sep 10 10:28:33 node01 crmd[1019]:  warning:
>         do_state_transition:
>         >     Progressed to state S_FINALIZE_JOIN after C_TIMER_POPPED
>         >     Sep 10 10:28:33 node01 crmd[1019]:  warning:
>         do_state_transition: 1
>         >     cluster nodes failed to respond to the join offer.
>         >     Sep 10 10:28:33 node01 crmd[1019]:   notice:
>         crmd_join_phase_log:
>         >     join-1: node02=none
>         >     Sep 10 10:28:33 node01 crmd[1019]:   notice:
>         crmd_join_phase_log:
>         >     join-1: node01=welcomed
>         >     Sep 10 10:28:45 node01 lrmd[1016]:  warning:
>         child_timeout_callback:
>         >     drbd_pg_monitor_31000 process (PID 9185) timed out
>         >     Sep 10 10:28:45 node01 lrmd[1016]:  warning:
>         operation_finished:
>         >     drbd_pg_monitor_31000:9185 - timed out after 20000ms
>         >     Sep 10 10:29:36 node01 lrmd[1016]:  warning:
>         child_timeout_callback:
>         >     drbd_pg_monitor_31000 process (PID 9432) timed out
>         >     Sep 10 10:29:36 node01 lrmd[1016]:  warning:
>         operation_finished:
>         >     drbd_pg_monitor_31000:9432 - timed out after 20000ms
>         >     Sep 10 10:30:27 node01 lrmd[1016]:  warning:
>         child_timeout_callback:
>         >     drbd_pg_monitor_31000 process (PID 9680) timed out
>         >     Sep 10 10:30:27 node01 lrmd[1016]:  warning:
>         operation_finished:
>         >     drbd_pg_monitor_31000:9680 - timed out after 20000ms
>         >     Sep 10 10:31:18 node01 lrmd[1016]:  warning:
>         child_timeout_callback:
>         >     drbd_pg_monitor_31000 process (PID 9927) timed out
>         >     Sep 10 10:31:18 node01 lrmd[1016]:  warning:
>         operation_finished:
>         >     drbd_pg_monitor_31000:9927 - timed out after 20000ms
>         >     Sep 10 10:32:09 node01 lrmd[1016]:  warning:
>         child_timeout_callback:
>         >     drbd_pg_monitor_31000 process (PID 10174) timed out
>         >     Sep 10 10:32:09 node01 lrmd[1016]:  warning:
>         operation_finished:
>         >     drbd_pg_monitor_31000:10174 - timed out after 20000ms
>         >
>         >     #crm_mon on node01 before I kill the other vm:
>         >     Stack: corosync
>         >     Current DC: node02 (167938104) - partition with quorum
>         >     Version: 1.1.10-42f2063
>         >     2 Nodes configured
>         >     5 Resources configured
>         >
>         >     Online: [ node01 node02 ]
>         >
>         >      Resource Group: PGServer
>         >          fs_pg      (ocf::heartbeat:Filesystem):    Started node02
>         >          lsb_pg     (lsb:postgresql):       Started node02
>         >          ip_pg      (ocf::heartbeat:IPaddr2):       Started node02
>         >      Master/Slave Set: ms_drbd_pg [drbd_pg]
>         >          Masters: [ node02 ]
>         >          Slaves: [ node01 ]
>         >
>         >     Thank you,
>         >     Kiam
>         >
>         >
>         >
>         >
>         > _______________________________________________
>         > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>         <mailto:Pacemaker at oss.clusterlabs.org>
>         > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>         >
>         > Project Home: http://www.clusterlabs.org
>         > Getting started:
>         http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>         > Bugs: http://bugs.clusterlabs.org
>         >
> 
> 
>         _______________________________________________
>         Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>         <mailto:Pacemaker at oss.clusterlabs.org>
>         http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
>         Project Home: http://www.clusterlabs.org
>         Getting started:
>         http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>         Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>