[Pacemaker] node status does not change even if pacemakerd dies

Wed Jan 9 09:57:02 UTC 2013

Hi Andrew,

I have another question about this subject.
Even if pengine, stonithd, and attrd crash after pacemakerd is killed
(for example, killed by OOM_Killer), node status does not change.

* pseudo testcase

  [dev1 ~]$ crm configure show
  node $id="2472913088" dev2
  node $id="2506467520" dev1
  primitive prmDummy ocf:pacemaker:Dummy \
          op monitor on-fail="restart" interval="10s"
  property $id="cib-bootstrap-options" \
          dc-version="1.1.8-d20d06f" \
          cluster-infrastructure="corosync" \
          no-quorum-policy="ignore" \
          stonith-enabled="false" \
          startup-fencing="false"
  rsc_defaults $id="rsc-options" \
          resource-stickiness="INFINITY" \
          migration-threshold="1"

  [dev1 ~]$ pkill -9 pacemakerd
  [dev1 ~]$ pkill -9 pengine
  [dev1 ~]$ pkill -9 stonithd
  [dev1 ~]$ pkill -9 attrd

  [dev1 ~]$ ps -ef|egrep 'corosync|pacemaker'
  root   19124    1  0 14:27 ?     00:00:01 corosync
  496    19144    1  0 14:27 ?     00:00:00 /usr/libexec/pacemaker/cib
  root   19146    1  0 14:27 ?     00:00:00 /usr/libexec/pacemaker/lrmd
  496    19149    1  0 14:27 ?     00:00:00 /usr/libexec/pacemaker/crmd

  [dev1 ~]$ crm_mon -1
   :
  Stack: corosync
  Current DC: dev2 (2472913088) - partition with quorum
  Version: 1.1.8-d20d06f
  2 Nodes configured, unknown expected votes
  1 Resources configured.

  Online: [ dev1 dev2 ]

   prmDummy       (ocf::pacemaker:Dummy): Started dev1

Node (dev1) remains Online.
When other processes such as lrmd crash, it becomes "UNCLEAN (offline)".
Is this a bug? Or specifications?

Best Regards,
Kazunori INOUE

(13.01.08 09:16), Andrew Beekhof wrote:
> On Wed, Dec 19, 2012 at 8:15 PM, Kazunori INOUE
> <inouekazu at intellilink.co.jp> wrote:
>> (12.12.13 08:26), Andrew Beekhof wrote:
>>>
>>> On Wed, Dec 12, 2012 at 8:02 PM, Kazunori INOUE
>>> <inouekazu at intellilink.co.jp> wrote:
>>>>
>>>>
>>>> Hi,
>>>>
>>>> I recognize that pacemakerd is much less likely to crash.
>>>> However, a possibility of being killed by OOM_Killer etc. is not 0%.
>>>
>>>
>>> True.  Although we just established in another thread that we don't
>>> have any leaks :)
>>>
>>>> So I think that a user gets confused. since behavior at the time of
>>>> process
>>>> death differs even if pacemakerd is running.
>>>>
>>>> case A)
>>>>    When pacemakerd and other processes (crmd etc.) are the parent-child
>>>> relation.
>>>>
>>>
>>> [snip]
>>>
>>>>
>>>>    For example, crmd died.
>>>>    However, since it is relaunched, the state of the cluster is not
>>>> affected.
>>>
>>>
>>> Right.
>>>
>>> [snip]
>>>
>>>>
>>>> case B)
>>>>    When pacemakerd and other processes are NOT the parent-child relation.
>>>>    Although pacemakerd was killed, it assumed the state where it was
>>>> respawned
>>>> by Upstart.
>>>>
>>>>     $ service corosync start ; service pacemaker start
>>>>     $ pkill -9 pacemakerd
>>>>     $ ps -ef|egrep 'corosync|pacemaker|UID'
>>>>     UID      PID  PPID  C STIME TTY       TIME CMD
>>>>     root   21091     1  1 14:52 ?     00:00:00 corosync
>>>>     496    21099     1  0 14:52 ?     00:00:00 /usr/libexec/pacemaker/cib
>>>>     root   21100     1  0 14:52 ?     00:00:00
>>>> /usr/libexec/pacemaker/stonithd
>>>>     root   21101     1  0 14:52 ?     00:00:00 /usr/libexec/pacemaker/lrmd
>>>>     496    21102     1  0 14:52 ?     00:00:00
>>>> /usr/libexec/pacemaker/attrd
>>>>     496    21103     1  0 14:52 ?     00:00:00
>>>> /usr/libexec/pacemaker/pengine
>>>>     496    21104     1  0 14:52 ?     00:00:00 /usr/libexec/pacemaker/crmd
>>>>     root   21128     1  1 14:53 ?     00:00:00 /usr/sbin/pacemakerd
>>>
>>>
>>> Yep, looks right.
>>>
>>
>> Hi Andrew,
>>
>> We discussed this behavior.
>> Behavior when pacemakerd and other processes are not parent-child
>> relation (case B) reached the conclusion that there is room for
>> improvement.
>>
>> Since not all users are experts, they may kill pacemakerd accidentally.
>> Such a user will get confused if the behavior after crmd death changes
>> with the following conditions.
>> case A: pacemakerd and others (crmd etc.) are the parent-child relation.
>> case B: pacemakerd and others are not the parent-child relation.
>>
>> So, we want to *always* obtain the same behavior as the case where
>> there is parent-child relation.
>> That is, when crmd etc. die, we want pacemaker to always relaunch
>> the process always immediately.
>
> No. Sorry.
> Writing features to satisfy an artificial test case is not a good practice.
>
> We can speed up the failure detection for case B (I'll agree that 60s
> is way too long, 5s or 2s might be better depending on the load is
> creates), but causing downtime now to _maybe_ avoid downtime in the
> future makes no sense.
> Especially when you consider that the node will likely be fenced if
> the crmd fails anyway.
>
> Take a look at the logs from a some ComponentFail test runs and you'll
> see that the parent-child relationship regularly _fails_ to prevent
> downtime.
>
>>
>> Regards,
>> Kazunori INOUE
>>
>>
>>>>    In this case, the node will be set to UNCLEAN if crmd dies.
>>>>    That is, the node will be fenced if there is stonith resource.
>>>
>>>
>>> Which is exactly what happens if only pacemakerd is killed with your
>>> proposal.
>>> Except now you have time to do a graceful pacemaker restart to
>>> re-establish the parent-child relationship.
>>>
>>> If you want to compare B with something, it needs to be with the old
>>> "children terminate if pacemakerd dies" strategy.
>>> Which is:
>>>
>>>>     $ service corosync start ; service pacemaker start
>>>>     $ pkill -9 pacemakerd
>>>>    ... the node will be set to UNCLEAN
>>>
>>>
>>> Old way: always downtime because children terminate which triggers fencing
>>> Our way: no downtime unless there is an additional failure (to the cib or
>>> crmd)
>>>
>>> Given that we're trying for HA, the second seems preferable.
>>>
>>>>
>>>>     $ pkill -9 crmd
>>>>     $ crm_mon -1
>>>>     Last updated: Wed Dec 12 14:53:48 2012
>>>>     Last change: Wed Dec 12 14:53:10 2012 via crmd on dev2
>>>>
>>>>     Stack: corosync
>>>>     Current DC: dev2 (2472913088) - partition with quorum
>>>>     Version: 1.1.8-3035414
>>>>
>>>>     2 Nodes configured, unknown expected votes
>>>>     0 Resources configured.
>>>>
>>>>     Node dev1 (2506467520): UNCLEAN (online)
>>>>     Online: [ dev2 ]
>>>>
>>>>
>>>> How about making behavior selectable with an option?
>>>
>>>
>>> MORE_DOWNTIME_PLEASE=(true|false) ?
>>>
>>>>
>>>> When pacemakerd dies,
>>>> mode A) which behaves in an existing way. (default)
>>>> mode B) which makes the node UNCLEAN.
>>>>
>>>> Best Regards,
>>>> Kazunori INOUE
>>>>
>>>>
>>>>
>>>>> Making stop work when there is no pacemakerd process is a different
>>>>> matter. We can make that work.
>>>>>
>>>>>>
>>>>>> Though the best solution is to relaunch pacemakerd, if it is difficult,
>>>>>> I think that a shortcut method is to make a node unclean.
>>>>>>
>>>>>>
>>>>>> And now, I tried Upstart a little bit.
>>>>>>
>>>>>> 1) started the corosync and pacemaker.
>>>>>>
>>>>>>     $ cat /etc/init/pacemaker.conf
>>>>>>     respawn
>>>>>>     script
>>>>>>         [ -f /etc/sysconfig/pacemaker ] && {
>>>>>>             . /etc/sysconfig/pacemaker
>>>>>>         }
>>>>>>         exec /usr/sbin/pacemakerd
>>>>>>     end script
>>>>>>
>>>>>>     $ service co start
>>>>>>     Starting Corosync Cluster Engine (corosync):               [  OK  ]
>>>>>>     $ initctl start pacemaker
>>>>>>     pacemaker start/running, process 4702
>>>>>>
>>>>>>
>>>>>>     $ ps -ef|egrep 'corosync|pacemaker'
>>>>>>     root   4695     1  0 17:21 ?    00:00:00 corosync
>>>>>>     root   4702     1  0 17:21 ?    00:00:00 /usr/sbin/pacemakerd
>>>>>>     496    4703  4702  0 17:21 ?    00:00:00 /usr/libexec/pacemaker/cib
>>>>>>     root   4704  4702  0 17:21 ?    00:00:00
>>>>>> /usr/libexec/pacemaker/stonithd
>>>>>>     root   4705  4702  0 17:21 ?    00:00:00 /usr/libexec/pacemaker/lrmd
>>>>>>     496    4706  4702  0 17:21 ?    00:00:00
>>>>>> /usr/libexec/pacemaker/attrd
>>>>>>     496    4707  4702  0 17:21 ?    00:00:00
>>>>>> /usr/libexec/pacemaker/pengine
>>>>>>     496    4708  4702  0 17:21 ?    00:00:00 /usr/libexec/pacemaker/crmd
>>>>>>
>>>>>> 2) killed pacemakerd.
>>>>>>
>>>>>>     $ pkill -9 pacemakerd
>>>>>>
>>>>>>     $ ps -ef|egrep 'corosync|pacemaker'
>>>>>>     root   4695     1  0 17:21 ?    00:00:01 corosync
>>>>>>     496    4703     1  0 17:21 ?    00:00:00 /usr/libexec/pacemaker/cib
>>>>>>     root   4704     1  0 17:21 ?    00:00:00
>>>>>> /usr/libexec/pacemaker/stonithd
>>>>>>     root   4705     1  0 17:21 ?    00:00:00 /usr/libexec/pacemaker/lrmd
>>>>>>     496    4706     1  0 17:21 ?    00:00:00
>>>>>> /usr/libexec/pacemaker/attrd
>>>>>>     496    4707     1  0 17:21 ?    00:00:00
>>>>>> /usr/libexec/pacemaker/pengine
>>>>>>     496    4708     1  0 17:21 ?    00:00:00 /usr/libexec/pacemaker/crmd
>>>>>>     root   4760     1  1 17:24 ?    00:00:00 /usr/sbin/pacemakerd
>>>>>>
>>>>>> 3) then I stopped pacemakerd. however, some processes did not stop.
>>>>>>
>>>>>>     $ initctl stop pacemaker
>>>>>>     pacemaker stop/waiting
>>>>>>
>>>>>>
>>>>>>     $ ps -ef|egrep 'corosync|pacemaker'
>>>>>>     root   4695     1  0 17:21 ?    00:00:01 corosync
>>>>>>     496    4703     1  0 17:21 ?    00:00:00 /usr/libexec/pacemaker/cib
>>>>>>     root   4704     1  0 17:21 ?    00:00:00
>>>>>> /usr/libexec/pacemaker/stonithd
>>>>>>     root   4705     1  0 17:21 ?    00:00:00 /usr/libexec/pacemaker/lrmd
>>>>>>     496    4706     1  0 17:21 ?    00:00:00
>>>>>> /usr/libexec/pacemaker/attrd
>>>>>>     496    4707     1  0 17:21 ?    00:00:00
>>>>>> /usr/libexec/pacemaker/pengine
>>>>>>
>>>>>> Best Regards,
>>>>>> Kazunori INOUE
>>>>>>
>>>>>>
>>>>>>>>> This isnt the case when the plugin is in use though, but then I'd
>>>>>>>>> also
>>>>>>>>> have expected most of the processes to die also.
>>>>>>>>>
>>>>>>>> Since node status will also change if such a result is brought,
>>>>>>>> we desire to become so.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ----
>>>>>>>>>> $ cat /etc/redhat-release
>>>>>>>>>> Red Hat Enterprise Linux Server release 6.3 (Santiago)
>>>>>>>>>>
>>>>>>>>>> $ ./configure --sysconfdir=/etc --localstatedir=/var
>>>>>>>>>> --without-cman
>>>>>>>>>> --without-heartbeat
>>>>>>>>>> -snip-
>>>>>>>>>> pacemaker configuration:
>>>>>>>>>>        Version                  = 1.1.8 (Build: 9c13d14)
>>>>>>>>>>        Features                 = generated-manpages agent-manpages
>>>>>>>>>>        ascii-docs
>>>>>>>>>> publican-docs ncurses libqb-logging libqb-ipc lha-fencing
>>>>>>>>>>      corosync-native
>>>>>>>>>> snmp
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> $ cat config.log
>>>>>>>>>> -snip-
>>>>>>>>>> 6000 | #define BUILD_VERSION "9c13d14"
>>>>>>>>>> 6001 | /* end confdefs.h.  */
>>>>>>>>>> 6002 | #include <gio/gio.h>
>>>>>>>>>> 6003 |
>>>>>>>>>> 6004 | int
>>>>>>>>>> 6005 | main ()
>>>>>>>>>> 6006 | {
>>>>>>>>>> 6007 | if (sizeof (GDBusProxy))
>>>>>>>>>> 6008 |        return 0;
>>>>>>>>>> 6009 |   ;
>>>>>>>>>> 6010 |   return 0;
>>>>>>>>>> 6011 | }
>>>>>>>>>> 6012 configure:32411: result: no
>>>>>>>>>> 6013 configure:32417: WARNING: Unable to support systemd/upstart.
>>>>>>>>>> You need
>>>>>>>>>> to use glib >= 2.26
>>>>>>>>>> -snip-
>>>>>>>>>> 6286 | #define BUILD_VERSION "9c13d14"
>>>>>>>>>> 6287 | #define SUPPORT_UPSTART 0
>>>>>>>>>> 6288 | #define SUPPORT_SYSTEMD 0
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Best Regards,
>>>>>>>>>> Kazunori INOUE
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> related bugzilla:
>>>>>>>>>>>> http://bugs.clusterlabs.org/show_bug.cgi?id=5064
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>> Kazunori INOUE
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>
>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>> Getting started:
>>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org