[Pacemaker] node status does not change even if pacemakerd dies
Andrew Beekhof
andrew at beekhof.net
Tue Jan 8 00:16:10 UTC 2013
On Wed, Dec 19, 2012 at 8:15 PM, Kazunori INOUE
<inouekazu at intellilink.co.jp> wrote:
> (12.12.13 08:26), Andrew Beekhof wrote:
>>
>> On Wed, Dec 12, 2012 at 8:02 PM, Kazunori INOUE
>> <inouekazu at intellilink.co.jp> wrote:
>>>
>>>
>>> Hi,
>>>
>>> I recognize that pacemakerd is much less likely to crash.
>>> However, a possibility of being killed by OOM_Killer etc. is not 0%.
>>
>>
>> True. Although we just established in another thread that we don't
>> have any leaks :)
>>
>>> So I think that a user gets confused. since behavior at the time of
>>> process
>>> death differs even if pacemakerd is running.
>>>
>>> case A)
>>> When pacemakerd and other processes (crmd etc.) are the parent-child
>>> relation.
>>>
>>
>> [snip]
>>
>>>
>>> For example, crmd died.
>>> However, since it is relaunched, the state of the cluster is not
>>> affected.
>>
>>
>> Right.
>>
>> [snip]
>>
>>>
>>> case B)
>>> When pacemakerd and other processes are NOT the parent-child relation.
>>> Although pacemakerd was killed, it assumed the state where it was
>>> respawned
>>> by Upstart.
>>>
>>> $ service corosync start ; service pacemaker start
>>> $ pkill -9 pacemakerd
>>> $ ps -ef|egrep 'corosync|pacemaker|UID'
>>> UID PID PPID C STIME TTY TIME CMD
>>> root 21091 1 1 14:52 ? 00:00:00 corosync
>>> 496 21099 1 0 14:52 ? 00:00:00 /usr/libexec/pacemaker/cib
>>> root 21100 1 0 14:52 ? 00:00:00
>>> /usr/libexec/pacemaker/stonithd
>>> root 21101 1 0 14:52 ? 00:00:00 /usr/libexec/pacemaker/lrmd
>>> 496 21102 1 0 14:52 ? 00:00:00
>>> /usr/libexec/pacemaker/attrd
>>> 496 21103 1 0 14:52 ? 00:00:00
>>> /usr/libexec/pacemaker/pengine
>>> 496 21104 1 0 14:52 ? 00:00:00 /usr/libexec/pacemaker/crmd
>>> root 21128 1 1 14:53 ? 00:00:00 /usr/sbin/pacemakerd
>>
>>
>> Yep, looks right.
>>
>
> Hi Andrew,
>
> We discussed this behavior.
> Behavior when pacemakerd and other processes are not parent-child
> relation (case B) reached the conclusion that there is room for
> improvement.
>
> Since not all users are experts, they may kill pacemakerd accidentally.
> Such a user will get confused if the behavior after crmd death changes
> with the following conditions.
> case A: pacemakerd and others (crmd etc.) are the parent-child relation.
> case B: pacemakerd and others are not the parent-child relation.
>
> So, we want to *always* obtain the same behavior as the case where
> there is parent-child relation.
> That is, when crmd etc. die, we want pacemaker to always relaunch
> the process always immediately.
No. Sorry.
Writing features to satisfy an artificial test case is not a good practice.
We can speed up the failure detection for case B (I'll agree that 60s
is way too long, 5s or 2s might be better depending on the load is
creates), but causing downtime now to _maybe_ avoid downtime in the
future makes no sense.
Especially when you consider that the node will likely be fenced if
the crmd fails anyway.
Take a look at the logs from a some ComponentFail test runs and you'll
see that the parent-child relationship regularly _fails_ to prevent
downtime.
>
> Regards,
> Kazunori INOUE
>
>
>>> In this case, the node will be set to UNCLEAN if crmd dies.
>>> That is, the node will be fenced if there is stonith resource.
>>
>>
>> Which is exactly what happens if only pacemakerd is killed with your
>> proposal.
>> Except now you have time to do a graceful pacemaker restart to
>> re-establish the parent-child relationship.
>>
>> If you want to compare B with something, it needs to be with the old
>> "children terminate if pacemakerd dies" strategy.
>> Which is:
>>
>>> $ service corosync start ; service pacemaker start
>>> $ pkill -9 pacemakerd
>>> ... the node will be set to UNCLEAN
>>
>>
>> Old way: always downtime because children terminate which triggers fencing
>> Our way: no downtime unless there is an additional failure (to the cib or
>> crmd)
>>
>> Given that we're trying for HA, the second seems preferable.
>>
>>>
>>> $ pkill -9 crmd
>>> $ crm_mon -1
>>> Last updated: Wed Dec 12 14:53:48 2012
>>> Last change: Wed Dec 12 14:53:10 2012 via crmd on dev2
>>>
>>> Stack: corosync
>>> Current DC: dev2 (2472913088) - partition with quorum
>>> Version: 1.1.8-3035414
>>>
>>> 2 Nodes configured, unknown expected votes
>>> 0 Resources configured.
>>>
>>> Node dev1 (2506467520): UNCLEAN (online)
>>> Online: [ dev2 ]
>>>
>>>
>>> How about making behavior selectable with an option?
>>
>>
>> MORE_DOWNTIME_PLEASE=(true|false) ?
>>
>>>
>>> When pacemakerd dies,
>>> mode A) which behaves in an existing way. (default)
>>> mode B) which makes the node UNCLEAN.
>>>
>>> Best Regards,
>>> Kazunori INOUE
>>>
>>>
>>>
>>>> Making stop work when there is no pacemakerd process is a different
>>>> matter. We can make that work.
>>>>
>>>>>
>>>>> Though the best solution is to relaunch pacemakerd, if it is difficult,
>>>>> I think that a shortcut method is to make a node unclean.
>>>>>
>>>>>
>>>>> And now, I tried Upstart a little bit.
>>>>>
>>>>> 1) started the corosync and pacemaker.
>>>>>
>>>>> $ cat /etc/init/pacemaker.conf
>>>>> respawn
>>>>> script
>>>>> [ -f /etc/sysconfig/pacemaker ] && {
>>>>> . /etc/sysconfig/pacemaker
>>>>> }
>>>>> exec /usr/sbin/pacemakerd
>>>>> end script
>>>>>
>>>>> $ service co start
>>>>> Starting Corosync Cluster Engine (corosync): [ OK ]
>>>>> $ initctl start pacemaker
>>>>> pacemaker start/running, process 4702
>>>>>
>>>>>
>>>>> $ ps -ef|egrep 'corosync|pacemaker'
>>>>> root 4695 1 0 17:21 ? 00:00:00 corosync
>>>>> root 4702 1 0 17:21 ? 00:00:00 /usr/sbin/pacemakerd
>>>>> 496 4703 4702 0 17:21 ? 00:00:00 /usr/libexec/pacemaker/cib
>>>>> root 4704 4702 0 17:21 ? 00:00:00
>>>>> /usr/libexec/pacemaker/stonithd
>>>>> root 4705 4702 0 17:21 ? 00:00:00 /usr/libexec/pacemaker/lrmd
>>>>> 496 4706 4702 0 17:21 ? 00:00:00
>>>>> /usr/libexec/pacemaker/attrd
>>>>> 496 4707 4702 0 17:21 ? 00:00:00
>>>>> /usr/libexec/pacemaker/pengine
>>>>> 496 4708 4702 0 17:21 ? 00:00:00 /usr/libexec/pacemaker/crmd
>>>>>
>>>>> 2) killed pacemakerd.
>>>>>
>>>>> $ pkill -9 pacemakerd
>>>>>
>>>>> $ ps -ef|egrep 'corosync|pacemaker'
>>>>> root 4695 1 0 17:21 ? 00:00:01 corosync
>>>>> 496 4703 1 0 17:21 ? 00:00:00 /usr/libexec/pacemaker/cib
>>>>> root 4704 1 0 17:21 ? 00:00:00
>>>>> /usr/libexec/pacemaker/stonithd
>>>>> root 4705 1 0 17:21 ? 00:00:00 /usr/libexec/pacemaker/lrmd
>>>>> 496 4706 1 0 17:21 ? 00:00:00
>>>>> /usr/libexec/pacemaker/attrd
>>>>> 496 4707 1 0 17:21 ? 00:00:00
>>>>> /usr/libexec/pacemaker/pengine
>>>>> 496 4708 1 0 17:21 ? 00:00:00 /usr/libexec/pacemaker/crmd
>>>>> root 4760 1 1 17:24 ? 00:00:00 /usr/sbin/pacemakerd
>>>>>
>>>>> 3) then I stopped pacemakerd. however, some processes did not stop.
>>>>>
>>>>> $ initctl stop pacemaker
>>>>> pacemaker stop/waiting
>>>>>
>>>>>
>>>>> $ ps -ef|egrep 'corosync|pacemaker'
>>>>> root 4695 1 0 17:21 ? 00:00:01 corosync
>>>>> 496 4703 1 0 17:21 ? 00:00:00 /usr/libexec/pacemaker/cib
>>>>> root 4704 1 0 17:21 ? 00:00:00
>>>>> /usr/libexec/pacemaker/stonithd
>>>>> root 4705 1 0 17:21 ? 00:00:00 /usr/libexec/pacemaker/lrmd
>>>>> 496 4706 1 0 17:21 ? 00:00:00
>>>>> /usr/libexec/pacemaker/attrd
>>>>> 496 4707 1 0 17:21 ? 00:00:00
>>>>> /usr/libexec/pacemaker/pengine
>>>>>
>>>>> Best Regards,
>>>>> Kazunori INOUE
>>>>>
>>>>>
>>>>>>>> This isnt the case when the plugin is in use though, but then I'd
>>>>>>>> also
>>>>>>>> have expected most of the processes to die also.
>>>>>>>>
>>>>>>> Since node status will also change if such a result is brought,
>>>>>>> we desire to become so.
>>>>>>>
>>>>>>>>>
>>>>>>>>> ----
>>>>>>>>> $ cat /etc/redhat-release
>>>>>>>>> Red Hat Enterprise Linux Server release 6.3 (Santiago)
>>>>>>>>>
>>>>>>>>> $ ./configure --sysconfdir=/etc --localstatedir=/var
>>>>>>>>> --without-cman
>>>>>>>>> --without-heartbeat
>>>>>>>>> -snip-
>>>>>>>>> pacemaker configuration:
>>>>>>>>> Version = 1.1.8 (Build: 9c13d14)
>>>>>>>>> Features = generated-manpages agent-manpages
>>>>>>>>> ascii-docs
>>>>>>>>> publican-docs ncurses libqb-logging libqb-ipc lha-fencing
>>>>>>>>> corosync-native
>>>>>>>>> snmp
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> $ cat config.log
>>>>>>>>> -snip-
>>>>>>>>> 6000 | #define BUILD_VERSION "9c13d14"
>>>>>>>>> 6001 | /* end confdefs.h. */
>>>>>>>>> 6002 | #include <gio/gio.h>
>>>>>>>>> 6003 |
>>>>>>>>> 6004 | int
>>>>>>>>> 6005 | main ()
>>>>>>>>> 6006 | {
>>>>>>>>> 6007 | if (sizeof (GDBusProxy))
>>>>>>>>> 6008 | return 0;
>>>>>>>>> 6009 | ;
>>>>>>>>> 6010 | return 0;
>>>>>>>>> 6011 | }
>>>>>>>>> 6012 configure:32411: result: no
>>>>>>>>> 6013 configure:32417: WARNING: Unable to support systemd/upstart.
>>>>>>>>> You need
>>>>>>>>> to use glib >= 2.26
>>>>>>>>> -snip-
>>>>>>>>> 6286 | #define BUILD_VERSION "9c13d14"
>>>>>>>>> 6287 | #define SUPPORT_UPSTART 0
>>>>>>>>> 6288 | #define SUPPORT_SYSTEMD 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best Regards,
>>>>>>>>> Kazunori INOUE
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> related bugzilla:
>>>>>>>>>>> http://bugs.clusterlabs.org/show_bug.cgi?id=5064
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Best Regards,
>>>>>>>>>>> Kazunori INOUE
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>
>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>> Getting started:
>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list