[Pacemaker] booth is the state of "started" on pacemaker before booth write ticket info in cib.

Jiaju Zhang jjzhang at suse.de
Thu Feb 21 15:55:05 UTC 2013


On Wed, 2013-02-20 at 16:26 +0900, Yuichi SEINO wrote:
> Hi Jiaju,
> 
> I am testing this patch.
> When a lockfile was removed, it seems that the stop of RA isn't a
> intended behavior. 

I'm just curious how the lockfile was removed. Basically the existence
of the lockfile shows one boothd is started, and prevent being wrongly
started again. So the lockfile should not be removed intentionally by
the admin.

Thanks,
Jiaju

> Currently, If "pidnum" is empty, RA run "cat
> /proc//cmdline". /proc/cmdline is boot parameter file. So, I added the
> check about a existence of lockfile.
> 
> diff --git a/script/ocf/booth-site b/script/ocf/booth-site
> index 2575643..7c775dc 100755
> --- a/script/ocf/booth-site
> +++ b/script/ocf/booth-site
> @@ -116,6 +116,10 @@ booth_check_daemon_state(){
> 
>         case $rc in
>         $OCF_SUCCESS)
> +               if [ ! -f $lockfile ]; then
> +                       ocf_log err "lockfile not exists.(${lockfile})"
> +                       return $BOOTH_DAEMON_EXIST;
> +               fi
>                 pidnum=$(cat $lockfile |awk '{print $1}')
>                 daemonstate=$(cat $lockfile |awk '{print $2}')
>                 if cat /proc/$pidnum/cmdline |grep $OCF_RESKEY_type
> >/dev/null 2>&1; then
> 
> When this happened, I got "crm resource trace booth"
> 
> + 21:09:48: 223: '[' '!' ']'
> + 21:09:48: 224: OCF_RESKEY_daemon=boothd
> + 21:09:48: 227: '[' '!' ']'
> + 21:09:48: 228: OCF_RESKEY_type=site
> + 21:09:48: 231: case $__OCF_ACTION in
> + 21:09:48: 236: booth_stop
> + 21:09:48: booth_stop:166: booth_check_daemon_state
> + 21:09:48: booth_check_daemon_state:115: booth_check_daemon_exist
> + 21:09:48: booth_check_daemon_exist:105: killall -0 boothd
> + 21:09:48: booth_check_daemon_exist:105: rc=0
> + 21:09:48: booth_check_daemon_exist:107: case $rc in
> + 21:09:48: booth_check_daemon_exist:108: return 0
> + 21:09:48: booth_check_daemon_state:115: rc=0
> + 21:09:48: booth_check_daemon_state:117: case $rc in
> + 21:09:48: booth_check_daemon_state:117: case $rc in
> ++ 21:09:48: booth_check_daemon_state:119: awk '{print $1}'
> + 21:09:48: booth_check_daemon_state:117: case $rc in
> + 21:09:48: booth_check_daemon_state:117: case $rc in
> ++ 21:09:48: booth_check_daemon_state:119: cat /var/run/booth.pid
> + 21:09:48: booth_check_daemon_state:117: case $rc in
> + 21:09:48: booth_check_daemon_state:117: case $rc in
> + 21:09:48: booth_check_daemon_state:119: pidnum=
> ++ 21:09:48: booth_check_daemon_state:120: awk '{print $2}'
> ++ 21:09:48: booth_check_daemon_state:120: cat /var/run/booth.pid
> + 21:09:48: booth_check_daemon_state:120: daemonstate=
> + 21:09:48: booth_check_daemon_state:121: grep site
> + 21:09:48: booth_check_daemon_state:121: cat /proc//cmdline
> + 21:09:48: booth_check_daemon_state:122: case $daemonstate in
> + 21:09:48: booth_check_daemon_state:125: return 4
> + 21:09:48: booth_stop:166: rc=4
> + 21:09:48: booth_stop:168: case $rc in
> + 21:09:48: booth_stop:173: return 1
> + 21:09:48: 246: rc=1
> + 21:09:48: 248: exit 1
> 
> 
> 
> 
> 2013/2/19 Jiaju Zhang <jjzhang at suse.de>:
> > Hi Yuichi,
> >
> > On Tue, 2013-02-19 at 10:27 +0900, Yuichi SEINO wrote:
> >> Hi Xia,
> >>
> >> I have a question about the following part. The write man explain that
> >> "errno" is set appropriately if the write return -1. So, if "rv" is
> >> equal to 0, strerror(errno) may not output the correct message. What
> >> do you think about it?
> >
> > Good catch, I think we should differentiate the cases of rv == -1 or rv
> > == 0. Maybe setting errno to ENOSPC when rv == 0.
> >
> > BTW, apart from that, does this patch fix your original issue?
> >
> > Thanks,
> > Jiaju
> >
> 
> 
> 
> --
> Yuichi SEINO
> METROSYSTEMS CORPORATION
> E-mail:seino.cluster2 at gmail.com






More information about the Pacemaker mailing list