[Pacemaker] [Enhancement] When attrd reboots, the attribute disappears.

Sun Jun 8 22:01:30 EDT 2014

Hi All,

I submitted a problem in next bugziila in the past.
 * https://developerbugs.linuxfoundation.org/show_bug.cgi?id=2501

A similar phenomenon is generated in attrd of latest Pacemaker.

Step 1) Set the setting of the cluster as follows.
 export PCMK_fail_fast=no

Step 2) Start a cluster.

Step 3) Cause trouble in a resource and improve a trouble count.(fail-count)
--------------------------------
[root at srv01 ~]# crm_mon -1 -Af
(snip)
Online: [ srv01 ]

 before-dummy   (ocf::heartbeat:Dummy): Started srv01 
 vip-master     (ocf::heartbeat:Dummy2):        Started srv01 

Migration summary:
* Node srv01: 
   before-dummy: migration-threshold=10 fail-count=1 last-failure='Mon Jun  9 19:21:07 2014'

Failed actions:
    before-dummy_monitor_10000 on srv01 'not running' (7): call=11, status=complete, last-rc-change='Mon Jun  9 19:21:07 2014', queued=0ms, exec=0ms
--------------------------------

Step 4) Reboot attrd in kill.(I assume that attrd breaks down and rebooted.)

Step 5) Produce trouble in a resource same as step 3 again.
 * The trouble number(fail-count) of times returns to 1.

--------------------------------
[root at srv01 ~]# crm_mon -1 -Af         
(snip)
Online: [ srv01 ]

 before-dummy   (ocf::heartbeat:Dummy): Started srv01 
 vip-master     (ocf::heartbeat:Dummy2):        Started srv01 

Migration summary:
* Node srv01: 
   before-dummy: migration-threshold=10 fail-count=1 last-failure='Mon Jun  9 19:22:47 2014'

Failed actions:
    before-dummy_monitor_10000 on srv01 'not running' (7): call=17, status=complete, last-rc-change='Mon Jun  9 19:22:47 2014', queued=0ms, exec=0ms
--------------------------------

Even if attrd reboots, I think that it is necessary to improve attrd so that an attribute is maintained definitely.

Best Regards,
Hideo Yamauch.