[Pacemaker] [Problem]A monitor of Master stops when crm command repeat the movement of the resource.

Tue Nov 8 00:27:06 EST 2011

Hi All,

We tested the movement of the resource in Master/Slave.

============
Last updated: Tue Nov  8 14:12:23 2011
Stack: Heartbeat
Current DC: bl460g1b (1b34eec8-1d62-488b-a7fb-8e4b38f95ec3) - partition with quorum
Version: 1.0.11-9af47ddebcad19e35a61b2a20301dc038018e8e8
2 Nodes configured, unknown expected votes
3 Resources configured.
============

Online: [ bl460g1a bl460g1b ]

 Resource Group: master-group
     vip-master (ocf::heartbeat:Dummy): Started bl460g1a
     vip-rep    (ocf::heartbeat:Dummy): Started bl460g1a
 Master/Slave Set: msPostgresql
     Masters: [ bl460g1a ]
     Slaves: [ bl460g1b ]
 Clone Set: clnPingCheck
     Started: [ bl460g1a bl460g1b ]

Migration summary:
* Node bl460g1b: 
* Node bl460g1a: 

I change monitor handling of Stateful RA.
(snip)
stateful_monitor() {
echo "Stateful monitor" >> /tmp/test.log
    stateful_check_state "master"
(snip)

I repeat movement in the following script.

#!/bin/sh
i=1
while [ 1 ]; do
  echo "##############################" >> /tmp/test.log
  echo "move $i"
  crm resource move vip-rep
  echo "sleep"
  sleep 60
  crm resource unmove vip-rep
  i=`expr $i + 1`
done;

The phenomenon that a monitor of Master is not carried out occurs when I repeat movement in a script for a while.
(A problem reappears at considerable frequency when it continues carrying away a script.)

This problem seems to happen in both 1.0 most recent versions and 1.0.11 version.

 * Pacemaker-1-0-9af47ddebcad
 * Pacemaker-1-0-6e010d6b0d49 

A stop of the monitors is a problem very much.
I request improvement.

I register these contents and hb_report with Bugzilla.

 * http://bugs.clusterlabs.org/show_bug.cgi?id=5010

Best Regards,
Hideo Yamauchi.