[Pacemaker] Pacemaker 1.1.6 order possible bug ?

Mon Sep 3 01:41:25 EDT 2012

Hello,

Sorry If I send same question twice, but message did not appeared on 
mailing list.

I have a problem with orders in pacemaker 1.1.6 and corosync 1.4.1.

Order below is working for failover, but it is not working when one 
cluster node starts up (drbd stays in Slave state and ms_toponet is 
started before DRBD gets promoted).

order o_start inf: ms_drbd_postgres:promote postgres:start 
ms_toponet:promote monitor_cluster:start

Order below is not working for failover (it kills slave toponet app and 
start it again) but it is working correctly when cluster starts up.

order o_start inf: ms_drbd_postgres:promote postgres:start 
ms_toponet:start ms_toponet:promote monitor_cluster:start

I want to the pacemaker to act as in 1.0.12 version.
* when toponet master app is killed, move postgres resource to other 
node and promote ms_toponet and ms_drbd_postgres to Master
* when one node is starting promote DRBD to master is is UpToDate

Am I doing something wrong?

It looks to me pacemaker ignores some orders (pacemaker should wait for 
DRBD promotion when starting toponet app, but toponet app is started 
right after DRBD start (slave)). I tried to solve this by different 
orders with combination symmetrical=false, split orders, different 
orders for start and stop, but no success at all (seems to me like 
completely ignoring symmetrical=false directive).

Pacemaker 1.1.7 is not working for me, because it has broken on-fail 
directive.

crm_mon output:

============
Last updated: Fri Aug 31 14:51:11 2012
Last change: Fri Aug 31 14:50:27 2012 by hacluster via crmd on toponet30
Stack: openais
Current DC: toponet30 - partition WITHOUT quorum
Version: 1.1.6-b988976485d15cb702c9307df55512d323831a5e
2 Nodes configured, 2 expected votes
10 Resources configured.
============

Online: [ toponet30 toponet31 ]

st_primary      (stonith:external/xen0):        Started toponet30
st_secondary    (stonith:external/xen0):        Started toponet31
  Master/Slave Set: ms_drbd_postgres
      Masters: [ toponet30 ]
      Slaves: [ toponet31 ]
  Resource Group: postgres
      pg_fs      (ocf::heartbeat:Filesystem):    Started toponet30
      PGIP       (ocf::heartbeat:IPaddr2):       Started toponet30
      postgresql (ocf::heartbeat:pgsql): Started toponet30
monitor_cluster (ocf::heartbeat:monitor_cluster):       Started toponet30
  Master/Slave Set: ms_toponet
      Masters: [ toponet30 ]
      Slaves: [ toponet31 ]

configuration:

node toponet30
node toponet31
primitive PGIP ocf:heartbeat:IPaddr2 \
         params ip="192.168.100.3" cidr_netmask="29" \
         op monitor interval="5s"
primitive drbd_postgres ocf:linbit:drbd \
         params drbd_resource="postgres" \
         op start interval="0" timeout="240s" \
         op stop interval="0" timeout="120s" \
         op monitor interval="5s" role="Master" timeout="10s" \
         op monitor interval="10s" role="Slave" timeout="20s"
primitive monitor_cluster ocf:heartbeat:monitor_cluster \
         op monitor interval="30s" \
         op start interval="0" timeout="30s" \
         meta target-role="Started"
primitive pg_fs ocf:heartbeat:Filesystem \
         params device="/dev/drbd0" directory="/var/lib/pgsql" fstype="ext3"
primitive postgresql ocf:heartbeat:pgsql \
         op start interval="0" timeout="80s" \
         op stop interval="0" timeout="60s" \
         op monitor interval="10s" timeout="10s" depth="0"
primitive st_primary stonith:external/xen0 \
         op start interval="0" timeout="60s" \
         params hostlist="toponet31:/etc/xen/vm/toponet31" 
dom0="172.16.103.54"
primitive st_secondary stonith:external/xen0 \
         op start interval="0" timeout="60s" \
         params hostlist="toponet30:/etc/xen/vm/toponet30" 
dom0="172.16.103.54"
primitive toponet ocf:heartbeat:toponet \
         op start interval="0" timeout="180s" \
         op stop interval="0" timeout="60s" \
         op monitor interval="10s" role="Master" timeout="20s" 
on-fail="standby" \
         op monitor interval="20s" role="Slave" timeout="40s" \
         op promote interval="0" timeout="120s" \
         op demote interval="0" timeout="120s"
group postgres pg_fs PGIP postgresql
ms ms_drbd_postgres drbd_postgres \
         meta master-max="1" master-node-max="1" clone-max="2" 
clone-node-max="1" notify="true" target-role="Master"
ms ms_toponet toponet \
         meta master-max="1" master-node-max="1" clone-max="2" 
clone-node-max="1" target-role="Master"
location loc_st_pri st_primary -inf: toponet31
location loc_st_sec st_secondary -inf: toponet30
location master-prefer-node1 postgres 100: toponet30
colocation pg_on_drbd inf: monitor_cluster ms_toponet:Master postgres 
ms_drbd_postgres:Master
order o_start inf: ms_drbd_postgres:start ms_drbd_postgres:promote 
postgres:start ms_toponet:start ms_toponet:promote monitor_cluster:start
property $id="cib-bootstrap-options" \
         dc-version="1.1.6-b988976485d15cb702c9307df55512d323831a5e" \
         cluster-infrastructure="openais" \
         expected-quorum-votes="2" \
         no-quorum-policy="ignore" \
         stonith-enabled="true"
rsc_defaults $id="rsc-options" \
         resource-stickiness="5000"