[Pacemaker] Resource ordering/colocating question (DRBD + LVM + FS)

Tue Sep 10 12:51:32 UTC 2013

Not sure whether I'm doing this the right way but here goes..

With resources started on node #1:

# crm_simulate -L -s -d pgdbsrv01.cl1.local

Current cluster status:
Online: [ pgdbsrv01.cl1.local pgdbsrv02.cl1.local ]

 Master/Slave Set: DRBD_ms_data01 [DRBD_data01]
     Masters: [ pgdbsrv01.cl1.local ]
     Slaves: [ pgdbsrv02.cl1.local ]
 Master/Slave Set: DRBD_ms_data02 [DRBD_data02]
     Masters: [ pgdbsrv01.cl1.local ]
     Slaves: [ pgdbsrv02.cl1.local ]
 Resource Group: GRP_data01
     LVM_vgdata01	(ocf::heartbeat:LVM):	Started pgdbsrv01.cl1.local
     FS_data01	(ocf::heartbeat:Filesystem):	Started pgdbsrv01.cl1.local
 Resource Group: GRP_data02
     LVM_vgdata02	(ocf::heartbeat:LVM):	Started pgdbsrv01.cl1.local
     FS_data02	(ocf::heartbeat:Filesystem):	Started pgdbsrv01.cl1.local
 fusion-fencing	(stonith:fence_fusion):	Started pgdbsrv02.cl1.local

Performing requested modifications
 + Taking node pgdbsrv01.cl1.local offline
Allocation scores:
clone_color: DRBD_ms_data01 allocation score on pgdbsrv01.cl1.local: 0
clone_color: DRBD_ms_data01 allocation score on pgdbsrv02.cl1.local: 0
clone_color: DRBD_data01:0 allocation score on pgdbsrv01.cl1.local: 0
clone_color: DRBD_data01:0 allocation score on pgdbsrv02.cl1.local: 10000
clone_color: DRBD_data01:1 allocation score on pgdbsrv01.cl1.local: 0
clone_color: DRBD_data01:1 allocation score on pgdbsrv02.cl1.local: 10000
native_color: DRBD_data01:0 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: DRBD_data01:0 allocation score on pgdbsrv02.cl1.local: 10000
native_color: DRBD_data01:1 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: DRBD_data01:1 allocation score on pgdbsrv02.cl1.local: -INFINITY
DRBD_data01:0 promotion score on pgdbsrv02.cl1.local: 10000
DRBD_data01:1 promotion score on none: 0
clone_color: DRBD_ms_data02 allocation score on pgdbsrv01.cl1.local: 0
clone_color: DRBD_ms_data02 allocation score on pgdbsrv02.cl1.local: 0
clone_color: DRBD_data02:0 allocation score on pgdbsrv01.cl1.local: 0
clone_color: DRBD_data02:0 allocation score on pgdbsrv02.cl1.local: 10000
clone_color: DRBD_data02:1 allocation score on pgdbsrv01.cl1.local: 0
clone_color: DRBD_data02:1 allocation score on pgdbsrv02.cl1.local: 10000
native_color: DRBD_data02:0 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: DRBD_data02:0 allocation score on pgdbsrv02.cl1.local: 10000
native_color: DRBD_data02:1 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: DRBD_data02:1 allocation score on pgdbsrv02.cl1.local: -INFINITY
DRBD_data02:0 promotion score on pgdbsrv02.cl1.local: 10000
DRBD_data02:1 promotion score on none: 0
group_color: GRP_data01 allocation score on pgdbsrv01.cl1.local: 0
group_color: GRP_data01 allocation score on pgdbsrv02.cl1.local: 0
group_color: LVM_vgdata01 allocation score on pgdbsrv01.cl1.local: 0
group_color: LVM_vgdata01 allocation score on pgdbsrv02.cl1.local: 0
group_color: FS_data01 allocation score on pgdbsrv01.cl1.local: 0
group_color: FS_data01 allocation score on pgdbsrv02.cl1.local: 0
native_color: LVM_vgdata01 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: LVM_vgdata01 allocation score on pgdbsrv02.cl1.local: 10000
native_color: FS_data01 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: FS_data01 allocation score on pgdbsrv02.cl1.local: 0
group_color: GRP_data02 allocation score on pgdbsrv01.cl1.local: 0
group_color: GRP_data02 allocation score on pgdbsrv02.cl1.local: 0
group_color: LVM_vgdata02 allocation score on pgdbsrv01.cl1.local: 0
group_color: LVM_vgdata02 allocation score on pgdbsrv02.cl1.local: 0
group_color: FS_data02 allocation score on pgdbsrv01.cl1.local: 0
group_color: FS_data02 allocation score on pgdbsrv02.cl1.local: 0
native_color: LVM_vgdata02 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: LVM_vgdata02 allocation score on pgdbsrv02.cl1.local: 10000
native_color: FS_data02 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: FS_data02 allocation score on pgdbsrv02.cl1.local: 0
native_color: fusion-fencing allocation score on pgdbsrv01.cl1.local: 0
native_color: fusion-fencing allocation score on pgdbsrv02.cl1.local: 0

Transition Summary:
 * Promote DRBD_data01:0	(Slave -> Master pgdbsrv02.cl1.local)
 * Demote  DRBD_data01:1	(Master -> Stopped pgdbsrv01.cl1.local)
 * Promote DRBD_data02:0	(Slave -> Master pgdbsrv02.cl1.local)
 * Demote  DRBD_data02:1	(Master -> Stopped pgdbsrv01.cl1.local)
 * Move    LVM_vgdata01	(Started pgdbsrv01.cl1.local -> pgdbsrv02.cl1.local)
 * Move    FS_data01	(Started pgdbsrv01.cl1.local -> pgdbsrv02.cl1.local)
 * Move    LVM_vgdata02	(Started pgdbsrv01.cl1.local -> pgdbsrv02.cl1.local)
 * Move    FS_data02	(Started pgdbsrv01.cl1.local -> pgdbsrv02.cl1.local)

..taking node #1 offline (standby) for real, resources running on node #2, then:

# crm_simulate -L -s -u pgdbsrv01.cl1.local

Current cluster status:
Node pgdbsrv01.cl1.local: standby
Online: [ pgdbsrv02.cl1.local ]

 Master/Slave Set: DRBD_ms_data01 [DRBD_data01]
     Masters: [ pgdbsrv02.cl1.local ]
     Stopped: [ DRBD_data01:1 ]
 Master/Slave Set: DRBD_ms_data02 [DRBD_data02]
     Masters: [ pgdbsrv02.cl1.local ]
     Stopped: [ DRBD_data02:1 ]
 Resource Group: GRP_data01
     LVM_vgdata01	(ocf::heartbeat:LVM):	Started pgdbsrv02.cl1.local
     FS_data01	(ocf::heartbeat:Filesystem):	Started pgdbsrv02.cl1.local
 Resource Group: GRP_data02
     LVM_vgdata02	(ocf::heartbeat:LVM):	Started pgdbsrv02.cl1.local
     FS_data02	(ocf::heartbeat:Filesystem):	Started pgdbsrv02.cl1.local
 fusion-fencing	(stonith:fence_fusion):	Started pgdbsrv02.cl1.local

Performing requested modifications
 + Bringing node pgdbsrv01.cl1.local online
Allocation scores:
clone_color: DRBD_ms_data01 allocation score on pgdbsrv01.cl1.local: 0
clone_color: DRBD_ms_data01 allocation score on pgdbsrv02.cl1.local: 0
clone_color: DRBD_data01:0 allocation score on pgdbsrv01.cl1.local: 0
clone_color: DRBD_data01:0 allocation score on pgdbsrv02.cl1.local: 10000
clone_color: DRBD_data01:1 allocation score on pgdbsrv01.cl1.local: 0
clone_color: DRBD_data01:1 allocation score on pgdbsrv02.cl1.local: 0
native_color: DRBD_data01:0 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: DRBD_data01:0 allocation score on pgdbsrv02.cl1.local: 10000
native_color: DRBD_data01:1 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: DRBD_data01:1 allocation score on pgdbsrv02.cl1.local: -INFINITY
DRBD_data01:0 promotion score on pgdbsrv02.cl1.local: 10000
DRBD_data01:1 promotion score on none: 0
clone_color: DRBD_ms_data02 allocation score on pgdbsrv01.cl1.local: 0
clone_color: DRBD_ms_data02 allocation score on pgdbsrv02.cl1.local: 0
clone_color: DRBD_data02:0 allocation score on pgdbsrv01.cl1.local: 0
clone_color: DRBD_data02:0 allocation score on pgdbsrv02.cl1.local: 10000
clone_color: DRBD_data02:1 allocation score on pgdbsrv01.cl1.local: 0
clone_color: DRBD_data02:1 allocation score on pgdbsrv02.cl1.local: 0
native_color: DRBD_data02:0 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: DRBD_data02:0 allocation score on pgdbsrv02.cl1.local: 10000
native_color: DRBD_data02:1 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: DRBD_data02:1 allocation score on pgdbsrv02.cl1.local: -INFINITY
DRBD_data02:0 promotion score on pgdbsrv02.cl1.local: 10000
DRBD_data02:1 promotion score on none: 0
group_color: GRP_data01 allocation score on pgdbsrv01.cl1.local: 0
group_color: GRP_data01 allocation score on pgdbsrv02.cl1.local: 0
group_color: LVM_vgdata01 allocation score on pgdbsrv01.cl1.local: 0
group_color: LVM_vgdata01 allocation score on pgdbsrv02.cl1.local: 0
group_color: FS_data01 allocation score on pgdbsrv01.cl1.local: 0
group_color: FS_data01 allocation score on pgdbsrv02.cl1.local: 0
native_color: LVM_vgdata01 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: LVM_vgdata01 allocation score on pgdbsrv02.cl1.local: 10000
native_color: FS_data01 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: FS_data01 allocation score on pgdbsrv02.cl1.local: 0
group_color: GRP_data02 allocation score on pgdbsrv01.cl1.local: 0
group_color: GRP_data02 allocation score on pgdbsrv02.cl1.local: 0
group_color: LVM_vgdata02 allocation score on pgdbsrv01.cl1.local: 0
group_color: LVM_vgdata02 allocation score on pgdbsrv02.cl1.local: 0
group_color: FS_data02 allocation score on pgdbsrv01.cl1.local: 0
group_color: FS_data02 allocation score on pgdbsrv02.cl1.local: 0
native_color: LVM_vgdata02 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: LVM_vgdata02 allocation score on pgdbsrv02.cl1.local: 10000
native_color: FS_data02 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: FS_data02 allocation score on pgdbsrv02.cl1.local: 0
native_color: fusion-fencing allocation score on pgdbsrv01.cl1.local: 0
native_color: fusion-fencing allocation score on pgdbsrv02.cl1.local: 0

Transition Summary:

And that's it. Once I unstandby node #1 and rerun the same simulate:

# crm_simulate -L -s -u pgdbsrv01.cl1.local

Current cluster status:
Online: [ pgdbsrv01.cl1.local pgdbsrv02.cl1.local ]

 Master/Slave Set: DRBD_ms_data01 [DRBD_data01]
     Masters: [ pgdbsrv02.cl1.local ]
     Slaves: [ pgdbsrv01.cl1.local ]
 Master/Slave Set: DRBD_ms_data02 [DRBD_data02]
     Masters: [ pgdbsrv02.cl1.local ]
     Slaves: [ pgdbsrv01.cl1.local ]
 Resource Group: GRP_data01
     LVM_vgdata01	(ocf::heartbeat:LVM):	Stopped
     FS_data01	(ocf::heartbeat:Filesystem):	Stopped
 Resource Group: GRP_data02
     LVM_vgdata02	(ocf::heartbeat:LVM):	Stopped
     FS_data02	(ocf::heartbeat:Filesystem):	Stopped
 fusion-fencing	(stonith:fence_fusion):	Started pgdbsrv01.cl1.local

Performing requested modifications
 + Bringing node pgdbsrv01.cl1.local online
Allocation scores:
clone_color: DRBD_ms_data01 allocation score on pgdbsrv01.cl1.local: 0
clone_color: DRBD_ms_data01 allocation score on pgdbsrv02.cl1.local: 0
clone_color: DRBD_data01:0 allocation score on pgdbsrv01.cl1.local: 10000
clone_color: DRBD_data01:0 allocation score on pgdbsrv02.cl1.local: 10000
clone_color: DRBD_data01:1 allocation score on pgdbsrv01.cl1.local: 10000
clone_color: DRBD_data01:1 allocation score on pgdbsrv02.cl1.local: 10000
native_color: DRBD_data01:1 allocation score on pgdbsrv01.cl1.local: 10000
native_color: DRBD_data01:1 allocation score on pgdbsrv02.cl1.local: 10000
native_color: DRBD_data01:0 allocation score on pgdbsrv01.cl1.local: 10000
native_color: DRBD_data01:0 allocation score on pgdbsrv02.cl1.local: -INFINITY
DRBD_data01:1 promotion score on pgdbsrv02.cl1.local: 10000
DRBD_data01:0 promotion score on pgdbsrv01.cl1.local: 10000
clone_color: DRBD_ms_data02 allocation score on pgdbsrv01.cl1.local: 0
clone_color: DRBD_ms_data02 allocation score on pgdbsrv02.cl1.local: 0
clone_color: DRBD_data02:0 allocation score on pgdbsrv01.cl1.local: 10000
clone_color: DRBD_data02:0 allocation score on pgdbsrv02.cl1.local: 10000
clone_color: DRBD_data02:1 allocation score on pgdbsrv01.cl1.local: 10000
clone_color: DRBD_data02:1 allocation score on pgdbsrv02.cl1.local: 10000
native_color: DRBD_data02:1 allocation score on pgdbsrv01.cl1.local: 10000
native_color: DRBD_data02:1 allocation score on pgdbsrv02.cl1.local: 10000
native_color: DRBD_data02:0 allocation score on pgdbsrv01.cl1.local: 10000
native_color: DRBD_data02:0 allocation score on pgdbsrv02.cl1.local: -INFINITY
DRBD_data02:1 promotion score on pgdbsrv02.cl1.local: 10000
DRBD_data02:0 promotion score on pgdbsrv01.cl1.local: 10000
group_color: GRP_data01 allocation score on pgdbsrv01.cl1.local: 0
group_color: GRP_data01 allocation score on pgdbsrv02.cl1.local: 0
group_color: LVM_vgdata01 allocation score on pgdbsrv01.cl1.local: 0
group_color: LVM_vgdata01 allocation score on pgdbsrv02.cl1.local: 0
group_color: FS_data01 allocation score on pgdbsrv01.cl1.local: 0
group_color: FS_data01 allocation score on pgdbsrv02.cl1.local: 0
native_color: LVM_vgdata01 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: LVM_vgdata01 allocation score on pgdbsrv02.cl1.local: 10000
native_color: FS_data01 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: FS_data01 allocation score on pgdbsrv02.cl1.local: 0
group_color: GRP_data02 allocation score on pgdbsrv01.cl1.local: 0
group_color: GRP_data02 allocation score on pgdbsrv02.cl1.local: 0
group_color: LVM_vgdata02 allocation score on pgdbsrv01.cl1.local: 0
group_color: LVM_vgdata02 allocation score on pgdbsrv02.cl1.local: 0
group_color: FS_data02 allocation score on pgdbsrv01.cl1.local: 0
group_color: FS_data02 allocation score on pgdbsrv02.cl1.local: 0
native_color: LVM_vgdata02 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: LVM_vgdata02 allocation score on pgdbsrv02.cl1.local: 10000
native_color: FS_data02 allocation score on pgdbsrv01.cl1.local: -INFINITY
native_color: FS_data02 allocation score on pgdbsrv02.cl1.local: 0
native_color: fusion-fencing allocation score on pgdbsrv01.cl1.local: 0
native_color: fusion-fencing allocation score on pgdbsrv02.cl1.local: 0

Transition Summary:
 * Start   LVM_vgdata01	(pgdbsrv02.cl1.local - blocked)
 * Start   FS_data01	(pgdbsrv02.cl1.local - blocked)
 * Start   LVM_vgdata02	(pgdbsrv02.cl1.local - blocked)
 * Start   FS_data02	(pgdbsrv02.cl1.local - blocked)

-- 
Heikki M

On 9.9.2013, at 12.02, Andreas Mock <andreas.mock at web.de> wrote:

> Hi Heikki,
> 
> it has to be crm_simulate -L -s. Sorry for the wrong command line
> parameters.
> 
> Best regards
> Andreas
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Heikki Manninen [mailto:hma at iki.fi] 
> Gesendet: Montag, 9. September 2013 10:46
> An: The Pacemaker cluster resource manager
> Betreff: Re: [Pacemaker] Resource ordering/colocating question (DRBD + LVM +
> FS)
> 
> Hello Andreas, thanks for your input, much appreciated.
> 
> On 5.9.2013, at 16.39, "Andreas Mock" <andreas.mock at web.de> wrote:
> 
>> 1) The second output of crm_mon show a resource IP_database
>> which is not shown in the initial crm_mon output and also
>> not in the config. => Reduce your problem/config to the
>> minimum being reproducible.
> 
> True. I edited out the resource from the e-mail that did not have anything
> to do with the problem as such (works ok all the time). Just forgot to
> remove it from the second copy-paste also. And yes, no more IP resource in
> the configuration.
> 
>> 2) Enable logging and look out which node is the DC.
>> There in the logs you find many many informations showing
>> what is going on. Hint: Open a terminal session with an
>> opened tail -f logfile. Watch it while inserting commands.
>> You'll get used to it.
> 
> Seems that node #2 was the DC (also visible in the pcs status output). I
> have looked at the logs all the time, just not yet too familiar with the
> contents of pacemaker logging. Here's the thing that keeps repeating
> everytime those LVM and FS resources stay in stopped state:
> 
> Sep  3 20:01:23 pgdbsrv02 pengine[1667]:   notice: LogActions: Start
> LVM_vgdata01#011(pgdbsrv01.cl1.local - blocked)
> Sep  3 20:01:23 pgdbsrv02 pengine[1667]:   notice: LogActions: Start
> FS_data01#011(pgdbsrv01.cl1.local - blocked)
> Sep  3 20:01:23 pgdbsrv02 pengine[1667]:   notice: LogActions: Start
> LVM_vgdata02#011(pgdbsrv01.cl1.local - blocked)
> Sep  3 20:01:23 pgdbsrv02 pengine[1667]:   notice: LogActions: Start
> FS_data02#011(pgdbsrv01.cl1.local - blocked)
> 
> So what does blocked mean here? Is it that the node #1 in this case is in
> need of fencing/stonithing and thus being blocked or something else (I have
> a backgroud in the RHCS/HACMP/LifeKeeper etc. world). No quorum policy is
> set to ignore.
> 
>> 3) The shown status of a drbd resource (crm_mon) doesn't show
>> you all informations of the drbd devices. Have a look at
>> drbd-overview on both nodes. (e.g. syncing status).
> 
> True, DRBD is working fine on these occations. Connected, Synced etc.
> 
>> 4) This setup CRIES for stonithing. Even in a test environment.
>> When stonith happens (this is what you see immediately) you
>> know something went wrong. This is a good indicator for
>> errors in agents or in the config. Believe me, as tedious stonithing
>> is the valuable it is for getting hints for bad cluster state.
>> On virtual machines stonithing is not as painful as on real
>> servers.
> 
> Very much true. I have implemented some custom fencing/stonithing agents
> before on physical and virtual cluster environments. Problem being here is
> that I'm not aware of reasonably simple ways to implement stonith with
> VMware Fusion that I'm bound to use for this test setup. Have to dig more
> into this though. So fencing from cman cluster.conf is chained to pacemaker
> fencing and pacemaker stonithing is disabled, no quorum policy is ignore.
> 
>> 5) Is the drbd fencing script enabled? If yes, in certain circumstances
>> -INF rules are inserted to deny promoting of "wrong" nodes.
>> You should grep for them 'cibadmin -Q | grep <resname>'
> 
> No, DRBD fencing is not enabled and split-brain recovery is done manually.
> 
>> 6) crm_simulate -L -v gives you an output of the scores of
>> the resources on each node. I really don't know how to read it
>> exactly (Is there a documentation of that anywhere?), but it
>> gives you a hint where to look at, when resources don't start.
>> Especially the aggregation of stickiness values in groups are
>> sometimes misleading.
> 
> Could be that I have some different version maybe, because -v is unknown
> option and:
> 
> # crm_simulate -L -V
> 
> Current cluster status:
> Online: [ pgdbsrv01.cl1.local pgdbsrv02.cl1.local ]
> 
> Master/Slave Set: DRBD_ms_data01 [DRBD_data01]
>    Masters: [ pgdbsrv01.cl1.local ]
>    Slaves: [ pgdbsrv02.cl1.local ]
> Master/Slave Set: DRBD_ms_data02 [DRBD_data02]
>    Masters: [ pgdbsrv01.cl1.local ]
>    Slaves: [ pgdbsrv02.cl1.local ]
> Resource Group: GRP_data01
>    LVM_vgdata01	(ocf::heartbeat:LVM):	Stopped
>    FS_data01	(ocf::heartbeat:Filesystem):	Stopped
> Resource Group: GRP_data02
>    LVM_vgdata02	(ocf::heartbeat:LVM):	Stopped
>    FS_data02	(ocf::heartbeat:Filesystem):	Stopped
> 
> 
> Only shows that much.
> 
> Original problem description left quoted below.
> 
> 
> -- 
> Heikki M
> 
> 
>> -----Ursprüngliche Nachricht-----
>> Von: Heikki Manninen [mailto:hma at iki.fi] 
>> Gesendet: Donnerstag, 5. September 2013 14:08
>> An: pacemaker at oss.clusterlabs.org
>> Betreff: [Pacemaker] Resource ordering/colocating question (DRBD + LVM +
> FS)
>> 
>> Hello,
>> 
>> I'm having a bit of a problem understanding what's going on with my simple
>> two-node demo cluster here. My resources come up correctly after
> restarting
>> the whole cluster but the LVM and Filesystem resources fail to start after
> a
>> single node restart or standby/unstandby (after node comes back online -
> why
>> do they even stop/start after the second node comes back?).
>> 
>> OS: CentOS 6.4 (cman stack)
>> Pacemaker: pacemaker-1.1.8-7.el6.x86_64
>> DRBD: drbd84-utils-8.4.3-1.el6.elrepo.x86_64
>> 
>> Everything is configured using: pcs-0.9.26-10.el6_4.1.noarch
>> 
>> Two DRBD resources configured and working: data01 & data02
>> Two nodes: pgdbsrv01.cl1.local & pgdbsrv02.cl1.local
>> 
>> Configuration:
>> 
>> node pgdbsrv01.cl1.local
>> node pgdbsrv02.cl1.local
>> primitive DRBD_data01 ocf:linbit:drbd \
>>   params drbd_resource="data01" \
>>   op monitor interval="30s"
>> primitive DRBD_data02 ocf:linbit:drbd \
>>   params drbd_resource="data02" \
>>   op monitor interval="30s"
>> primitive FS_data01 ocf:heartbeat:Filesystem \
>>   params device="/dev/mapper/vgdata01-lvdata01" directory="/data01"
>> fstype="ext4" \
>>   op monitor interval="30s"
>> primitive FS_data02 ocf:heartbeat:Filesystem \
>>   params device="/dev/mapper/vgdata02-lvdata02" directory="/data02"
>> fstype="ext4" \
>>   op monitor interval="30s"
>> primitive LVM_vgdata01 ocf:heartbeat:LVM \
>>   params volgrpname="vgdata01" exclusive="true" \
>>   op monitor interval="30s"
>> primitive LVM_vgdata02 ocf:heartbeat:LVM \
>>   params volgrpname="vgdata02" exclusive="true" \
>>   op monitor interval="30s"
>> group GRP_data01 LVM_vgdata01 FS_data01
>> group GRP_data02 LVM_vgdata02 FS_data02
>> ms DRBD_ms_data01 DRBD_data01 \
>>   meta master-max="1" master-node-max="1" clone-max="2"
>> clone-node-max="1" notify="true"
>> ms DRBD_ms_data02 DRBD_data02 \
>>   meta master-max="1" master-node-max="1" clone-max="2"
>> clone-node-max="1" notify="true"
>> colocation colocation-GRP_data01-DRBD_ms_data01-INFINITY inf: GRP_data01
>> DRBD_ms_data01:Master
>> colocation colocation-GRP_data02-DRBD_ms_data02-INFINITY inf: GRP_data02
>> DRBD_ms_data02:Master
>> order order-DRBD_data01-GRP_data01-mandatory : DRBD_data01:promote
>> GRP_data01:start
>> order order-DRBD_data02-GRP_data02-mandatory : DRBD_data02:promote
>> GRP_data02:start
>> property $id="cib-bootstrap-options" \
>>   dc-version="1.1.8-7.el6-394e906" \
>>   cluster-infrastructure="cman" \
>>   stonith-enabled="false" \
>>   no-quorum-policy="ignore" \
>>   migration-threshold="1"
>> rsc_defaults $id="rsc_defaults-options" \
>>   resource-stickiness="100"
>> 
>> 
>> 1) After starting the cluster, everything runs happily:
>> 
>> Last updated: Tue Sep  3 00:11:13 2013
>> Last change: Tue Sep  3 00:05:15 2013 via cibadmin on pgdbsrv01.cl1.local
>> Stack: cman
>> Current DC: pgdbsrv02.cl1.local - partition with quorum
>> Version: 1.1.8-7.el6-394e906
>> 2 Nodes configured, unknown expected votes
>> 9 Resources configured.
>> 
>> Online: [ pgdbsrv01.cl1.local pgdbsrv02.cl1.local ]
>> 
>> Full list of resources:
>> 
>> Master/Slave Set: DRBD_ms_data01 [DRBD_data01]
>>   Masters: [ pgdbsrv01.cl1.local ]
>>   Slaves: [ pgdbsrv02.cl1.local ]
>> Master/Slave Set: DRBD_ms_data02 [DRBD_data02]
>>   Masters: [ pgdbsrv01.cl1.local ]
>>   Slaves: [ pgdbsrv02.cl1.local ]
>> Resource Group: GRP_data01
>>   LVM_vgdata01 (ocf::heartbeat:LVM): Started pgdbsrv01.cl1.local
>>   FS_data01 (ocf::heartbeat:Filesystem): Started pgdbsrv01.cl1.local
>> Resource Group: GRP_data02
>>   LVM_vgdata02 (ocf::heartbeat:LVM): Started pgdbsrv01.cl1.local
>>   FS_data02 (ocf::heartbeat:Filesystem): Started pgdbsrv01.cl1.local
>> 
>> 2) Putting node #1 to standby mode - after which everything runs happily
> on
>> node pgdbsrv02.cl1.local
>> 
>> # pcs cluster standby pgdbsrv01.cl1.local
>> # pcs status
>> Last updated: Tue Sep  3 00:16:01 2013
>> Last change: Tue Sep  3 00:15:55 2013 via crm_attribute on
>> pgdbsrv02.cl1.local
>> Stack: cman
>> Current DC: pgdbsrv02.cl1.local - partition with quorum
>> Version: 1.1.8-7.el6-394e906
>> 2 Nodes configured, unknown expected votes
>> 9 Resources configured.
>> 
>> 
>> Node pgdbsrv01.cl1.local: standby
>> Online: [ pgdbsrv02.cl1.local ]
>> 
>> Full list of resources:
>> 
>> IP_database     (ocf::heartbeat:IPaddr2):     Started pgdbsrv02.cl1.local
>> Master/Slave Set: DRBD_ms_data01 [DRBD_data01]
>>   Masters: [ pgdbsrv02.cl1.local ]
>>   Stopped: [ DRBD_data01:1 ]
>> Master/Slave Set: DRBD_ms_data02 [DRBD_data02]
>>   Masters: [ pgdbsrv02.cl1.local ]
>>   Stopped: [ DRBD_data02:1 ]
>> Resource Group: GRP_data01
>>   LVM_vgdata01     (ocf::heartbeat:LVM):     Started pgdbsrv02.cl1.local
>>   FS_data01     (ocf::heartbeat:Filesystem):     Started
>> pgdbsrv02.cl1.local
>> Resource Group: GRP_data02
>>   LVM_vgdata02     (ocf::heartbeat:LVM):     Started pgdbsrv02.cl1.local
>>   FS_data02     (ocf::heartbeat:Filesystem):     Started
>> pgdbsrv02.cl1.local
>> 
>> 3) Putting node #1 back online - it seems that all the resources stop (?)
>> and then DRBD gets promoted successfully on node #2 but LVM and FS
> resources
>> never start
>> 
>> # pcs cluster unstandby pgdbsrv01.cl1.local
>> # pcs status
>> Last updated: Tue Sep  3 00:17:00 2013
>> Last change: Tue Sep  3 00:16:56 2013 via crm_attribute on
>> pgdbsrv02.cl1.local
>> Stack: cman
>> Current DC: pgdbsrv02.cl1.local - partition with quorum
>> Version: 1.1.8-7.el6-394e906
>> 2 Nodes configured, unknown expected votes
>> 9 Resources configured.
>> 
>> 
>> Online: [ pgdbsrv01.cl1.local pgdbsrv02.cl1.local ]
>> 
>> Full list of resources:
>> 
>> IP_database     (ocf::heartbeat:IPaddr2):     Started pgdbsrv02.cl1.local
>> Master/Slave Set: DRBD_ms_data01 [DRBD_data01]
>>   Masters: [ pgdbsrv02.cl1.local ]
>>   Slaves: [ pgdbsrv01.cl1.local ]
>> Master/Slave Set: DRBD_ms_data02 [DRBD_data02]
>>   Masters: [ pgdbsrv02.cl1.local ]
>>   Slaves: [ pgdbsrv01.cl1.local ]
>> Resource Group: GRP_data01
>>   LVM_vgdata01     (ocf::heartbeat:LVM):     Stopped
>>   FS_data01     (ocf::heartbeat:Filesystem):     Stopped
>> Resource Group: GRP_data02
>>   LVM_vgdata02     (ocf::heartbeat:LVM):     Stopped
>>   FS_data02     (ocf::heartbeat:Filesystem):     Stopped
>> 
>> 
>> 
>> Any ideas why this is happening/what could be wrong in the resource
>> configuration? The same thing happens when testing the situation with the
>> resources located vice-versa in the beginning. Also, if I stop & start one
>> of the nodes, same thing happens once the node gets back online.
>> 
>> 
>> -- 
>> Heikki Manninen <hma at iki.fi>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>> 
>> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org