[Pacemaker] [PACEMAKER] Why cant't migrate group resource with collation in drbd resource

Wed Jan 23 10:23:17 EST 2013

2013/1/23 Kashif Jawed Siddiqui <kashifjs at huawei.com>

>  There is a Pacemaker bug which cannot be replaced due to legacy tracking
> and backward compatibility
>
>
> colocation FS_WITH_DRBD inf: IP-AND-FS ms_drbd:Master
> order DRBD_BEF_FS inf: IP-AND-FS:start ms_drbd:promote
>
> if colocation and order is specified between 2 Resources then it means the
> 2nd one is the first and 1st one is the next
>
> For example,
> colocation FS_WITH_DRBD inf: IP-AND-FS ms_drbd:Master
> //Actually means  first start IP-AND-FS then make ms_drbd as Master
> // But in behavior, it is first make ms_drbd as Master and then start
> IP-AND-FS
>

For colocation that is true. But.....

> Also for order, it is the same behavior...
>  order DRBD_BEF_FS inf: IP-AND-FS:start ms_drbd:promote
>  //Actually means  first start IP-AND-FS then promote ms_drbd
> // But in behavior, it is first promote ms_drbd and then start IP-AND-FS
>

For order it working as we expected, so for your example first IP-AND-FS:start
then ms_drbd:promote what is wrong in my case, because filesystem is on top
of drbd.

> But when you define order and colocation between 3 or more resources, then
> the interpretation is fine...
>
>
> Therefore in your scenario, you must configure (as I said)
>  colocation FS_WITH_DRBD inf: IP-AND-FS ms_drbd:Master
> order DRBD_BEF_FS inf: IP-AND-FS:start ms_drbd:promote
>
> Though it means opposite, but the behavior is as you expect.(owing to
> Pacemaker backward compatibility)
>
>
>
When I've hanged it, I could not shut down cluster propely. I've got a lot
of error in syslogs:

[776658.542263] block drbd1:   state = { cs:WFConnection ro:Primary/Unknown
ds:UpToDate/DUnknown r--- }
[776658.542352] block drbd1:  wanted = { cs:WFConnection
ro:Secondary/Unknown ds:UpToDate/DUnknown r--- }
[776659.561595] block drbd1: State change failed: Device is held open by
someone
[776659.561653] block drbd1:   state = { cs:WFConnection ro:Primary/Unknown
ds:UpToDate/DUnknown r--- }
[776659.561742] block drbd1:  wanted = { cs:WFConnection
ro:Secondary/Unknown ds:UpToDate/DUnknown r--- }

because pacemaker wanted to down drbd before unmount filesystem.

To sum up, colocation rule and order from my config is correct.

I wonder why I cant migrate group resource without force parameter. I use
1.0.9 version of pacemaker, maybe it is some kind of bug, I'll try newer
version.

>
>  Regards,
> Kashif Jawed Siddiqui
>
>
>
> ***************************************************************************************
> This e-mail and attachments contain confidential information from HUAWEI,
> which is intended only for the person or entity whose address is listed
> above. Any use of the information contained herein in any way (including,
> but not limited to, total or partial disclosure, reproduction, or
> dissemination) by persons other than the intended recipient's) is
> prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!
>   ------------------------------
> *From:* and k [not4mad at gmail.com]
> *Sent:* Wednesday, January 23, 2013 7:31 PM
> *To:* The Pacemaker cluster resource manager
> *Subject:* Re: [Pacemaker] [PACEMAKER] Why cant't migrate group resource
> with collation in drbd resource
>
>
>
>
>
>
> 2013/1/23 emmanuel segura <emi2fast at gmail.com>
>
>> Ummm
>>
>> Fist the IP-AND-FS? but what happen if the FS is on drbd?
>>
>
>  Emmanuel, you are right, filesystem is on top of drbd device, so i cant
> run that group before drbd promote.
>
>  I also noticed that, I can migrate it by force with use:
>
> crm_resource -M -f -r fs_r1 -H drbd02
>
>  master is promoted correctly, and group is succesfully migrated.
>
>
>  But i wonder why it doesn't work without force (-f) parameter ?? Any
> idea ?
>
>
>
>
>> Thanks
>>
>>  2013/1/23 Kashif Jawed Siddiqui <kashifjs at huawei.com>
>>
>>>   You must change the order
>>>
>>>
>>> #order DRBD_BEF_FS inf: ms_drbd:promote IP-AND-FS:start
>>>
>>>  order DRBD_BEF_FS inf: IP-AND-FS:start ms_drbd:promote
>>>
>>> //First start IP-AND-FS, only then promote ms_drbd
>>>
>>>
>>>  Regards,
>>> Kashif Jawed Siddiqui
>>>
>>>
>>>
>>> ***************************************************************************************
>>> This e-mail and attachments contain confidential information from
>>> HUAWEI, which is intended only for the person or entity whose address is
>>> listed above. Any use of the information contained herein in any way
>>> (including, but not limited to, total or partial disclosure, reproduction,
>>> or dissemination) by persons other than the intended recipient's) is
>>> prohibited. If you receive this e-mail in error, please notify the sender
>>> by phone or email immediately and delete it!
>>>   ------------------------------
>>> *From:* and k [not4mad at gmail.com]
>>> *Sent:* Wednesday, January 23, 2013 4:34 PM
>>> *To:* pacemaker at oss.clusterlabs.org
>>> *Subject:* [Pacemaker] [PACEMAKER] Why cant't migrate group resource
>>> with collation in drbd resource
>>>
>>>   Hello Everybody,
>>>
>>>  I've got a problem (but I am not quite sure if it is not a feature in
>>> pacemaker ) that's why I decided to write on that mailing list.
>>>
>>>  It comes about migrate resource with collation in drbd resource.
>>>
>>>  I've got group including virtual IP and filesystem which is collated
>>> with ms resource drbd in mster-slave configuration.
>>>
>>>  ============
>>> Last updated: Wed Jan 23 03:40:55 2013
>>> Stack: openais
>>> Current DC: drbd01 - partition with quorum
>>> Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
>>> ============
>>>
>>>  Online: [ drbd01 drbd02 ]
>>>
>>>   Master/Slave Set: ms_drbd
>>>      Masters: [ drbd01 ]
>>>      Slaves: [ drbd02 ]
>>>  Resource Group: IP-AND-FS
>>>      fs_r1      (ocf::heartbeat:Filesystem):    Started drbd01
>>>      VIRTUAL-IP (ocf::heartbeat:IPaddr):        Started drbd01
>>>
>>>  I would like to migrate that group manually to another node which is
>>> slave. So i type in: crm resource migrate IP-AND-FS drbd02
>>>
>>>  after that configuration include additional line:
>>>
>>>  location cli-prefer-IP-AND-FS IP-AND-FS \
>>>         rule $id="cli-prefer-rule-IP-AND-FS" inf: #uname eq drbd02
>>>
>>>  and in logs i see:
>>>
>>> Jan 23 11:30:12 drbd02 cibadmin: [1126]: info: Invoked: cibadmin -Ql -o
>>> resources
>>> Jan 23 11:30:12 drbd02 cibadmin: [1129]: info: Invoked: cibadmin -Ql -o
>>> nodes
>>> Jan 23 11:30:12 drbd02 cibadmin: [1131]: info: Invoked: cibadmin -Ql -o
>>> resources
>>> Jan 23 11:30:12 drbd02 cibadmin: [1133]: info: Invoked: cibadmin -Ql -o
>>> nodes
>>> Jan 23 11:30:12 drbd02 cibadmin: [1135]: info: Invoked: cibadmin -Ql -o
>>> resources
>>> Jan 23 11:30:12 drbd02 cibadmin: [1137]: info: Invoked: cibadmin -Ql -o
>>> nodes
>>> Jan 23 11:30:14 drbd02 cibadmin: [1166]: info: Invoked: cibadmin -Ql -o
>>> resources
>>> Jan 23 11:30:14 drbd02 cibadmin: [1168]: info: Invoked: cibadmin -Ql -o
>>> nodes
>>> Jan 23 11:30:14 drbd02 cibadmin: [1170]: info: Invoked: cibadmin -Ql -o
>>> resources
>>> Jan 23 11:30:14 drbd02 cibadmin: [1172]: info: Invoked: cibadmin -Ql -o
>>> nodes
>>> Jan 23 11:30:16 drbd02 cibadmin: [1174]: info: Invoked: cibadmin -Ql -o
>>> resources
>>> Jan 23 11:30:16 drbd02 cibadmin: [1176]: info: Invoked: cibadmin -Ql -o
>>> nodes
>>> Jan 23 11:30:16 drbd02 cibadmin: [1178]: info: Invoked: cibadmin -Ql -o
>>> resources
>>> Jan 23 11:30:16 drbd02 cibadmin: [1180]: info: Invoked: cibadmin -Ql -o
>>> nodes
>>> Jan 23 11:30:40 drbd02 cibadmin: [1211]: info: Invoked: cibadmin -Ql -o
>>> nodes
>>> Jan 23 11:30:40 drbd02 crm_resource: [1213]: info: Invoked: crm_resource
>>> -M -r IP-AND-FS --node=drbd02
>>> Jan 23 11:30:40 drbd02 cib: [1214]: info: write_cib_contents: Archived
>>> previous version as /var/lib/heartbeat/crm/cib-73.raw
>>> Jan 23 11:30:40 drbd02 cib: [1214]: info: write_cib_contents: Wrote
>>> version 0.225.0 of the CIB to disk (digest:
>>> 166251193cbe1e0b9314ab07358accca)
>>> Jan 23 11:30:40 drbd02 cib: [1214]: info: retrieveCib: Reading cluster
>>> configuration from: /var/lib/heartbeat/crm/cib.tk72Ft (digest:
>>> /var/lib/heartbeat/crm/cib.hF2UsS)
>>> Jan 23 11:30:44 drbd02 cib: [30098]: info: cib_stats: Processed 153
>>> operations (1437.00us average, 0% utilization) in the last 10min
>>>
>>>  but nothing happened  resource group is still active on drbd01 node,
>>> as well as there was no new master promotion.
>>>
>>>  Shouldn't pacemaker automatically promote second node to master and
>>> move my resource group ?
>>>
>>>
>>>  Below is my test configuration, i will be appreciate for help:
>>>
>>>  crm(live)# configure show
>>> node drbd01 \
>>>         attributes standby="off"
>>> node drbd02 \
>>>         attributes standby="off"
>>> primitive VIRTUAL-IP ocf:heartbeat:IPaddr \
>>>         params ip="10.11.11.111"
>>> primitive drbd ocf:linbit:drbd \
>>>         params drbd_resource="r1" \
>>>         op start interval="0" timeout="240" \
>>>         op stop interval="0" timeout="100" \
>>>         op monitor interval="59s" role="Master" timeout="30s" \
>>>         op monitor interval="60s" role="Slave" timeout="30s"
>>> primitive fs_r1 ocf:heartbeat:Filesystem \
>>>         params device="/dev/drbd1" directory="/mnt" fstype="ext3" \
>>>         op start interval="0" timeout="60" \
>>>         op stop interval="0" timeout="120" \
>>>         meta allow-migrate="true"
>>> group IP-AND-FS fs_r1 VIRTUAL-IP \
>>>         meta target-role="Started"
>>> ms ms_drbd drbd \
>>>         meta master-node-max="1" clone-max="2" clone-node-max="1"
>>> globally-unique="false" notify="true" target-role="Master"
>>> location cli-prefer-IP-AND-FS IP-AND-FS \
>>>         rule $id="cli-prefer-rule-IP-AND-FS" inf: #uname eq drbd02
>>> colocation FS_WITH_DRBD inf: IP-AND-FS ms_drbd:Master
>>> order DRBD_BEF_FS inf: ms_drbd:promote IP-AND-FS:start
>>> property $id="cib-bootstrap-options" \
>>>         dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \
>>>         cluster-infrastructure="openais" \
>>>         expected-quorum-votes="2" \
>>>         stonith-enabled="false" \
>>>         no-quorum-policy="ignore" \
>>>         last-lrm-refresh="1358868655" \
>>>         default-resource-stickiness="1"
>>>
>>>  Regards
>>> Andrew
>>>
>>>
>>>  _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>
>>
>> --
>> esta es mi vida e me la vivo hasta que dios quiera
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130123/68cf4d6b/attachment-0003.html>