[Pacemaker] stickiness weirdness please explain

Wed Feb 23 08:42:46 UTC 2011

Hi,

On Tue, Feb 22, 2011 at 7:21 PM, Jelle de Jong <jelledejong at powercraft.nl>wrote:

> Hello everybody,
>
> I got the following setup: http://debian.pastebin.com/Sife0hTz
>
> The problem is that when I crm node standby the godfrey node2 everything
> nicely migrates to finley node1 and continues to run. (as expected) when
> godfrey comes back online and finished synchronising the drbd disks it
> tries to take over the resources of finley and fails crashing the iscsi
> and drbd systems....
>

This is something that you should remove from the config, as I understand
it, all resources should run together on the same node and migrate together
to the other node.

   1. location cli-prefer-ip_virtual01 ip_virtual01 \
   2.         rule $id="cli-prefer-rule-ip_virtual01" inf: #uname eq finley
   3. location cli-prefer-iscsi02_lun1 iscsi02_lun1 \
   4.         rule $id="cli-prefer-rule-iscsi02_lun1" inf: #uname eq godfrey
   5. location cli-prefer-iscsi02_target iscsi02_target \
   6.         rule $id="cli-prefer-rule-iscsi02_target" inf: #uname eq
   finley

Try removing property default-resource-stickiness="200" and adding a section
with:
rsc_defaults $id="rsc-options" \
resource-stickiness="200"

And also maybe increasing the value to 1000 from 200.

I see that both groups rg_iscsi01 and rg_iscsi02 start on the same node, and
the general order would be promote DRBD, start the virtual IP, start the
targets and then the luns. I would suggest:
group rg_iscsi ip_virtual01 iscsi01_target iscsi01_lun1 iscsi02_target
iscsi02_lun1 iscsi02_lun2 iscsi02_lun3 iscsi02_lun4
(instead of the 2 groups)
All colocation drbd_rx-master-with-ip inf: ms_drbd_rx:Master ip_virtual01
lines removed
All colocation iscsi0x-with-drbd-master inf: rg_iscsi01 ms_drbd_rx:Master
lines changed to
colocation iscsi0x-with-drbd-master inf: rg_iscsi ms_drbd_rx:Master
(replacing x with the appropriate values)
All order ip-after-drbd_rx inf: ms_drbd_rx:promote ip_virtual01:start lines
removed
All order iscsi0x-after-drbd-promote inf: ms_drbd_rx:promote
rg_iscsi0x:start lines changed to
order iscsi0x-after-drbd-promote inf: ms_drbd_rx:promote rg_iscsi:start
(replacing x ...)

This simplifies resource design and thus keeping the cib smaller, while
achieving the same functional goal.

> I have to stop corosync on both nodes and start them again make both
> nodes standby and then online godfrey and then online finley to get it
> all working again.
>
> Why doesn't it stay running on the finley node when godfrey comes back
> online?
>
> I am also unable to move the iscsi luns with all his depending resources
> to finley and back and forwards by using crm resource move finely.
>
> Why can't I manually move resources around?
>
>
Output of ptest -LsVVV and some logs in a pastebin might help.

Regards,
Dan

> There is probably something I am not doing right but please help me out
> i read the Cluster_from_Scratch.pdf,
> Pacemaker-1.0-Pacemaker_Explained-en-US and ha-iscsi.pdf.
>
> Thanks in advance,
>
> With kind regards,
>
> Jelle de Jong
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>

-- 
Dan Frincu
CCNA, RHCE
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110223/8f83ecf1/attachment.htm>