[Pacemaker] Issue with ordering
Vladislav Bogdanov
bubble at hoster-ok.com
Thu Mar 29 08:07:40 UTC 2012
Hi Andrew, all,
I'm continuing experiments with lustre on stacked drbd, and see
following problem:
I have one drbd resource (ms-drbd-testfs-mdt0000) is stacked on top of
other (ms-drbd-testfs-mdt0000-left), and have following constraints
between them:
colocation drbd-testfs-mdt0000-with-drbd-testfs-mdt0000-left inf:
ms-drbd-testfs-mdt0000 ms-drbd-testfs-mdt0000-left:Master
order drbd-testfs-mdt0000-after-drbd-testfs-mdt0000-left inf:
ms-drbd-testfs-mdt0000-left:promote ms-drbd-testfs-mdt0000:start
Then I have filesystem mounted on top of ms-drbd-testfs-mdt0000
(testfs-mdt0000 resource).
colocation testfs-mdt0000-with-drbd-testfs-mdt0000 inf: testfs-mdt0000
ms-drbd-testfs-mdt0000:Master
order testfs-mdt0000-after-drbd-testfs-mdt0000 inf:
ms-drbd-testfs-mdt0000:promote testfs-mdt0000:start
When I trigger event which causes many resources to stop (including
these three), LogActions output look like:
LogActions: Stop drbd-local#011(lustre01-left)
LogActions: Stop drbd-stacked#011(Started lustre02-left)
LogActions: Stop drbd-testfs-local#011(Started lustre03-left)
LogActions: Stop drbd-testfs-stacked#011(Started lustre04-left)
LogActions: Stop lustre#011(Started lustre04-left)
LogActions: Stop mgs#011(Started lustre01-left)
LogActions: Stop testfs#011(Started lustre03-left)
LogActions: Stop testfs-mdt0000#011(Started lustre01-left)
LogActions: Stop testfs-ost0000#011(Started lustre01-left)
LogActions: Stop testfs-ost0001#011(Started lustre02-left)
LogActions: Stop testfs-ost0002#011(Started lustre03-left)
LogActions: Stop testfs-ost0003#011(Started lustre04-left)
LogActions: Stop drbd-mgs:0#011(Master lustre01-left)
LogActions: Stop drbd-mgs:1#011(Slave lustre02-left)
LogActions: Stop drbd-testfs-mdt0000:0#011(Master lustre01-left)
LogActions: Stop drbd-testfs-mdt0000-left:0#011(Master lustre01-left)
LogActions: Stop drbd-testfs-mdt0000-left:1#011(Slave lustre02-left)
LogActions: Stop drbd-testfs-ost0000:0#011(Master lustre01-left)
LogActions: Stop drbd-testfs-ost0000-left:0#011(Master lustre01-left)
LogActions: Stop drbd-testfs-ost0000-left:1#011(Slave lustre02-left)
LogActions: Stop drbd-testfs-ost0001:0#011(Master lustre02-left)
LogActions: Stop drbd-testfs-ost0001-left:0#011(Master lustre02-left)
LogActions: Stop drbd-testfs-ost0001-left:1#011(Slave lustre01-left)
LogActions: Stop drbd-testfs-ost0002:0#011(Master lustre03-left)
LogActions: Stop drbd-testfs-ost0002-left:0#011(Master lustre03-left)
LogActions: Stop drbd-testfs-ost0002-left:1#011(Slave lustre04-left)
LogActions: Stop drbd-testfs-ost0003:0#011(Master lustre04-left)
LogActions: Stop drbd-testfs-ost0003-left:0#011(Master lustre04-left)
LogActions: Stop drbd-testfs-ost0003-left:1#011(Slave lustre03-left)
For some reason demote is not run on both mdt drbd esources (should
it?), so drbd RA prints warning about that.
What I see then is that ms-drbd-testfs-mdt0000-left is tried to stop
before ms-drbd-testfs-mdt0000.
More, testfs-mdt0000 filesystem resource is not stopped before stopping
drbd-testfs-mdt0000.
I have advisory ordering constraints between mdt and ost filesystem
resources, so all ost's are stopped before mdt. Thus mdt stop is delayed
a bit. May be this influences what happens.
I'm pretty sure I have correct constraints for at least these three
resources, so it looks like a bug, because mandatory ordering is not
preserved.
I can produce report for this.
Best,
Vladislav
More information about the Pacemaker
mailing list