[Pacemaker] continue starting chain with failed group resources
Patrick H.
pacemaker at feystorm.net
Wed Dec 15 01:18:16 UTC 2010
Sent: Tue Dec 14 2010 11:37:06 GMT-0700 (Mountain Standard Time)
From: Dejan Muhamedagic <dejanmm at fastmail.fm>
To: The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org>
Subject: Re: [Pacemaker] continue starting chain with failed group
resources
> Hi,
>
> On Mon, Dec 13, 2010 at 10:43:36PM -0700, Patrick H. wrote:
>
>> After tinkering with this for a few hours I finally have something working.
>>
>> colocation co-raid inf: ( md_raid iscsi_1 iscsi_2 iscsi_3 )
>>
>
> This should be noop. You'd want something like this, I think:
>
> colocation co-raid inf: md_raid ( iscsi_1 iscsi_2 iscsi_3 )
>
>
No, that makes the md_raid service depend on all the iscsi services
being started, which I dont want
>> order or-raid 0: ( iscsi_1 iscsi_2 iscsi_3 ) md_raid
>>
>> Got rid of the group, changed the score on the order to 0, and
>> changed the grouping of both the colocation and order. This
>> *appears* to function as intended, but if anyone can point out any
>> pitfalls I'd appreciate it
>>
>> -Patrick
>>
>> Sent: Mon Dec 13 2010 21:12:04 GMT-0700 (Mountain Standard Time)
>> From: Patrick H. <pacemaker at feystorm.net>
>> To: The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org>
>> Subject: [Pacemaker] continue starting chain with failed group resources
>>
>>> Is there a way to continue down a chain of starting resources once
>>> a previous resource hast tried to start, no matter if the try was
>>> successful or not?
>>>
>
> No, that's currently not possible to express. I think that you
> should take the iSCSI resources out of the cluster and let them
> start on boot _before_ the cluster manager. If there are not
> enough disks, then the md_raid resource is going to fail.
>
Cant do that either. When the node that is currently using the iscsi
services fails, they have to be migrated over to another host so it can
assemble them into a raid array. If theyre not being managed by
pacemaker, they wont migrate.
I made a few more tweaks from the configuration I posted earlier and it
seems to work pretty good with only one exception.
colocation co-raid inf: ( md_raid iscsi_1 iscsi_2 iscsi_3 )
order or-raid_start 0: ( iscsi_1:start iscsi_2:start iscsi_3:start )
md_raid:start
order or-raid_stop inf: md_raid:stop ( iscsi_1:stop iscsi_2:stop
iscsi_3:stop )
That makes it so that when they start up, they start in order, but it
isnt required that every iscsi start before md_raid, just that they try
to start
Then when they stop, its manditory that they stop in that order so that
no iscsi service will stop while md_raid is still running.
The exception I mentioned is a bug in the policy engine. Bug 2435. The
policy engine allows resources within a colocation set to start on other
nodes. So if I were to stop one of the iscsi services, and then start it
again, it might start on a different node. Unless this bug gets fixed
soon, I'll probably modify the iscsi script so that all the iscsi
devices are under 1 resource.
> Thanks,
>
> Dejan
>
>
>>> I've got 3 iSCSI resources which are in a group, and then an md
>>> raid-5 array as another resource. I have the raid array resource
>>> set to start after the group with a colocation rule, but it will
>>> only start if the whole group comes up. Since this is raid-5, we
>>> can obviously handle some disk failure and start up anyway. So how
>>> do I get it to try to start it up once all the iSCSI resources
>>> have tried to start? Went looking through the docs and didnt find
>>> anything.
>>>
>>> Note: there will be other resources in the chain (like mounting
>>> the filesystem) that I dont want to try and start if the raid
>>> array resource didnt start.
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>
>
>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20101214/4adfb4b5/attachment-0002.htm>
More information about the Pacemaker
mailing list