[Pacemaker] DRBD Recovery Policies

Fri Mar 12 10:05:19 UTC 2010

Are you absolutely sure you set the resource-fencing parameters 
correctly in your drbd.conf (you can post your drbd.conf if unsure) and 
reloaded the configuration?

On 12-03-10 10:48, Darren.Mansell at opengi.co.uk wrote:
> The odd thing is - it didn't. From my test, it failed back, re-promoted
> NodeA to be the DRBD master and failed all grouped resources back too.
>
> Everything was working with the ~7GB of data I had put onto NodeB while
> NodeA was down, now available on NodeA...
>
> /proc/drbd on the slave said Secondary/Primary UpToDate/Inconsistent
> while it was syncing data back - so it was able to mount the
> inconsistent data on the primary node and access the files that hadn't
> yet sync'd over?! I mounted a 4GB ISO that shouldn't have been able to
> be there yet and was able to access data inside it..
>
> Is my understanding of DRBD limited and it's actually able to provide
> access to not fully sync'd files over the network link or something?
>
> If so - wow.
>
> I'm confused ;)
>
>
> -----Original Message-----
> From: Menno Luiten [mailto:mluiten at artifix.net]
> Sent: 11 March 2010 19:35
> To: pacemaker at oss.clusterlabs.org
> Subject: Re: [Pacemaker] DRBD Recovery Policies
>
> Hi Darren,
>
> I believe that this is handled by DRBD by fencing the Master/Slave
> resource during resync using Pacemaker. See
> http://www.drbd.org/users-guide/s-pacemaker-fencing.html. This would
> prevent Node A to promote/start services with outdated data
> (fence-peer), and it would be forced to wait with takeover until the
> resync is completed (after-resync-target).
>
> Regards,
> Menno
>
> Op 11-3-2010 15:52, Darren.Mansell at opengi.co.uk schreef:
>> I've been reading the DRBD Pacemaker guide on the DRBD.org site and
> I'm
>> not sure I can find the answer to my question.
>>
>> Imagine a scenario:
>>
>> (NodeA
>>
>> NodeB
>>
>> Order and group:
>>
>> M/S DRBD Promote/Demote
>>
>> FS Mount
>>
>> Other resource that depends on the F/S mount
>>
>> DRBD master location score of 100 on NodeA)
>>
>> NodeA is down, resources failover to NodeB and everything happily runs
>> for days. When NodeA is brought back online it isn't treated as
>> split-brain as a normal demote/promote would happen. But the data on
>> NodeA would be very old and possibly take a long time to sync from
> NodeB.
>>
>> What would happen in this scenario? Would the RA defer the promote
> until
>> the sync is completed? Would the inability to promote cause the
> failback
>> to not happen and a resource cleanup is required once the sync has
>> completed?
>>
>> I guess this is really down to how advanced the Linbit DRBD RA is?
>>
>> Thanks
>>
>> Darren
>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list
>> Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker