[Pacemaker] Not unmoving colocated resources can provoke DRBD split-brain

Robert Dahlem Robert.Dahlem at gmx.net
Fri May 30 10:43:47 EDT 2014


Hi,

On 30.05.2014 13:20, Robert Dahlem wrote:

>> run crm_report for the period covered by these commands and attach the result:
>>
>> # crm node standby korfwf01 ; sleep 10
>> # crm node standby korfwf02 ; sleep 10
>> # crm node online korfwf02 ; sleep 10
>> # crm node online korfwf01 ; sleep 10
>> # crm status
> 
> I filed a bug
> 	http://bugs.clusterlabs.org/show_bug.cgi?id=5217
> and attached the crm_report.

This seems to be some kind of a race condition: I added
	sleep 3
to a central point in /usr/lib/ocf/resource.d/linbit/drbd. As soon as I
did that I could not reproduce the split brain.

Then I added some logging and repeated
	crm resource move ALL-ffm korfwf01
	crm node standby korfwf01
	crm node standby korfwf02
	crm node online korfwf02
	crm node online korfwf01

The following is what happens after
	crm node online korfwf01

Timestamp			korfwf01	korfwf02
========================================================
Fri May 30 15:46:35 CEST 2014			notify
Fri May 30 15:46:38 CEST 2014			notify
Fri May 30 15:46:41 CEST 2014			demote
Fri May 30 15:46:44 CEST 2014			notify
Fri May 30 15:46:47 CEST 2014			notify
Fri May 30 15:46:50 CEST 2014			stop
Fri May 30 15:46:53 CEST 2014	start		start
Fri May 30 15:46:58 CEST 2014	notify		notify
Fri May 30 15:47:01 CEST 2014	notify		notify
Fri May 30 15:47:04 CEST 2014	promote	
Fri May 30 15:47:07 CEST 2014	notify		notify
Fri May 30 15:47:10 CEST 2014	monitor		monitor

1.) Note the parallel "start" at 15:46:53. This could very well end up
in a race condition without "sleep 3".

2.) Why is pacemaker doing "stop/start" at all on korfwf02?

Now again, but without "sleep 3":

Timestamp			korfwf01	korfwf02
========================================================
Fri May 30 16:23:24 CEST 2014			notify
Fri May 30 16:23:26 CEST 2014			notify
Fri May 30 16:23:26 CEST 2014	start		demote
Fri May 30 16:23:26 CEST 2014			notify
Fri May 30 16:23:26 CEST 2014			notify
Fri May 30 16:23:26 CEST 2014			notify
Fri May 30 16:23:26 CEST 2014			stop
Fri May 30 16:23:26 CEST 2014			start
Fri May 30 16:23:27 CEST 2014	notify		notify
Fri May 30 16:23:28 CEST 2014	notify		notify
Fri May 30 16:23:28 CEST 2014	promote	
Fri May 30 16:23:28 CEST 2014	notify		notify
Fri May 30 16:23:28 CEST 2014	monitor		monitor

Look at this excerpt from /var/log/messages:

16:23:27 korfwf01 block drbd7: disk( Diskless -> Attaching )
16:23:27 korfwf01 block drbd7: disk( Attaching -> UpToDate )
16:23:27 korfwf01 drbd ffm: conn( StandAlone -> Unconnected )
16:23:27 korfwf01 drbd ffm: conn( Unconnected -> WFConnection )
16:23:28 korfwf01 block drbd7: role( Secondary -> Primary )
16:23:28 korfwf01 drbd ffm: conn( WFConnection -> WFReportParams )
16:23:28 korfwf01 drbd ffm: conn( WFReportParams -> NetworkFailure )
16:23:28 korfwf01 drbd ffm: conn( NetworkFailure -> Unconnected )
16:23:28 korfwf01 drbd ffm: conn( Unconnected -> WFConnection )

korfwf01 was not waiting for a connection before 16:23:27. At this time
korfwf02 was stopped, but had the latest data. So korfwf01 came up
before korfwf02 was up again -> split-brain!

Again: why is korfwf02 doing "stop/start" in this situation? Without
this, korfwf02 would just do "demote" while korfwf01 would do
"start/promote" and everything would be fine.

Kind regards,
Robert




More information about the Pacemaker mailing list