[Pacemaker] Managing DRBD Dual Primary with Pacemaker always initial Split Brains
Felix Zachlod
fz.lists at sis-gmbh.info
Thu Oct 2 08:44:37 CEST 2014
Am 01.10.2014 20:46, schrieb Digimer:
>
> At some point along the way, both nodes were Primary while not
> connected, even if for just a moment. Your log snippet above shows the
> results of this break, they do not appear to speak to the break itself.
Even easier to reproduce is the problem when I try stop stop a drbd
resource and later restart it this always leads to a split brain.
this is the log from the one side:
Oct 2 08:11:46 storage-test-d kernel: [44936.343453] drbd testdata2:
asender terminated
Oct 2 08:11:46 storage-test-d kernel: [44936.343457] drbd testdata2:
Terminating drbd_a_testdata
Oct 2 08:11:46 storage-test-d kernel: [44936.362103] drbd testdata2:
conn( TearDown -> Disconnecting )
Oct 2 08:11:47 storage-test-d kernel: [44936.450052] drbd testdata2:
Connection closed
Oct 2 08:11:47 storage-test-d kernel: [44936.450070] drbd testdata2:
conn( Disconnecting -> StandAlone )
Oct 2 08:11:47 storage-test-d kernel: [44936.450074] drbd testdata2:
receiver terminated
Oct 2 08:11:47 storage-test-d kernel: [44936.450081] drbd testdata2:
Terminating drbd_r_testdata
Oct 2 08:11:47 storage-test-d kernel: [44936.450104] block drbd11:
disk( UpToDate -> Failed )
Oct 2 08:11:47 storage-test-d kernel: [44936.514071] block drbd11:
bitmap WRITE of 0 pages took 0 jiffies
Oct 2 08:11:47 storage-test-d kernel: [44936.514078] block drbd11: 0 KB
(0 bits) marked out-of-sync by on disk bit-map.
Oct 2 08:11:47 storage-test-d kernel: [44936.514088] block drbd11:
disk( Failed -> Diskless )
Oct 2 08:11:47 storage-test-d kernel: [44936.514793] block drbd11:
drbd_bm_resize called with capacity == 0
Oct 2 08:11:47 storage-test-d kernel: [44936.515461] drbd testdata2:
Terminating drbd_w_testdata
Oct 2 08:12:16 storage-test-d rsyslogd-2177: imuxsock lost 124 messages
from pid 2748 due to rate-limiting
Oct 2 08:13:06 storage-test-d kernel: [45016.120378] drbd testdata2:
Starting worker thread (from drbdsetup-84 [10353])
Oct 2 08:13:06 storage-test-d kernel: [45016.121012] block drbd11:
disk( Diskless -> Attaching )
Oct 2 08:13:06 storage-test-d kernel: [45016.121812] drbd testdata2:
Method to ensure write ordering: drain
Oct 2 08:13:06 storage-test-d kernel: [45016.121817] block drbd11: max
BIO size = 1048576
Oct 2 08:13:06 storage-test-d kernel: [45016.121825] block drbd11:
drbd_bm_resize called with capacity == 838835128
Oct 2 08:13:06 storage-test-d kernel: [45016.127192] block drbd11:
resync bitmap: bits=104854391 words=1638350 pages=3200
Oct 2 08:13:06 storage-test-d kernel: [45016.127199] block drbd11: size
= 400 GB (419417564 KB)
Oct 2 08:13:06 storage-test-d kernel: [45016.321361] block drbd11:
recounting of set bits took additional 2 jiffies
Oct 2 08:13:06 storage-test-d kernel: [45016.321369] block drbd11: 0 KB
(0 bits) marked out-of-sync by on disk bit-map.
Oct 2 08:13:06 storage-test-d kernel: [45016.321382] block drbd11:
disk( Attaching -> UpToDate )
Oct 2 08:13:06 storage-test-d kernel: [45016.321388] block drbd11:
attached to UUIDs
28A688FAC06E2662:0000000000000000:0EABC2724124755C:0EAAC2724124755C
Oct 2 08:13:06 storage-test-d kernel: [45016.376555] drbd testdata2:
conn( StandAlone -> Unconnected )
Oct 2 08:13:06 storage-test-d kernel: [45016.376634] drbd testdata2:
Starting receiver thread (from drbd_w_testdata [10355])
Oct 2 08:13:06 storage-test-d kernel: [45016.376876] drbd testdata2:
receiver (re)started
Oct 2 08:13:06 storage-test-d kernel: [45016.376897] drbd testdata2:
conn( Unconnected -> WFConnection )
Oct 2 08:13:07 storage-test-d rsyslogd-2177: imuxsock begins to drop
messages from pid 2748 due to rate-limiting
Oct 2 08:13:07 storage-test-d kernel: [45016.707045] block drbd11:
role( Secondary -> Primary )
Oct 2 08:13:07 storage-test-d kernel: [45016.729180] block drbd11: new
current UUID
C58090DF57933525:28A688FAC06E2662:0EABC2724124755C:0EAAC2724124755C
Oct 2 08:13:07 storage-test-d kernel: [45016.876920] drbd testdata2:
Handshake successful: Agreed network protocol version 101
Oct 2 08:13:07 storage-test-d kernel: [45016.876926] drbd testdata2:
Agreed to support TRIM on protocol level
Oct 2 08:13:07 storage-test-d kernel: [45016.876999] drbd testdata2:
conn( WFConnection -> WFReportParams )
Oct 2 08:13:07 storage-test-d kernel: [45016.877013] drbd testdata2:
Starting asender thread (from drbd_r_testdata [10376])
Oct 2 08:13:07 storage-test-d kernel: [45017.015220] block drbd11:
drbd_sync_handshake:
Oct 2 08:13:07 storage-test-d kernel: [45017.015228] block drbd11: self
C58090DF57933525:28A688FAC06E2662:0EABC2724124755C:0EAAC2724124755C
bits:0 flags:0
Oct 2 08:13:07 storage-test-d kernel: [45017.015234] block drbd11: peer
7F282664519D49A1:28A688FAC06E2662:0EABC2724124755C:0EAAC2724124755C
bits:0 flags:0
Oct 2 08:13:07 storage-test-d kernel: [45017.015239] block drbd11:
uuid_compare()=100 by rule 90
Oct 2 08:13:07 storage-test-d kernel: [45017.015247] block drbd11:
helper command: /sbin/drbdadm initial-split-brain minor-11
Oct 2 08:13:07 storage-test-d kernel: [45017.018192] block drbd11:
helper command: /sbin/drbdadm initial-split-brain minor-11 exit code 0 (0x0)
Oct 2 08:13:07 storage-test-d kernel: [45017.018218] block drbd11:
Split-Brain detected but unresolved, dropping connection!
Oct 2 08:13:07 storage-test-d kernel: [45017.018244] block drbd11:
helper command: /sbin/drbdadm split-brain minor-11
Oct 2 08:13:07 storage-test-d kernel: [45017.021201] block drbd11:
helper command: /sbin/drbdadm split-brain minor-11 exit code 0 (0x0)
Oct 2 08:13:07 storage-test-d kernel: [45017.021242] drbd testdata2:
conn( WFReportParams -> Disconnecting )
Oct 2 08:13:07 storage-test-d kernel: [45017.021248] drbd testdata2:
error receiving ReportState, e: -5 l: 0!
Oct 2 08:13:07 storage-test-d kernel: [45017.021277] drbd testdata2:
asender terminated
Oct 2 08:13:07 storage-test-d kernel: [45017.021280] drbd testdata2:
Terminating drbd_a_testdata
Oct 2 08:13:07 storage-test-d kernel: [45017.143260] drbd testdata2:
Connection closed
Oct 2 08:13:07 storage-test-d kernel: [45017.143440] drbd testdata2:
conn( Disconnecting -> StandAlone )
Oct 2 08:13:07 storage-test-d kernel: [45017.143446] drbd testdata2:
receiver terminated
Oct 2 08:13:07 storage-test-d kernel: [45017.143449] drbd testdata2:
Terminating drbd_r_testdata
This is the log from the either side. As you can see both sides promote
even before the connections is ready.
Oct 2 08:11:46 storage-test-c kernel: [75654.680647] block drbd11:
role( Primary -> Secondary )
Oct 2 08:11:46 storage-test-c kernel: [75654.680764] block drbd11:
bitmap WRITE of 0 pages took 0 jiffies
Oct 2 08:11:46 storage-test-c kernel: [75654.680769] block drbd11: 0 KB
(0 bits) marked out-of-sync by on disk bit-map.
Oct 2 08:11:46 storage-test-c kernel: [75654.725842] block drbd11:
peer( Primary -> Secondary )
Oct 2 08:11:46 storage-test-c rsyslogd-2177: imuxsock begins to drop
messages from pid 2472 due to rate-limiting
Oct 2 08:11:46 storage-test-c kernel: [75654.947972] drbd testdata2:
peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk(
UpToDate -> DUnknown )
Oct 2 08:11:46 storage-test-c kernel: [75654.948085] drbd testdata2:
asender terminated
Oct 2 08:11:46 storage-test-c kernel: [75654.948089] drbd testdata2:
Terminating drbd_a_testdata
Oct 2 08:11:47 storage-test-c kernel: [75655.052087] drbd testdata2:
Connection closed
Oct 2 08:11:47 storage-test-c kernel: [75655.052096] drbd testdata2:
conn( Disconnecting -> StandAlone )
Oct 2 08:11:47 storage-test-c kernel: [75655.052099] drbd testdata2:
receiver terminated
Oct 2 08:11:47 storage-test-c kernel: [75655.052102] drbd testdata2:
Terminating drbd_r_testdata
Oct 2 08:11:47 storage-test-c kernel: [75655.052119] block drbd11:
disk( UpToDate -> Failed )
Oct 2 08:11:47 storage-test-c kernel: [75655.120108] block drbd11:
bitmap WRITE of 0 pages took 0 jiffies
Oct 2 08:11:47 storage-test-c kernel: [75655.120112] block drbd11: 0 KB
(0 bits) marked out-of-sync by on disk bit-map.
Oct 2 08:11:47 storage-test-c kernel: [75655.120117] block drbd11:
disk( Failed -> Diskless )
Oct 2 08:11:47 storage-test-c kernel: [75655.120419] block drbd11:
drbd_bm_resize called with capacity == 0
Oct 2 08:11:47 storage-test-c kernel: [75655.120900] drbd testdata2:
Terminating drbd_w_testdata
Oct 2 08:12:16 storage-test-c rsyslogd-2177: imuxsock lost 125 messages
from pid 2472 due to rate-limiting
Oct 2 08:13:06 storage-test-c kernel: [75734.646310] drbd testdata2:
Starting worker thread (from drbdsetup-84 [7942])
Oct 2 08:13:06 storage-test-c kernel: [75734.646588] block drbd11:
disk( Diskless -> Attaching )
Oct 2 08:13:06 storage-test-c kernel: [75734.647120] drbd testdata2:
Method to ensure write ordering: drain
Oct 2 08:13:06 storage-test-c kernel: [75734.647126] block drbd11: max
BIO size = 1048576
Oct 2 08:13:06 storage-test-c kernel: [75734.647133] block drbd11:
drbd_bm_resize called with capacity == 838835128
Oct 2 08:13:06 storage-test-c kernel: [75734.652037] block drbd11:
resync bitmap: bits=104854391 words=1638350 pages=3200
Oct 2 08:13:06 storage-test-c kernel: [75734.652050] block drbd11: size
= 400 GB (419417564 KB)
Oct 2 08:13:06 storage-test-c kernel: [75734.796781] block drbd11:
recounting of set bits took additional 3 jiffies
Oct 2 08:13:06 storage-test-c kernel: [75734.796786] block drbd11: 0 KB
(0 bits) marked out-of-sync by on disk bit-map.
Oct 2 08:13:06 storage-test-c kernel: [75734.796800] block drbd11:
disk( Attaching -> UpToDate )
Oct 2 08:13:06 storage-test-c kernel: [75734.796804] block drbd11:
attached to UUIDs
28A688FAC06E2662:0000000000000000:0EABC2724124755C:0EAAC2724124755C
Oct 2 08:13:06 storage-test-c kernel: [75734.821025] drbd testdata2:
conn( StandAlone -> Unconnected )
Oct 2 08:13:06 storage-test-c kernel: [75734.821051] drbd testdata2:
Starting receiver thread (from drbd_w_testdata [7944])
Oct 2 08:13:06 storage-test-c kernel: [75734.821108] drbd testdata2:
receiver (re)started
Oct 2 08:13:06 storage-test-c kernel: [75734.821127] drbd testdata2:
conn( Unconnected -> WFConnection )
Oct 2 08:13:07 storage-test-c rsyslogd-2177: imuxsock begins to drop
messages from pid 2472 due to rate-limiting
Oct 2 08:13:07 storage-test-c kernel: [75735.288894] block drbd11:
role( Secondary -> Primary )
Oct 2 08:13:07 storage-test-c kernel: [75735.289121] block drbd11: new
current UUID
7F282664519D49A1:28A688FAC06E2662:0EABC2724124755C:0EAAC2724124755C
Oct 2 08:13:07 storage-test-c kernel: [75735.480079] drbd testdata2:
Handshake successful: Agreed network protocol version 101
Oct 2 08:13:07 storage-test-c kernel: [75735.480083] drbd testdata2:
Agreed to support TRIM on protocol level
Oct 2 08:13:07 storage-test-c kernel: [75735.480158] drbd testdata2:
conn( WFConnection -> WFReportParams )
Oct 2 08:13:07 storage-test-c kernel: [75735.480164] drbd testdata2:
Starting asender thread (from drbd_r_testdata [7965])
Oct 2 08:13:07 storage-test-c kernel: [75735.568077] block drbd11:
drbd_sync_handshake:
Oct 2 08:13:07 storage-test-c kernel: [75735.568082] block drbd11: self
7F282664519D49A1:28A688FAC06E2662:0EABC2724124755C:0EAAC2724124755C
bits:0 flags:0
Oct 2 08:13:07 storage-test-c kernel: [75735.568087] block drbd11: peer
C58090DF57933525:28A688FAC06E2662:0EABC2724124755C:0EAAC2724124755C
bits:0 flags:0
Oct 2 08:13:07 storage-test-c kernel: [75735.568091] block drbd11:
uuid_compare()=100 by rule 90
Oct 2 08:13:07 storage-test-c kernel: [75735.568096] block drbd11:
helper command: /sbin/drbdadm initial-split-brain minor-11
Oct 2 08:13:07 storage-test-c kernel: [75735.570098] block drbd11:
helper command: /sbin/drbdadm initial-split-brain minor-11 exit code 0 (0x0)
Oct 2 08:13:07 storage-test-c kernel: [75735.570111] block drbd11:
Split-Brain detected but unresolved, dropping connection!
Oct 2 08:13:07 storage-test-c kernel: [75735.570168] block drbd11:
helper command: /sbin/drbdadm split-brain minor-11
Oct 2 08:13:07 storage-test-c kernel: [75735.571935] block drbd11:
helper command: /sbin/drbdadm split-brain minor-11 exit code 0 (0x0)
Oct 2 08:13:07 storage-test-c kernel: [75735.571960] drbd testdata2:
conn( WFReportParams -> Disconnecting )
Oct 2 08:13:07 storage-test-c kernel: [75735.571964] drbd testdata2:
error receiving ReportState, e: -5 l: 0!
Oct 2 08:13:07 storage-test-c kernel: [75735.572015] drbd testdata2:
asender terminated
Oct 2 08:13:07 storage-test-c kernel: [75735.572018] drbd testdata2:
Terminating drbd_a_testdata
Oct 2 08:13:07 storage-test-c kernel: [75735.672070] drbd testdata2:
Connection closed
Oct 2 08:13:07 storage-test-c kernel: [75735.672124] drbd testdata2:
conn( Disconnecting -> StandAlone )
Oct 2 08:13:07 storage-test-c kernel: [75735.672127] drbd testdata2:
receiver terminated
Oct 2 08:13:07 storage-test-c kernel: [75735.672130] drbd testdata2:
Terminating drbd_r_testdata
And yes, I can easily configure stonith/fencing- stonith is even already
working I just would have to add the resource level side in drbd but for
my opinion this adds an unnesesary complexity to this problem at THIS
point of the survey. I definitely reach a split brain situation here
where I don't expect it. Stopping a resource and re starting in a normal
fashion should not cause a split brain. And if the split brain is there
the fencing should kick in and I would have a node turning off and
possibly in a loop as on every restart the resource would split brains
into half and shoot a node. I think it would be best to have this
cluster running in a way that under normal circumstances does NOT cause
split brains and afterwards get it working for odd circumstances where
fencing is needed. I will definitely try them all.
And again. I still believe that pacemaker is causing the problem here by
promoting the nodes too fast.
I try the second part manually:
1. stop the resource in pacemaker via crm stop $resource
2. set the cluster into maintenance
3. start the drbd resource via the shell
Both sides come up as secondary, sync and are ready to promote.
If you do this via Pacemaker as above you can see that it promotes the
resource to primary although it is not connected at this point
(WFConnection) and naturally this means split brain afterwards.
> I'd take an errant fence over a split-brain any day.
Not if your good node with the good data is being fenced and the one
with the (consistent but old) data comes up. This leads to a time warp
in the FS in any VM running on this storage and will most probably cause
them to crash all and possibly damage their file systems too. And
finally you have got the situation where you don't know if both nodes
contain relevant data.
> That said, that you
> can recover without invalidating one of the nodes is very odd.
I saw that from the logs it isn't too odd, it seems it recovers
automatically, I have configured
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
Where the after-sb-1pri policy lets the secondary sync from the primary.
> What
> version of DRBD are you using?
I am currently running 8.4.5 on to of Debian Wheezy with Pacemaker 1.1.7
> If you stop DRBD on a node, the other should stay in WFConnection state.
> Otherwise, it won't hear the connection request when you restart the peer.
Yes this is true and this is happening though... just that the other
node seems to come up as primary which is unexpected and
after-sb-2pri disconnect;
disconnects both hosts at this point.
> You are right though, if a node goes Primary *before* connecting, you
> have a split-brain. The start order should be attach -> connect ->
> primary. Why it's not doing that is to be determined, but it would
> certainly be a problem if you're seeing attach -> primary -> connect.
regards, Felix
More information about the Pacemaker
mailing list