[Pacemaker] Problems with SBD

Lars Marowsky-Bree lmb at suse.com
Wed Jan 7 06:09:36 EST 2015


On 2015-01-04T19:49:58, Oriol Mula-Valls <omv.lists at gmail.com> wrote:

> I have a two node system with SLES 11 SP3 (pacemaker-1.1.9-0.19.102,
> corosync-1.4.5-0.18.15, sbd-1.1-0.13.153). Since desember we started to
> have several reboots of the system due to SBD; 22nd, 24th and 26th. Last
> reboot happened yesterday January 3rd. The message is the same all the
> times.
> /var/log/messages:Jan  3 11:55:08 kernighan sbd: [7879]: info: Cancelling
> IO request due to timeout (rw=0)
> /var/log/messages:Jan  3 11:55:08 kernighan sbd: [7879]: ERROR: mbox read
> failed in servant.
> /var/log/messages:Jan  3 11:55:08 kernighan sbd: [7878]: WARN: Servant for
> /dev/sdc1 (pid: 7879) has terminated
> /var/log/messages:Jan  3 11:55:08 kernighan sbd: [7878]: WARN: Servant for
> /dev/sdc1 outdated (age: 4)
> /var/log/messages:Jan  3 11:55:08 kernighan sbd: [8183]: info: Servant
> starting for device /dev/sdc1
> /var/log/messages:Jan  3 11:55:11 kernighan sbd: [8183]: info: Cancelling
> IO request due to timeout (rw=0)
> /var/log/messages:Jan  3 11:55:11 kernighan sbd: [8183]: ERROR: Unable to
> read header from device 5
> /var/log/messages:Jan  3 11:55:11 kernighan sbd: [8183]: ERROR: Not a valid
> header on /dev/sdc1
> /var/log/messages:Jan  3 11:55:11 kernighan sbd: [7878]: WARN: Servant for
> /dev/sdc1 (pid: 8183) has terminated
> /var/log/messages:Jan  3 11:55:11 kernighan sbd: [7878]: WARN: Latency: No
> liveness for 4 s exceeds threshold of 3 s (healthy servants: 0)
> 
> The sbd is an iscsi drive shared by synology box.
> 
> Could any one provide me some guidance on what's happenning please?

Those are pretty clearly IO errors due to high latency. You may need to
increase the IO timeout, and/or figure out why the IO to your Synology
box sometimes stalls for multiple seconds. See the manpage for this; you
can add the required flag to /etc/sysconfig/sbd -> SBD_OPTS.

You also should use a stable name (/dev/disk/by-id/...) rather than
/dev/sdc1 - note that /dev/sdX may not be stable over reboots or iSCSI
restarts.

Further, you can avoid the reboots by enabling the pacemaker
integration. See the manpage for details on what that flag does. (-P)
That will be the default in later sbd versions for releases after SLE HA
11.



Regards,
    Lars

-- 
Architect Storage/HA
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde





More information about the Pacemaker mailing list