[Pacemaker] 2 sbd devices and stonith-ng is showing (1 active devices)

Lars Marowsky-Bree lmb at suse.com
Fri Mar 16 10:13:28 UTC 2012


On 2012-03-16T07:49:10, "Janec, Jozef" <jozef.janec at hp.com> wrote:

> "The sbd agent does not need to and should not be cloned. If all of your nodes run SBD, as is most likely, not even a monitor action provides a real benefit, since the daemon would suicide the node if there was a problem."
> 
> There isn't exactly information that there should be only one resource, If there is only one resource  and something will happen with storage where is the sbd device, only one node will detect that, because the monitoring is running only on the side where is the resource. 

That is incorrect.

So yes, the wiki page needs clearing up to clarify that only one
instance of external/sbd needs to be configured - per (group of) sbd
device(s), but since all clusters I've ever seen only had one, I'll not
include this addendum to avoid further confusion ;-)

But all the nodes run the sbd daemon already - so they will *definitely*
notice if something is wrong with the (majority of) the devices, because
they'll self-fence.

So there is *really* no need to run multiple sbd resources. Trust me.

> Why is the behavior important for us because we had situation where we lost one of the sbd device and result was that whole cluster was rebooted, even there was still second one available

I've only seen one such example; that was two devices on two fabrics,
one device/fabric was lost, but due to some yet-to-be-analyzed
hardware/firmware/kernel problem *both* fabrics went down. Strangely
enough, instead of demanding that either the hardware or the kernel/MPIO
get fixed, they pointed the finger at SBD ;-)

I've made a few adjustments to sbd that I've not yet pushed upstream
pending further testing. I'm attaching them here, perhaps this helps.

> If there is no resource agent for stonith with sbd how the node will
> send request to another node?

Pacemaker/stonith-ng internally forward the requests to the node where
the fencing "device"/controller resource is running. If no fencing
resource is running, Pacemaker will start one.

Trust me.

> Or all nodes are sharing one ?

Yes.

> or if there is some issue with this shared resource that there will be
> no fence action, because the nodes will be not able send request to
> shared device?

If there's a problem, the resource will be restarted elsewhere (as per
the regular rules). If it fails permanently on all nodes, yes, then
there'd be a problem, but that is to be expected.

> In situation where are two nodes and there will be no network connection, and the split brain situation will appear, and this one shared resource will have problem , that the situation will not handle this correctly, and from my point of view one resource for sbd is single point of failure, therefore when one node has its own it will avoid those all situations.

No. Please. Don't clone it. Run just one instance.

> Configure one or two sbd devices on cluster, on device mapper multipath  target. Then use echo 1 > /sys/block/sdXXX/device/delete and delete all devices from the multipath target.  Now all IO operation on that multipath device will be hanged.  In this scenario server will never release that the resource with sbd is not working on the affected node. Now when the second node will need do sbd operation how you will manage that?

First, with just one SBD device, this will "immediately" fence the
affected node anyway.

Second, with two devices, the documentation says that fencing is
impossible if one device is down. That is an documented/expected
behaviour of 2 devices; it can't withstand a further fault.

Third, if you had three devices, losing one device is unproblematic. If
you lose two, the node will not just not be able to fence, but suicide
(similar to the first case), which will *definitely* trigger moving the
resource to another node.

Fourth, don't run SBD on queue_if_no_path MPIO.

Fifth, try the attached patches to make SBD more resilient even in the
face of unrecoverable IO hangs. (You can also request a PTF via SUSE's
bugzilla/technical partner management, if you prefer.)


Regards,
    Lars

-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

-------------- next part --------------
# HG changeset patch
# Parent b25b1fc2d8e073e75f69768213f7fe8b5128e378
sbd: Explicitly inform the master process about IO problems on the child (bnc#738295)

Processes can get stuck on exit handling if async IO cannot be
cancelled, alas.

diff -r b25b1fc2d8e0 lib/stonith/sbd-md.c
--- a/lib/stonith/sbd-md.c	Tue Feb 14 11:18:35 2012 +0100
+++ b/lib/stonith/sbd-md.c	Thu Mar 01 13:24:50 2012 +0100
@@ -23,12 +23,14 @@ struct servants_list_item *servants_lead
 static int	servant_count	= 0;
 static int	servant_restart_interval = 60;
 static int	servant_restart_count = 10;
+static int	servant_inform_parent = 0;
 
 /* signals reserved for multi-disk sbd */
 #define SIG_LIVENESS (SIGRTMIN + 1)	/* report liveness of the disk */
 #define SIG_EXITREQ  (SIGRTMIN + 2)	/* exit request to inquisitor */
 #define SIG_TEST     (SIGRTMIN + 3)	/* trigger self test */
 #define SIG_RESTART  (SIGRTMIN + 4)	/* trigger restart of all failed disk */
+#define SIG_IO_FAIL  (SIGRTMIN + 5)	/* the IO child requests to be considered failed */
 /* FIXME: should add dynamic check of SIG_XX >= SIGRTMAX */
 
 /* Debug Helper */
@@ -216,6 +218,20 @@ int ping_via_slots(const char *name)
 	return 0;
 }
 
+/* This is a bit hackish, but the easiest way to rewire all process
+ * exits to send the desired signal to the parent. */
+void servant_exit(void)
+{
+	pid_t ppid;
+	union sigval signal_value;
+
+	ppid = getppid();
+	if (servant_inform_parent) {
+		memset(&signal_value, 0, sizeof(signal_value));
+		sigqueue(ppid, SIG_IO_FAIL, signal_value);
+	}
+}
+
 int servant(const char *diskname, const void* argp)
 {
 	struct sector_mbox_s *s_mbox = NULL;
@@ -260,6 +276,8 @@ int servant(const char *diskname, const 
 	}
 	cl_log(LOG_INFO, "Monitoring slot %d on disk %s", mbox, diskname);
 	set_proc_title("sbd: watcher: %s - slot: %d", diskname, mbox);
+	atexit(servant_exit);
+	servant_inform_parent = 1;
 
 	s_mbox = sector_alloc();
 	if (mbox_write(st, mbox, s_mbox) < 0) {
@@ -339,6 +357,9 @@ int servant(const char *diskname, const 
  out:
 	free(s_mbox);
 	close_device(st);
+	if (rc == 0) {
+		servant_inform_parent = 0;
+	}
 	return rc;
 }
 
@@ -496,9 +517,10 @@ inline void cleanup_servant_by_pid(pid_t
 				s->devname, s->pid);
 		s->pid = 0;
 	} else {
-		/* TODO: This points to an inconsistency in our internal
-		 * data - how to recover? */
-		cl_log(LOG_ERR, "Cannot cleanup after unknown pid %i",
+		/* This most likely is a stray signal from somewhere, or
+		 * a SIGCHLD for a process that has previously
+		 * explicitly disconnected. */
+		cl_log(LOG_INFO, "cleanup_servant: Nothing known about pid %i",
 				pid);
 	}
 }
@@ -543,6 +565,7 @@ void inquisitor_child(void)
 	sigaddset(&procmask, SIG_LIVENESS);
 	sigaddset(&procmask, SIG_EXITREQ);
 	sigaddset(&procmask, SIG_TEST);
+	sigaddset(&procmask, SIG_IO_FAIL);
 	sigaddset(&procmask, SIGUSR1);
 	sigaddset(&procmask, SIGUSR2);
 	sigprocmask(SIG_BLOCK, &procmask, NULL);
@@ -572,6 +595,13 @@ void inquisitor_child(void)
 					cleanup_servant_by_pid(pid);
 				}
 			}
+		} else if (sig == SIG_IO_FAIL) {
+			s = lookup_servant_by_pid(sinfo.si_pid);
+			if (s) {
+				cl_log(LOG_WARNING, "Servant for %s requests to be disowned",
+						s->devname);
+				cleanup_servant_by_pid(sinfo.si_pid);
+			}
 		} else if (sig == SIG_LIVENESS) {
 			s = lookup_servant_by_pid(sinfo.si_pid);
 			if (s) {
diff -r b25b1fc2d8e0 lib/stonith/sbd.h
--- a/lib/stonith/sbd.h	Tue Feb 14 11:18:35 2012 +0100
+++ b/lib/stonith/sbd.h	Thu Mar 01 13:24:50 2012 +0100
@@ -169,6 +169,7 @@ int ping_via_slots(const char *name);
 int dump_headers(void);
 
 int check_all_dead(void);
+void servant_exit(void);
 int servant(const char *diskname, const void* argp);
 void recruit_servant(const char *devname, pid_t pid);
 struct servants_list_item *lookup_servant_by_dev(const char *devname);
-------------- next part --------------
# HG changeset patch
# Parent 3d9f81a0d95d8d277628bf36281f79be155f5a01
sbd: Make servant restart logic more robust and verbose (bnc#738295)

diff -r 3d9f81a0d95d lib/stonith/sbd-md.c
--- a/lib/stonith/sbd-md.c	Thu Feb 09 21:34:43 2012 +0100
+++ b/lib/stonith/sbd-md.c	Tue Feb 14 11:18:35 2012 +0100
@@ -232,6 +232,8 @@ int servant(const char *diskname, const 
 		return -1;
 	}
 
+	cl_log(LOG_INFO, "Servant starting for device %s", diskname);
+
 	/* Block most of the signals */
 	sigfillset(&servant_masks);
 	sigdelset(&servant_masks, SIGKILL);
@@ -405,20 +407,31 @@ int check_all_dead(void)
 }
 
 
+void servant_start(struct servants_list_item *s)
+{
+	int r = 0;
+	union sigval svalue;
+
+	if (s->pid != 0) {
+		r = sigqueue(s->pid, 0, svalue);
+		if ((r != -1 || errno != ESRCH))
+			return;
+	}
+	cl_log(LOG_INFO, "Starting servant for device %s",
+			s->devname);
+	s->restarts++;
+	s->pid = assign_servant(s->devname, servant, NULL);
+	clock_gettime(CLOCK_MONOTONIC, &s->t_started);
+	return;
+}
+
 void servants_start(void)
 {
 	struct servants_list_item *s;
-	int r = 0;
-	union sigval svalue;
 
 	for (s = servants_leader; s; s = s->next) {
-		if (s->pid != 0) {
-			r = sigqueue(s->pid, 0, svalue);
-			if ((r != -1 || errno != ESRCH))
-				continue;
-		}
 		s->restarts = 0;
-		s->pid = assign_servant(s->devname, servant, NULL);
+		servant_start(s);
 	}
 }
 
@@ -479,6 +492,8 @@ inline void cleanup_servant_by_pid(pid_t
 
 	s = lookup_servant_by_pid(pid);
 	if (s) {
+		cl_log(LOG_WARNING, "Servant for %s (pid: %i) has terminated",
+				s->devname, s->pid);
 		s->pid = 0;
 	} else {
 		/* TODO: This points to an inconsistency in our internal
@@ -488,28 +503,6 @@ inline void cleanup_servant_by_pid(pid_t
 	}
 }
 
-void restart_servant_by_pid(pid_t pid)
-{
-	struct servants_list_item* s;
-
-	s = lookup_servant_by_pid(pid);
-	if (s) {
-		if ((servant_restart_count == 0) || s->restarts < servant_restart_count) {
-			s->pid = assign_servant(s->devname, servant, NULL);
-			s->restarts++;
-		} else {
-			cl_log(LOG_WARNING, "Max retry count reached: not restarting servant for %s",
-					s->devname);
-		}
-
-	} else {
-		/* TODO: This points to an inconsistency in our internal
-		 * data - how to recover? */
-		cl_log(LOG_ERR, "Cannot restart unknown pid %i",
-				pid);
-	}
-}
-
 int inquisitor_decouple(void)
 {
 	pid_t ppid = getppid();
@@ -531,27 +524,20 @@ int inquisitor_decouple(void)
 
 void inquisitor_child(void)
 {
-	int sig, pid, i;
+	int sig, pid;
 	sigset_t procmask;
 	siginfo_t sinfo;
-	int *reports;
 	int status;
 	struct timespec timeout;
 	int good_servants = 0;
 	int exiting = 0;
 	int decoupled = 0;
 	time_t latency;
-	struct timespec t_last_tickle, t_now, t_last_restarted;
+	struct timespec t_last_tickle, t_now;
+	struct servants_list_item* s;
 
 	set_proc_title("sbd: inquisitor");
 
-	reports = malloc(sizeof(int) * servant_count);
-	if (!reports) {
-		cl_log(LOG_ERR, "malloc failed");
-		exit(1);
-	}
-	memset(reports, 0, sizeof(int) * servant_count);
-
 	sigemptyset(&procmask);
 	sigaddset(&procmask, SIGCHLD);
 	sigaddset(&procmask, SIG_LIVENESS);
@@ -567,12 +553,13 @@ void inquisitor_child(void)
 	timeout.tv_nsec = 0;
 	good_servants = 0;
 	clock_gettime(CLOCK_MONOTONIC, &t_last_tickle);
-	clock_gettime(CLOCK_MONOTONIC, &t_last_restarted);
 
 	while (1) {
 		sig = sigtimedwait(&procmask, &sinfo, &timeout);
 		DBGPRINT("got signal %d\n", sig);
 
+		clock_gettime(CLOCK_MONOTONIC, &t_now);
+
 		if (sig == SIG_EXITREQ) {
 			servants_kill();
 			watchdog_close();
@@ -581,27 +568,19 @@ void inquisitor_child(void)
 			while ((pid = waitpid(-1, &status, WNOHANG))) {
 				if (pid == -1 && errno == ECHILD) {
 					break;
-				} else if (exiting) {
+				} else {
 					cleanup_servant_by_pid(pid);
-				} else {
-					restart_servant_by_pid(pid);
 				}
 			}
 		} else if (sig == SIG_LIVENESS) {
-			for (i = 0; i < servant_count; i++) {
-				if (reports[i] == sinfo.si_pid) {
-					break;
-				} else if (reports[i] == 0) {
-					reports[i] = sinfo.si_pid;
-					good_servants++;
-					break;
-				}
+			s = lookup_servant_by_pid(sinfo.si_pid);
+			if (s) {
+				clock_gettime(CLOCK_MONOTONIC, &s->t_last);
 			}
 		} else if (sig == SIG_TEST) {
 		} else if (sig == SIGUSR1) {
 			if (exiting)
 				continue;
-			clock_gettime(CLOCK_MONOTONIC, &t_last_restarted);
 			servants_start();
 		}
 
@@ -612,8 +591,22 @@ void inquisitor_child(void)
 				continue;
 		}
 
+		good_servants = 0;
+		for (s = servants_leader; s; s = s->next) {
+			int age = t_now.tv_sec - s->t_last.tv_sec;
+
+			if (!s->t_last.tv_sec)
+				continue;
+
+			if (age < timeout_watchdog) {
+				good_servants++;
+			} else {
+				cl_log(LOG_WARNING, "Servant for %s outdated (age: %d)",
+						s->devname, age);
+			}
+		}
+
 		if (quorum_read(good_servants)) {
-			DBGPRINT("Enough liveness messages\n");
 			if (!decoupled) {
 				if (inquisitor_decouple() < 0) {
 					servants_kill();
@@ -626,11 +619,8 @@ void inquisitor_child(void)
 
 			watchdog_tickle();
 			clock_gettime(CLOCK_MONOTONIC, &t_last_tickle);
-			memset(reports, 0, sizeof(int) * servant_count);
-			good_servants = 0;
 		}
 
-		clock_gettime(CLOCK_MONOTONIC, &t_now);
 		latency = t_now.tv_sec - t_last_tickle.tv_sec;
 		if (timeout_watchdog && (latency > timeout_watchdog)) {
 			if (!decoupled) {
@@ -649,12 +639,19 @@ void inquisitor_child(void)
 			       (int)latency, (int)timeout_watchdog_warn, good_servants);
 		}
 		
-		latency = t_now.tv_sec - t_last_restarted.tv_sec;
-		if (servant_restart_interval > 0 
-				&& latency > servant_restart_interval) {
-			/* Restart all children every hour */
-			clock_gettime(CLOCK_MONOTONIC, &t_last_restarted);
-			servants_start();
+		for (s = servants_leader; s; s = s->next) {
+			int age = t_now.tv_sec - s->t_started.tv_sec;
+
+			if (age > servant_restart_interval) {
+				s->restarts = 0;
+			}
+
+			if (s->restarts > servant_restart_count) {
+				cl_log(LOG_WARNING, "Max retry count reached: not restarting servant for %s",
+						s->devname);
+				continue;
+			}
+			servant_start(s);
 		}
 	}
 	/* not reached */
diff -r 3d9f81a0d95d lib/stonith/sbd.h
--- a/lib/stonith/sbd.h	Thu Feb 09 21:34:43 2012 +0100
+++ b/lib/stonith/sbd.h	Tue Feb 14 11:18:35 2012 +0100
@@ -73,6 +73,7 @@ struct servants_list_item {
 	const char* devname;
 	pid_t pid;
 	int restarts;
+	struct timespec t_last, t_started;
 	struct servants_list_item *next;
 };
 
@@ -174,12 +175,12 @@ struct servants_list_item *lookup_servan
 struct servants_list_item *lookup_servant_by_pid(pid_t pid);
 void servants_kill(void);
 void servants_start(void);
+void servant_start(struct servants_list_item *s);
 void inquisitor_child(void);
 int inquisitor(void);
 int inquisitor_decouple(void);
 int messenger(const char *name, const char *msg);
 int check_timeout_inconsistent(void);
-void restart_servant_by_pid(pid_t pid);
 void cleanup_servant_by_pid(pid_t pid);
 int quorum_write(int good_servants);
 int quorum_read(int good_servants);
-------------- next part --------------
# HG changeset patch
# Parent d8c154589a16cb99ab16f36a27756ba94eefdbee
Medium: sbd: Use async IO for disk reads to increase resilience against hung IO (bnc#738295)

Also adjusts the servant restart interval tunables, since these are now
more likely to trigger. Any late IO will translate to a servant restart.

diff -r d8c154589a16 lib/stonith/Makefile.am
--- a/lib/stonith/Makefile.am	Tue Jan 31 17:19:00 2012 +0100
+++ b/lib/stonith/Makefile.am	Thu Feb 09 21:34:43 2012 +0100
@@ -43,7 +43,7 @@ meatclient_LDADD	= $(GLIBLIB)
 
 sbd_SOURCES		= sbd-md.c sbd-common.c
 sbd_CFLAGS		= -D_GNU_SOURCE
-sbd_LDADD		= $(GLIBLIB)					\
+sbd_LDADD		= $(GLIBLIB) -laio				\
 			$(top_builddir)/lib/clplumbing/libplumb.la	\
 			$(top_builddir)/lib/clplumbing/libplumbgpl.la
 
diff -r d8c154589a16 lib/stonith/sbd-common.c
--- a/lib/stonith/sbd-common.c	Tue Jan 31 17:19:00 2012 +0100
+++ b/lib/stonith/sbd-common.c	Thu Feb 09 21:34:43 2012 +0100
@@ -1,25 +1,3 @@
-#include <stdio.h>
-#include <stdlib.h>
-#include <unistd.h>
-#include <asm/unistd.h>
-#include <ctype.h>
-#include <string.h>
-#include <syslog.h>
-#include <sys/types.h>
-#include <sys/stat.h>
-#include <sys/ptrace.h>
-#include <fcntl.h>
-#include <time.h>
-#include <clplumbing/cl_log.h>
-#include <clplumbing/coredumps.h>
-#include <clplumbing/realtime.h>
-#include <clplumbing/cl_reboot.h>
-#include <malloc.h>
-#include <sys/utsname.h>
-#include <sys/ioctl.h>
-#include <linux/types.h>
-#include <linux/watchdog.h>
-#include <linux/fs.h>
 
 #include "sbd.h"
 
@@ -33,6 +11,7 @@ unsigned long	timeout_watchdog_warn 	= 3
 int		timeout_allocate 	= 2;
 int		timeout_loop	    	= 1;
 int		timeout_msgwait		= 10;
+int		timeout_io		= 3;
 
 int	watchdog_use		= 0;
 int	watchdog_set_timeout	= 1;
@@ -72,8 +51,11 @@ usage(void)
 "-4 <N>		Set msgwait timeout to N seconds (optional, create only)\n"
 "-5 <N>		Warn if loop latency exceeds threshold (optional, watch only)\n"
 "			(default is 3, set to 0 to disable)\n"
-"-t <N>		Interval in seconds for automatic child restarts (optional)\n"
-"			(default is 3600, set to 0 to disable)\n"
+"-I <N>		Async IO read timeout (defaults to 3 * loop timeout, optional)\n"
+"-t <N>		Dampening delay before faulty servants are restarted (optional)\n"
+"			(default is 60, set to 0 to disable)\n"
+"-F <N>		# of failures before a servant is considered faulty (optional)\n"
+"			(default is 10, set to 0 to disable)\n"
 "Commands:\n"
 "create		initialize N slots on <dev> - OVERWRITES DEVICE!\n"
 "list		List all allocated slots on device, and messages.\n"
@@ -212,27 +194,49 @@ maximize_priority(void)
 	}
 }
 
-int
+void
+close_device(struct sbd_context *st)
+{
+	close(st->devfd);
+	free(st);
+}
+
+struct sbd_context *
 open_device(const char* devname)
 {
-	int devfd;
+	struct sbd_context *st;
+
 	if (!devname)
-		return -1;
+		return NULL;
 
-	devfd = open(devname, O_SYNC|O_RDWR|O_DIRECT);
+	st = malloc(sizeof(struct sbd_context));
+	if (!st)
+		return NULL;
+	memset(st, 0, sizeof(struct sbd_context));
 
-	if (devfd == -1) {
+	if (io_setup(1, &st->ioctx) != 0) {
+		cl_perror("io_setup failed");
+		free(st);
+		return NULL;
+	}
+	
+	st->devfd = open(devname, O_SYNC|O_RDWR|O_DIRECT);
+
+	if (st->devfd == -1) {
 		cl_perror("Opening device %s failed.", devname);
-		return -1;
+		free(st);
+		return NULL;
 	}
 
-	ioctl(devfd, BLKSSZGET, &sector_size);
+	ioctl(st->devfd, BLKSSZGET, &sector_size);
 
 	if (sector_size == 0) {
 		cl_perror("Get sector size failed.\n");
-		return -1;
+		close_device(st);
+		return NULL;
 	}
-	return devfd;
+
+	return st;
 }
 
 signed char
@@ -297,14 +301,14 @@ char2cmd(const char cmd)
 }
 
 int
-sector_write(int devfd, int sector, const void *data)
+sector_write(struct sbd_context *st, int sector, const void *data)
 {
-	if (lseek(devfd, sector_size*sector, 0) < 0) {
+	if (lseek(st->devfd, sector_size*sector, 0) < 0) {
 		cl_perror("sector_write: lseek() failed");
 		return -1;
 	}
 
-	if (write(devfd, data, sector_size) <= 0) {
+	if (write(st->devfd, data, sector_size) <= 0) {
 		cl_perror("sector_write: write_sector() failed");
 		return -1;
 	}
@@ -312,55 +316,83 @@ sector_write(int devfd, int sector, cons
 }
 
 int
-sector_read(int devfd, int sector, void *data)
+sector_read(struct sbd_context *st, int sector, void *data)
 {
-	if (lseek(devfd, sector_size*sector, 0) < 0) {
-		cl_perror("sector_read: lseek() failed");
+	struct timespec	timeout;
+	struct io_event event;
+	struct iocb	*ios[1] = { &st->io };
+	long		r;
+
+	timeout.tv_sec  = timeout_io;
+	timeout.tv_nsec = 0;
+
+	memset(&st->io, 0, sizeof(struct iocb));
+	io_prep_pread(&st->io, st->devfd, data, sector_size, sector_size * sector);
+	if (io_submit(st->ioctx, 1, ios) != 1) {
+		cl_log(LOG_ERR, "Failed to submit IO request!");
 		return -1;
 	}
 
-	if (read(devfd, data, sector_size) < sector_size) {
-		cl_perror("sector_read: read() failed");
+	errno = 0;
+	r = io_getevents(st->ioctx, 1L, 1L, &event, &timeout);
+
+	if (r < 0 ) {
+		cl_log(LOG_ERR, "Failed to retrieve IO events");
+		return -1;
+	} else if (r < 1L) {
+		cl_log(LOG_WARNING, "Cancelling IO request due to timeout");
+		r = io_cancel(st->ioctx, ios[0], &event);
+		if (r) {
+			cl_log(LOG_ERR, "Could not cancel IO request!");
+			/* TODO: Couldn't cancel the IO */
+		}
 		return -1;
 	}
-	return(0);
+	
+	/* IO is happy */
+	if (event.res == sector_size) {
+		return 0;
+	} else {
+		cl_log(LOG_ERR, "Short read");
+		return -1;
+	}
 }
 
 int
-slot_read(int devfd, int slot, struct sector_node_s *s_node)
+slot_read(struct sbd_context *st, int slot, struct sector_node_s *s_node)
 {
-	return sector_read(devfd, SLOT_TO_SECTOR(slot), s_node);
+	return sector_read(st, SLOT_TO_SECTOR(slot), s_node);
 }
 
 int
-slot_write(int devfd, int slot, const struct sector_node_s *s_node)
+slot_write(struct sbd_context *st, int slot, const struct sector_node_s *s_node)
 {
-	return sector_write(devfd, SLOT_TO_SECTOR(slot), s_node);
+	return sector_write(st, SLOT_TO_SECTOR(slot), s_node);
 }
 
 int
-mbox_write(int devfd, int mbox, const struct sector_mbox_s *s_mbox)
+mbox_write(struct sbd_context *st, int mbox, const struct sector_mbox_s *s_mbox)
 {
-	return sector_write(devfd, MBOX_TO_SECTOR(mbox), s_mbox);
+	return sector_write(st, MBOX_TO_SECTOR(mbox), s_mbox);
 }
 
 int
-mbox_read(int devfd, int mbox, struct sector_mbox_s *s_mbox)
+mbox_read(struct sbd_context *st, int mbox, struct sector_mbox_s *s_mbox)
 {
-	return sector_read(devfd, MBOX_TO_SECTOR(mbox), s_mbox);
+	return sector_read(st, MBOX_TO_SECTOR(mbox), s_mbox);
 }
 
 int
-mbox_write_verify(int devfd, int mbox, const struct sector_mbox_s *s_mbox)
+mbox_write_verify(struct sbd_context *st, int mbox, const struct sector_mbox_s *s_mbox)
 {
 	void *data;
 	int rc = 0;
 
-	if (sector_write(devfd, MBOX_TO_SECTOR(mbox), s_mbox) < 0)
+	if (sector_write(st, MBOX_TO_SECTOR(mbox), s_mbox) < 0)
 		return -1;
 
 	data = sector_alloc();
-	if (sector_read(devfd, MBOX_TO_SECTOR(mbox), data) < 0) {
+	if (sector_read(st, MBOX_TO_SECTOR(mbox), data) < 0) {
 		rc = -1;
 		goto out;
 	}
@@ -377,20 +409,20 @@ out:
 	return rc;
 }
 
-int header_write(int devfd, struct sector_header_s *s_header)
+int header_write(struct sbd_context *st, struct sector_header_s *s_header)
 {
 	s_header->sector_size = htonl(s_header->sector_size);
 	s_header->timeout_watchdog = htonl(s_header->timeout_watchdog);
 	s_header->timeout_allocate = htonl(s_header->timeout_allocate);
 	s_header->timeout_loop = htonl(s_header->timeout_loop);
 	s_header->timeout_msgwait = htonl(s_header->timeout_msgwait);
-	return sector_write(devfd, 0, s_header);
+	return sector_write(st, 0, s_header);
 }
 
 int
-header_read(int devfd, struct sector_header_s *s_header)
+header_read(struct sbd_context *st, struct sector_header_s *s_header)
 {
-	if (sector_read(devfd, 0, s_header) < 0)
+	if (sector_read(st, 0, s_header) < 0)
 		return -1;
 
 	s_header->sector_size = ntohl(s_header->sector_size);
@@ -426,18 +458,18 @@ valid_header(const struct sector_header_
 }
 
 struct sector_header_s *
-header_get(int devfd)
+header_get(struct sbd_context *st)
 {
 	struct sector_header_s *s_header;
 	s_header = sector_alloc();
 
-	if (header_read(devfd, s_header) < 0) {
-		cl_log(LOG_ERR, "Unable to read header from device %d", devfd);
+	if (header_read(st, s_header) < 0) {
+		cl_log(LOG_ERR, "Unable to read header from device %d", st->devfd);
 		return NULL;
 	}
 
 	if (valid_header(s_header) < 0) {
-		cl_log(LOG_ERR, "header on device %d is not valid.", devfd);
+		cl_log(LOG_ERR, "header on device %d is not valid.", st->devfd);
 		return NULL;
 	}
 
@@ -448,7 +480,7 @@ header_get(int devfd)
 }
 
 int
-init_device(int devfd)
+init_device(struct sbd_context *st)
 {
 	struct sector_header_s	*s_header;
 	struct sector_node_s	*s_node;
@@ -469,30 +501,30 @@ init_device(int devfd)
 	s_header->timeout_loop = timeout_loop;
 	s_header->timeout_msgwait = timeout_msgwait;
 
-	fstat(devfd, &s);
+	fstat(st->devfd, &s);
 	/* printf("st_size = %ld, st_blksize = %ld, st_blocks = %ld\n",
 			s.st_size, s.st_blksize, s.st_blocks); */
 
 	cl_log(LOG_INFO, "Creating version %d header on device %d",
 			s_header->version,
-			devfd);
+			st->devfd);
 	fprintf(stdout, "Creating version %d header on device %d\n",
 			s_header->version,
-			devfd);
-	if (header_write(devfd, s_header) < 0) {
+			st->devfd);
+	if (header_write(st, s_header) < 0) {
 		rc = -1; goto out;
 	}
 	cl_log(LOG_INFO, "Initializing %d slots on device %d",
 			s_header->slots,
-			devfd);
+			st->devfd);
 	fprintf(stdout, "Initializing %d slots on device %d\n",
 			s_header->slots,
-			devfd);
+			st->devfd);
 	for (i=0;i < s_header->slots;i++) {
-		if (slot_write(devfd, i, s_node) < 0) {
+		if (slot_write(st, i, s_node) < 0) {
 			rc = -1; goto out;
 		}
-		if (mbox_write(devfd, i, s_mbox) < 0) {
+		if (mbox_write(st, i, s_mbox) < 0) {
 			rc = -1; goto out;
 		}
 	}
@@ -507,7 +539,7 @@ out:	free(s_node);
  * slot number. If not found, returns -1.
  * This is necessary because slots might not be continuous. */
 int
-slot_lookup(int devfd, const struct sector_header_s *s_header, const char *name)
+slot_lookup(struct sbd_context *st, const struct sector_header_s *s_header, const char *name)
 {
 	struct sector_node_s	*s_node = NULL;
 	int 			i;
@@ -521,7 +553,7 @@ slot_lookup(int devfd, const struct sect
 	s_node = sector_alloc();
 
 	for (i=0; i < s_header->slots; i++) {
-		if (slot_read(devfd, i, s_node) < 0) {
+		if (slot_read(st, i, s_node) < 0) {
 			rc = -1; goto out;
 		}
 		if (s_node->in_use != 0) {
@@ -538,7 +570,7 @@ out:	free(s_node);
 }
 
 int
-slot_unused(int devfd, const struct sector_header_s *s_header)
+slot_unused(struct sbd_context *st, const struct sector_header_s *s_header)
 {
 	struct sector_node_s	*s_node;
 	int 			i;
@@ -547,7 +579,7 @@ slot_unused(int devfd, const struct sect
 	s_node = sector_alloc();
 
 	for (i=0; i < s_header->slots; i++) {
-		if (slot_read(devfd, i, s_node) < 0) {
+		if (slot_read(st, i, s_node) < 0) {
 			rc = -1; goto out;
 		}
 		if (s_node->in_use == 0) {
@@ -561,7 +593,7 @@ out:	free(s_node);
 
 
 int
-slot_allocate(int devfd, const char *name)
+slot_allocate(struct sbd_context *st, const char *name)
 {
 	struct sector_header_s	*s_header = NULL;
 	struct sector_node_s	*s_node = NULL;
@@ -575,7 +607,7 @@ slot_allocate(int devfd, const char *nam
 		rc = -1; goto out;
 	}
 
-	s_header = header_get(devfd);
+	s_header = header_get(st);
 	if (!s_header) {
 		rc = -1; goto out;
 	}
@@ -584,19 +616,19 @@ slot_allocate(int devfd, const char *nam
 	s_mbox = sector_alloc();
 
 	while (1) {
-		i = slot_lookup(devfd, s_header, name);
+		i = slot_lookup(st, s_header, name);
 		if (i >= 0) {
 			rc = i; goto out;
 		}
 
-		i = slot_unused(devfd, s_header);
+		i = slot_unused(st, s_header);
 		if (i >= 0) {
 			cl_log(LOG_INFO, "slot %d is unused - trying to own", i);
 			fprintf(stdout, "slot %d is unused - trying to own\n", i);
 			memset(s_node, 0, sizeof(*s_node));
 			s_node->in_use = 1;
 			strncpy(s_node->name, name, sizeof(s_node->name));
-			if (slot_write(devfd, i, s_node) < 0) {
+			if (slot_write(st, i, s_node) < 0) {
 				rc = -1; goto out;
 			}
 			sleep(timeout_allocate);
@@ -614,7 +646,7 @@ out:	free(s_node);
 }
 
 int
-slot_list(int devfd)
+slot_list(struct sbd_context *st)
 {
 	struct sector_header_s	*s_header = NULL;
 	struct sector_node_s	*s_node = NULL;
@@ -622,7 +654,7 @@ slot_list(int devfd)
 	int 			i;
 	int			rc = 0;
 
-	s_header = header_get(devfd);
+	s_header = header_get(st);
 	if (!s_header) {
 		rc = -1; goto out;
 	}
@@ -631,11 +663,11 @@ slot_list(int devfd)
 	s_mbox = sector_alloc();
 
 	for (i=0; i < s_header->slots; i++) {
-		if (slot_read(devfd, i, s_node) < 0) {
+		if (slot_read(st, i, s_node) < 0) {
 			rc = -1; goto out;
 		}
 		if (s_node->in_use > 0) {
-			if (mbox_read(devfd, i, s_mbox) < 0) {
+			if (mbox_read(st, i, s_mbox) < 0) {
 				rc = -1; goto out;
 			}
 			printf("%d\t%s\t%s\t%s\n",
@@ -651,7 +683,7 @@ out:	free(s_node);
 }
 
 int
-slot_msg(int devfd, const char *name, const char *cmd)
+slot_msg(struct sbd_context *st, const char *name, const char *cmd)
 {
 	struct sector_header_s	*s_header = NULL;
 	struct sector_mbox_s	*s_mbox = NULL;
@@ -663,7 +695,7 @@ slot_msg(int devfd, const char *name, co
 		rc = -1; goto out;
 	}
 
-	s_header = header_get(devfd);
+	s_header = header_get(st);
 	if (!s_header) {
 		rc = -1; goto out;
 	}
@@ -672,7 +704,7 @@ slot_msg(int devfd, const char *name, co
 		name = local_uname;
 	}
 
-	mbox = slot_lookup(devfd, s_header, name);
+	mbox = slot_lookup(st, s_header, name);
 	if (mbox < 0) {
 		cl_log(LOG_ERR, "slot_msg(): No slot found for %s.", name);
 		rc = -1; goto out;
@@ -690,7 +722,7 @@ slot_msg(int devfd, const char *name, co
 
 	cl_log(LOG_INFO, "Writing %s to node slot %s",
 			cmd, name);
-	if (mbox_write_verify(devfd, mbox, s_mbox) < -1) {
+	if (mbox_write_verify(st, mbox, s_mbox) < -1) {
 		rc = -1; goto out;
 	}
 	if (strcasecmp(cmd, "exit") != 0) {
@@ -705,7 +737,7 @@ out:	free(s_mbox);
 }
 
 int
-slot_ping(int devfd, const char *name)
+slot_ping(struct sbd_context *st, const char *name)
 {
 	struct sector_header_s	*s_header = NULL;
 	struct sector_mbox_s	*s_mbox = NULL;
@@ -718,7 +750,7 @@ slot_ping(int devfd, const char *name)
 		rc = -1; goto out;
 	}
 
-	s_header = header_get(devfd);
+	s_header = header_get(st);
 	if (!s_header) {
 		rc = -1; goto out;
 	}
@@ -727,7 +759,7 @@ slot_ping(int devfd, const char *name)
 		name = local_uname;
 	}
 
-	mbox = slot_lookup(devfd, s_header, name);
+	mbox = slot_lookup(st, s_header, name);
 	if (mbox < 0) {
 		cl_log(LOG_ERR, "slot_msg(): No slot found for %s.", name);
 		rc = -1; goto out;
@@ -739,13 +771,13 @@ slot_ping(int devfd, const char *name)
 	strncpy(s_mbox->from, local_uname, sizeof(s_mbox->from)-1);
 
 	cl_log(LOG_DEBUG, "Pinging node %s", name);
-	if (mbox_write(devfd, mbox, s_mbox) < -1) {
+	if (mbox_write(st, mbox, s_mbox) < -1) {
 		rc = -1; goto out;
 	}
 
 	rc = -1;
 	while (waited <= timeout_msgwait) {
-		if (mbox_read(devfd, mbox, s_mbox) < 0)
+		if (mbox_read(st, mbox, s_mbox) < 0)
 			break;
 		if (s_mbox->cmd != SBD_MSG_TEST) {
 			rc = 0;
@@ -869,10 +901,10 @@ make_daemon(void)
 }
 
 int
-header_dump(int devfd)
+header_dump(struct sbd_context *st)
 {
 	struct sector_header_s *s_header;
-	s_header = header_get(devfd);
+	s_header = header_get(st);
 	if (s_header == NULL)
 		return -1;
 
diff -r d8c154589a16 lib/stonith/sbd-md.c
--- a/lib/stonith/sbd-md.c	Tue Jan 31 17:19:00 2012 +0100
+++ b/lib/stonith/sbd-md.c	Thu Feb 09 21:34:43 2012 +0100
@@ -16,41 +16,13 @@
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
  */
 
-#include <signal.h>
-#include <errno.h>
-#include <sys/types.h>
-#include <sys/wait.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <unistd.h>
-#include <asm/unistd.h>
-#include <ctype.h>
-#include <string.h>
-#include <syslog.h>
-#include <sys/types.h>
-#include <sys/stat.h>
-#include <sys/ptrace.h>
-#include <fcntl.h>
-#include <time.h>
-#include <clplumbing/cl_log.h>
-#include <clplumbing/coredumps.h>
-#include <clplumbing/realtime.h>
-#include <clplumbing/cl_reboot.h>
-#include <clplumbing/setproctitle.h>
-#include <malloc.h>
-#include <time.h>
-#include <sys/utsname.h>
-#include <sys/ioctl.h>
-#include <linux/types.h>
-#include <linux/watchdog.h>
-#include <linux/fs.h>
-
 #include "sbd.h"
 
 struct servants_list_item *servants_leader = NULL;
 
 static int	servant_count	= 0;
-static int	servant_restart_interval = 3600;
+static int	servant_restart_interval = 60;
+static int	servant_restart_count = 10;
 
 /* signals reserved for multi-disk sbd */
 #define SIG_LIVENESS (SIGRTMIN + 1)	/* report liveness of the disk */
@@ -104,18 +76,18 @@ int assign_servant(const char* devname, 
 int init_devices()
 {
 	int rc = 0;
-	int devfd;
+	struct sbd_context *st;
 	struct servants_list_item *s;
 
 	for (s = servants_leader; s; s = s->next) {
 		fprintf(stdout, "Initializing device %s\n",
 				s->devname);
-		devfd = open_device(s->devname);
-		if (devfd == -1) {
+		st = open_device(s->devname);
+		if (!st) {
 			return -1;
 		}
-		rc = init_device(devfd);
-		close(devfd);
+		rc = init_device(st);
+		close_device(st);
 		if (rc == -1) {
 			fprintf(stderr, "Failed to init device %s\n", s->devname);
 			return rc;
@@ -128,14 +100,14 @@ int init_devices()
 int slot_msg_wrapper(const char* devname, const void* argp)
 {
 	int rc = 0;
-	int devfd;
+	struct sbd_context *st;
 	const struct slot_msg_arg_t* arg = (const struct slot_msg_arg_t*)argp;
 
-        devfd = open_device(devname);
-        if (devfd == -1) 
+        st = open_device(devname);
+        if (!st) 
 		return -1;
-	rc = slot_msg(devfd, arg->name, arg->msg);
-	close(devfd);
+	rc = slot_msg(st, arg->name, arg->msg);
+	close_device(st);
 	return rc;
 }
 
@@ -143,32 +115,32 @@ int slot_ping_wrapper(const char* devnam
 {
 	int rc = 0;
 	const char* name = (const char*)argp;
-	int devfd;
+	struct sbd_context *st;
 
-	devfd = open_device(devname);
-	if (devfd == -1)
+	st = open_device(devname);
+	if (!st)
 		return -1;
-	rc = slot_ping(devfd, name);
-	close(devfd);
+	rc = slot_ping(st, name);
+	close_device(st);
 	return rc;
 }
 
 int allocate_slots(const char *name)
 {
 	int rc = 0;
-	int devfd;
+	struct sbd_context *st;
 	struct servants_list_item *s;
 
 	for (s = servants_leader; s; s = s->next) {
 		fprintf(stdout, "Trying to allocate slot for %s on device %s.\n", 
 				name,
 				s->devname);
-		devfd = open_device(s->devname);
-		if (devfd == -1) {
+		st = open_device(s->devname);
+		if (!st) {
 			return -1;
 		}
-		rc = slot_allocate(devfd, name);
-		close(devfd);
+		rc = slot_allocate(st, name);
+		close_device(st);
 		if (rc == -1)
 			return rc;
 		fprintf(stdout, "Slot for %s has been allocated on %s.\n",
@@ -182,15 +154,15 @@ int list_slots()
 {
 	int rc = 0;
 	struct servants_list_item *s;
-	int devfd;
+	struct sbd_context *st;
 
 	for (s = servants_leader; s; s = s->next) {
 		DBGPRINT("list slots on device %s\n", s->devname);
-		devfd = open_device(s->devname);
-		if (devfd == -1)
+		st = open_device(s->devname);
+		if (!st)
 			return -1;
-		rc = slot_list(devfd);
-		close(devfd);
+		rc = slot_list(st);
+		close_device(st);
 		if (rc == -1)
 			return rc;
 	}
@@ -207,7 +179,6 @@ int ping_via_slots(const char *name)
 	siginfo_t sinfo;
 	struct servants_list_item *s;
 
-	DBGPRINT("you shall know no fear\n");
 	sigemptyset(&procmask);
 	sigaddset(&procmask, SIGCHLD);
 	sigprocmask(SIG_BLOCK, &procmask, NULL);
@@ -253,7 +224,7 @@ int servant(const char *diskname, const 
 	time_t t0, t1, latency;
 	union sigval signal_value;
 	sigset_t servant_masks;
-	int devfd;
+	struct sbd_context *st;
 	pid_t ppid;
 
 	if (!diskname) {
@@ -272,12 +243,12 @@ int servant(const char *diskname, const 
 	/* FIXME: check error */
 	sigprocmask(SIG_SETMASK, &servant_masks, NULL);
 
-	devfd = open_device(diskname);
-	if (devfd == -1) {
+	st = open_device(diskname);
+	if (!st) {
 		return -1;
 	}
 
-	mbox = slot_allocate(devfd, local_uname);
+	mbox = slot_allocate(st, local_uname);
 	if (mbox < 0) {
 		cl_log(LOG_ERR,
 		       "No slot allocated, and automatic allocation failed for disk %s.",
@@ -289,7 +260,7 @@ int servant(const char *diskname, const 
 	set_proc_title("sbd: watcher: %s - slot: %d", diskname, mbox);
 
 	s_mbox = sector_alloc();
-	if (mbox_write(devfd, mbox, s_mbox) < 0) {
+	if (mbox_write(st, mbox, s_mbox) < 0) {
 		rc = -1;
 		goto out;
 	}
@@ -308,7 +279,7 @@ int servant(const char *diskname, const 
 			do_reset();
 		}
 
-		if (mbox_read(devfd, mbox, s_mbox) < 0) {
+		if (mbox_read(st, mbox, s_mbox) < 0) {
 			cl_log(LOG_ERR, "mbox read failed in servant.");
 			exit(1);
 		}
@@ -321,7 +292,7 @@ int servant(const char *diskname, const 
 			switch (s_mbox->cmd) {
 			case SBD_MSG_TEST:
 				memset(s_mbox, 0, sizeof(*s_mbox));
-				mbox_write(devfd, mbox, s_mbox);
+				mbox_write(st, mbox, s_mbox);
 				sigqueue(ppid, SIG_TEST, signal_value);
 				break;
 			case SBD_MSG_RESET:
@@ -345,7 +316,7 @@ int servant(const char *diskname, const 
 				cl_log(LOG_ERR, "Unknown message on disk %s",
 				       diskname);
 				memset(s_mbox, 0, sizeof(*s_mbox));
-				mbox_write(devfd, mbox, s_mbox);
+				mbox_write(st, mbox, s_mbox);
 				break;
 			}
 		}
@@ -365,8 +336,7 @@ int servant(const char *diskname, const 
 	}
  out:
 	free(s_mbox);
-	close(devfd);
-	devfd = -1;
+	close_device(st);
 	return rc;
 }
 
@@ -465,17 +435,17 @@ void servants_kill(void)
 
 int check_timeout_inconsistent(void)
 {
-	int devfd;
+	struct sbd_context *st;
 	struct sector_header_s *hdr_cur = 0, *hdr_last = 0;
 	struct servants_list_item* s;
 	int inconsistent = 0;
 
 	for (s = servants_leader; s; s = s->next) {
-		devfd = open_device(s->devname);
-		if (devfd < 0)
+		st = open_device(s->devname);
+		if (!st)
 			continue;
-		hdr_cur = header_get(devfd);
-		close(devfd);
+		hdr_cur = header_get(st);
+		close_device(st);
 		if (!hdr_cur)
 			continue;
 		if (hdr_last) {
@@ -524,7 +494,7 @@ void restart_servant_by_pid(pid_t pid)
 
 	s = lookup_servant_by_pid(pid);
 	if (s) {
-		if (s->restarts < 10) {
+		if ((servant_restart_count == 0) || s->restarts < servant_restart_count) {
 			s->pid = assign_servant(s->devname, servant, NULL);
 			s->restarts++;
 		} else {
@@ -802,15 +772,15 @@ int dump_headers(void)
 {
 	int rc = 0;
 	struct servants_list_item *s = servants_leader;
-	int devfd;
+	struct sbd_context *st;
 
 	for (s = servants_leader; s; s = s->next) {
 		fprintf(stdout, "==Dumping header on disk %s\n", s->devname);
-		devfd = open_device(s->devname);
-		if (devfd == -1)
+		st = open_device(s->devname);
+		if (!st)
 			return -1;
-		rc = header_dump(devfd);
-		close(devfd);
+		rc = header_dump(st);
+		close_device(st);
 		if (rc == -1)
 			return rc;
 		fprintf(stdout, "==Header on disk %s is dumped\n", s->devname);
@@ -835,7 +805,7 @@ int main(int argc, char **argv, char **e
 
 	get_uname();
 
-	while ((c = getopt(argc, argv, "DRWhvw:d:n:1:2:3:4:5:t:")) != -1) {
+	while ((c = getopt(argc, argv, "DRWhvw:d:n:1:2:3:4:5:t:I:")) != -1) {
 		switch (c) {
 		case 'D':
 			/* Ignore for historical reasons */
@@ -879,6 +849,12 @@ int main(int argc, char **argv, char **e
 		case 't':
 			servant_restart_interval = atoi(optarg);
 			break;
+		case 'I':
+			timeout_io = atoi(optarg);
+			break;
+		case 'F':
+			servant_restart_count = atoi(optarg);
+			break;
 		case 'h':
 			usage();
 			return (0);
diff -r d8c154589a16 lib/stonith/sbd.h
--- a/lib/stonith/sbd.h	Tue Jan 31 17:19:00 2012 +0100
+++ b/lib/stonith/sbd.h	Thu Feb 09 21:34:43 2012 +0100
@@ -15,8 +15,35 @@
  * License along with this library; if not, write to the Free Software
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
  */
+
 #include <arpa/inet.h>
+#include <asm/unistd.h>
+#include <clplumbing/cl_log.h>
+#include <clplumbing/cl_reboot.h>
+#include <clplumbing/coredumps.h>
+#include <clplumbing/realtime.h>
+#include <clplumbing/setproctitle.h>
+#include <ctype.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <libaio.h>
+#include <linux/fs.h>
+#include <linux/types.h>
+#include <linux/watchdog.h>
+#include <malloc.h>
+#include <signal.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/ioctl.h>
+#include <sys/ptrace.h>
+#include <sys/stat.h>
 #include <sys/types.h>
+#include <sys/utsname.h>
+#include <sys/wait.h>
+#include <syslog.h>
+#include <time.h>
+#include <unistd.h>
 
 /* Sector data types */
 struct sector_header_s {
@@ -49,6 +76,12 @@ struct servants_list_item {
 	struct servants_list_item *next;
 };
 
+struct sbd_context {
+	int	devfd;
+	io_context_t	ioctx;
+	struct iocb	io;
+};
+
 #define SBD_MSG_EMPTY	0x00
 #define SBD_MSG_TEST	0x01
 #define SBD_MSG_RESET	0x02
@@ -65,32 +98,33 @@ int watchdog_tickle(void);
 int watchdog_init(void);
 void sysrq_init(void);
 void watchdog_close(void);
-int open_device(const char* devname);
+struct sbd_context *open_device(const char* devname);
+void close_device(struct sbd_context *st);
 signed char cmd2char(const char *cmd);
 void * sector_alloc(void);
 const char* char2cmd(const char cmd);
-int sector_write(int devfd, int sector, const void *data);
-int sector_read(int devfd, int sector, void *data);
-int slot_read(int devfd, int slot, struct sector_node_s *s_node);
-int slot_write(int devfd, int slot, const struct sector_node_s *s_node);
-int mbox_write(int devfd, int mbox, const struct sector_mbox_s *s_mbox);
-int mbox_read(int devfd, int mbox, struct sector_mbox_s *s_mbox);
-int mbox_write_verify(int devfd, int mbox, const struct sector_mbox_s *s_mbox);
+int sector_write(struct sbd_context *st, int sector, const void *data);
+int sector_read(struct sbd_context *st, int sector, void *data);
+int slot_read(struct sbd_context *st, int slot, struct sector_node_s *s_node);
+int slot_write(struct sbd_context *st, int slot, const struct sector_node_s *s_node);
+int mbox_write(struct sbd_context *st, int mbox, const struct sector_mbox_s *s_mbox);
+int mbox_read(struct sbd_context *st, int mbox, struct sector_mbox_s *s_mbox);
+int mbox_write_verify(struct sbd_context *st, int mbox, const struct sector_mbox_s *s_mbox);
 /* After a call to header_write(), certain data fields will have been
  * converted to on-disk byte-order; the header should not be accessed
  * afterwards anymore! */
-int header_write(int devfd, struct sector_header_s *s_header);
-int header_read(int devfd, struct sector_header_s *s_header);
+int header_write(struct sbd_context *st, struct sector_header_s *s_header);
+int header_read(struct sbd_context *st, struct sector_header_s *s_header);
 int valid_header(const struct sector_header_s *s_header);
-struct sector_header_s * header_get(int devfd);
-int init_device(int devfd);
-int slot_lookup(int devfd, const struct sector_header_s *s_header, const char *name);
-int slot_unused(int devfd, const struct sector_header_s *s_header);
-int slot_allocate(int devfd, const char *name);
-int slot_list(int devfd);
-int slot_ping(int devfd, const char *name);
-int slot_msg(int devfd, const char *name, const char *cmd);
-int header_dump(int devfd);
+struct sector_header_s * header_get(struct sbd_context *st);
+int init_device(struct sbd_context *st);
+int slot_lookup(struct sbd_context *st, const struct sector_header_s *s_header, const char *name);
+int slot_unused(struct sbd_context *st, const struct sector_header_s *s_header);
+int slot_allocate(struct sbd_context *st, const char *name);
+int slot_list(struct sbd_context *st);
+int slot_ping(struct sbd_context *st, const char *name);
+int slot_msg(struct sbd_context *st, const char *name, const char *cmd);
+int header_dump(struct sbd_context *st);
 void sysrq_trigger(char t);
 void do_crashdump(void);
 void do_reset(void);
@@ -105,6 +139,7 @@ extern unsigned long    timeout_watchdog
 extern int      timeout_allocate;
 extern int      timeout_loop;
 extern int      timeout_msgwait;
+extern int      timeout_io;
 extern int  watchdog_use;
 extern int  watchdog_set_timeout;
 extern int  skip_rt;


More information about the Pacemaker mailing list