[Pacemaker] Pacemaker resource management
Dan Frincu
dfrincu at streamwide.ro
Sun Jun 6 14:47:35 UTC 2010
Hello all,
I have a couple of questions and I haven't found any relevant
documentation about it so I would appreciate any answers on the matter.
I'm using drbd 8.3.2-6 with pacemaker 1.0.5-4.2, openais 0.80.5-15.2 and
heartbeat 3.0.0-33.3 for a high availability 2 node cluster for mysql
and apache with drbd partitions.
What I want to know is if a a resource fails, such as apache, pacemaker
tries to restart the service, it has to do with
"common_apply_stickiness", from what I can see in the logs.
1. How many times does pacemaker try to restart a resource before
declaring it "down" and migrating the resource (and dependencies) to the
other node?
2. How can I alter this behavior, to be able to set the number of
retries a resource is attempted to be restarted before migrating it to
the other available node?
I've noticed that sometimes, if there is a problem with the block device
(drbd) the cluster will go into a stage where it migrates all resources
in a group from A to B, however, when trying to start resources on B,
there is a synchronization issue, one block device is still being in
process of being updated from node A drbd0 to node B drbd0. In this case
the group resources don't start until the synchronization is complete.
3. Can I "force" a group of resources to migrate to another node if any
of the resources fails to be brought up within a number of retries or
after a timeout (including if the group is just being migrated from A to
B, but one resource fails to start on B, to be migrated back to A)? How?
4. Is there a Resource Agent out there that can be configured to send
SNMP traps?
Thank you in advance for your replies.
Best regards.
More information about the Pacemaker
mailing list