[Pacemaker] notifications for cloned resources

Steve Feehan feehans at ncbi.nlm.nih.gov
Wed Aug 13 16:33:21 CEST 2014


On Tue, Aug 12, 2014 at 04:56:06PM +1000, Andrew Beekhof wrote:
> 
> What is ganeti doing with the information though?
> Like GFS2, OCFS2 and the dlm, it might be more appropriate for it to get membership information directly from corosync.

ganeti wants the ganeti-node-role resource to run on all nodes, but it
only performs any action on the master node.

It expects to receive notifications when a node is down and then sets
its internal state as offline. The only OCF action that it ipmlements is
'notify'. When its invoked in this mode it does this:

notify_action() {
  is_master || exit 0
  [[ -f $NORUNFILE ]] && exit 0
  # TODO: also implement the "start" operation for readding a node
  [[ $OCF_RESKEY_CRM_meta_notify_operation == "stop" ]] || exit 0
  [[ $OCF_RESKEY_CRM_meta_notify_type == "post" ]] || exit 0
  local -r target=$OCF_RESKEY_CRM_meta_notify_stop_uname
  local -r node=$(gnt-node list --no-headers -o name $target)
  # TODO: use drain_node when we can
  offline_node $node
  exit 0
}

ganeti provides a harep utility that will perform actions to heal a
cluster when a node is marked offline.

All I need is some way to mark a node offline when it is down/fenced.
This has to be done from the master. Master failover works, so if the
master is down pacemaker will promote one of the other nodes to master.

There are two cases:

 1. A node that is not the master is down. On the master mark the node
    as offline and harep will do the rest.

 2. A node that is the master is down. Pacemaker will start the master
    on another node (this works), the new master will mark the old master
    as offline, and then harep will do the rest.

-- 
Steve Feehan [Contractor]



More information about the Pacemaker mailing list