[Pacemaker] Breaking pacemaker

Tue Feb 16 12:59:17 UTC 2010

Hello,

I have a cluster that is all working perfectly. Time to break it.

This is a two node master/slave cluster with drbd. Failover between
the nodes works backwards and forwards. Everything is happier than a
well fed cat.

I wanted to see what would happen if the drbd device couldn't be
mounted, so on the slave node I deleted the mountpoint, then failed
over.

Oh dear. I broke things so badly that I had to fail back, shutdown
corosync on the slave, delete the config files, and start it again.
Since that's not the right way to do it, I thought I should ask the
list for the right way.

Here are the errors I get on the slave when the fs_BLAH resource tries
to start with a missing mountpoint:

Failed actions:
 fs_BLAH_start_0 (node=X, call=X, status=complete): not installed

The logs tell me that the mount point doesn't exist, so I create it,
and try to tell pacemaker.

crm(live)resource# start gr_GROUPNAME
Multiple attributes match name=target-role
(group members listed here)

Okay so starting a group doesn't work. I try to start the filesystem member:
crm(live)resource# start fs_BLAH

That didn't work. I expected it to, but it didn't.

Failover failback doesn't work either.

What am I doing wrong here? This seems fairly sensible..

Thanks

J