[Pacemaker] Understanding the LRM

Jesse W. Hathaway jesse at mbuki-mvuki.org
Tue Sep 9 11:43:42 EDT 2008


I am using a modified version of this nagios check_crm script
to monitor my cluster:

  http://article.gmane.org/gmane.linux.highavailability.user/21849

It works fairly well by just parsing the output of crm_mon

However it does not provide the failcount of resources since those
are unavailable from crm_mon.

I wrote a ruby script to process the cib.xml directly and retrieve
the failcounts, this works well, but I would also like to duplicate
the information from crm_mon, namely which services are started
and on what node they are running.

How do I get that information from the cib.xml? I looked at the 
crm.dtd but it didn't help me too much. I also tried grokking the
source for crm_mon.c but I didn't get too far.

Can anyone provide pointers to code or documentation about how to 
extract the information that crm_mon displays from the cib.xml?

thanks, Jesse




More information about the Pacemaker mailing list