[Pacemaker] How to send email-notification on failure of resource in cluster frame work

Tue Mar 29 08:19:21 EDT 2011

On Mar 29, 2011 6:12 AM, "Michael Schwartzkopff" <misch at clusterbau.com>
wrote:
>
> > On Tue, Mar 29, 2011 at 3:29 AM, Vadym Chepkov <vchepkov at gmail.com>
wrote:
> > > On Mar 24, 2011, at 12:46 AM, Rakesh K wrote:
> > >> Hi ALL
> > >> Is there any way to send Email notifications when a resource is
failure
> > >> in the cluster frame work.
> > >>
> > >> while i was going through the Pacemaker-explained document provided
in
> > >> the website www.clusterlabs.org
> > >>
> > >> There was no content in the chapter 7 --> which is sending email
> > >> notification events.
> > >>
> > >> can anybody help me regarding this.
> > >>
> > >> for know i am approaching the crm_mon --daemonize --as-html <path ot
> > >> fil> to maintain the status of HA in html file.
> > >>
> > >> Is there any other approach for sending email notification.
> > >
> > > Last time I checked, crm_mon is not well suited for this purpose.
> > >
> > > crm_mon has the following option
> > >
> > >       -T, --mail-to=value
> > >              Send  Mail  alerts  to  this  user.    See   also
> > > --mail-from, --mail-host, --mail-prefix
> > >
> > > But you will end-up with obscene amount of e-mails, I was blocked from
> > > gmail when I tried to use it once :) For one resource failure you will
> > > get 4 e-mails: monitor,stop,start,monitor. Now imagine if it was a
most
> > > significant member of a group or worse, node failure...
> > >
> > > nagios would be better suited for this purpose, but, unfortunately,
> > > crm_mon is broken
> > > (http://developerbugs.linux-foundation.org/show_bug.cgi?id=2344) for
> > > quite awhile.
> >
> > The fix is going to have to come from the community, I don't have any
> > knowledge of nagios
> >
> > > I am yet to find a good monitoring solution for pacemaker, hopefully
> > > somebody had more success and will share.
>
> Use SNMP. It is the standard protocol for monitoring. Add a "extend" line
to
> your snmpd.conf to call a script that returns the number of failcounts.
You
> can easily monitoring this with every NMS. For nagios use check_snmp.
>

I afraid it won't be able to tell more then "stuff happened" :(
Would it?

Vadym
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110329/71d88262/attachment-0003.html>