[Pacemaker] operative tasks for a pacemaker cluster

Wed Apr 13 06:29:05 UTC 2011

On Wed, Apr 13, 2011 at 8:23 AM, Tim Serong <tserong at novell.com> wrote:
> On 4/13/2011 at 02:04 AM, mark - pacemaker list <m+pacemaker at nerdish.us> wrote:
>> Hello,
>>
>> On Mon, Apr 11, 2011 at 11:11 AM, Andrew Beekhof <andrew at beekhof.net> wrote:
>> > On Mon, Apr 11, 2011 at 2:48 PM, Klaus Darilion
>> > <klaus.mailinglists at pernau.at> wrote:
>> >>
>> >> Recently I got hit by running out of inodes due to too many files in
>> >> /var/lib/pengine.
>> >
>> > man pengine
>> >
>> > look for "-series-max"
>>
>> There is no pengine man page in the packages (pacemaker, heartbeat, or
>> corosync) from the EPEL repo, nor online with the other online
>> manpages at clusterlabs.  Am I missing it someplace?  I want to read
>> about this as I have just under 7000 files in /var/lib/pengine on a
>> node that has 7 days of uptime.  Will this grow unchecked, or do older
>> files eventually get cleaned up?
>
> Not sure what's up with the EPEL packaging, sorry.  The relevant bit of
> that manpage is:
>
>       pe-error-series-max = integer [-1]
>           The number of PE inputs resulting in ERRORs to save
>
>           Zero to disable, -1 to store unlimited.
>
>       pe-warn-series-max = integer [-1]
>           The number of PE inputs resulting in WARNINGs to save
>
>           Zero to disable, -1 to store unlimited.
>
>       pe-input-series-max = integer [-1]
>           The number of other PE inputs to save

This seems excessive.  I might change this to 5000 - thats enough to
get us through 500 iterations of CTS and so should suffice IRL.

>
>           Zero to disable, -1 to store unlimited.
>
> So, yeah, by default unless you specifically limit it, it'll just keep
> saving 'em.  They're invaluable for debugging failures, BTW.
>
> Were those 7000 pe-inputs all created over that 7 day period?  Because
> that's a transition every 1.44 minutes.  Is it just me, or does that
> sound like a rather busy cluster?

Indeed.