[Pacemaker] Help With Cluster Failure
Andrew Beekhof
andrew at beekhof.net
Mon Apr 11 08:24:38 UTC 2011
On Fri, Apr 8, 2011 at 2:50 PM, <Darren.Mansell at opengi.co.uk> wrote:
> -----Original Message-----
> From: Andrew Beekhof [mailto:andrew at beekhof.net]
> Sent: 08 April 2011 08:15
> To: The Pacemaker cluster resource manager
> Cc: Darren Mansell
> Subject: Re: [Pacemaker] Help With Cluster Failure
>
> On Thu, Apr 7, 2011 at 12:12 PM, <Darren.Mansell at opengi.co.uk> wrote:
>> Hi all.
>>
>>
>>
>> One of my clusters had a STONITH shoot-out last night and then refused
>
>> to do anything but sit there from 0400 until 0735 after I'd been woken
>
>> up to fix it.
>>
>>
>>
>> In the end, just a resource cleanup fixed it, which I don't think
>> should be the case.
>>
>>
>>
>> I have an 8MB hb_report file. Is that too big to attach to send here?
>> Should I upload it somewhere?
>
> Is there somewhere you can put it and send us a URL?
>
>
>
> Absolutely. Thanks Andrew.
>
> www.mysqlsimplecluster.com/HB_report/DM_report_1.tar.bz2
>
> Darren
>
Large problem.... no logs are in there.
But this looks ominous:
ptest[4891]: 2011/04/11_10:17:45 notice: native_print:
STONITH-1 (stonith:external/ibmrsa-telnet) Started
ptest[4891]: 2011/04/11_10:17:45 notice: native_print: 0 : OGG-ACTIVEQUOTE-02
ptest[4891]: 2011/04/11_10:17:45 notice: native_print: 1 : OGG-ACTIVEQUOTE-03
ptest[4891]: 2011/04/11_10:17:45 notice: native_print:
STONITH-2 (stonith:external/ibmrsa-telnet): Started OGG-ACTIVEQUOTE-01
ptest[4891]: 2011/04/11_10:17:45 notice: native_print:
STONITH-3 (stonith:external/ibmrsa-telnet) Started
ptest[4891]: 2011/04/11_10:17:45 notice: native_print: 0 : OGG-ACTIVEQUOTE-01
ptest[4891]: 2011/04/11_10:17:45 notice: native_print: 1 : OGG-ACTIVEQUOTE-02
More information about the Pacemaker
mailing list