[Pacemaker] Pacemaker/corosync freeze

Attila Megyeri amegyeri at minerva-soft.com
Tue Mar 18 03:03:48 EDT 2014


Hello,

> -----Original Message-----
> From: Andrew Beekhof [mailto:andrew at beekhof.net]
> Sent: Tuesday, March 18, 2014 2:43 AM
> To: Attila Megyeri
> Cc: The Pacemaker cluster resource manager
> Subject: Re: [Pacemaker] Pacemaker/corosync freeze
> 
> 
> On 13 Mar 2014, at 11:44 pm, Attila Megyeri <amegyeri at minerva-soft.com>
> wrote:
> 
> > Hello,
> >
> >> -----Original Message-----
> >> From: Jan Friesse [mailto:jfriesse at redhat.com]
> >> Sent: Thursday, March 13, 2014 10:03 AM
> >> To: The Pacemaker cluster resource manager
> >> Subject: Re: [Pacemaker] Pacemaker/corosync freeze
> >>
> >> ...
> >>
> >>>>>>
> >>>>>> Also can you please try to set debug: on in corosync.conf and
> >>>>>> paste full corosync.log then?
> >>>>>
> >>>>> I set debug to on, and did a few restarts but could not reproduce
> >>>>> the issue
> >>>> yet - will post the logs as soon as I manage to reproduce.
> >>>>>
> >>>>
> >>>> Perfect.
> >>>>
> >>>> Another option you can try to set is netmtu (1200 is usually safe).
> >>>
> >>> Finally I was able to reproduce the issue.
> >>> I restarted node ctsip2 at 21:10:14, and CPU went 100% immediately
> >>> (not
> >> when node was up again).
> >>>
> >>> The corosync log with debug on is available at:
> >>> http://pastebin.com/kTpDqqtm
> >>>
> >>>
> >>> To be honest, I had to wait much longer for this reproduction as
> >>> before,
> >> even though there was no change in the corosync configuration - just
> >> potentially some system updates. But anyway, the issue is
> >> unfortunately still there.
> >>> Previously, when this issue came, cpu was at 100% on all nodes -
> >>> this time
> >> only on ctmgr, which was the DC...
> >>>
> >>> I hope you can find some useful details in the log.
> >>>
> >>
> >> Attila,
> >> what seems to be interesting is
> >>
> >> Configuration ERRORs found during PE processing.  Please run "crm_verify
> -L"
> >> to identify issues.
> >>
> >> I'm unsure how much is this problem but I'm really not pacemaker expert.
> >
> > Perhaps Andrew could comment on that. Any idea?
> 
> Did you run the command?  What did it say?

Yes, all was fine. This is why I found it strange.






More information about the Pacemaker mailing list