[ClusterLabs] Corosync main process was not scheduled for 115935.2266 ms (threshold is 800.0000 ms). Consider token timeout increase.

Fri Feb 19 10:11:30 UTC 2016

> Jan, thank you for the answer. I still have few questions.
>
> Also, FYI, this happened on a cluster with just one node.
> The cluster is designed to be a two-node cluster, with possibility to work
> even with only one node.
>
>>> and if corosync was not scheduled for more than token timeout "Process
> pause detected for ..." message is displayed and new membership is formed.
> That meant other nodes will call STONITH to fence that node. Right?
> Although, it didn't happen that way, because there was no the other
> "online" node in the cluster.
> And because I didn't configure my fencing device to accept node-list, it
> was called by the cluster.
> And the call was successful because of the logic - the other node was not
> connected to the STONITH physical device, equals to a "successful STONITH".
> And then that Pacemaker's logic worked out to to shut down itself ("crit:"
> message in the log).
>
>>> There is really no help. It's best to make sure corosync is scheduled
> regularly.
> I may sound silly, but how can I do it?

It's actually very hard to say. Pauses like 30 sec is really unusual and 
shouldn't happen (specially with RT scheduling). It is usually happening 
on VM where host is overcommitted. I don't think it's your case. Another 
problem may be thrashing (what in theory may be the case because of "Out 
of memory: Kill process 8674 ..." message).

>
> I did:
> # grep -n --color "Process pause detected for\|Corosync main process was
> not scheduled for\|invoked oom-killer\|kernel: Out of memory: Kill
> process\|kernel: Killed process\|RA:\|error:\|fence\|STONITH"
> B5-2U-205-LS.log >B5-2U-205-LS.log.cluster_excerpt
>
> and attached it to the letter.
>
>
> I see complete disaster in the syslog.
> Correct me if I am wrong, but here I will try to analyze what happened to
> the cluster:
>
> At some time, when the system has already been working under the really
> high load for about 16 hours, Corosync started to report that "Corosync
> main process was not scheduled for ...".
> Which means that Corosync wasn't scheduled by the OS often enough so it
> couldn't detect membership changes (token timeout).
> Then, after a few such messages which appeared almost in a row, monitor
> operation of few resources failed.
> Question: in the log, entries from Resource Agent is shown first, then
> "lrmd" reports a timeout problem, like this:
>
> Jan 29 07:00:19 B5-2U-205-LS diskHelper(sm0dh)[18835]: WARNING: RA:
> [monitor] : got rc=1
> Jan 29 07:00:32 B5-2U-205-LS lrmd[3012]: notice: operation_finished:
> sm0dh_monitor_30000:18803:stderr [ Failed to get properties: Connection
> timed out ]
>
> Does it mean that the monitor failed because of timeout?
>
> Two minutes later Corosync started to report another message "Process pause
> detected for ..."
> Which means that Corosync was not scheduled for more than a token timeout.
> Then five minutes later I heave this line in the log:
>
> Jan 29 07:05:54 B5-2U-205-LS kernel: stonithd invoked oom-killer:
> gfp_mask=0x201da, order=0, oom_score_adj=0
>
> Which I assume states for "stonithd tried to allocate some memory, and
> kernel decided to run oom-killer, because there was no enough available
> memory". I am right here?
> Why stonithd activated that time?
> Was it because of "Process pause detected for ..." Corosync's message?
> What stonithd actually aimed to do?
>
> Then oom-killer kills one of heavy processes.
> Then systemd-journal requests memory (?) and another one heavy resource
> goes down, killed by oom-killer.
>
> Both killed resources was under Pacemaker's control.
> Other processes managed by Pacemaker report monitor timeout (?).
> Then one of them times out on "stop" operation and so Pacemaker requests a
> node to be STONITHed.
> There is only one node in the cluster and the only running resource is not
> designed (properly - don't have "node-list") to kill the node on which it
> runs.
> And because there is no another guy physically connected to the fencing
> device - STONITH reports success.
> Pacemaker's internal check works out (thank you guys!) and Pacemaker shuts
> down itself.
>
>
> Please, correct me if I am wrong in this log analyzing. I just want to
> level up in understanding what is happening here.
> As a general question, is this all happened because of:
>
> For some reasons Corosync started to experience a lack of processor time
> (scheduling).
> That is why monitor operations started to time out.
> Than after "Process pause detected for ..." message I assume the node
> should be STONITHed by the other node, but there is no another node, so
> what should happen in that case?
> Than for some reasons "stonithd" triggered "oom-killer", which killed one
> of the managed resources. Why?
> Monitor function time out for all resources continuously.
> Eventually "stop" function times out for one of the resources, that is why
> Pacemaker eventually shuts down.
>
> Please correct me in case I am wrong anywhere in my assumptions.
>
> Thank you for spending your precious time reading all this =)
> Hope for some help here =)
>
>
> Thank you,
> Kostia
>
> On Wed, Feb 17, 2016 at 6:47 PM, Jan Friesse <jfriesse at redhat.com> wrote:
>
>> Kostiantyn Ponomarenko napsal(a):
>>
>>> Thank you for the suggestion.
>>> The OS is Debian 8. All Packages are build by myself.
>>> libqb-0.17.2
>>> corosync-2.3.5
>>> cluster-glue-1.0.12
>>> pacemaker-1.1.13
>>>
>>> It is really important for me to understand what is happening with the
>>> cluster under the high load.
>>>
>>
>> For Corosync it's really simple. Corosync has to be scheduled by OS
>> regularly (more often than it's current token timeout) to be able to detect
>> membership changes and send/receive messages (cpg). If it's not scheduled,
>> membership is not up to date and eventually when it's finally scheduled, it
>> logs "process was not scheduled for ... ms" message (warning for user) and
>> if corosync was not scheduled for more than token timeout "Process pause
>> detected for ..." message is displayed and new membership is formed. Other
>> nodes (if scheduled regularly) sees non regularly scheduled node as dead.
>>
>> So I would appreciate any help here =)
>>>
>>
>> There is really no help. It's best to make sure corosync is scheduled
>> regularly.
>>
>>
>>>
>>> Thank you,
>>> Kostia
>>>
>>> On Wed, Feb 17, 2016 at 5:02 PM, Greg Woods <woods at ucar.edu> wrote:
>>>
>>>
>>>> On Wed, Feb 17, 2016 at 3:30 AM, Kostiantyn Ponomarenko <
>>>> konstantin.ponomarenko at gmail.com> wrote:
>>>>
>>>> Jan 29 07:00:43 B5-2U-205-LS corosync[2742]: [MAIN  ] Corosync main
>>>>> process was not scheduled for 12483.7363 ms (threshold is 800.0000 ms).
>>>>> Consider token timeout increase.
>>>>>
>>>>
>>>>
>>>> I was having this problem as well. You don't say which version of
>>>> corosync
>>>> you are running or on what OS, but on CentOS 7, there is an available
>>>>
>>>
>> This update sets round robin realtime scheduling for corosync by default.
>> Same can be achieved without update by editing /etc/sysconfig/corosync and
>> changing COROSYNC_OPTIONS line to something like COROSYNC_OPTIONS="-r"
>>
>> Regards,
>>    Honza
>>
>>
>> update that looks like it might address this (it has to do with
>>>> scheduling). We haven't gotten around to actually applying it yet because
>>>> it will require some down time on production services (we do have a few
>>>> node-locked VMs in our cluster), and it only happens when the system is
>>>> under very high load, so I can't say for sure the update will fix the
>>>> issue, but it might be worth looking into.
>>>>
>>>> --Greg
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list: Users at clusterlabs.org
>>>> http://clusterlabs.org/mailman/listinfo/users
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>>
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> http://clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>