[ClusterLabs] Antw: Antw: notice: throttle_handle_load: High CPU load detected

Kostiantyn Ponomarenko konstantin.ponomarenko at gmail.com
Mon Feb 29 14:00:22 CET 2016


I am back to this question =)

I am still trying to understand the impact of "High CPU load detected"
messages in the log.
Looking in the code I figured out that setting "load-threshold" parameter
to something higher than 100% solves the problem.
And actually for 8 cores (12 with Hyper Threading) load-threshold=400% kind
of works.

Also I noticed that this parameter may have an impact on the number of "the
maximum number of jobs that can be scheduled per node". As there is a
formula to limit F_CRM_THROTTLE_MAX based on F_CRM_THROTTLE_MODE.

Is my understanding correct that the impact of setting "load-threshold"
high enough (so there is no noisy messages) will lead only to the
"throttle_job_max" and nothing more.
Also, if I got it correct, than "throttle_job_max" is a number of allowed
parallel actions per node in lrmd.
And a child of the lrmd is actually an RA process running some actions
(monitor, start, etc).

So there is no impact on how many RA (resources) can run on a node, but how
Pacemaker will operate with them in parallel (I am not sure I understand
this part correct).

Thank you,
Kostia

On Wed, Jun 3, 2015 at 12:17 AM, Andrew Beekhof <andrew at beekhof.net> wrote:

>
> > On 27 May 2015, at 10:09 pm, Kostiantyn Ponomarenko <
> konstantin.ponomarenko at gmail.com> wrote:
> >
> > I think I wasn't precise in my questions.
> > So I will try to ask more precise questions.
> > 1. why the default value for "load-threshold" is 80%?
>
> Experimentation showed it better to begin throttling before the node
> became saturated.
>
> > 2. what would be the impact to the cluster in case of
> "load-threshold=100%”?
>
> Your nodes will be busier.  Will they be able to handle your load or will
> it result in additional recovery actions (creating more load and more
> failures)?  Only you will know when you try.
>
> >
> > Thank you,
> > Kostya
> >
> > On Mon, May 25, 2015 at 4:11 PM, Kostiantyn Ponomarenko <
> konstantin.ponomarenko at gmail.com> wrote:
> > Guys, please, if anyone can help me to understand this parameter better,
> I would be appreciated.
> >
> >
> > Thank you,
> > Kostya
> >
> > On Fri, May 22, 2015 at 4:15 PM, Kostiantyn Ponomarenko <
> konstantin.ponomarenko at gmail.com> wrote:
> > Another question - is it crmd specific to measure CPU usage by "I/O
> wait"?
> > And if I need to get the most performance of the running resources in
> cluster, should I set "load-threshold=95%" (or even 100%)?
> > Will it impact the cluster behavior in any ways?
> > The man page for crmd says that it will "The cluster will slow down its
> recovery process when the amount of system resources used (currently CPU)
> approaches this limit".
> > Does it mean there will be delays in cluster in moving resources in case
> a node goes down, or something else?
> > I just want to understand in better.
> >
> > That you in advance for the help =)
> >
> > P.S.: The main resource does a lot of disk I/Os.
> >
> >
> > Thank you,
> > Kostya
> >
> > On Fri, May 22, 2015 at 3:30 PM, Kostiantyn Ponomarenko <
> konstantin.ponomarenko at gmail.com> wrote:
> > I didn't know that.
> > You mentioned "as opposed to other Linuxes", but I am using Debian Linux.
> > Does it also measure CPU usage by I/O waits?
> > You are right about "I/O waits" (a screenshot of "top" is attached).
> > But why it shows 50% of CPU usage for a single process (that is the main
> one) while "I/O waits" shows a bigger number?
> >
> >
> > Thank you,
> > Kostya
> >
> > On Fri, May 22, 2015 at 9:40 AM, Ulrich Windl <
> Ulrich.Windl at rz.uni-regensburg.de> wrote:
> > >>> "Ulrich Windl" <Ulrich.Windl at rz.uni-regensburg.de> schrieb am
> 22.05.2015 um
> > 08:36 in Nachricht <555EEA72020000A10001A71D at gwsmtp1.uni-regensburg.de>:
> > > Hi!
> > >
> > > I Linux I/O waits are considered for load (as opposed to other
> Linuxes) Thus
> > ^^ "In"
>                             s/Linux/UNIX/
> >
> > (I should have my coffee now to awake ;-) Sorry.
> >
> >
> >
> >
> > _______________________________________________
> > Users mailing list: Users at clusterlabs.org
> > http://clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
> >
> >
> >
> > _______________________________________________
> > Users mailing list: Users at clusterlabs.org
> > http://clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://clusterlabs.org/pipermail/users/attachments/20160229/5c14ff81/attachment.html>


More information about the Users mailing list