[Pacemaker] The larger cluster is tested.
Andrew Beekhof
andrew at beekhof.net
Mon Nov 11 23:03:20 UTC 2013
On 11 Nov 2013, at 11:48 pm, yusuke iida <yusk.iida at gmail.com> wrote:
> Execution of the graph was also checked.
> Since the number of pending(s) is restricted to 16 from the middle, it
> is judged that batch-limit is effective.
> Observing here, even if a job is restricted by batch-limit, two or
> more jobs are always fired(ed) in 1 second.
> These performed jobs return a result and the synchronous message of
> CIB generates them.
> The node which continued receiving a synchronous message processes
> there preferentially, and postpones an internal IPC message.
> I think that it caused timeout.
What load-threshold were you running this with?
I see this in the logs:
"Host vm10 supports a maximum of 4 jobs and throttle mode 0100. New job limit is 1"
Have you set LRMD_MAX_CHILDREN=4 on these nodes?
I wouldn't recommend that for a single core VM. I'd let the default of 2*cores be used.
Also, I'm not seeing "Extreme CIB load detected". Are these still single core machines?
If so it would suggest that something about:
if(cores == 1) {
cib_max_cpu = 0.4;
}
if(throttle_load_target > 0.0 && throttle_load_target < cib_max_cpu) {
cib_max_cpu = throttle_load_target;
}
if(load > 1.5 * cib_max_cpu) {
/* Can only happen on machines with a low number of cores */
crm_notice("Extreme %s detected: %f", desc, load);
mode |= throttle_extreme;
is wrong.
What was load-threshold configured as?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20131112/aa0eadfa/attachment-0004.sig>
More information about the Pacemaker
mailing list