[Pacemaker] The larger cluster is tested.

Wed Nov 6 21:08:52 UTC 2013

On 6 Nov 2013, at 4:48 pm, yusuke iida <yusk.iida at gmail.com> wrote:

> Hi, Andrew
> 
> I tested by the following versions.
> https://github.com/ClusterLabs/pacemaker/commit/3492fec7fe58a6fd94071632df27d3fd3fc3ffe3
> 
> load-threshold was checked at 60%, 40%, and 20%.
> 
> However, the problem was not solved.
> It will not change but timeout will occur.

That is extremely surprising.  I will have a look at your logs today.
How many cores do these machines have btw?

> 
> Restriction of the number of jobs seems to be carried out correctly.
> However, since the synchronous message of CIB is sent ceaseless, it is
> processing there preferentially.
> Therefore, the internal IPC communication message is kept waiting.
> 
> I think that I need to change the priority of message processing in
> order to solve this problem.
> Or when load is high, I think that processing which stops that DC
> sends a job is effective.
> The accumulated message may be processed while transmission of the job
> has stopped.
> However, it is expected that operation of the whole cluster becomes
> slow in that case.
> 
> Does it happen with the problem which may occur when a priority is
> changed in what kind of case?
> And if known, I want you to tell me should be what the test.
> 
> load-threshold 60% test report
> https://drive.google.com/file/d/0BwMFJItoO-fVOHB5S1ROOUJrams/edit?usp=sharing
> load-threshold 40% test report
> https://drive.google.com/file/d/0BwMFJItoO-fVemlqVUU2QkhEMW8/edit?usp=sharing
> load-threshold 20% test report
> https://drive.google.com/file/d/0BwMFJItoO-fVTWFTU2pqOF9pcms/edit?usp=sharing
> 
> report tested by the commitment which changed the priority is also sent.
> https://github.com/yuusuke/pacemaker/commit/17a7cbe67c455f5f6d36a1e1bc255b4ab0039dd8
> 
> load-threshold 80% and CPG G_PRIORITY_DEFAULT test report
> https://drive.google.com/file/d/0BwMFJItoO-fVV1BoTjVQMk52WEU/edit?usp=sharing
> 
> 2013/11/6 Andrew Beekhof <andrew at beekhof.net>:
>> 
>> On 5 Nov 2013, at 12:48 pm, yusuke iida <yusk.iida at gmail.com> wrote:
>> 
>>> Hi, Andrew
>>> 
>>> I tested by this commitment.
>>> https://github.com/beekhof/pacemaker/commit/145c782e432d8108ca865f994640cf5a62406363
>>> 
>>> However, the problem has not improved.
>>> It seems that it will be preferentially processed since the message of
>>> CPG is set as G_PRIORITY_MED.
>>> 
>>> I suggest that you lower the priority of CPG instead.
>> 
>> I worry about this change.
>> It may allow ipc clients to read out of date information (the pending cpg messages almost certainly contain updates) and could result in updates being lost (because they're not being made to the latest config+status).
>> 
>> Could you try reducing the value of load-threshold? The default (80%) could be too high.
>> 
>>> How is this?
>>> https://github.com/yuusuke/pacemaker/commit/22a14318cc740b3043106609923f47039c3aa407
>>> 
>>> I did not find the method of lowering only the priority of the CPG
>>> message of a CIB process.
>>> 
>>> Reports when the error came out were collected.
>>> I want you to note that it is delayed that an IPC message is processed
>>> as follows.
>>> 
>>> Nov 01 21:53:52 [9246] vm01       crmd: (cib_native.c:397   )   trace:
>>> cib_native_perform_op_delegate:  Async call, returning 32
>>> (snip)
>>> Nov 01 21:55:57 [9241] vm01        cib: ( callbacks.c:688   )    info:
>>> cib_process_request:     Forwarding cib_modify operation for section
>>> status to master (origin=local/crmd/32)
>>> 
>>> Since size is large, I want you to download from the following.
>>> https://drive.google.com/file/d/0BwMFJItoO-fVWDg1Sjc2WXltUjQ/edit?usp=sharing
>>> 
>>> Regards,
>>> Yusuke
>>> 
>>> 2013/10/31 Andrew Beekhof <andrew at beekhof.net>:
>>>> 
>>>> On 29 Oct 2013, at 12:12 am, yusuke iida <yusk.iida at gmail.com> wrote:
>>>> 
>>>>> Hi, Andrew
>>>>> 
>>>>> I tested using following commit.
>>>>> https://github.com/beekhof/pacemaker/commit/b6fa1e650f64b1ba73fdb143f41323aa8cb3544e
>>>>> 
>>>>> However, timeout of operation has still occurred.
>>>>> 
>>>>> I analyzed the log.
>>>>> 
>>>>> I am noting that it is late that the ipc message transmitted to cib
>>>>> from crmd of local is processed.
>>>>> Since the CIB synchronous message by which the CIB process came from
>>>>> the outside will have priority and will be processed, this happens?
>>>>> 
>>>>> 
>>>>> I made the following corrections so that the priority of the message
>>>>> which CIB processes might be changed.
>>>>> In this case, timeout does not occur.
>>>>> 
>>>>> diff --git a/lib/cluster/cpg.c b/lib/cluster/cpg.c
>>>>> index 8522cbf..3a67998 100644
>>>>> --- a/lib/cluster/cpg.c
>>>>> +++ b/lib/cluster/cpg.c
>>>>> @@ -212,7 +212,7 @@ pcmk_cpg_dispatch(gpointer user_data)
>>>>>   int rc = 0;
>>>>>   crm_cluster_t *cluster = (crm_cluster_t*) user_data;
>>>>> 
>>>>> -    rc = cpg_dispatch(cluster->cpg_handle, CS_DISPATCH_ALL);
>>>>> +    rc = cpg_dispatch(cluster->cpg_handle, CS_DISPATCH_ONE);
>>>>>   if (rc != CS_OK) {
>>>>>       crm_err("Connection to the CPG API failed: %s (%d)",
>>>>> ais_error2text(rc), rc);
>>>>>       cluster->cpg_handle = 0;
>>>>> diff --git a/lib/common/mainloop.c b/lib/common/mainloop.c
>>>>> index 18a67e6..d605288 100644
>>>>> --- a/lib/common/mainloop.c
>>>>> +++ b/lib/common/mainloop.c
>>>>> @@ -482,7 +482,7 @@ gio_poll_dispatch_add(enum qb_loop_priority p,
>>>>> int32_t fd, int32_t evts,
>>>>>   adaptor->p = p;
>>>>>   adaptor->is_used = QB_TRUE;
>>>>>   adaptor->source =
>>>>> -        g_io_add_watch_full(channel, G_PRIORITY_DEFAULT, evts,
>>>>> gio_read_socket, adaptor,
>>>>> +        g_io_add_watch_full(channel, G_PRIORITY_MEDIUM, evts,
>>>>> gio_read_socket, adaptor,
>>>>>                           gio_poll_destroy);
>>>>> 
>>>>>   /* Now that mainloop now holds a reference to channel,
>>>>> 
>>>>> I do not know this fix is correct.
>>>>> Can't the comment to this correction be got?
>>>> 
>>>> The CS_DISPATCH_ONE change looks ok: https://github.com/beekhof/pacemaker/commit/6384053
>>>> Did you try with just that?  I'd like to avoid the mainloop priority change if possible.
>>>> 
>>>>> 
>>>>> Regards,
>>>>> Yusuke
>>>>> 
>>>>> 2013/10/20 Andrew Beekhof <andrew at beekhof.net>:
>>>>>> 
>>>>>> On 18/10/2013, at 10:12 PM, yusuke iida <yusk.iida at gmail.com> wrote:
>>>>>> 
>>>>>>> Hi, Andrew
>>>>>>> 
>>>>>>> Now, I am testing the configuration of one standby node and active node of 15.
>>>>>>> About 10 Dummy resources are started per node.
>>>>>>> 
>>>>>>> If all the nodes are started with this composition, before all the
>>>>>>> resources start, it will take the time for about 20 minutes.
>>>>>>> 
>>>>>>> And some resources have caused start timeout.
>>>>>>> probe is performed all at once by all the nodes at a start-up.
>>>>>>> The result is written in cib and synchronizes with all the nodes.
>>>>>>> This processing requires very high load.
>>>>>>> I think that timeout has occurred owing to it.
>>>>>> 
>>>>>> More than likely, yes.
>>>>>> 
>>>>>>> 
>>>>>>> I am very interested in whether this problem is solvable, if you use
>>>>>>> throttle created now.
>>>>>> 
>>>>>> I have been using it, I have found it more effective than batch-limit for bounding CPU usage and avoiding timeouts.
>>>>>> I would be interested to hear your feedback if you have the time to do some testing.
>>>>>> 
>>>>>>> When is throttle due to be merged into the repository of ClusterLabs?
>>>>>> 
>>>>>> It is queued up behind a compatibility patch that is needed for some changes I made to the pacemaker-remote wire protocol.
>>>>>> 
>>>>>>> 
>>>>>>> Best Regards,
>>>>>>> 
>>>>>>> --
>>>>>>> ----------------------------------------
>>>>>>> METRO SYSTEMS CO., LTD
>>>>>>> 
>>>>>>> Yusuke Iida
>>>>>>> Mail: yusk.iida at gmail.com
>>>>>>> ----------------------------------------
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>> 
>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>> 
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> ----------------------------------------
>>>>> METRO SYSTEMS CO., LTD
>>>>> 
>>>>> Yusuke Iida
>>>>> Mail: yusk.iida at gmail.com
>>>>> ----------------------------------------
>>>>> 
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>> 
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>> 
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>> 
>>> 
>>> 
>>> --
>>> ----------------------------------------
>>> METRO SYSTEMS CO., LTD
>>> 
>>> Yusuke Iida
>>> Mail: yusk.iida at gmail.com
>>> ----------------------------------------
>>> 
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>> 
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>> 
>> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> -- 
> ----------------------------------------
> METRO SYSTEMS CO., LTD
> 
> Yusuke Iida
> Mail: yusk.iida at gmail.com
> ----------------------------------------
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org