[Pacemaker] The larger cluster is tested.

Wed Oct 30 23:07:33 UTC 2013

On 29 Oct 2013, at 12:12 am, yusuke iida <yusk.iida at gmail.com> wrote:

> Hi, Andrew
> 
> I tested using following commit.
> https://github.com/beekhof/pacemaker/commit/b6fa1e650f64b1ba73fdb143f41323aa8cb3544e
> 
> However, timeout of operation has still occurred.
> 
> I analyzed the log.
> 
> I am noting that it is late that the ipc message transmitted to cib
> from crmd of local is processed.
> Since the CIB synchronous message by which the CIB process came from
> the outside will have priority and will be processed, this happens?
> 
> 
> I made the following corrections so that the priority of the message
> which CIB processes might be changed.
> In this case, timeout does not occur.
> 
> diff --git a/lib/cluster/cpg.c b/lib/cluster/cpg.c
> index 8522cbf..3a67998 100644
> --- a/lib/cluster/cpg.c
> +++ b/lib/cluster/cpg.c
> @@ -212,7 +212,7 @@ pcmk_cpg_dispatch(gpointer user_data)
>     int rc = 0;
>     crm_cluster_t *cluster = (crm_cluster_t*) user_data;
> 
> -    rc = cpg_dispatch(cluster->cpg_handle, CS_DISPATCH_ALL);
> +    rc = cpg_dispatch(cluster->cpg_handle, CS_DISPATCH_ONE);
>     if (rc != CS_OK) {
>         crm_err("Connection to the CPG API failed: %s (%d)",
> ais_error2text(rc), rc);
>         cluster->cpg_handle = 0;
> diff --git a/lib/common/mainloop.c b/lib/common/mainloop.c
> index 18a67e6..d605288 100644
> --- a/lib/common/mainloop.c
> +++ b/lib/common/mainloop.c
> @@ -482,7 +482,7 @@ gio_poll_dispatch_add(enum qb_loop_priority p,
> int32_t fd, int32_t evts,
>     adaptor->p = p;
>     adaptor->is_used = QB_TRUE;
>     adaptor->source =
> -        g_io_add_watch_full(channel, G_PRIORITY_DEFAULT, evts,
> gio_read_socket, adaptor,
> +        g_io_add_watch_full(channel, G_PRIORITY_MEDIUM, evts,
> gio_read_socket, adaptor,
>                             gio_poll_destroy);
> 
>     /* Now that mainloop now holds a reference to channel,
> 
> I do not know this fix is correct.
> Can't the comment to this correction be got?

The CS_DISPATCH_ONE change looks ok: https://github.com/beekhof/pacemaker/commit/6384053
Did you try with just that?  I'd like to avoid the mainloop priority change if possible.

> 
> Regards,
> Yusuke
> 
> 2013/10/20 Andrew Beekhof <andrew at beekhof.net>:
>> 
>> On 18/10/2013, at 10:12 PM, yusuke iida <yusk.iida at gmail.com> wrote:
>> 
>>> Hi, Andrew
>>> 
>>> Now, I am testing the configuration of one standby node and active node of 15.
>>> About 10 Dummy resources are started per node.
>>> 
>>> If all the nodes are started with this composition, before all the
>>> resources start, it will take the time for about 20 minutes.
>>> 
>>> And some resources have caused start timeout.
>>> probe is performed all at once by all the nodes at a start-up.
>>> The result is written in cib and synchronizes with all the nodes.
>>> This processing requires very high load.
>>> I think that timeout has occurred owing to it.
>> 
>> More than likely, yes.
>> 
>>> 
>>> I am very interested in whether this problem is solvable, if you use
>>> throttle created now.
>> 
>> I have been using it, I have found it more effective than batch-limit for bounding CPU usage and avoiding timeouts.
>> I would be interested to hear your feedback if you have the time to do some testing.
>> 
>>> When is throttle due to be merged into the repository of ClusterLabs?
>> 
>> It is queued up behind a compatibility patch that is needed for some changes I made to the pacemaker-remote wire protocol.
>> 
>>> 
>>> Best Regards,
>>> 
>>> --
>>> ----------------------------------------
>>> METRO SYSTEMS CO., LTD
>>> 
>>> Yusuke Iida
>>> Mail: yusk.iida at gmail.com
>>> ----------------------------------------
>>> 
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>> 
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>> 
>> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>> 
> 
> 
> 
> -- 
> ----------------------------------------
> METRO SYSTEMS CO., LTD
> 
> Yusuke Iida
> Mail: yusk.iida at gmail.com
> ----------------------------------------
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org