[Pacemaker] Re: [PATCH] election trigger
Bernd Schubert
bs at q-leap.de
Wed Nov 5 14:26:20 CET 2008
Hello Andrew,
sorry for my late response.
On Sunday 02 November 2008 20:32:14 Andrew Beekhof wrote:
> On Oct 30, 2008, at 6:08 PM, Bernd Schubert wrote:
> > Heartbeat calls crmd only if all nodes are already online.
>
> Not everyone uses it on heartbeat anymore ;-)
I grepped the sources of openais and corosync for "KEY_INITDEAD", but can't =
find anything. Are there any further solutions pacemaker supports?
>
> > So introducing
> > another posssibly huge deadtime here will at least delay the DC
> > selection
> > and so resource startup by heartbeats initial deadtime. If one node
> > e.g.
> > after a global power failure doesn't come up at all, the DC
> > selection was
> > even delayed by 2 x initial hb deadtime. Simply remove the usage of
> > heartbeats initial deadtime and only use our own.
>
> I don't understand.
> The logic below is only triggered for people who haven't set a value
> for dc_deadtime... why not just set a value in the cib?
Well firstly, the logs didn't tell me: =
"Look here, you didn't set dc_deadtime, so crm is going to use a huge usele=
ss =
timeout". =
But instead on each startup of heartbeat I get hundreds of lines into syslo=
g =
and all of these don't look as if there are for the common admin, but IMHO =
99% of it are developer information. =
Then after I found the code in pacemaker, I already tested setting dc_deati=
me, =
but during my initial test that didn't change anything. While we need for =
Lustre installations a heartbeat deadtime > 10min, I set it on my test =
systems to 180s. =
Now after your suggestion I tested it again, with deadtime=3D20min, but =
dc_deatime=3D10s and quite odd, crm still needs about 3min to set the nodes =
online (syslog attached). With the code removed it is only 10s.
Since openais doesn't seem to support the code below at at all and since it=
is =
wrong when used together with heartbeat, I still think removing these lines =
is right. Please correct me if I'm wrong.
Thanks,
Bernd
PS: Sorry, the attached syslog is still with heartbeat-2.1.4. If you think =
you =
fixed it in pacemaker already, please point me to the commit.
>
> > Signed-off-by: Bernd Schubert <bs at q-leap.de>
> >
> > diff --git a/crmd/control.c b/crmd/control.c
> > --- a/crmd/control.c
> > +++ b/crmd/control.c
> > @@ -747,23 +747,6 @@ config_query_callback(xmlNode *msg, int
> > output, XML_CIB_TAG_PROPSET, NULL, config_hash,
> > CIB_OPTIONS_FIRST, FALSE, now);
> >
> > - value =3D g_hash_table_lookup(config_hash,
> > XML_CONFIG_ATTR_DC_DEADTIME);
> > - if(value =3D=3D NULL) {
> > - /* apparently we're not allowed to free the result of getenv */
> > - char *param_val =3D getenv(ENV_PREFIX "initdead");
> > -
> > - value =3D crmd_pref(config_hash, XML_CONFIG_ATTR_DC_DEADTIME);
> > - if(param_val !=3D NULL) {
> > - int from_env =3D crm_get_msec(param_val) / 2;
> > - int from_defaults =3D crm_get_msec(value);
> > - if(from_env > from_defaults) {
> > - g_hash_table_replace(
> > - config_hash, crm_strdup(XML_CONFIG_ATTR_DC_DEADTIME),
> > - crm_strdup(param_val));
> > - }
> > - }
> > - }
> > -
> > verify_crmd_options(config_hash);
> >
> > value =3D crmd_pref(config_hash, XML_CONFIG_ATTR_DC_DEADTIME);
> >
> >
> > --
> > Bernd Schubert
> > Q-Leap Networks GmbH
> >
> > _______________________________________________
> > Pacemaker mailing list
> > Pacemaker at clusterlabs.org
> > http://list.clusterlabs.org/mailman/listinfo/pacemaker
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at clusterlabs.org
> http://list.clusterlabs.org/mailman/listinfo/pacemaker
-- =
Bernd Schubert
Q-Leap Networks GmbH
-------------- next part --------------
A non-text attachment was scrubbed...
Name: syslog.bak.gz
Type: application/x-gzip
Size: 146568 bytes
Desc: not available
Url : http://list.clusterlabs.org/pipermail/pacemaker/attachments/20081105/=
10ba2504/syslog.bak-0001.bin
More information about the Pacemaker
mailing list