[Pacemaker] Interval-origin in monitor operations does not work
Andrew Beekhof
andrew at beekhof.net
Fri May 2 06:55:16 UTC 2014
On 15 Apr 2014, at 4:12 am, Rainer Brestan <rainer.brestan at gmx.net> wrote:
> Of course, I can.
> <primitive class="ocf" id="resD" provider="heartbeat" type="Dummy">
> <operations>
> <op id="resD-start-0" interval="0" name="start" timeout="20"/>
> <op id="resD-stop-0" interval="0" name="stop" timeout="20"/>
> <op id="resD-monitor-1h" interval="1h" interval-origin="00:34" name="monitor" timeout="60"/>
> </operations>
> <meta_attributes id="resD-meta_attributes">
> <nvpair id="resD-meta_attributes-failure-timeout" name="failure-timeout" value="15m"/>
> <nvpair id="resD-meta_attributes-migration-threshold" name="migration-threshold" value="3"/>
> </meta_attributes>
> </primitive>
>
> Yes, the origin is in the future, but consider above monitor configuration.
> The monitor operation shall run every hour at 34 minutes.
> If i would specifiy a full date in the past then pengine has to run a number of while(rc<0) loops in unpack_operation.
> One year after full date exactly 8760 and this for every call of unpack_operation.
> Thats why i specified the first possible run time every day and then they are maximum of 23 while loop runs.
>
> If unpack_operation is called between 00:00 and 00:34 the described situation happens.
> Origin is later than now.
>
> Applying this patch will help.
It will, but as I suspected it will also cause:
iso8601 -d '2014-01-01 00:00:30Z' -D P-1D -E '2013-12-31 00:00:30Z'
to fail with:
Date: 2014-01-01 00:00:30Z
Duration: 0000-01--01 00:00:00Z
Duration ends at: 2014-01-00 00:00:30Z
which isn't right :)
I'm working on a true fix now...
> diff --git a/lib/common/iso8601.c b/lib/common/iso8601.c
> index 7dc2495..742de70 100644
> --- a/lib/common/iso8601.c
> +++ b/lib/common/iso8601.c
> @@ -1137,7 +1137,7 @@ crm_time_add_days(crm_time_t * a_time, int extra)
> ydays = crm_time_leapyear(a_time->years) ? 366 : 365;
> }
> - while (a_time->days <= 0) {
> + while (a_time->days < 0) {
> a_time->years--;
> a_time->days += crm_time_leapyear(a_time->years) ? 366 : 365;
> }
>
> Rainer
> Gesendet: Mittwoch, 09. April 2014 um 08:57 Uhr
> Von: "Andrew Beekhof" <andrew at beekhof.net>
> An: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
> Betreff: Re: [Pacemaker] Interval-origin in monitor operations does not work
>
> On 1 Apr 2014, at 5:10 am, Rainer Brestan <rainer.brestan at gmx.net> wrote:
>
> > Using interval-origin in monitor operation definition does not work any more.
> > Veryfied on Pacemaker 1.1.10, but we think it does not work since 1.1.8 until now.
> >
> > Pengine calculates start delay in function unpack_operation and calls there crm_time_subtract.
> >
> > The call to crm_time_subtract with
> > origin=2014-03-31 19:20:00Z
> > date_set->now=2014-03-31 17:31:04Z
> > result in
> > delay=-0001-12-31 01:48:56Z
> > delay_s=31456136
> > start_delay=31456136000
> > which is almost a year in the future.
>
> To be fair, the origin was also in the future.
> I don't think that was expected.
>
> Can you supply your cib so I can experiment?
>
> >
> > The function crm_time_subtract calculates this by the crm_time_add_* functions.
> >
> > The buggy statement is in crm_time_add_days.
> > If the calculated number of days is zero, it subtracts one year and add the number of days, in this case 365.
> > But if a_time->days is zero, it must not do anything.
> >
> > The function crm_time_get_seconds, which is called by unpack_operation cannot handle negative years, so it ignores the year -1 but adds 365 days.
> >
> > There are two solutions.
> > One is the add handling on negative years to crm_time_get_seconds.
> > The other is to exchange line 1140 in iso8601.c
> > while (a_time->days <= 0) {
> > by
> > while (a_time->days < 0) {
> >
> > Second solution is verified to bring the expected result, start-delay of little less than two hours.
> > Handling of negative years in crm_time_get_seconds might not be a proper solution as the return value of the function is unsigned long long and what to report if the complete calculation gives a negative number of seconds.
> >
> > Rainer
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140502/3b6bb1ba/attachment-0003.sig>
More information about the Pacemaker
mailing list