[Pacemaker] Interval-origin in monitor operations does not work

Andrew Beekhof andrew at beekhof.net
Fri May 2 02:55:16 EDT 2014


On 15 Apr 2014, at 4:12 am, Rainer Brestan <rainer.brestan at gmx.net> wrote:

> Of course, I can.
>       <primitive class="ocf" id="resD" provider="heartbeat" type="Dummy">
>         <operations>
>           <op id="resD-start-0" interval="0" name="start" timeout="20"/>
>           <op id="resD-stop-0" interval="0" name="stop" timeout="20"/>
>           <op id="resD-monitor-1h" interval="1h" interval-origin="00:34" name="monitor" timeout="60"/>
>         </operations>
>         <meta_attributes id="resD-meta_attributes">
>           <nvpair id="resD-meta_attributes-failure-timeout" name="failure-timeout" value="15m"/>
>           <nvpair id="resD-meta_attributes-migration-threshold" name="migration-threshold" value="3"/>
>         </meta_attributes>
>       </primitive>
>  
> Yes, the origin is in the future, but consider above monitor configuration.
> The monitor operation shall run every hour at 34 minutes.
> If i would specifiy a full date in the past then pengine has to run a number of while(rc<0) loops in unpack_operation.
> One year after full date exactly 8760 and this for every call of unpack_operation.
> Thats why i specified the first possible run time every day and then they are maximum of 23 while loop runs.
>  
> If unpack_operation is called between 00:00 and 00:34 the described situation happens.
> Origin is later than now.
>  
> Applying this patch will help.

It will, but as I suspected it will also cause:

  iso8601 -d '2014-01-01 00:00:30Z' -D P-1D -E '2013-12-31 00:00:30Z'

to fail with:

Date: 2014-01-01 00:00:30Z
Duration: 0000-01--01 00:00:00Z
Duration ends at: 2014-01-00 00:00:30Z

which isn't right :)

I'm working on a true fix now...


> diff --git a/lib/common/iso8601.c b/lib/common/iso8601.c
> index 7dc2495..742de70 100644
> --- a/lib/common/iso8601.c
> +++ b/lib/common/iso8601.c
> @@ -1137,7 +1137,7 @@ crm_time_add_days(crm_time_t * a_time, int extra)
>          ydays = crm_time_leapyear(a_time->years) ? 366 : 365;
>      }
> -    while (a_time->days <= 0) {
> +    while (a_time->days < 0) {
>          a_time->years--;
>          a_time->days += crm_time_leapyear(a_time->years) ? 366 : 365;
>      }
>  
> Rainer
> Gesendet: Mittwoch, 09. April 2014 um 08:57 Uhr
> Von: "Andrew Beekhof" <andrew at beekhof.net>
> An: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
> Betreff: Re: [Pacemaker] Interval-origin in monitor operations does not work
> 
> On 1 Apr 2014, at 5:10 am, Rainer Brestan <rainer.brestan at gmx.net> wrote:
> 
> > Using interval-origin in monitor operation definition does not work any more.
> > Veryfied on Pacemaker 1.1.10, but we think it does not work since 1.1.8 until now.
> >
> > Pengine calculates start delay in function unpack_operation and calls there crm_time_subtract.
> >
> > The call to crm_time_subtract with
> > origin=2014-03-31 19:20:00Z
> > date_set->now=2014-03-31 17:31:04Z
> > result in
> > delay=-0001-12-31 01:48:56Z
> > delay_s=31456136
> > start_delay=31456136000
> > which is almost a year in the future.
> 
> To be fair, the origin was also in the future.
> I don't think that was expected.
> 
> Can you supply your cib so I can experiment?
> 
> >
> > The function crm_time_subtract calculates this by the crm_time_add_* functions.
> >
> > The buggy statement is in crm_time_add_days.
> > If the calculated number of days is zero, it subtracts one year and add the number of days, in this case 365.
> > But if a_time->days is zero, it must not do anything.
> >
> > The function crm_time_get_seconds, which is called by unpack_operation cannot handle negative years, so it ignores the year -1 but adds 365 days.
> >
> > There are two solutions.
> > One is the add handling on negative years to crm_time_get_seconds.
> > The other is to exchange line 1140 in iso8601.c
> > while (a_time->days <= 0) {
> > by
> > while (a_time->days < 0) {
> >
> > Second solution is verified to bring the expected result, start-delay of little less than two hours.
> > Handling of negative years in crm_time_get_seconds might not be a proper solution as the return value of the function is unsigned long long and what to report if the complete calculation gives a negative number of seconds.
> >
> > Rainer
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140502/3b6bb1ba/attachment-0002.sig>


More information about the Pacemaker mailing list