[Pacemaker] Interval-origin in monitor operations does not work

Andrew Beekhof andrew at beekhof.net
Mon May 5 00:37:20 EDT 2014


On 2 May 2014, at 4:55 pm, Andrew Beekhof <andrew at beekhof.net> wrote:

> 
> On 15 Apr 2014, at 4:12 am, Rainer Brestan <rainer.brestan at gmx.net> wrote:
> 
>> Of course, I can.
>>      <primitive class="ocf" id="resD" provider="heartbeat" type="Dummy">
>>        <operations>
>>          <op id="resD-start-0" interval="0" name="start" timeout="20"/>
>>          <op id="resD-stop-0" interval="0" name="stop" timeout="20"/>
>>          <op id="resD-monitor-1h" interval="1h" interval-origin="00:34" name="monitor" timeout="60"/>
>>        </operations>
>>        <meta_attributes id="resD-meta_attributes">
>>          <nvpair id="resD-meta_attributes-failure-timeout" name="failure-timeout" value="15m"/>
>>          <nvpair id="resD-meta_attributes-migration-threshold" name="migration-threshold" value="3"/>
>>        </meta_attributes>
>>      </primitive>
>> 
>> Yes, the origin is in the future, but consider above monitor configuration.
>> The monitor operation shall run every hour at 34 minutes.
>> If i would specifiy a full date in the past then pengine has to run a number of while(rc<0) loops in unpack_operation.
>> One year after full date exactly 8760 and this for every call of unpack_operation.
>> Thats why i specified the first possible run time every day and then they are maximum of 23 while loop runs.
>> 
>> If unpack_operation is called between 00:00 and 00:34 the described situation happens.
>> Origin is later than now.
>> 
>> Applying this patch will help.
> 
> It will, but as I suspected it will also cause:
> 
>  iso8601 -d '2014-01-01 00:00:30Z' -D P-1D -E '2013-12-31 00:00:30Z'
> 
> to fail with:
> 
> Date: 2014-01-01 00:00:30Z
> Duration: 0000-01--01 00:00:00Z
> Duration ends at: 2014-01-00 00:00:30Z
> 
> which isn't right :)
> 
> I'm working on a true fix now...

These are the resulting patches in https://github.com/beekhof/pacemaker:

+ Andrew Beekhof (14 seconds ago) 44af669: Test: PE: Correctly handle origin offsets in the future  (HEAD, master)
+ Andrew Beekhof (27 minutes ago) 3f20485: Fix: PE: Correctly handle origin offsets in the future 
+ Andrew Beekhof (4 minutes ago) d39bad6: Test: iso8601: Improved logging of durations 
+ Andrew Beekhof (29 minutes ago) afb6c16: Fix: iso8601: Different logic is needed when logging and calculating durations 

> 
> 
>> diff --git a/lib/common/iso8601.c b/lib/common/iso8601.c
>> index 7dc2495..742de70 100644
>> --- a/lib/common/iso8601.c
>> +++ b/lib/common/iso8601.c
>> @@ -1137,7 +1137,7 @@ crm_time_add_days(crm_time_t * a_time, int extra)
>>         ydays = crm_time_leapyear(a_time->years) ? 366 : 365;
>>     }
>> -    while (a_time->days <= 0) {
>> +    while (a_time->days < 0) {
>>         a_time->years--;
>>         a_time->days += crm_time_leapyear(a_time->years) ? 366 : 365;
>>     }
>> 
>> Rainer
>> Gesendet: Mittwoch, 09. April 2014 um 08:57 Uhr
>> Von: "Andrew Beekhof" <andrew at beekhof.net>
>> An: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
>> Betreff: Re: [Pacemaker] Interval-origin in monitor operations does not work
>> 
>> On 1 Apr 2014, at 5:10 am, Rainer Brestan <rainer.brestan at gmx.net> wrote:
>> 
>>> Using interval-origin in monitor operation definition does not work any more.
>>> Veryfied on Pacemaker 1.1.10, but we think it does not work since 1.1.8 until now.
>>> 
>>> Pengine calculates start delay in function unpack_operation and calls there crm_time_subtract.
>>> 
>>> The call to crm_time_subtract with
>>> origin=2014-03-31 19:20:00Z
>>> date_set->now=2014-03-31 17:31:04Z
>>> result in
>>> delay=-0001-12-31 01:48:56Z
>>> delay_s=31456136
>>> start_delay=31456136000
>>> which is almost a year in the future.
>> 
>> To be fair, the origin was also in the future.
>> I don't think that was expected.
>> 
>> Can you supply your cib so I can experiment?
>> 
>>> 
>>> The function crm_time_subtract calculates this by the crm_time_add_* functions.
>>> 
>>> The buggy statement is in crm_time_add_days.
>>> If the calculated number of days is zero, it subtracts one year and add the number of days, in this case 365.
>>> But if a_time->days is zero, it must not do anything.
>>> 
>>> The function crm_time_get_seconds, which is called by unpack_operation cannot handle negative years, so it ignores the year -1 but adds 365 days.
>>> 
>>> There are two solutions.
>>> One is the add handling on negative years to crm_time_get_seconds.
>>> The other is to exchange line 1140 in iso8601.c
>>> while (a_time->days <= 0) {
>>> by
>>> while (a_time->days < 0) {
>>> 
>>> Second solution is verified to bring the expected result, start-delay of little less than two hours.
>>> Handling of negative years in crm_time_get_seconds might not be a proper solution as the return value of the function is unsigned long long and what to report if the complete calculation gives a negative number of seconds.
>>> 
>>> Rainer
>>> 
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>> 
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140505/c49faa9d/attachment-0003.sig>


More information about the Pacemaker mailing list