[Pacemaker] warning log is outputted after pacemaker stopped
Andrew Beekhof
andrew at beekhof.net
Fri Mar 7 03:03:07 UTC 2014
On 25 Feb 2014, at 7:23 pm, Kazunori INOUE <kazunori.inoue3 at gmail.com> wrote:
> 2014-02-24 11:09 GMT+09:00 Andrew Beekhof <andrew at beekhof.net>:
>>
>> On 24 Feb 2014, at 12:59 pm, Andrew Beekhof <andrew at beekhof.net> wrote:
>>
>>>
>>> On 21 Feb 2014, at 9:36 pm, Kazunori INOUE <kazunori.inoue3 at gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> WARNING of the following is outputted after pacemaker stopped in
>>>> Pacemaker-1.1.11.
>>>>
>>>> Feb 21 18:22:57 bl460g1n6 ping(prmPing)[9195]: WARNING: Could not
>>>> update default_ping_set = 100: rc=141
>>>>
>>>>
>>>> This is because pacemaker does not wait for completion of 'monitor' of
>>>> the ping resource.
>>>>
>>>> Feb 21 18:22:52 bl460g1n6 lrmd[9100]: debug:
>>>> recurring_action_timer: Scheduling another invokation of
>>>> prmPing_monitor_30000
>>>> Feb 21 18:22:52 bl460g1n6 pacemakerd[9094]: info:
>>>> crm_signal_dispatch: Invoking handler for signal 15: Terminated
>>>> (snip)
>>>> Feb 21 18:22:54 bl460g1n6 crmd[9103]: notice: process_lrm_event: LRM
>>>> operation prmPing_stop_0 (call=9, rc=0, cib-update=29, confirmed=true)
>>>> ok
>>>> (snip)
>>>> Feb 21 18:22:54 bl460g1n6 lrmd[9100]: info:
>>>> services_action_cancel: Cancelling op: prmPing_monitor_30000 will
>>>> occur once operation completes
>>>> (snip)
>>>> Feb 21 18:22:54 bl460g1n6 pacemakerd[9094]: info: main: Exiting pacemakerd
>>>> (snip)
>>>> Feb 21 18:22:55 bl460g1n6 corosync[9083]: [MAIN ] Corosync Cluster
>>>> Engine exiting normally
>>>> Feb 21 18:22:57 bl460g1n6 ping(prmPing)[9195]: WARNING: Could not
>>>> update default_ping_set = 100: rc=141
>>>>
>>>>
>>>> As with pacemaker-1.0, I think that it's better to perform a 'stop'
>>>> after completion of 'monitor'.
>>>
>>> Absolutely. If it is not then we have a problem.
>>
>> Feb 21 18:22:20 bl460g1n6 crmd[9103]: notice: te_rsc_command: Initiating action 6: monitor prmPing_monitor_30000 on bl460g1n6 (local)
>> Feb 21 18:22:20 bl460g1n6 crmd[9103]: info: do_lrm_rsc_op: Performing key=6:1:0:82c5fc6f-611d-4ba9-8a4f-13155b12536e op=prmPing_monitor_30000
>> Feb 21 18:22:22 bl460g1n6 crmd[9103]: notice: process_lrm_event: LRM operation prmPing_monitor_30000 (call=7, rc=0, cib-update=27, confirmed=false) ok
>>
>> so far so good
>>
>> Feb 21 18:22:54 bl460g1n6 crmd[9103]: notice: te_rsc_command: Initiating action 5: stop prmPing_stop_0 on bl460g1n6 (local)
>> Feb 21 18:22:54 bl460g1n6 lrmd[9100]: info: services_action_cancel: Cancelling op: prmPing_monitor_30000 will occur once operation completes
>> Feb 21 18:22:54 bl460g1n6 crmd[9103]: info: do_lrm_rsc_op: Performing key=5:2:0:82c5fc6f-611d-4ba9-8a4f-13155b12536e op=prmPing_stop_0
>> Feb 21 18:22:54 bl460g1n6 lrmd[9100]: info: log_execute: executing - rsc:prmPing action:stop call_id:9
>> Feb 21 18:22:54 bl460g1n6 lrmd[9100]: info: log_finished: finished - rsc:prmPing action:stop call_id:9 pid:9207 exit-code:0 exec-time:36ms queue-time:0ms
>> Feb 21 18:22:54 bl460g1n6 lrmd[9100]: info: services_action_cancel: Cancelling op: prmPing_monitor_30000 will occur once operation completes
>> Feb 21 18:22:54 bl460g1n6 crmd[9103]: notice: process_lrm_event: LRM operation prmPing_stop_0 (call=9, rc=0, cib-update=29, confirmed=true) ok
>> Feb 21 18:22:54 bl460g1n6 crmd[9103]: info: match_graph_event: Action prmPing_stop_0 (5) confirmed on bl460g1n6 (rc=0)
>> Feb 21 18:22:54 bl460g1n6 crmd[9103]: info: stop_recurring_actions: Cancelling op 7 for prmPing (prmPing:7)
>>
>> David: I think we have the order wrong here... this should happen prior to the stop not after.
>> Also, does this mean we ran the stop action while the monitor was running?
>>
>> Feb 21 18:22:54 bl460g1n6 lrmd[9100]: info: services_action_cancel: Cancelling op: prmPing_monitor_30000 will occur once operation completes
>> Feb 21 18:22:57 bl460g1n6 ping(prmPing)[9195]: WARNING: Could not update default_ping_set = 100: rc=141
>>
>>
>> Inoue-san: could we get a copy of this please?
>> Feb 21 18:22:54 bl460g1n6 pengine[9102]: notice: process_pe_message: Calculated Transition 2: /var/lib/pacemaker/pengine/pe-input-13.bz2
>>
>
> Since the file was lost, I reproduced the same event.
Thanks. I'm reasonably certain this is an lrmd bug, I've passed it on to David.
>
>> I'd like to see if we knew the recurring operation was running
>>
>>
>>>
>>>> (Who manages 'monitor' after pacemaker stopped?)
>>>>
>>>> Regards,
>>>> Kazunori INOUE
>>>> <ha-log>_______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> <pcmk-Tue-25-Feb-2014.tar.bz2>_______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140307/a0197931/attachment-0003.sig>
More information about the Pacemaker
mailing list