[Pacemaker] Antwort: Re: Fw: Antwort: Re: pingd process dies for no reason
Dejan Muhamedagic
dejanmm at fastmail.fm
Mon Jan 10 16:44:20 UTC 2011
Hi,
On Mon, Jan 10, 2011 at 04:50:39PM +0100, Patrik.Rapposch at knapp.com wrote:
> i say thx to you, for trying to help me. :)
>
> yes i checked it, there is no problem with it, i assume it must be a
> problem with the ping or the attrd_updater, because as I understand that,
> the crm gets an timeout from the monitor process and then kills the
> ressource.
Right. It could be that your timeout is set too short (did you
run crm configure verify?). Or there is a problem with DNS
perhaps? If all fails, you can add 'set -x' somewhere at the
top of the RA and look at the logs for the shell trace.
Thanks,
Dejan
> "
> p_out=`$p_exe $p_args $OCF_RESKEY_options $host 2>&1`; rc=$?
>
> case $rc in
> 0) active=`expr $active + 1`;;
> 1) ping_conditional_log warn "$host is inactive: $p_out";;
> *) ocf_log err "Unexpected result for '$p_exe $p_args
> $OCF_RESKEY_options $host' $rc: $p_out";;
> esac
> done
> score=`expr $active \* $OCF_RESKEY_multiplier`
> attrd_updater -n $OCF_RESKEY_name -v $score -d $OCF_RESKEY_dampen
> $attrd_options
> rc=$?
> case $rc in
> 0) ping_conditional_log debug "Updated $OCF_RESKEY_name = $score"
> ;;
> *) ocf_log warn "Could not update $OCF_RESKEY_name = $score:
> rc=$rc";;
> esac
> return $rc
> "
>
> as there is no response from this part of the RA, the cluster reacts in
> that way:
> "Jan 5 08:40:33 node2 crmd: [5993]: ERROR: process_lrm_event: LRM
> >> operation pingd:0_monitor_15000 (48559) Timed Out (timeout=5000ms)"
>
> this is what i assume.
>
> kr patrik
>
>
>
>
>
> Mit freundlichen Grüßen / Best Regards
>
> Patrik Rapposch, BSc
> System Administration
>
> KNAPP Systemintegration GmbH
> Waltenbachstraße 9
> 8700 Leoben, Austria
> Phone: +43 3842 805-915
> Fax: +43 3842 82930-500
> patrik.rapposch at knapp.com
> www.KNAPP.com
>
> Commercial register number: FN 138870x
> Commercial register court: Leoben
>
> The information in this e-mail (including any attachment) is confidential
> and intended to be for the use of the addressee(s) only. If you have
> received the e-mail by mistake, any disclosure, copy, distribution or use
> of the contents of the e-mail is prohibited, and you must delete the
> e-mail from your system. As e-mail can be changed electronically KNAPP
> assumes no responsibility for any alteration to this e-mail or its
> attachments. KNAPP has taken every reasonable precaution to ensure that
> any attachment to this e-mail has been swept for virus. However, KNAPP
> does not accept any liability for damage sustained as a result of such
> attachment being virus infected and strongly recommend that you carry out
> your own virus check before opening any attachment.
>
>
>
> Andreas Kurz <andreas.kurz at linbit.com>
> 10.01.2011 15:41
> Bitte antworten an
> The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org>
>
>
> An
> pacemaker at oss.clusterlabs.org
> Kopie
>
> Thema
> Re: [Pacemaker] Fw: Antwort: Re: pingd process dies for no reason
>
>
>
>
>
>
> On 2011-01-10 13:35, Patrik.Rapposch at knapp.com wrote:
> > Anyone an idea or did anyone have the same problem?
>
> sorry for the question ;-) ... of course you checked your host
> xxx.xxx.xxx.xxx is ping-able from the cluster nodes? only idea here is a
> firewall somewhere.
>
> Regards,
> Andreas
>
> >
> >
> > Mit freundlichen Grüßen / Best Regards
> > *
> > Patrik Rapposch, BSc*
> > System Administration
> > *
> > KNAPP Systemintegration GmbH*
> > Waltenbachstraße 9
> > 8700 Leoben, Austria
> > Phone: +43 3842 805-915
> > Fax: +43 3842 82930-500
> > patrik.rapposch at knapp.com
> > www.KNAPP.com
> >
> > Commercial register number: FN 138870x
> > Commercial register court: Leoben
> >
> > The information in this e-mail (including any attachment) is
> > confidential and intended to be for the use of the addressee(s) only. If
> > you have received the e-mail by mistake, any disclosure, copy,
> > distribution or use of the contents of the e-mail is prohibited, and you
> > must delete the e-mail from your system. As e-mail can be changed
> > electronically KNAPP assumes no responsibility for any alteration to
> > this e-mail or its attachments. KNAPP has taken every reasonable
> > precaution to ensure that any attachment to this e-mail has been swept
> > for virus. However, KNAPP does not accept any liability for damage
> > sustained as a result of such attachment being virus infected and
> > strongly recommend that you carry out your own virus check before
> > opening any attachment.
> > ----- Weitergeleitet von Patrik Rapposch/KSI am 10.01.2011 13:35 -----
> > *Patrik.Rapposch at knapp.com*
> >
> > 07.01.2011 16:38
> > Bitte antworten an
> > The Pacemaker cluster resource manager
> > <pacemaker at oss.clusterlabs.org>
> >
> >
> >
> > An
> > The Pacemaker cluster resource manager
> <pacemaker at oss.clusterlabs.org>
> > Kopie
> >
> > Thema
> > [Pacemaker] Antwort: Re: pingd process dies for no
> reason
> >
> >
> >
> >
> >
> >
> >
> >
> > Hello,
> >
> > thx for your fast reply, we use the ping ressource, you can see it in
> > our config, its just the id which is called pingd, i admit this is a
> > little confusing.:*
> > "**<primitive class="ocf" id="pingd" provider="pacemaker"
> >> /type="ping"/>**"*
> >
> > kr patrik
> >
> >
> > Mit freundlichen Grüßen / Best Regards*
> >
> > Patrik Rapposch, BSc*
> > System Administration*
> >
> > KNAPP Systemintegration GmbH*
> > Waltenbachstraße 9
> > 8700 Leoben, Austria
> > Phone: +43 3842 805-915
> > Fax: +43 3842 82930-500
> > patrik.rapposch at knapp.com _
> > __www.KNAPP.com_
> >
> > Commercial register number: FN 138870x
> > Commercial register court: Leoben
> >
> > The information in this e-mail (including any attachment) is
> > confidential and intended to be for the use of the addressee(s) only. If
> > you have received the e-mail by mistake, any disclosure, copy,
> > distribution or use of the contents of the e-mail is prohibited, and you
> > must delete the e-mail from your system. As e-mail can be changed
> > electronically KNAPP assumes no responsibility for any alteration to
> > this e-mail or its attachments. KNAPP has taken every reasonable
> > precaution to ensure that any attachment to this e-mail has been swept
> > for virus. However, KNAPP does not accept any liability for damage
> > sustained as a result of such attachment being virus infected and
> > strongly recommend that you carry out your own virus check before
> > opening any attachment.
> >
> > *Michael Schwartzkopff <misch at clusterbau.com>*
> >
> > 07.01.2011 15:02
> > Bitte antworten an
> > The Pacemaker cluster resource manager
> > <pacemaker at oss.clusterlabs.org>
> >
> >
> > An
> > The Pacemaker cluster resource manager
> <pacemaker at oss.clusterlabs.org>
> > Kopie
> >
> > Thema
> > Re: [Pacemaker] pingd process dies for no reason
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Friday 07 January 2011 14:56:03 Patrik.Rapposch at knapp.com wrote:
> >> Greetings,
> >>
> >> we have a problem, that the ping daemon dies for no reason and we can't
> >> find why this happened.
> >>
> >> we use following versions on SLES 11.1:
> >>
> >> libpacemaker3-1.1.2-0.6.1
> >> pacemaker-mgmt-2.0.0-0.3.10
> >> pacemaker-mgmt-client-2.0.0-0.3.10
> >> drbd-pacemaker-8.3.8.1-0.2.9
> >> libpacemaker-devel-1.1.2-0.6.1
> >> pacemaker-1.1.2-0.6.1
> >> pacemaker-mgmt-devel-2.0.0-0.3.10
> >> libcorosync4-1.2.6-0.2.2
> >> corosync-1.2.6-0.2.2
> >> libcorosync-devel-1.2.6-0.2.2
> >>
> >> here is the important part of the log trace:
> >> "
> >> Jan 5 08:40:30 node2 lrmd: [5990]: info: rsc:OSR_IP:46535: monitor
> >> Jan 5 08:40:30 node2 lrmd: [5990]: info: rsc:Cluster_IP:46533: monitor
> >> Jan 5 08:40:33 node2 lrmd: [5990]: WARN: pingd:0:monitor process (PID
> >> 23937) timed out (try 1). Killing with signal SIGTERM (15).
> >> Jan 5 08:40:33 node2 lrmd: [5990]: WARN: operation monitor[48559] on
> >> ocf::ping::pingd:0 for client 5993, its parameters: CRM_meta_clone=[0]
> >> host_list=[xxx.xxx.xxx.xxx] CRM_meta_clone_node_max=[1]
> >> CRM_meta_clone_max=[2] CRM_meta_notify=[false] dampen=[5s]
> >> CRM_meta_globally_unique=[false] crm_feature_set=[3.0.2]
> multiplier=[100]
> >> CRM_meta_name=[monitor] CRM_meta_interval=[15000]
> CRM_meta_timeout=[5000]
> >>
> >> : pid [23937] timed out
> >>
> >> Jan 5 08:40:33 node2 crmd: [5993]: ERROR: process_lrm_event: LRM
> >> operation pingd:0_monitor_15000 (48559) Timed Out (timeout=5000ms)
> >> Jan 5 08:40:33 node2 crmd: [5993]: WARN: update_failcount: Updating
> >> failcount for pingd:0 on node2 after failed monitor: rc=-2
> >> (update=value++, time=1294213233)
> >> Jan 5 08:40:35 node2 pengine: [5992]: notice: unpack_config: On loss
> of
> >> CCM Quorum: Ignore
> >> Jan 5 08:40:35 node2 pengine: [5992]: WARN: unpack_rsc_op: Processing
> >> failed op drbd_r0:1_promote_0 on node1: unknown exec error (-2)
> >> Jan 5 08:40:35 node2 pengine: [5992]: WARN: unpack_rsc_op: Processing
> >> failed op pingd:0_monitor_15000 on node2: unknown exec error (-2)
> >> Jan 5 08:40:35 node2 pengine: [5992]: notice: clone_print: Clone Set:
> >> pingdclone [pingd]
> >> Jan 5 08:40:35 node2 pengine: [5992]: notice: native_print: pingd:0
> >> (ocf::pacemaker:ping): Started node2 FAILED
> >> Jan 5 08:40:35 node2 pengine: [5992]: notice: short_print: Started:
> >> [ node1 ]"
> >>
> >> the ressource is configured in following way:
> >> <clone id="pingdclone">
> >> <meta_attributes id="pingdclone-meta_attributes">
> >> <nvpair id="pingdclone-meta_attributes-globally-unique"
> >> name="globally-unique" value="false"/>
> >> </meta_attributes>
> >> <primitive class="ocf" id="pingd" provider="pacemaker"
> >> type="ping">
> >> <instance_attributes id="pingd-instance_attributes">
> >> <nvpair id="pingd-instance_attributes-host_list"
> >> name="host_list" value="xxx.xxx.xxx.xxx"/>
> >> <nvpair id="pingd-instance_attributes-multiplier"
> >> name="multiplier" value="100"/>
> >> <nvpair id="nvpair-96877c9e-2825-4d7d-997b-944652f89584"
> >> name="dampen" value="5s"/>
> >> </instance_attributes>
> >> <operations>
> >> <op id="pingd-monitor-15s" interval="15s" name="monitor"
> >> timeout="5s"/>
> >> </operations>
> >> </primitive>
> >> </clone>
> >>
> >> thx for your help in advance.
> >>
> >> Mit freundlichen Grüßen / Best Regards
> >>
> >> Patrik Rapposch, BSc
> >
> > Please use the "ping" resource agent instead of the "pingd"
> >
> > Greetings,
> >
> > --
> > Dr. Michael Schwartzkopff
> > Guardinistr. 63
> > 81375 München
> >
> > Tel: (0163) 172 50 98
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org_
> > __http://oss.clusterlabs.org/mailman/listinfo/pacemaker_
> >
> > Project Home: _http://www.clusterlabs.org_ <http://www.clusterlabs.org/>
> > Getting started:
> _http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf_
> > Bugs:
> >
> _http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker_
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/>
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs:
> >
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> >
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
More information about the Pacemaker
mailing list