[Pacemaker] Fw: Antwort: Re: pingd process dies for no reason

Patrik.Rapposch at knapp.com Patrik.Rapposch at knapp.com
Mon Jan 10 12:35:41 UTC 2011


Anyone an idea or did anyone have the same problem?


Mit freundlichen Grüßen / Best Regards

Patrik Rapposch, BSc
System Administration

KNAPP Systemintegration GmbH
Waltenbachstraße 9
8700 Leoben, Austria 
Phone: +43 3842 805-915
Fax: +43 3842 82930-500
patrik.rapposch at knapp.com 
www.KNAPP.com 

Commercial register number: FN 138870x
Commercial register court: Leoben

The information in this e-mail (including any attachment) is confidential 
and intended to be for the use of the addressee(s) only. If you have 
received the e-mail by mistake, any disclosure, copy, distribution or use 
of the contents of the e-mail is prohibited, and you must delete the 
e-mail from your system. As e-mail can be changed electronically KNAPP 
assumes no responsibility for any alteration to this e-mail or its 
attachments. KNAPP has taken every reasonable precaution to ensure that 
any attachment to this e-mail has been swept for virus. However, KNAPP 
does not accept any liability for damage sustained as a result of such 
attachment being virus infected and strongly recommend that you carry out 
your own virus check before opening any attachment.
----- Weitergeleitet von Patrik Rapposch/KSI am 10.01.2011 13:35 -----

Patrik.Rapposch at knapp.com 
07.01.2011 16:38
Bitte antworten an
The Pacemaker cluster resource manager  <pacemaker at oss.clusterlabs.org>


An
The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org>
Kopie

Thema
[Pacemaker] Antwort: Re:  pingd process dies for no reason






Hello, 

thx for your fast reply, we use the ping ressource, you can see it in our 
config, its just the id which is called pingd, i admit this is a little 
confusing.: 
"<primitive class="ocf" id="pingd" provider="pacemaker"
> type="ping">" 

kr patrik 


Mit freundlichen Grüßen / Best Regards

Patrik Rapposch, BSc
System Administration

KNAPP Systemintegration GmbH
Waltenbachstraße 9
8700 Leoben, Austria 
Phone: +43 3842 805-915
Fax: +43 3842 82930-500
patrik.rapposch at knapp.com 
www.KNAPP.com 

Commercial register number: FN 138870x
Commercial register court: Leoben

The information in this e-mail (including any attachment) is confidential 
and intended to be for the use of the addressee(s) only. If you have 
received the e-mail by mistake, any disclosure, copy, distribution or use 
of the contents of the e-mail is prohibited, and you must delete the 
e-mail from your system. As e-mail can be changed electronically KNAPP 
assumes no responsibility for any alteration to this e-mail or its 
attachments. KNAPP has taken every reasonable precaution to ensure that 
any attachment to this e-mail has been swept for virus. However, KNAPP 
does not accept any liability for damage sustained as a result of such 
attachment being virus infected and strongly recommend that you carry out 
your own virus check before opening any attachment. 


Michael Schwartzkopff <misch at clusterbau.com> 
07.01.2011 15:02 

Bitte antworten an
The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org>


An
The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org> 
Kopie

Thema
Re: [Pacemaker] pingd process dies for no reason








On Friday 07 January 2011 14:56:03 Patrik.Rapposch at knapp.com wrote:
> Greetings,
> 
> we have a problem, that the ping daemon dies for no reason and we can't
> find why this happened.
> 
> we use following versions on SLES 11.1:
> 
> libpacemaker3-1.1.2-0.6.1
> pacemaker-mgmt-2.0.0-0.3.10
> pacemaker-mgmt-client-2.0.0-0.3.10
> drbd-pacemaker-8.3.8.1-0.2.9
> libpacemaker-devel-1.1.2-0.6.1
> pacemaker-1.1.2-0.6.1
> pacemaker-mgmt-devel-2.0.0-0.3.10
> libcorosync4-1.2.6-0.2.2
> corosync-1.2.6-0.2.2
> libcorosync-devel-1.2.6-0.2.2
> 
> here is the important part of the log trace:
> "
> Jan  5 08:40:30 node2 lrmd: [5990]: info: rsc:OSR_IP:46535: monitor
> Jan  5 08:40:30 node2 lrmd: [5990]: info: rsc:Cluster_IP:46533: monitor
> Jan  5 08:40:33 node2 lrmd: [5990]: WARN: pingd:0:monitor process (PID
> 23937) timed out (try 1).  Killing with signal SIGTERM (15).
> Jan  5 08:40:33 node2 lrmd: [5990]: WARN: operation monitor[48559] on
> ocf::ping::pingd:0 for client 5993, its parameters: CRM_meta_clone=[0]
> host_list=[xxx.xxx.xxx.xxx] CRM_meta_clone_node_max=[1]
> CRM_meta_clone_max=[2] CRM_meta_notify=[false] dampen=[5s]
> CRM_meta_globally_unique=[false] crm_feature_set=[3.0.2] 
multiplier=[100]
> CRM_meta_name=[monitor] CRM_meta_interval=[15000] 
CRM_meta_timeout=[5000]
> 
> : pid [23937] timed out
> 
> Jan  5 08:40:33 node2 crmd: [5993]: ERROR: process_lrm_event: LRM
> operation pingd:0_monitor_15000 (48559) Timed Out (timeout=5000ms)
> Jan  5 08:40:33 node2 crmd: [5993]: WARN: update_failcount: Updating
> failcount for pingd:0 on node2 after failed monitor: rc=-2
> (update=value++, time=1294213233)
> Jan  5 08:40:35 node2 pengine: [5992]: notice: unpack_config: On loss of
> CCM Quorum: Ignore
> Jan  5 08:40:35 node2 pengine: [5992]: WARN: unpack_rsc_op: Processing
> failed op drbd_r0:1_promote_0 on node1: unknown exec error (-2)
> Jan  5 08:40:35 node2 pengine: [5992]: WARN: unpack_rsc_op: Processing
> failed op pingd:0_monitor_15000 on node2: unknown exec error (-2)
> Jan  5 08:40:35 node2 pengine: [5992]: notice: clone_print:  Clone Set:
> pingdclone [pingd]
> Jan  5 08:40:35 node2 pengine: [5992]: notice: native_print: pingd:0
> (ocf::pacemaker:ping):  Started node2 FAILED
> Jan  5 08:40:35 node2 pengine: [5992]: notice: short_print: Started:
> [ node1 ]"
> 
> the ressource is configured in following way:
> <clone id="pingdclone">
>         <meta_attributes id="pingdclone-meta_attributes">
>           <nvpair id="pingdclone-meta_attributes-globally-unique"
> name="globally-unique" value="false"/>
>         </meta_attributes>
>         <primitive class="ocf" id="pingd" provider="pacemaker"
> type="ping">
>           <instance_attributes id="pingd-instance_attributes">
>             <nvpair id="pingd-instance_attributes-host_list"
> name="host_list" value="xxx.xxx.xxx.xxx"/>
>             <nvpair id="pingd-instance_attributes-multiplier"
> name="multiplier" value="100"/>
>             <nvpair id="nvpair-96877c9e-2825-4d7d-997b-944652f89584"
> name="dampen" value="5s"/>
>           </instance_attributes>
>           <operations>
>             <op id="pingd-monitor-15s" interval="15s" name="monitor"
> timeout="5s"/>
>           </operations>
>         </primitive>
>       </clone>
> 
> thx for your help in advance.
> 
> Mit freundlichen Grüßen / Best Regards
> 
> Patrik Rapposch, BSc

Please use the "ping" resource agent instead of the "pingd"

Greetings,

-- 
Dr. Michael Schwartzkopff
Guardinistr. 63
81375 München

Tel: (0163) 172 50 98
_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: 
http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: 
http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110110/e79f7ad2/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/octet-stream
Size: 205 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110110/e79f7ad2/attachment-0003.obj>


More information about the Pacemaker mailing list