[Pacemaker] stonith-ng message in /var/log/messages

Andrew Beekhof andrew at beekhof.net
Thu Sep 30 02:19:47 EDT 2010


On Wed, Sep 29, 2010 at 11:57 PM, Andrew Daugherity
<adaugherity at tamu.edu> wrote:
> Ron Kerry <rkerry at ...> writes:
>> I am seeing the following sequence of messages with every monitor interval for
> my stonith resource.
>>
>> Sep 28 10:44:01 genesis stonith-ng: [9493]: ERROR: run_stonith_agent: No
> timeout set for stonith
>> operation monitor with device fence_legacy
>> Sep 28 10:44:01 genesis stonith: l2network device OK.
>>
>> It is unclear to me what this ERROR means as the resource itself says
> everything is fine. There is a
>> monitor timeout set in the resource definition.
>>
>> Distribution is SLES11SP1  (SLE11SP1-HAE).
>> cluster-glue 1.0.6-0.3.7
>
> I'm seeing the same problem ever since the latest update rollup from Novell (the
> "sleshasp1-ha-update-201009" patch).  Example:
> Sep 29 16:28:35 imsxen3 stonith-ng: [5182]: ERROR: run_stonith_agent: No timeout
> set for stonith operation monitor with device fence_legacy
> Sep 29 16:28:36 imsxen3 stonith: external/ipmi device OK.

I believe its been fixed upstream, I guess novell needs to apply the
other half of the patch.

>
> I downgraded the cluster-glue package (and a couple others, so RPM dependencies
> were still satisfied) on one machine and the messages went away on that machine,
> while they're still there on the others.
>
> To clarify -- the "no timeout set" error is logged on the machine the stonith
> resource is currently running on, each time the monitor operation fires.  On the
> machine I downgraded cluster-glue on, there are no such errors for any stonith
> resource running on that server.
>
> My stonith definitions (in "crm configure" format) are like this:
> primitive stonith-imsxen1 stonith:external/ipmi \
>        meta target-role="Started" \
>        operations $id="stonith-imsxen2-operations" \
>        op monitor interval="300" timeout="15" start-delay="15" \
>        params hostname="imsxen1" ipaddr="10.95.12.51" userid="stonith" passwd="XXXX"
> interface="lanplus"
> and similarly for stonith-imsxen2 and stonith-imsxen3.  (Node names are
> imsxen[123].)
>
> STONITH works properly, aside from the annoying messages with the latest version.
>
> Here is the RPM version comparison:
> v | SLE11-HAE-SP1-Updates                 | cluster-glue   | 1.0.5-0.5.1     |
> 1.0.6-0.3.7       | x86_64
> v | SLE11-HAE-SP1-Updates                 | libglue2       | 1.0.5-0.5.1     |
> 1.0.6-0.3.7       | x86_64
> v | SLE11-HAE-SP1-Updates                 | libpacemaker3  | 1.1.2-0.2.1     |
> 1.1.2-0.6.1       | x86_64
> v | SLE11-HAE-SP1-Updates                 | pacemaker      | 1.1.2-0.2.1     |
> 1.1.2-0.6.1       | x86_64
> v | SLE11-HAE-SP1-Updates                 | pacemaker-mgmt | 2.0.0-0.2.19    |
> 2.0.0-0.3.10      | x86_64
>
> I intentionally rolled back the cluster-glue package, and the others were rolled
> back to satisfy dependencies.  According to the RPM changelog, the "good"
> version of cluster-glue (1.0.5-0.5.1) is from Upstream version cs: 6cf2e36df9f4,
> while the newer one is from cs: a146a145a3e.
>
> While it's possible this is a problem with Novell's builds, I don't think that
> to be likely, since there are no local patches in the RPM spec file.
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>




More information about the Pacemaker mailing list