[Pacemaker] Occasional error running ocf scripts
Chris Picton
chris at ecntelecoms.com
Fri Aug 13 10:29:43 UTC 2010
On Fri, 13 Aug 2010 12:06:27 +0200, Dejan Muhamedagic wrote:
> Hi,
>
> On Fri, Aug 13, 2010 at 11:20:38AM +0200, Chris Picton wrote:
>> Hi all
>>
>> I have seen the following behaviour on a few occasions in the past few
>> months. It seems as if the resource script get called, but without the
>> correct OCF_RESOURCE parameters.
>>
>> Aug 13 10:58:08 chris-test-01 Filesystem[24682]: [24688]: ERROR: Please
>> set OCF_RESKEY_device to the device to be managed Aug 13 10:58:08
>>
>>
>> 99% of the time, the resource will stop correctly, it is just on a few
>> occasions that I see an error like this.
>>
>> Is this a known problem, or can I generate extra logging to try help
>> debug?
>
> Never heard of it. That sounds quite serious. Yes, extra logging would
> be helpful. How often did that happen? Which releases do you run?
>
I have probably seen it more than 10 times (on different resources,
versions and servers) over the past year
It has happened on versions 2.1.4, 3.0.0 and 3.0.3, but it happened more
often on 2.1.4 (we had a server which would often get stonithed when
stopping a resource for exactly this reason)
I am currently testing a new CIB for my sql servers and it came up again,
so I thought I would mail through my results.
I will update to the latest rpm package from clusterlabs (I currently am
running pacemaker-1.0.9.1-1.el5 and heartbeat-3.0.3-2.el5 on my test),
and see if I can trigger it again with a higher debug level.
More information about the Pacemaker
mailing list