[Pacemaker] metadata (timeout) ignored?

Markus M. adrock0905 at alice.de
Wed Jan 20 15:28:49 UTC 2010


Hello,

i've a question about metadata returned by an ocf resource agent using 
the "meta-data" command and the behaviour of the cluster.

When checking the resource agent's metadata using crm i get this:

# crm
crm(live)# ra
crm(live)ra#  meta cluster_oracle ocf
bla (ocf:heartbeat:cluster_oracle)

Master/Slave OCF Resource Agent for Oracle (clustered)

Parameters (* denotes required, [] the default):

oracle_role* (string): Ora role
     Required to assign the Oracle role. Must be "master" or "slave"

Operations' defaults (advisory minimum):

     start    timeout=240
     promote  timeout=90
     demote   timeout=90
     notify   timeout=90
     stop     timeout=100
     monitor  timeout=20 interval=20 depth=0
     monitor  timeout=20 interval=10 depth=0

So it seems for the "stop" action there is a timeout of 100 seconds 
defined. But at cluster shutdown i can see this in the ha-debug log:

...
Jan 18 14:31:35 node1 crmd: [12844]: info: te_rsc_command: Initiating 
action 5: stop oracle_primary_stop_0 on node1 (local)
Jan 18 14:31:35 node11 pengine: [12848]: notice: LogActions: Leave 
resource oracle_secondary  (Stopped)
Jan 18 14:31:35 node1 lrmd: [12841]: info: rsc:oracle_primary:7: stop
Jan 18 14:31:35 node1 crmd: [12844]: info: do_lrm_rsc_op: Performing 
key=5:10:0:40ea1f42-c929-40d6-a0ed-569a7c8944bc op=oracle_primary_stop_0 )
Jan 18 14:31:35 node1 lrmd: [12841]: info: RA output: 
(oracle_primary:stop:stderr) 
/usr/lib/ocf/resource.d//heartbeat/cluster_oracle[247]:
Jan 18 14:31:35 node1 pengine: [12848]: WARN: process_pe_message: 
Transition 10: WARNINGs found during PE processing. PEngine Input stored 
in: /var/lib/pengine/pe-warn-2220.bz2
Jan 18 14:31:35 node1 pengine: [12848]: info: process_pe_message: 
Configuration WARNINGs found during PE processing.  Please run 
"crm_verify -L" to identify issues.
Jan 18 14:31:55 node1 lrmd: [12841]: WARN: oracle_primary:stop process 
(PID 14386) timed out (try 1).  Killing with signal SIGTERM (15).
Jan 18 14:31:55 node1 lrmd: [12841]: info: RA output: 
(oracle_primary:stop:stderr)
Session terminated, killing shell...
Jan 18 14:31:57 node1 lrmd: [12841]: info: RA output: 
(oracle_primary:stop:stderr)  ...killed.

Apparently a timeout occured at the stop action after 20 seconds. But 
why, if the resource defined 100 secs?

With kind regards
Markus




More information about the Pacemaker mailing list