[Pacemaker] default timeout for op start/stop

Fri Sep 24 16:12:05 UTC 2010

Hi,

On Fri, Sep 24, 2010 at 01:54:57PM +0200, Michael Schhwartzkopff wrote:
> On Friday 24 September 2010 13:50:49 Pavlos Parissis wrote:
> > Hi,
> > 
> > When I verify my conf I get complains about the timeout on start and stop
> > operation
> > crm(live)configure# verify
> > WARNING: drbd_01: default timeout 20s for start is smaller than the advised
> > 240
> > WARNING: drbd_01: default timeout 20s for stop is smaller than the advised
> > 100
> > WARNING: drbd_02: default timeout 20s for start is smaller than the advised
> > 240
> > WARNING: drbd_02: default timeout 20s for stop is smaller than the advised
> > 100
> > 
> > Since I don't specifically set timeout for the mentioned resources I
> > thought this 20s is coming from the defaults.
> > So, I queried the defaults and got the following
> > [root at node-03 ~]# crm_attribute --type op_defaults --name timeout
> > scope=op_defaults  name=timeout value=(null)
> > 
> > So, I am wondering from where this 20s is coming from.
> > 
> > I had the same issue for IP and Filesystem type resources and in order to
> > get rid of the warning I specifically set it to be 60s.
> > 
> > Regards,
> > Pavlos
> > 
> > 
> > [root at node-03 ~]# crm configure show
> > node $id="b8ad13a6-8a6e-4304-a4a1-8f69fa735100" node-02
> > node $id="d5557037-cf8f-49b7-95f5-c264927a0c76" node-01
> > node $id="e5195d6b-ed14-4bb3-92d3-9105543f9251" node-03
> > primitive drbd_01 ocf:linbit:drbd \
> >         params drbd_resource="drbd_pbx_service_1" \
> >         op monitor interval="30s"
> > primitive drbd_02 ocf:linbit:drbd \
> >         params drbd_resource="drbd_pbx_service_2" \
> >         op monitor interval="30s"
> > primitive fs_01 ocf:heartbeat:Filesystem \
> >         params device="/dev/drbd1" directory="/pbx_service_01"
> > fstype="ext3" \
> >         meta migration-threshold="3" failure-timeout="60" \
> >         op monitor interval="20s" timeout="40s" OCF_CHECK_LEVEL="20" \
> >         op start interval="0" timeout="60s" \
> >         op stop interval="0" timeout="60s"
> > primitive fs_02 ocf:heartbeat:Filesystem \
> >         params device="/dev/drbd2" directory="/pbx_service_02"
> > fstype="ext3" \
> >         meta migration-threshold="3" failure-timeout="60" \
> >         op monitor interval="20s" timeout="40s" OCF_CHECK_LEVEL="20" \
> >         op start interval="0" timeout="60s" \
> >         op stop interval="0" timeout="60s"
> > primitive ip_01 ocf:heartbeat:IPaddr2 \
> >         params ip="10.10.10.10" cidr_netmask="25" broadcast="10.10.10.127"
> > \ meta failure-timeout="120" migration-threshold="3" \
> >         op monitor interval="5s"
> > primitive ip_02 ocf:heartbeat:IPaddr2 \
> >         params ip="10.10.10.11" cidr_netmask="25" broadcast="10.10.10.127"
> > \ op monitor interval="5s"
> > primitive pbx_01 ocf:heartbeat:Dummy \
> >         params state="/pbx_service_01/Dummy.state" \
> >         meta failure-timeout="60" migration-threshold="3" \
> >         op monitor interval="20s" timeout="40s"
> > primitive pbx_02 ocf:heartbeat:Dummy \
> >         params state="/pbx_service_02/Dummy.state" \
> >         meta failure-timeout="60" migration-threshold="3"
> > group pbx_service_01 ip_01 fs_01 pbx_01 \
> >         meta target-role="Started"
> > group pbx_service_02 ip_02 fs_02 pbx_02 \
> >         meta target-role="Started"
> > ms ms-drbd_01 drbd_01 \
> >         meta master-max="1" master-node-max="1" clone-max="2"
> > clone-node-max="1" notify="true"
> > ms ms-drbd_02 drbd_02 \
> >         meta master-max="1" master-node-max="1" clone-max="2"
> > clone-node-max="1" notify="true" target-role="Started"
> > location PrimaryNode-drbd_01 ms-drbd_01 100: node-01
> > location PrimaryNode-drbd_02 ms-drbd_02 100: node-02
> > location PrimaryNode-pbx_service_01 pbx_service_01 200: node-01
> > location PrimaryNode-pbx_service_02 pbx_service_02 200: node-02
> > location SecondaryNode-drbd_01 ms-drbd_01 0: node-03
> > location SecondaryNode-drbd_02 ms-drbd_02 0: node-03
> > location SecondaryNode-pbx_service_01 pbx_service_01 10: node-03
> > location SecondaryNode-pbx_service_02 pbx_service_02 10: node-03
> > colocation fs_01-on-drbd_01 inf: fs_01 ms-drbd_01:Master
> > colocation fs_02-on-drbd_02 inf: fs_02 ms-drbd_02:Master
> > colocation pbx_01-with-fs_01 inf: pbx_01 fs_01
> > colocation pbx_01-with-ip_01 inf: pbx_01 ip_01
> > colocation pbx_02-with-fs_02 inf: pbx_02 fs_02
> > colocation pbx_02-with-ip_02 inf: pbx_02 ip_02
> > order fs_01-after-drbd_01 inf: ms-drbd_01:promote fs_01:start
> > order fs_02-after-drbd_02 inf: ms-drbd_02:promote fs_02:start
> > order pbx_01-after-fs_01 inf: fs_01 pbx_01
> > order pbx_01-after-ip_01 inf: ip_01 pbx_01
> > order pbx_02-after-fs_02 inf: fs_02 pbx_02
> > order pbx_02-after-ip_02 inf: ip_02 pbx_02
> > property $id="cib-bootstrap-options" \
> >         dc-version="1.0.9-89bd754939df5150de7cd76835f98fe90851b677" \
> >         cluster-infrastructure="Heartbeat" \
> >         stonith-enabled="false" \
> >         symmetric-cluster="false" \
> >         last-lrm-refresh="1285323745"
> > rsc_defaults $id="rsc-options" \
> >         resource-stickiness="1000"
> 
> Default timeout is coded into the resource agent. You safely can ignore the 
> WARNINGs. These are also removed from more recent versions of pacemaker.

These warnings shouldn't be ignored. The defaults which are coded
in the RA are what the author of the RA advised as minimum. These
values are, however, not used automatically by the CRM, so they
need to be specified in the configuration. And then the resources
should be thoroughly tested to see if the timeouts are meaningful
in the given environment.

Thanks,

Dejan

> 
> -- 
> Dr. Michael Schwartzkopff
> Guardinistr. 63
> 81375 München
> 
> Tel: (0163) 172 50 98
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker