[Pacemaker] long time to start

Schaefer, Diane E diane.schaefer at unisys.com
Fri Apr 16 19:28:26 UTC 2010


Hi,
  I have a resource that sometimes can take 10 minutes to start after a failure due to log records that need to be sync'd. (my own OCF)  I noticed while the start action was being performed, if other resources in my cluster report a "not running", no restart will be attempted until my long running started resource returns.  Meanwhile, the crm_mon  reports the resources as "started" eventhough they are not running, and may not be for many minutes.  Is the lrm process single threaded?  Is running my resource start action async a better strategy?  I am concerned that other critical resources will not be restarted in case of failures during the restart of the long starting one.   Is the resource state of started, not running or failed triggered by the result of start instead of monitor?

Thanks for any information.
Diane Schaefer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20100416/9f030d0f/attachment.htm>


More information about the Pacemaker mailing list