[Pacemaker] Should monitor operations be stopped after a resource is unmanaged?

Ron Kerry rkerry at sgi.com
Fri Apr 1 14:36:28 UTC 2011


Folks -

Consider a running cluster with all resources managed. We want to stop and quickly restart a 
particular resource without impacting other resources. The software stack running on the system can 
deal with this sort of temporary outage. We perform the following actions:
   * unmanage the resource
   * stop the resource
   * start the resource
   * manage the resource

The above procedure is sometimes successful. However, we will also sometimes get a resource monitor 
failure after stopping the resource. It is clear that the monitor operation was not stopped (at 
least not immediately) by unmanaging the resource.

Will the monitor operation get stopped when a resource is unmanaged?
If so, how long will it take for this to occur?
What determines this length of time?
Is there a better way to do a quick restart of a resource without impacting other resources?
(In our case the resource is a member of a resource group)

Example:
  Resource Group: dmfGroup
      CXFS	(ocf::sgi:cxfs):	Started genesis
      VirtualIP	(ocf::heartbeat:IPaddr2):	Started genesis
      TMF	(ocf::sgi:tmf):	Started genesis (unmanaged)
      DMF	(ocf::sgi:dmf):	Started genesis
      DMFMAN	(ocf::sgi:dmfman):	Started genesis
      DMFSOAP	(ocf::sgi:dmfsoap):	Started genesis

Log messages ...
genesis:~ # tail -f /var/log/messages | grep TMF
Apr  1 09:26:28 genesis lrmd: [5741]: debug: rsc:TMF:18: monitor
Apr  1 09:27:09 genesis root: unmanage TMF
Apr  1 09:27:09 genesis cib: [5740]: info: log_data_element: cib:diff: +         <primitive id="TMF" >
Apr  1 09:27:09 genesis cib: [5740]: info: log_data_element: cib:diff: +           <meta_attributes 
id="TMF-meta_attributes" >
Apr  1 09:27:09 genesis cib: [5740]: info: log_data_element: cib:diff: +             <nvpair 
id="TMF-meta_attributes-is-managed" name="is-managed" value="false" __crm_diff_marker__="added:top" />
Apr  1 09:27:09 genesis pengine: [5743]: info: native_add_running: resource TMF isnt managed
Apr  1 09:27:09 genesis pengine: [5743]: notice: native_print:      TMF	(ocf::sgi:tmf):	Started 
genesis (unmanaged)
Apr  1 09:27:09 genesis pengine: [5743]: info: native_color: Unmanaged resource TMF allocated to 
genesis: active
Apr  1 09:27:09 genesis pengine: [5743]: notice: LogActions: Leave resource TMF	(Started unmanaged)
Apr  1 09:27:12 genesis crm_resource: [5219]: info: native_add_running: resource TMF isnt managed
Apr  1 09:27:12 genesis crm_resource: [5222]: info: native_add_running: resource TMF isnt managed
Apr  1 09:28:28 genesis lrmd: [5741]: debug: rsc:TMF:18: monitor
Apr  1 09:30:29 genesis lrmd: [5741]: debug: rsc:TMF:18: monitor
Apr  1 09:30:32 genesis crm_resource: [6428]: info: native_add_running: resource TMF isnt managed
Apr  1 09:30:32 genesis crm_resource: [6431]: info: native_add_running: resource TMF isnt managed
Apr  1 09:32:29 genesis lrmd: [5741]: debug: rsc:TMF:18: monitor

-- 

Ron Kerry         rkerry at sgi.com
Global Product Support - SGI Federal




More information about the Pacemaker mailing list