[Pacemaker] Resource is Too Active (on both nodes)

Mon Mar 25 19:16:33 EDT 2013

On 2013-03-22 21:35, Mohica Jasha wrote:
> Hey,
> 
> I have two cluster nodes.
> 
> I have a service process which is prone to crash and takes a very long
> time to start. 
> Since the service process takes a long time to start I have the service
> process running on both nodes, but only the active node with the virtual
> IP serves the incoming requests.
> 
> On both nodes, I have a cron job which periodically checks if the
> service process is up and if not it starts the service.
> 
> I want pacemaker to periodically check if the service is down on the
> active node and if so, it switches the virtual IP to the second node
> (without starting or stopping the my service)
> 
> I have the following configuration:
> 
> primitive clusterIP ocf:heartbeat:IPaddr2 \
> params ip="10.0.1.247" \
> op monitor interval="10s" timeout="20s"
> 
> primitive serviceMonitoring ocf:serviceMonitoring:serviceMonitoring 
> params op monitor interval="10s" timeout="20s"
> 
> colocation HACluster inf: serviceMonitoring clusterIP
> order serviceMonitoring-after-clusterIP inf: clusterIP serviceMonitoring
> 
> My serviceMonitoring resource doesn't do anything other than checking
> the state of the service process. I get the following in the log file:
> 
> Mar 05 15:07:59 [1543] ha1 pengine:   notice: unpack_rsc_op: Operation
> monitor found resource serviceMonitoring active on ha2
> Mar 05 15:07:59 [1543] ha1 pengine:   notice: unpack_rsc_op: Operation
> monitor found resource serviceMonitoring active on ha1
> Mar 05 15:07:59 [1543] ha1 pengine:    error: native_create_actions:
> Resource serviceMonitoring (ocf:: serviceMonitoring) is active on 2
> nodes attempting recovery
> Mar 05 15:07:59 [1543] ha1 pengine:  warning: native_create_actions: See
> http://clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more information.
> 
> So it seems that pacemaker calls the monitor method of the
> serviceMonitoring resource on both nodes.

Yes, it does a probing of the resources on all nodes ... clone your
serviceMonitoring resource and set it into unmanaged mode, that should
give you the desired behavior ... or simply clone it and let Pacemaker
do the complete management and go without your cron-check-restart magic.

Regards,
Andreas

> 
> Any idea how I can fix this?
> 
> Thanks,
> Mohica
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

-- 
Need help with Pacemaker?
http://www.hastexo.com/now