[Pacemaker] Howto handle opt-in clusters WAS: Re: resource monitor operations on wrong nodes

Fri Apr 16 13:26:02 UTC 2010

Hi,

> > I have a non-symmetric cluster (symmetric-cluster="false") with four
> > nodes.
> We still check _every_ node to be sure the resources aren't already
> running there.

OK, that is reasonable - but I have trouble with the logic of the 
messages: they are listed as failed actions, however if the resource 
mustn't run there at first place it is not a failure that the resource is 
not installed. For the admins of such a cluster  I would like to have a 
clean status view - but you get several "false alarms" when you have 
different resources which can't run on all nodes. 
Will such a message send also an SNMP Trap (or Email if configured)?
Or do I need to install dummy scripts on these nodes to avoid such 
messages? 
Perhaps this is only more or less something that the crm-shell should 
handle? 
Sometimes resources are trying to start on false nodes and are increasing 
also the failcounter - is this also a correct behavior or 
misconfiguration?

And what about the ocf-scripts, do I need to copy them to all nodes - even 
to those nodes where the application is not installed. Because I can only 
configure the resources in crm shell when invoked on the node where the 
application is installed otherwise I get an:
 ERROR: lrm_get_rsc_type_metadata(578): got a return code HA_FAIL from a 
reply message of metadata with funct
ERROR: ocf:icw:ocfAPP: no such resource agent 
(I understand this error message because obviously the shell is looking on 
the local node for the ocf script.)

I know I can use the cibadmin for this case, but it would be more 
user-friendly to be able to centrally administrate a cluster with the crm 
shell.

I am not sure if I am going in the right direction but I want to set up a 
6-8 nodes cluster where on pairs (drbd) of them are running different 
applications and I need a status view for "normal" admins so they can see 
at a glance that everything is OK (or not).

Thanks in advance,
Martin

Andrew Beekhof <andrew at beekhof.net> wrote on 09.04.2010 13:08:07:

> [image removed] 
> 
> Re: [Pacemaker] resource monitor operations on wrong nodes
> 
> Andrew Beekhof 
> 
> to:
> 
> The Pacemaker cluster resource manager
> 
> 09.04.2010 13:10
> 
> [image removed] 
> 
> From:
> 
> Andrew Beekhof <andrew at beekhof.net>
> 
> To:
> 
> The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org>
> 
> Please respond to The Pacemaker cluster resource manager 
> <pacemaker at oss.clusterlabs.org>
> 
> On Fri, Apr 9, 2010 at 12:06 PM,  <martin.braun at icw.de> wrote:
> > Hi,
> >
> > I have a non-symmetric cluster (symmetric-cluster="false") with four
> > nodes.
> 
> We still check _every_ node to be sure the resources aren't already
> running there.
> 
> > On two nodes I have allowed a resource group:
> >
> > location grpFS-pref1 grpFS 200: wdf-ux-0040
> > location grpFS-pref2 grpFS 200: wdf-ux-0041
> >
> > grpFS is configured as:
> >
> > group grpFS resFS resVIP resAPP
> >
> > the other nodes are not mentioned in any location constraints for now.
> >
> > However I get this:
> >
> > <<
> > Failed actions:
> >    resAPP_monitor_0 (node=ux-4, call=4, rc=5, status=complete): not
> > installed
> >    resAPP_monitor_0 (node=ux-5, call=4, rc=5, status=complete): not
> > installed
> >>>
> >
> >
> > My question is why does pacemaker try to monitor the app on the wrong
> > nodes, I would have thought that with an opt-in cluster this should 
not
> > happen?
> > Or do I have to use explicitly loc constraints to avoid  runnning
> > monitoring the resource on the other nodes?
> >
> > Overall Config:
> >
> >
> > node ux-0 \
> >        attributes standby="off"
> > node ux-1 \
> >        attributes standby="off"
> > node ux-4
> > node ux-5
> > primitive resDRBD ocf:linbit:drbd \
> >        operations $id="resDRBD-operations" \
> >        op monitor interval="20" role="Slave" timeout="20"
> > start-delay="1m" \
> >        op monitor interval="10" role="Master" timeout="20"
> > start-delay="1m" \
> >        params drbd_resource="r0" drbdconf="/usr/local/etc/drbd.conf"
> > primitive resFS ocf:heartbeat:Filesystem \
> >        operations $id="resFS-operations" \
> >        op monitor interval="20" timeout="40" start-delay="0" \
> >        params device="/dev/drbd0" directory="/opt/icw" fstype="ext3"
> > primitive resAPP ocf:icw:ocfapp2 \
> >        operations $id="resapp-operations" \
> >        op start interval="0" timeout="3m" \
> >        op monitor interval="60s" timeout="30s" start-delay="3m" \
> >        params [....]
> >        meta target-role="Started" is-managed="true"
> > primitive resVIP ocf:heartbeat:IPaddr2 \
> >        params ip="192.168.210.91" cidr_netmask="24" nic="eth3" \
> >        operations $id="resVIP-operations" \
> >        op monitor interval="10s" timeout="20s" start-delay="2s" \
> >        meta target-role="Started"
> > group grpFS resFS resVIP resapp \
> >        meta target-role="started"
> > ms msDRBD resDRBD \
> >        meta clone-max="2" notify="true" target-role="started"
> > location grpFS-pref1 grpFS 200: wdf-ux-0040
> > location grpFS-pref2 grpFS 200: wdf-ux-0041
> > location master-pref1 msDRBD 200: wdf-ux-0040
> > location master-pref2 msDRBD 200: wdf-ux-0041
> > colocation colFSDRBD inf: grpFS msDRBD:Master
> > order orderFSDRBD : msDRBD:promote grpFS:start
> > property $id="cib-bootstrap-options" \
> >        dc-version="1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7" \
> >        cluster-infrastructure="openais" \
> >        expected-quorum-votes="4" \
> >        no-quorum-policy="ignore" \
> >        stonith-enabled="false" \
> >        last-lrm-refresh="1270053177" \
> >        symmetric-cluster="false"
> >
> >
> > Thanks in advance,
> > Martin
> >
> >
> >
> >
> > _______________________________________________
> > Pacemaker mailing list
> > Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> >
> 
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker

InterComponentWare AG:  
Vorstand: Peter Kirschbauer (Vors.), Jörg Stadler / Aufsichtsratsvors.: Prof. Dr. Christof Hettich  
Firmensitz: 69190 Walldorf, Industriestraße 41 / AG Mannheim HRB 351761 / USt.-IdNr.: DE 198388516