[Pacemaker] Differences in scoring between similar resources

Bobbie Lind blind at sms-fed.com
Mon Aug 22 13:46:53 CET 2011


I have 4 servers running 4 dummy resources for colocation purposes.  2 of
the 4 resources score properly and 2 do not, causing me significant
headache.

anchorOSS1 and anchorOSS2 are showing equal scores for both nodes (not what
I want) and anchorOSS3 and anchorOSS4 are showing the proper scores for the
nodes

Here is the relevant part of my configuration:

location locOSS1primary anchorOSS1 500: node1
location locOSS1secondary anchorOSS1 250: node3
location locOSS2primary anchorOSS2 500: node2
location locOSS2secondary anchorOSS2 250: node4

location locOSS3primary anchorOSS3 500: node3
location locOSS3secondary anchorOSS3 250: node1
location locOSS4primary anchorOSS4 500: node4
location locOSS4secondary anchorOSS4 250: node2
.
.
colocation colocOSS1OSS2 -inf: anchorOSS2 anchorOSS1
colocation colocOSS1OSS4 -inf: anchorOSS4 anchorOSS1
colocation colocOSS1group 300: ( resOST0000 resOST0004 resOST0008 )
anchorOSS1
colocation colocOSS2OSS3 -inf: anchorOSS3 anchorOSS2
colocation colocOSS2group 300: ( resOST0001 resOST0005 resOST0009 )
anchorOSS2
colocation colocOSS3OSS4 -inf: anchorOSS4 anchorOSS3
colocation colocOSS3group 300: ( resOST0002 resOST0006 resOST000a )
anchorOSS3
colocation colocOSS4group 300: ( resOST0003 resOST0007 resOST000b )
anchorOSS4


Here are the anchor results of ptest -Ls after first starting up corosync

native_color: anchorOSS1 allocation score on node1: 750
native_color: anchorOSS1 allocation score on node3: 750
native_color: anchorOSS2 allocation score on node2: 750
native_color: anchorOSS2 allocation score on node4: 750

native_color: anchorOSS3 allocation score on node3: 500
native_color: anchorOSS3 allocation score on node1: 250
native_color: anchorOSS4 allocation score on node4: 500
native_color: anchorOSS4 allocation score on node2: 250

With these scores, anchorOSS3 migrates and unmigrates properly using "crm
resource migrate anchorOSS3" and "crm resource unmigrate anchorOSS3", same
with anchorOSS4.

However, migrating anchorOSS1 and anchorOSS2 results in a proper migration
but unmigrating doesn't provide the desired results (moving back to the
original server) because of the "same score for each node"

The configuration is "opt-in" with explicitly denying access unless granted.

I am not sure where the 750 score is even coming from.  It's almost like the
primary and secondary score added together.

Some pertinent information for my setup:

I am running RedHat 5.6
Pacemaker: 1.0.11
Corosync 1.2.8
OpenAIS 1.1.4

I'm not seeing any errors in the logs but I do see the following:

Aug 18 20:17:50 node1 pengine: [12483]: debug: unpack_rsc_op:
anchorOSS1_monitor_0 on node1 returned 0 (ok) instead of the expected value:
7 (not running)
Aug 18 20:17:50 node1 pengine: [12483]: notice: unpack_rsc_op: Operation
anchorOSS1_monitor_0 found resource anchorOSS1 active on node1
Aug 18 20:17:50 node1 pengine: [12483]: debug: unpack_rsc_op:
anchorOSS2_monitor_0 on node2 returned 0 (ok) instead of the expected value:
7 (not running)
Aug 18 20:17:50 node1 pengine: [12483]: notice: unpack_rsc_op: Operation
anchorOSS2_monitor_0 found resource anchorOSS2 active on node2
Aug 18 20:17:50 node1 pengine: [12483]: debug: unpack_rsc_op:
anchorOSS4_monitor_0 on node4 returned 0 (ok) instead of the expected value:
7 (not running)
Aug 18 20:17:50 node1 pengine: [12483]: notice: unpack_rsc_op: Operation
anchorOSS4_monitor_0 found resource anchorOSS4 active on node4
Aug 18 20:17:50 node1 pengine: [12483]: debug: unpack_rsc_op:
anchorOSS3_monitor_0 on node3 returned 0 (ok) instead of the expected value:
7 (not running)
Aug 18 20:17:50 node1 pengine: [12483]: notice: unpack_rsc_op: Operation
anchorOSS3_monitor_0 found resource anchorOSS3 active on node3
Aug 18 20:17:50 node1 pengine: [12483]: notice: native_print: anchorOSS1
(ocf::heartbeat:Dummy):    Started node1
Aug 18 20:17:50 node1 pengine: [12483]: notice: native_print: anchorOSS2
(ocf::heartbeat:Dummy):    Started node4
Aug 18 20:17:50 node1 pengine: [12483]: notice: native_print: anchorOSS3
(ocf::heartbeat:Dummy):    Started node3
Aug 18 20:17:50 node1 pengine: [12483]: notice: native_print: anchorOSS4
(ocf::heartbeat:Dummy):    Started node4
...
Aug 18 20:17:51 node1 pengine: [12483]: info: rsc_merge_weights: resMDTLVM:
Rolling back scores from anchorOSS2
Aug 18 20:17:51 node1 pengine: [12483]: info: rsc_merge_weights: resMDTLVM:
Rolling back scores from anchorOSS4
Aug 18 20:17:51 node1 pengine: [12483]: info: rsc_merge_weights: resMDTLVM:
Rolling back scores from anchorOSS3
Aug 18 20:17:51 node1 pengine: [12483]: info: rsc_merge_weights: resMDTLVM:
Rolling back scores from anchorOSS4

This shows that it see's anchorOSS2 on node2 but starts it on node4.  I
believe this is because of the score values mentioned above but I'm clueless
as to why.

I have tried to understand the colocation explained documentation but I
can't quite wrap my head around it for my situation since nothing I have in
colocation is a dependent resource.

I appreciate any help in setting me straight.


Bobbie Lind
Systems Engineer
*Solutions Made Simple, Inc (SMSi)*
703-296-3087 (Cell)
blind at sms-fed.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20110822/c3ea2969/attachment-0001.html>


More information about the Pacemaker mailing list