<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body smarttemplateinserted="true" bgcolor="#FFFFFF" text="#000000">

    <div id="smartTemplate4-quoteHeader">

      <div><b>From: </b>Andrew Beekhof <a class="moz-txt-link-rfc2396E" href="mailto:andrew@beekhof.net">&lt;andrew@beekhof.net&gt;</a></div>

      <div><b>Sent: </b> 2014-06-10 02:25:09 EDT </div>

      <div><b>To: </b>The Pacemaker cluster resource manager

        <a class="moz-txt-link-rfc2396E" href="mailto:pacemaker@oss.clusterlabs.org">&lt;pacemaker@oss.clusterlabs.org&gt;</a></div>

      <div><b>Subject: </b>Re: [Pacemaker] resources not rebalancing</div>

      <br>

    </div>

    <blockquote

      cite="mid:3B0E4D72-9C5B-4A12-812F-D2F2A2E6D5AB@beekhof.net"

      type="cite">

      <pre wrap="">

On 5 Jun 2014, at 10:38 am, Patrick Hemmer <a class="moz-txt-link-rfc2396E" href="mailto:pacemaker@feystorm.net">&lt;pacemaker@feystorm.net&gt;</a> wrote:

</pre>

      <blockquote type="cite">

        <pre wrap="">From: Andrew Beekhof <a class="moz-txt-link-rfc2396E" href="mailto:andrew@beekhof.net">&lt;andrew@beekhof.net&gt;</a>

Sent: 2014-06-04 20:15:22 EDT

To: The Pacemaker cluster resource manager <a class="moz-txt-link-rfc2396E" href="mailto:pacemaker@oss.clusterlabs.org">&lt;pacemaker@oss.clusterlabs.org&gt;</a>

Subject: Re: [Pacemaker] resources not rebalancing

</pre>

        <blockquote type="cite">

          <pre wrap="">On 5 Jun 2014, at 12:57 am, Patrick Hemmer <a class="moz-txt-link-rfc2396E" href="mailto:pacemaker@feystorm.net">&lt;pacemaker@feystorm.net&gt;</a>

 wrote:

</pre>

          <blockquote type="cite">

            <pre wrap="">From: Andrew Beekhof <a class="moz-txt-link-rfc2396E" href="mailto:andrew@beekhof.net">&lt;andrew@beekhof.net&gt;</a>

Sent: 2014-06-04 04:15:48 E

To: The Pacemaker cluster resource manager 

<a class="moz-txt-link-rfc2396E" href="mailto:pacemaker@oss.clusterlabs.org">&lt;pacemaker@oss.clusterlabs.org&gt;</a>

Subject: Re: [Pacemaker] resources not rebalancing

</pre>

            <blockquote type="cite">

              <pre wrap="">On 4 Jun 2014, at 4:22 pm, Patrick Hemmer <a class="moz-txt-link-rfc2396E" href="mailto:pacemaker@feystorm.net">&lt;pacemaker@feystorm.net&gt;</a>

 wrote:

</pre>

              <blockquote type="cite">

                <pre wrap="">Testing some different scenarios, and after bringing a node back online, none of the resources move to it unless they are restarted. However default-resource-stickiness is set to 0, so they should be able to move around freely.

# pcs status

Cluster name: docker

Last updated: Wed Jun  4 06:09:26 2014

Last change: Wed Jun  4 06:08:40 2014 via cibadmin on i-093f1f55

Stack: corosync

Current DC: i-083f1f54 (3) - partition with quorum

Version: 1.1.11-1.fc20-9d39a6b

3 Nodes configured

8 Resources configured

Online: [ i-053f1f59 i-083f1f54 i-093f1f55 ]

Full list of resources:

 dummy2    (ocf::pacemaker:Dummy):    Started i-083f1f54 

 Clone Set: dummy1-clone [dummy1] (unique)

     dummy1:0    (ocf::pacemaker:Dummy):    Started i-083f1f54 

     dummy1:1    (ocf::pacemaker:Dummy):    Started i-093f1f55 

     dummy1:2    (ocf::pacemaker:Dummy):    Started i-093f1f55 

     dummy1:3    (ocf::pacemaker:Dummy):    Started i-083f1f54 

     dummy1:4    (ocf::pacemaker:Dummy):    Started i-093f1f55 

# pcs resource show --all 

 Resource: dummy2 (class=ocf provider=pacemaker type=Dummy)

 Clone: dummy1-clone

  Meta Attrs: clone-max=5 clone-node-max=5 globally-unique=true 

  Resource: dummy1 (class=ocf provider=pacemaker type=Dummy)

# pcs property show --all | grep default-resource-stickiness

 default-resource-stickiness: 0

Notice how i-053f1f59 isn't running anything. I feel like I'm missing something obvious, but it escapes me.

</pre>

              </blockquote>

              <pre wrap="">clones are ever so slightly sticky by default, try setting resource-stickiness=0 for the clone resource

(and unset it once everything has moved back)

</pre>

            </blockquote>

            <pre wrap="">Thanks, that did indeed fix it. But how come dummy2 didn't move? It's not a clone, but it didn't move either?

</pre>

          </blockquote>

          <pre wrap="">Do you have a location constraint that says it should prefer i-053f1f59?

</pre>

        </blockquote>

        <pre wrap="">No location constraint.

</pre>

        <blockquote type="cite">

          <blockquote type="cite">

            <pre wrap="">And now a separate follow up question, the resources didn't balance as they should. I've got several utilization attributes set, and the resources aren't balanced according to the placement-strategy.

# pcs property show placement-strategy

Cluster Properties:

 placement-strategy: balanced

# crm_simulate -URL

Current cluster status:

Online: [ i-053f1f59 i-083f1f54 i-093f1f55 ]

 dummy2    (ocf::pacemaker:Dummy):    Started i-053f1f59 

 Clone Set: dummy1-clone [dummy1] (unique)

     dummy1:0    (ocf::pacemaker:Dummy):    Started i-053f1f59 

     dummy1:1    (ocf::pacemaker:Dummy):    Started i-093f1f55 

     dummy1:2    (ocf::pacemaker:Dummy):    Started i-083f1f54 

     dummy1:3    (ocf::pacemaker:Dummy):    Started i-083f1f54 

     dummy1:4    (ocf::pacemaker:Dummy):    Started i-093f1f55 

Utilization information:

Original: i-053f1f59 capacity: cpu=5000000 mem=3840332000

Original: i-083f1f54 capacity: cpu=5000000 mem=3840332000

Original: i-093f1f55 capacity: cpu=5000000 mem=3840332000

calculate_utilization: dummy2 utilization on i-053f1f59: cpu=10000

calculate_utilization: dummy1:2 utilization on i-083f1f54: cpu=1000

calculate_utilization: dummy1:1 utilization on i-093f1f55: cpu=1000

calculate_utilization: dummy1:0 utilization on i-053f1f59: cpu=1000

calculate_utilization: dummy1:3 utilization on i-083f1f54: cpu=1000

calculate_utilization: dummy1:4 utilization on i-093f1f55: cpu=1000

Remaining: i-053f1f59 capacity: cpu=4989000 mem=3840332000

Remaining: i-083f1f54 capacity: cpu=4998000 mem=3840332000

Remaining: i-093f1f55 capacity: cpu=4998000 mem=3840332000

The "balanced" strategy is defined as: "the node that has more free capacity gets consumed first".

Notice that dummy2 consumes cpu=10000, while dummy1 is only 1000 (10x less). After dummy2 was placed on i-053f1f59, that should have consumed enough "cpu" resource to keep dummy1 off it and on the other 2 nodes, but dummy1:0 got placed on the node.

</pre>

          </blockquote>

          <pre wrap="">But i-053f1f59 still has orders of magnitude more cpu capacity left to run things. 

</pre>

        </blockquote>

        <pre wrap="">

I don't follow. They're all equal in terms of total "cpu" capacity.

</pre>

      </blockquote>

      <pre wrap="">

Right. But each node still has 4998000+ units with which to accommodate something that only requires 10000.

Thats about 0.2% of the remaining capacity, so wherever it starts, its hardly making a dint.</pre>

    </blockquote>

    You're thinking of the 'utilization' placement strategy. The

    'utilization' placement strategy is such that resources are

    distributed among nodes such that the node has enough capacity to

    run the resource, and that the number of resources are evenly

    balanced. I'm using the 'balanced' strategy, which is supposed to

    distribute resources such that the amount of free capacity is evenly

    balanced.<br>

    <br>

    <blockquote

      cite="mid:3B0E4D72-9C5B-4A12-812F-D2F2A2E6D5AB@beekhof.net"

      type="cite">

      <blockquote type="cite">

        <pre wrap=""> And at the bottom of the simulate output, the "Remaining" even shows i-053f1f59 has less remaining than the other nodes.

However after playing with it some more, this appears to be an issue with clones. When I created 5 separate resources instead, this does work as expected. the dummy2 resource gets put on a node by itself, and the other resources get distributed among the remaining nodes (at least until the "cpu" used balances out).

Since this smells like a bug, I can enter it on the bug tracker you mention below.

</pre>

      </blockquote>

      <pre wrap="">

Its probably a result of clone stickiness (they have a default of 1) and the hoops we have to jump through to avoid them needlessly shuffling around.</pre>

    </blockquote>

    This was mentioned this earlier on in the email thread. You advised

    to explicitly set the stickiness to 0 which I did.<br>

    <br>

    <blockquote

      cite="mid:3B0E4D72-9C5B-4A12-812F-D2F2A2E6D5AB@beekhof.net"

      type="cite">

      <blockquote type="cite">

        <blockquote type="cite">

          <blockquote type="cite">

            <pre wrap="">Also how difficult is it to add a strategy?

</pre>

          </blockquote>

          <pre wrap="">It might be challenging, the policy engine is deep voodoo :)

Can you create an entry at bugs.clusterlabs.org and include the result of 'cibadmin -Q' when the cluster is in the state you describe above?

It wont make it into 1.1.12 but we can look at it for .13

</pre>

        </blockquote>

        <pre wrap="">

Will ponder possible scenarios and then enter it. Another thought occurred that you might want to balance based on percentage of capacity used. So now you've got, balanced based on amount of capacity used, balanced based on amount of capacity free, and balance based on percent of capacity. All 3 of them are probably similar enough in logic that the same algorithm could take care of them, would just need a way to tune that algorithm (this would be my guess anyway, no clue what the code looks like).

</pre>

        <blockquote type="cite">

          <pre wrap="">

</pre>

          <blockquote type="cite">

            <pre wrap="">I'd be interested in having a strategy which places a resource on a node with the least amount of capacity used? Kind of the inverse of "balanced". The docs say balanced looks at much capacity is free. The 2 strategies would be equivalent if all nodes have the same capacity, but if one node has 10x the capacity of the other nodes, I want the resources to be distributed evenly (based on the capacity each uses), and not over-utilize that one node.

Thanks

-Patrick

_

</pre>

          </blockquote>

        </blockquote>

      </blockquote>

    </blockquote>

    <br>

  </body>

</html>