[Pacemaker] Breaking dependency loop && stonith

Tue Nov 26 08:26:31 EST 2013

Hi, ALL.

I want to clarify two more questions.
After stonith reboot - this node hangs with status "pending".
The logs found string .....

    info: rsc_merge_weights:    pgsql:1: Breaking dependency loop at msPostgresql
    info: rsc_merge_weights:    pgsql:2: Breaking dependency loop at msPostgresql

This means that breaking search the depends, because they are no more.
Or interrupted by an infinite loop for search the dependency?

And two.
Do I need to clone the stonith resource now (In PCMK 1.1.11)?
On the one hand, I see this resource on all nodes through command.
# cibadmin -Q|grep stonith
        <nvpair name="stonith-enabled" value="true" id="cib-bootstrap-options-stonith-enabled"/>
      <primitive id="st1" class="stonith" type="external/sshbykey">
          <lrm_resource id="st1" type="external/sshbykey" class="stonith">
          <lrm_resource id="st1" type="external/sshbykey" class="stonith">
          <lrm_resource id="st1" type="external/sshbykey" class="stonith">
(without pending node)

On the other hand, another command I see only one instance on a particular node.
# crm_verify -LVVVV
    info: main:         =#=#=#=#= Getting XML =#=#=#=#=
    info: main:         Reading XML from: live cluster
    info: validate_with_relaxng:        Creating RNG parser context
    info: determine_online_status_fencing:      Node dev-cluster2-node4 is active
    info: determine_online_status:      Node dev-cluster2-node4 is online
    info: determine_online_status_fencing:      - Node dev-cluster2-node1 is not ready to run resources
    info: determine_online_status_fencing:      Node dev-cluster2-node2 is active
    info: determine_online_status:      Node dev-cluster2-node2 is online
    info: determine_online_status_fencing:      Node dev-cluster2-node3 is active
    info: determine_online_status:      Node dev-cluster2-node3 is online
    info: determine_op_status:  Operation monitor found resource pingCheck:0 active on dev-cluster2-node4
    info: native_print:         VirtualIP       (ocf::heartbeat:IPaddr2):       Started dev-cluster2-node4
    info: clone_print:   Master/Slave Set: msPostgresql [pgsql]
    info: short_print:       Masters: [ dev-cluster2-node4 ]
    info: short_print:       Slaves: [ dev-cluster2-node2 dev-cluster2-node3 ]
    info: short_print:       Stopped: [ dev-cluster2-node1 ]
    info: clone_print:   Clone Set: clnPingCheck [pingCheck]
    info: short_print:       Started: [ dev-cluster2-node2 dev-cluster2-node3 dev-cluster2-node4 ]
    info: short_print:       Stopped: [ dev-cluster2-node1 ]
    info: native_print:         st1     (stonith:external/sshbykey):    Started dev-cluster2-node4
    info: native_color:         Resource pingCheck:3 cannot run anywhere
    info: native_color:         Resource pgsql:3 cannot run anywhere
    info: rsc_merge_weights:    pgsql:1: Breaking dependency loop at msPostgresql
    info: rsc_merge_weights:    pgsql:2: Breaking dependency loop at msPostgresql
    info: master_color:         Promoting pgsql:0 (Master dev-cluster2-node4)
    info: master_color:         msPostgresql: Promoted 1 instances of a possible 1 to master
    info: LogActions:   Leave   VirtualIP       (Started dev-cluster2-node4)
    info: LogActions:   Leave   pgsql:0 (Master dev-cluster2-node4)
    info: LogActions:   Leave   pgsql:1 (Slave dev-cluster2-node2)
    info: LogActions:   Leave   pgsql:2 (Slave dev-cluster2-node3)
    info: LogActions:   Leave   pgsql:3 (Stopped)
    info: LogActions:   Leave   pingCheck:0     (Started dev-cluster2-node4)
    info: LogActions:   Leave   pingCheck:1     (Started dev-cluster2-node2)
    info: LogActions:   Leave   pingCheck:2     (Started dev-cluster2-node3)
    info: LogActions:   Leave   pingCheck:3     (Stopped)
    info: LogActions:   Leave   st1     (Started dev-cluster2-node4)

However, if I do a "clone" - it turns out the same garbage.