[Pacemaker] Getting Started

Fri Dec 7 07:34:15 UTC 2012

Hi Brett

2012/12/5 Brett Maton <brett.maton at googlemail.com>:
> Ok, almost there :)
>
>   I'm  having some trouble with VIPs either not starting or starting on the wrong node (so something isn't right :)).
>
> Lab04 should be the master (vipMaster), lab05 slave (vipSlave)
>
> (Postgres is up and running as a replication slave on lab05, although it's being reported as stopped...)
>
> Output from crm_mon -Af
>
> Last updated: Wed Dec  5 09:35:58 2012
> Last change: Wed Dec  5 09:35:57 2012 via crm_attribute on lab04
> Stack: openais
> Current DC: lab04 - partition with quorum
> Version: 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14
> 2 Nodes configured, 2 expected votes
> 6 Resources configured.
> ============
>
> Online: [ lab05 lab04 ]
>
>  Master/Slave Set: msPostgreSQL [pgsql]
>      Masters: [ lab04 ]
>      Stopped: [ pgsql:1 ]
> vipSlave        (ocf::heartbeat:IPaddr2):       Started lab04
>  Clone Set: clnPingCheck [pingCheck]
>      Started: [ lab04 ]
>      Stopped: [ pingCheck:1 ]
> vipMaster       (ocf::heartbeat:IPaddr2):       Started lab04
>
> Node Attributes:
> * Node lab05:
>     + master-pgsql:0                    : -INFINITY
>     + master-pgsql:1                    : 100
>     + pgsql-data-status                 : STREAMING|SYNC
>     + pgsql-status                      : STOP
> * Node lab04:
>     + master-pgsql:0                    : 1000
>     + pgsql-data-status                 : LATEST
>     + pgsql-master-baseline             : 000000000A000200
>     + pgsql-status                      : PRI
>     + pingNodes                         : 200
>
> Migration summary:
> * Node lab04:
> * Node lab05:
>

It seems that it isn't normal status because pgsql:1 is stopped and
pgsql-data-staus="STREAMING|SYNC".
Did you start PostgreSQL in lab05 manually ?
If yes,  it confuses RA.

>  How do I migrate vipSalve to node lab05?

If you use my sample configuration, it's impossible
because of this configuration

----
location rsc_location-1 vip-slave \
    rule  200: pgsql-status eq "HS:sync" \
    rule  100: pgsql-status eq "PRI" \
    rule  -inf: not_defined pgsql-status \
    rule  -inf: pgsql-status ne "HS:sync" and pgsql-status ne "PRI"
----

This means that vip-slave can't run if pgsql-status isn't "HS:sync" and "PRI".

>   I've tried
>   # crm resource migrate vipSlave lab05
>
> I did find this in the corosync log
> Dec 05 09:35:58 [2064] lab04    pengine:   notice: unpack_rsc_op:       Operation monitor found resource vipMaster active on lab04
> Dec 05 09:35:58 [2064] lab04    pengine:   notice: unpack_rsc_op:       Operation monitor found resource pgsql:0 active in master mode on lab04
> Dec 05 09:35:58 [2064] lab04    pengine:   notice: unpack_rsc_op:       Operation monitor found resource vipSlave active on lab04
> Dec 05 09:35:58 [2064] lab04    pengine:   notice: unpack_rsc_op:       Operation monitor found resource pingCheck:0 active on lab04
> Dec 05 09:35:58 [2064] lab04    pengine:   notice: unpack_rsc_op:       Operation monitor found resource pgsql:1 active on lab05
> Dec 05 09:35:58 [2064] lab04    pengine:  warning: common_apply_stickiness:     Forcing clnPingCheck away from lab05 after 1 failures (max=1)
> Dec 05 09:35:58 [2064] lab04    pengine:  warning: common_apply_stickiness:     Forcing clnPingCheck away from lab05 after 1 failures (max=1)
>
> If it helps, pingCheck config:
>
> primitive pingCheck ocf:pacemaker:ping \
>         params \
>                 name="pingNodes" \
>                 host_list="192.168.0.12 192.168.0.13" \
>                 multiplier="100" \
>         op start interval="0" timeout="60s" on-fail="restart" \
>         op monitor interval="10" timeout="60s" on-fail="restart" \
>         op stop interval="0" timeout="60s" on-fail="ignore"
>
> Thanks again,
> Brett
>

Thanks,
Takatoshi MATSUO