[Pacemaker] Colocating with unmanaged resource

Tue Jan 6 05:27:45 UTC 2015

> On 20 Dec 2014, at 6:21 am, Покотиленко Костик <casper at meteor.dp.ua> wrote:
> 
> Hi,
> 
> Simple scenario, several floating IPs should be living on "front" nodes
> only if there is working Nginx. There are several reasons against Nginx
> being controlled by Pacemaker.
> 
> So, decided to colocate FIPs with unmanaged Nginx. This worked fine in
> 1.1.6 with some exceptions.
> 
> Later, on other cluster I decided to switch to 1.1.10 and corosync 2
> because of performance improvements. Now also testing 1.1.12.
> 
> It seems I can't reliably colocate FIPs with unmanaged Nginx on 1.1.10
> and 10.1.12.
> 
> Here are behaviors of different versions of pacemaker:
> 
> 1.1.6, 1.1.10, 1.1.12:
> 
> - if Nginx has started on a node after initial probe for Nginx clone
> then pacemaker will never see it running until cleanup or other probe
> trigger

you'll want a recurring monitor with role=Stopped

> 
> 1.1.6:
> 
> - stopping nginx on a node makes the clone instance FAIL for that node,
> FIP moves away from that node. This is as expected
> - starting nginx removes FAIL state and FIP moves back. This is as
> expected
> 
> 1.1.10:
> 
> - stopping nginx on a node:
>  - usually makes the clone instance to FAIL for that node, but 
>    FIP stays running on that node regardless of INF colocation
>  - sometime makes the clone instance to FAIL for that node and
>    immediately after that clone instance returns to STARTED state,
>    FIP stays running on that node
>  - sometimes makes the clone instance to be STOPPED for that node,
>    FIP moves away from that node. This is as expected
> - starting nginx:
>  - if was FAIL: removes FAIL state: FIP remains running
>  - if was STARTED:
>    - usually nothing happens: FIP remains running
>    - sometimes makes clone instance to FAIL for that node, but 
>      FIP stays running on that node regardless of INF colocation
>  - if was STOPPED: moves FIP back. This is as expected
> 
> 1.1.12:
> 
> - stopping nginx on a node always makes the clone instance to FAIL for
> that node, but FIP stays running on that node regardless of INF
> colocation

can you attach a crm_report of the above test please?

> - starting nginx removes FAIL state, FIP remains running
> 
> Please comment on this. And some questions:
> 
> - are unmanaged resources designed to be used in normal conditions for
> other resources to be colocated with them? How to cook them right?
> - is there a some kind of "recurring probe" to "see" unmanaged resources
> that have started after initial probe?
> 
> Let me know if more logs needed, right now can't collect logs for all
> cases, some attached.
> 
> Config for 1.1.10 (similar configs for 1.1.6 and 1.1.12):
> 
> node $id="..." pcmk10-1 \
>        attributes onhv="1" front="true"
> node $id="..." pcmk10-2 \
>        attributes onhv="2" front="true"
> node $id="..." pcmk10-3 \
>        attributes onhv="3" front="true"
> 
> primitive FIP_1 ocf:heartbeat:IPaddr2 \
>        op monitor interval="2s" \
>        params ip="10.1.1.1" cidr_netmask="16" \
>        meta migration-threshold="2" failure-timeout="60s"
> primitive FIP_2 ocf:heartbeat:IPaddr2 \
>        op monitor interval="2s" \
>        params ip="10.1.2.1" cidr_netmask="16" \
>        meta migration-threshold="2" failure-timeout="60s"
> primitive FIP_3 ocf:heartbeat:IPaddr2 \
>        op monitor interval="2s" \
>        params ip="10.1.3.1" cidr_netmask="16" \
>        meta migration-threshold="2" failure-timeout="60s"
> 
> primitive Nginx lsb:nginx \
>        op start interval="0" enabled="false" \
>        op stop interval="0" enabled="false" \
>        op monitor interval="2s"
> 
> clone cl_Nginx Nginx \
>        meta globally-unique="false" notify="false" is-managed="false"
> 
> location loc-cl_Nginx cl_Nginx \
>        rule $id="loc-cl_Nginx-r1" 500: front eq true
> 
> location loc-FIP_1 FIP_1 \
>        rule $id="loc-FIP_1-r1" 500: onhv eq 1 and front eq true \
>        rule $id="loc-FIP_1-r2" 200: defined onhv and onhv ne 1 and
> front eq true
> location loc-FIP_2 FIP_2 \
>        rule $id="loc-FIP_2-r1" 500: onhv eq 2 and front eq true \
>        rule $id="loc-FIP_2-r2" 200: defined onhv and onhv ne 2 and
> front eq true
> location loc-FIP_3 FIP_3 \
>        rule $id="loc-FIP_3-r1" 500: onhv eq 3 and front eq true \
>        rule $id="loc-FIP_3-r2" 200: defined onhv and onhv ne 3 and
> front eq true
> 
> colocation coloc-FIP_1-cl_Nginx inf: FIP_1 cl_Nginx
> colocation coloc-FIP_2-cl_Nginx inf: FIP_2 cl_Nginx
> colocation coloc-FIP_3-cl_Nginx inf: FIP_3 cl_Nginx
> 
> property $id="cib-bootstrap-options" \
>        dc-version="1.1.10-42f2063" \
>        cluster-infrastructure="corosync" \
>        symmetric-cluster="false" \
>        stonith-enabled="false" \
>        no-quorum-policy="stop" \
>        cluster-recheck-interval="10s" \
>        maintenance-mode="false" \
>        last-lrm-refresh="1418998945"
> rsc_defaults $id="rsc-options" \
>        resource-stickiness="30"
> op_defaults $id="op_defaults-options" \
>        record-pending="false"
> 
> <1.1.10_fail-started.log><1.1.10_stopped-started.log>_______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org