[Pacemaker] Master won't get promoted
Andrew Beekhof
andrew at beekhof.net
Thu Sep 29 07:15:21 UTC 2011
Could you attach /var/lib/pengine/pe-input-3802.bz2 from staging1?
That would tell us why.
On Mon, Sep 26, 2011 at 10:28 PM, Charles Richard
<chachi.richard at gmail.com> wrote:
> Hi,
>
> I'm making some headway finally with my pacemaker install but now that
> crm_mon doesn't return errors any more and crm_verify is clear, I'm having a
> problem where my master won't get promoted. Not sure what to do with this
> one, any suggestions? Here's the log snippet and config files:
>
> Sep 26 04:06:12 staging1 crmd: [1686]: info: crm_timer_popped: PEngine
> Recheck Timer (I_PE_CALC) just popped!
> Sep 26 04:06:12 staging1 crmd: [1686]: info: do_state_transition: State
> transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED
> origin=crm_timer_popped ]
> Sep 26 04:06:12 staging1 crmd: [1686]: info: do_state_transition: Progressed
> to state S_POLICY_ENGINE after C_TIMER_POPPED
> Sep 26 04:06:12 staging1 crmd: [1686]: info: do_state_transition: All 2
> cluster nodes are eligible to run resources.
> Sep 26 04:06:12 staging1 crmd: [1686]: info: do_pe_invoke: Query 106:
> Requesting the current CIB: S_POLICY_ENGINE
> Sep 26 04:06:12 staging1 crmd: [1686]: info: do_pe_invoke_callback: Invoking
> the PE: query=106, ref=pe_calc-dc-1317020772-95, seq=2564, quorate=1
> Sep 26 04:06:12 staging1 pengine: [1685]: info: unpack_config: Startup
> probes: enabled
> Sep 26 04:06:12 staging1 pengine: [1685]: notice: unpack_config: On loss of
> CCM Quorum: Ignore
> Sep 26 04:06:12 staging1 pengine: [1685]: info: unpack_config: Node scores:
> 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
> Sep 26 04:06:12 staging1 pengine: [1685]: info: unpack_domains: Unpacking
> domains
> Sep 26 04:06:12 staging1 pengine: [1685]: info: determine_online_status:
> Node staging1.dev.applepeak.com is online
> Sep 26 04:06:12 staging1 pengine: [1685]: info: determine_online_status:
> Node staging2.dev.applepeak.com is online
> Sep 26 04:06:12 staging1 pengine: [1685]: notice: group_print: Resource
> Group: mysql
> Sep 26 04:06:12 staging1 pengine: [1685]: notice: native_print:
> fs_mysql#011(ocf::heartbeat:Filesystem):#011Stopped
> Sep 26 04:06:12 staging1 pengine: [1685]: notice: native_print:
> ip_mysql#011(ocf::heartbeat:IPaddr2):#011Stopped
> Sep 26 04:06:12 staging1 pengine: [1685]: notice: native_print:
> mysqld#011(lsb:mysqld):#011Stopped
> Sep 26 04:06:12 staging1 pengine: [1685]: notice: clone_print: Master/Slave
> Set: ms_drbd_mysql
> Sep 26 04:06:12 staging1 pengine: [1685]: notice: short_print: Stopped:
> [ drbd_mysql:0 drbd_mysql:1 ]
> Sep 26 04:06:12 staging1 pengine: [1685]: info: master_color: ms_drbd_mysql:
> Promoted 0 instances of a possible 1 to master
> Sep 26 04:06:12 staging1 pengine: [1685]: info: native_merge_weights:
> fs_mysql: Rolling back scores from ip_mysql
> Sep 26 04:06:12 staging1 pengine: [1685]: info: native_merge_weights:
> ip_mysql: Rolling back scores from mysqld
> Sep 26 04:06:12 staging1 pengine: [1685]: info: master_color: ms_drbd_mysql:
> Promoted 0 instances of a possible 1 to master
> Sep 26 04:06:12 staging1 pengine: [1685]: notice: LogActions: Leave resource
> fs_mysql#011(Stopped)
> Sep 26 04:06:12 staging1 pengine: [1685]: notice: LogActions: Leave resource
> ip_mysql#011(Stopped)
> Sep 26 04:06:12 staging1 pengine: [1685]: notice: LogActions: Leave resource
> mysqld#011(Stopped)
> Sep 26 04:06:12 staging1 pengine: [1685]: notice: LogActions: Leave resource
> drbd_mysql:0#011(Stopped)
> Sep 26 04:06:12 staging1 pengine: [1685]: notice: LogActions: Leave resource
> drbd_mysql:1#011(Stopped)
> Sep 26 04:06:12 staging1 crmd: [1686]: info: do_state_transition: State
> transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
> cause=C_IPC_MESSAGE origin=handle_response ]
> Sep 26 04:06:12 staging1 crmd: [1686]: info: unpack_graph: Unpacked
> transition 72: 0 actions in 0 synapses
> Sep 26 04:06:12 staging1 crmd: [1686]: info: do_te_invoke: Processing graph
> 72 (ref=pe_calc-dc-1317020772-95) derived from
> /var/lib/pengine/pe-input-3802.bz2
> Sep 26 04:06:12 staging1 crmd: [1686]: info: run_graph:
> ====================================================
> Sep 26 04:06:12 staging1 crmd: [1686]: notice: run_graph: Transition 72
> (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pengine/pe-input-3802.bz2): Complete
> Sep 26 04:06:12 staging1 crmd: [1686]: info: te_graph_trigger: Transition 72
> is now complete
> Sep 26 04:06:12 staging1 crmd: [1686]: info: notify_crmd: Transition 72
> status: done - <null>
> Sep 26 04:06:12 staging1 crmd: [1686]: info: do_state_transition: State
> transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
> cause=C_FSA_INTERNAL origin=notify_crmd ]
> Sep 26 04:06:12 staging1 crmd: [1686]: info: do_state_transition: Starting
> PEngine Recheck Timer
> Sep 26 04:06:12 staging1 pengine: [1685]: info: process_pe_message:
> Transition 72: PEngine Input stored in: /var/lib/pengine/pe-input-3802.bz2
> Sep 26 04:15:09 staging1 cib: [1682]: info: cib_stats: Processed 1
> operations (0.00us average, 0% utilization) in the last 10min
>
> My drbd config file:
>
> resource mysqld {
>
> protocol C;
>
> startup { wfc-timeout 0; degr-wfc-timeout 120; }
>
> disk { on-io-error detach; }
>
>
> on staging1 {
>
> device /dev/drbd0;
>
> disk /dev/vg_staging1/lv_data;
>
> meta-disk internal;
>
> address 10.10.20.1:7788;
>
> }
>
> on staging2 {
>
> device /dev/drbd0;
>
> disk /dev/vg_staging2/lv_data;
>
> meta-disk internal;
>
> address 10.10.20.2:7788;
>
> }
>
> }
>
> corosync.conf:
>
> compatibility: whitetank
>
> aisexec {
> user: root
> group: root
> }
>
> totem {
> version: 2
> secauth: off
> threads: 0
> interface {
> ringnumber: 0
> bindnetaddr: 10.10.10.0
> mcastaddr: 226.94.1.1
> mcastport: 5405
> }
> }
>
> logging {
> fileline: off
> to_stderr: no
> to_logfile: no
> to_syslog: yes
> logfile: /var/log/cluster/corosync.log
> debug: off
> timestamp: on
> logger_subsys {
> subsys: AMF
> debug: off
> }
> }
>
> amf {
> mode: disabled
> }
>
> service {
> #Load Pacemaker
> name: pacemaker
> ver: 0
> use_mgmtd: yes
> }
>
> And my crm config:
>
> node staging1.dev.applepeak.com
> node staging2.dev.applepeak.com
> primitive drbd_mysql ocf:linbit:drbd \
> params drbd_resource="mysqld" \
> op monitor interval="15s" \
> op start interval="0" timeout="240s" \
> op stop interval="0" timeout="100s"
> primitive fs_mysql ocf:heartbeat:Filesystem \
> params device="/dev/drbd0" directory="/opt/data/mysql/data/mysql"
> fstype="ext4" \
> op start interval="0" timeout="60s" \
> op stop interval="0" timeout="60s"
> primitive ip_mysql ocf:heartbeat:IPaddr2 \
> params ip="10.10.10.31" nic="eth0"
> primitive mysqld lsb:mysqld
> group mysql fs_mysql ip_mysql mysqld
> ms ms_drbd_mysql drbd_mysql \
> meta master-max="1" master-node-max="1" clone-max="2"
> clone-node-max="1" notify="true"
> colocation mysql_on_drbd inf: mysql ms_drbd_mysql:Master
> order mysql_after_drbd inf: ms_drbd_mysql:promote mysql:start
> property $id="cib-bootstrap-options" \
> dc-version="1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="2" \
> stonith-enabled="false" \
> last-lrm-refresh="1316961847" \
> stop-all-resources="true" \
> no-quorum-policy="ignore"
> rsc_defaults $id="rsc-options" \
> resource-stickiness="100"
>
> Thanks,
> Charles
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>
More information about the Pacemaker
mailing list