[Pacemaker] FS mount error
Proskurin Kirill
proskurin-kv at fxclub.org
Thu Jul 22 07:29:47 UTC 2010
Hello all.
I really new to Pacemaker and try to make some test and learn how it is
all works. I use Clusters From Scratch pdf from clusterlabs.org as how-to.
What we have:
Debian Lenny 5.0.5 (with kernel 2.6.32-bpo.4-amd64 from backports)
pacemaker 1.0.8+hg15494-4~bpo50+1
openais 1.1.2-2~bpo50+1
Problem:
I try to add fs mount resource but get unknown error. If I mount it by
hands - all is ok.
crm_mon:
============
Last updated: Thu Jul 22 08:22:20 2010
Stack: openais
Current DC: node01.domain.org - partition with quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, 2 expected votes
4 Resources configured.
============
Online: [ node02.domain.org node01.domain.org ]
ClusterIP (ocf::heartbeat:IPaddr2): Started node02.domain.org
Master/Slave Set: WebData
Masters: [ node02.domain.org ]
Slaves: [ node01.domain.org ]
WebFS (ocf::heartbeat:Filesystem): Started node02.domain.org FAILED
Failed actions:
WebFS_start_0 (node=node01.domain.org, call=18, rc=1,
status=complete): unknown error
WebFS_start_0 (node=node02.domain.org, call=301, rc=1,
status=complete): unknown error
node01:~# crm_verify -VL
crm_verify[1482]: 2010/07/22_08:28:13 WARN: unpack_rsc_op: Processing
failed op WebFS_start_0 on node01.domain.org: unknown error (1)
crm_verify[1482]: 2010/07/22_08:28:13 WARN: unpack_rsc_op: Processing
failed op WebFS_start_0 on node02.domain.org: unknown error (1)
crm_verify[1482]: 2010/07/22_08:28:13 WARN: common_apply_stickiness:
Forcing WebFS away from node01.domain.org after 1000000 failures
(max=1000000)
node01:~# crm configure show
node node01.domain.org
node node02.domain.org
primitive ClusterIP ocf:heartbeat:IPaddr2 \
params ip="192.168.1.100" cidr_netmask="32" \
op monitor interval="30s"
primitive WebFS ocf:heartbeat:Filesystem \
params device="/dev/drbd0" directory="/var/spool/dovecot" fstype="ext4" \
op start interval="0" timeout="60s" \
op stop interval="0" timeout="60s" \
meta target-role="Started"
primitive WebSite ocf:heartbeat:apache \
params configfile="/etc/apache2/apache2.conf" \
op monitor interval="1min" \
op start interval="0" timeout="40s" \
op stop interval="0" timeout="60s" \
meta target-role="Started"
primitive wwwdrbd ocf:linbit:drbd \
params drbd_resource="drbd0" \
op monitor interval="60s" \
op start interval="0" timeout="240s" \
op stop interval="0" timeout="100s"
ms WebData wwwdrbd \
meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true" target-role="Started"
colocation WebSite-with-WebFS inf: WebSite WebFS
colocation fs_on_drbd inf: WebFS WebData:Master
colocation website-with-ip inf: WebSite ClusterIP
order WebFS-after-WebData inf: WebData:promote WebFS:start
order WebSite-after-WebFS inf: WebFS WebSite
order apache-after-ip inf: ClusterIP WebSite
property $id="cib-bootstrap-options" \
dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
last-lrm-refresh="1279717510"
In logs:
Jul 22 08:18:39 node01 crmd: [1814]: ERROR: stonithd_signon: Can't
initiate connection to stonithd
Jul 22 08:18:39 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:39 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in
failed: triggered a retry
Jul 22 08:18:39 node01 crmd: [1814]: info: te_connect_stonith:
Attempting connection to fencing daemon...
Jul 22 08:18:40 node01 crmd: [1814]: ERROR: stonithd_signon: Can't
initiate connection to stonithd
Jul 22 08:18:40 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:40 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in
failed: triggered a retry
Jul 22 08:18:40 node01 crmd: [1814]: info: te_connect_stonith:
Attempting connection to fencing daemon...
Jul 22 08:18:41 node01 crmd: [1814]: ERROR: stonithd_signon: Can't
initiate connection to stonithd
Jul 22 08:18:41 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:41 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in
failed: triggered a retry
Jul 22 08:18:41 node01 crmd: [1814]: info: te_connect_stonith:
Attempting connection to fencing daemon...
Jul 22 08:18:42 node01 cibadmin: [1199]: info: Invoked: cibadmin -Ql -o
resources
Jul 22 08:18:42 node01 cibadmin: [1200]: info: Invoked: cibadmin -p -R
-o resources
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: -
<cib admin_epoch="0" epoch="143" num_updates="2" >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: -
<configuration >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: -
<resources >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: -
<primitive id="WebFS" >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: -
<meta_attributes id="WebFS-meta_attributes" >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: -
<nvpair value="Stopped" id="WebFS-meta_attributes-target-role" />
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: -
</meta_attributes>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: -
</primitive>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: -
</resources>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: -
</configuration>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: -
</cib>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: +
<cib admin_epoch="0" epoch="144" num_updates="1" >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: +
<configuration >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: +
<resources >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: +
<primitive id="WebFS" >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: +
<meta_attributes id="WebFS-meta_attributes" >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: +
<nvpair value="Started" id="WebFS-meta_attributes-target-role" />
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: +
</meta_attributes>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: +
</primitive>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: +
</resources>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: +
</configuration>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: +
</cib>
Jul 22 08:18:42 node01 cib: [1810]: info: cib_process_request: Operation
complete: op cib_replace for section resources (origin=local/cibadmin/2,
version=0.144.1): ok (rc=0)
Jul 22 08:18:42 node01 cib: [1201]: info: write_cib_contents: Archived
previous version as /var/lib/heartbeat/crm/cib-89.raw
Jul 22 08:18:42 node01 cib: [1201]: info: write_cib_contents: Wrote
version 0.144.0 of the CIB to disk (digest:
5f51a15c21330c7ff76862ad9a5193b1)
Jul 22 08:18:42 node01 cib: [1201]: info: retrieveCib: Reading cluster
configuration from: /var/lib/heartbeat/crm/cib.woPqNQ (digest:
/var/lib/heartbeat/crm/cib.bF43Zi)
Jul 22 08:18:42 node01 crmd: [1814]: ERROR: stonithd_signon: Can't
initiate connection to stonithd
Jul 22 08:18:42 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:42 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in
failed: triggered a retry
Jul 22 08:18:42 node01 crmd: [1814]: info: abort_transition_graph:
need_abort:59 - Triggered transition abort (complete=1) : Non-status change
Jul 22 08:18:42 node01 crmd: [1814]: info: need_abort: Aborting on
change to admin_epoch
Jul 22 08:18:42 node01 crmd: [1814]: info: do_state_transition: State
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC
cause=C_FSA_INTERNAL origin=abort_transition_graph ]
Jul 22 08:18:42 node01 crmd: [1814]: info: do_state_transition: All 2
cluster nodes are eligible to run resources.
Jul 22 08:18:42 node01 crmd: [1814]: info: do_pe_invoke: Query 350:
Requesting the current CIB: S_POLICY_ENGINE
Jul 22 08:18:42 node01 crmd: [1814]: info: te_connect_stonith:
Attempting connection to fencing daemon...
Jul 22 08:18:43 node01 crmd: [1814]: ERROR: stonithd_signon: Can't
initiate connection to stonithd
Jul 22 08:18:43 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:43 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in
failed: triggered a retry
Jul 22 08:18:43 node01 crmd: [1814]: info: do_pe_invoke_callback:
Invoking the PE: query=350, ref=pe_calc-dc-1279783123-729, seq=152,
quorate=1
Jul 22 08:18:43 node01 crmd: [1814]: info: te_connect_stonith:
Attempting connection to fencing daemon...
Jul 22 08:18:43 node01 pengine: [1813]: info: unpack_config: Node
scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Jul 22 08:18:43 node01 pengine: [1813]: info: determine_online_status:
Node node01.domain.org is online
Jul 22 08:18:43 node01 pengine: [1813]: notice: unpack_rsc_op: Operation
WebSite_monitor_0 found resource WebSite active on node01.domain.org
Jul 22 08:18:43 node01 pengine: [1813]: WARN: unpack_rsc_op: Processing
failed op WebFS_start_0 on node01.domain.org: unknown error (1)
Jul 22 08:18:43 node01 pengine: [1813]: info: determine_online_status:
Node node02.domain.org is online
Jul 22 08:18:43 node01 pengine: [1813]: notice: unpack_rsc_op: Operation
WebSite_monitor_0 found resource WebSite active on node02.domain.org
Jul 22 08:18:43 node01 pengine: [1813]: WARN: unpack_rsc_op: Processing
failed op WebFS_start_0 on node02.domain.org: unknown error (1)
Jul 22 08:18:43 node01 pengine: [1813]: notice: native_print:
ClusterIP#011(ocf::heartbeat:IPaddr2):#011Started node02.domain.org
Jul 22 08:18:43 node01 pengine: [1813]: notice: native_print:
WebSite#011(ocf::heartbeat:apache):#011Stopped
Jul 22 08:18:43 node01 pengine: [1813]: notice: clone_print:
Master/Slave Set: WebData
Jul 22 08:18:43 node01 pengine: [1813]: notice: short_print:
Masters: [ node02.domain.org ]
Jul 22 08:18:43 node01 pengine: [1813]: notice: short_print:
Slaves: [ node01.domain.org ]
Jul 22 08:18:43 node01 pengine: [1813]: notice: native_print:
WebFS#011(ocf::heartbeat:Filesystem):#011Stopped
Jul 22 08:18:43 node01 pengine: [1813]: info: get_failcount: WebFS has
failed 1000000 times on node01.domain.org
Jul 22 08:18:43 node01 pengine: [1813]: WARN: common_apply_stickiness:
Forcing WebFS away from node01.domain.org after 1000000 failures
(max=1000000)
Jul 22 08:18:43 node01 pengine: [1813]: info: native_merge_weights:
WebData: Rolling back scores from WebFS
Jul 22 08:18:43 node01 pengine: [1813]: info: native_merge_weights:
wwwdrbd:0: Rolling back scores from WebFS
Jul 22 08:18:43 node01 pengine: [1813]: info: native_merge_weights:
WebData: Rolling back scores from WebFS
Jul 22 08:18:43 node01 pengine: [1813]: info: master_color: Promoting
wwwdrbd:0 (Master node02.domain.org)
Jul 22 08:18:43 node01 pengine: [1813]: info: master_color: WebData:
Promoted 1 instances of a possible 1 to master
Jul 22 08:18:43 node01 pengine: [1813]: info: master_color: Promoting
wwwdrbd:0 (Master node02.domain.org)
Jul 22 08:18:43 node01 pengine: [1813]: info: master_color: WebData:
Promoted 1 instances of a possible 1 to master
Jul 22 08:18:43 node01 pengine: [1813]: notice: RecurringOp: Start
recurring monitor (60s) for WebSite on node02.domain.org
Jul 22 08:18:43 node01 pengine: [1813]: notice: LogActions: Leave
resource ClusterIP#011(Started node02.domain.org)
Jul 22 08:18:43 node01 pengine: [1813]: notice: LogActions: Start
WebSite#011(node02.domain.org)
Jul 22 08:18:43 node01 pengine: [1813]: notice: LogActions: Leave
resource wwwdrbd:0#011(Master node02.domain.org)
Jul 22 08:18:43 node01 pengine: [1813]: notice: LogActions: Leave
resource wwwdrbd:1#011(Slave node01.domain.org)
Jul 22 08:18:43 node01 pengine: [1813]: notice: LogActions: Start
WebFS#011(node02.domain.org)
Jul 22 08:18:43 node01 pengine: [1813]: info: process_pe_message:
Transition 199: PEngine Input stored in: /var/lib/pengine/pe-input-243.bz2
Jul 22 08:18:44 node01 crmd: [1814]: ERROR: stonithd_signon: Can't
initiate connection to stonithd
Jul 22 08:18:44 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:44 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in
failed: triggered a retry
Jul 22 08:18:44 node01 crmd: [1814]: info: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]
Jul 22 08:18:44 node01 crmd: [1814]: info: unpack_graph: Unpacked
transition 199: 4 actions in 4 synapses
Jul 22 08:18:44 node01 crmd: [1814]: info: do_te_invoke: Processing
graph 199 (ref=pe_calc-dc-1279783123-729) derived from
/var/lib/pengine/pe-input-243.bz2
Jul 22 08:18:44 node01 crmd: [1814]: info: te_rsc_command: Initiating
action 42: start WebFS_start_0 on node02.domain.org
Jul 22 08:18:44 node01 crmd: [1814]: info: te_rsc_command: Initiating
action 5: probe_complete probe_complete on node02.domain.org - no waiting
Jul 22 08:18:44 node01 crmd: [1814]: info: te_connect_stonith:
Attempting connection to fencing daemon...
Jul 22 08:18:45 node01 crmd: [1814]: ERROR: stonithd_signon: Can't
initiate connection to stonithd
Jul 22 08:18:45 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:45 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in
failed: triggered a retry
Jul 22 08:18:45 node01 crmd: [1814]: info: te_connect_stonith:
Attempting connection to fencing daemon...
Jul 22 08:18:46 node01 crmd: [1814]: ERROR: stonithd_signon: Can't
initiate connection to stonithd
Jul 22 08:18:46 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:46 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in
failed: triggered a retry
Jul 22 08:18:46 node01 crmd: [1814]: info: te_connect_stonith:
Attempting connection to fencing daemon...
Jul 22 08:18:47 node01 crmd: [1814]: ERROR: stonithd_signon: Can't
initiate connection to stonithd
Jul 22 08:18:47 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:47 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in
failed: triggered a retry
Jul 22 08:18:47 node01 crmd: [1814]: info: te_connect_stonith:
Attempting connection to fencing daemon...
--
Best regards,
Proskurin Kirill
More information about the Pacemaker
mailing list