[Pacemaker] unable to move resource in pacemaker 1.0.8

Piotr Jewiec piotr at jewiec.net
Wed Dec 19 03:53:00 EST 2012


Hi,

I am not able to move nfs to second node of my cluster, some time ago 
crmd on the node that NFS currently runs on was jammed (used all 
filedescriptors) and was kill -9'ed:

============
Last updated: Wed Dec 19 03:39:59 2012
Stack: openaisCurrent
DC: filer-1 - partition with quorumVersion: 
1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, 2 expected votes
12 Resources configured.
============

Online: [ filer-2 filer-1 ]

  ip_10.66.16.59 (ocf::heartbeat:IPaddr2):       Started filer-1
  r_dhcpd        (ocf::arces:lsbwrapper):        Started filer-1
  r_rsyncd       (ocf::heartbeat:rsyncd):        Started filer-1
  r_tftpd        (ocf::arces:lsbwrapper):        Started filer-1
  Resource Group: grp_activemq
      r_fs_activemq      (ocf::arces:Filesystem):        Started filer-2
      ip_10.66.16.47     (ocf::heartbeat:IPaddr2):       Started filer-2
      r_activemq (lsb:arces-activemq):   Started filer-2
  Resource Group: grp_iscsi
      r_fs_iscsi-files   (ocf::heartbeat:Filesystem):    Started filer-1
      ip_10.66.16.3      (ocf::heartbeat:IPaddr2):       Started filer-1
      r_ietd     (ocf::arces:lsbwrapper):        Started filer-1
  Resource Group: grp_mysql
      r_fs_mysql (ocf::arces:Filesystem):        Started filer-1
      ip_10.66.16.15     (ocf::heartbeat:IPaddr2):       Started filer-1
      r_mysql    (ocf::heartbeat:mysql): Started filer-1
  Master/Slave Set: ms_drbd_activemq
      Masters: [ filer-2 ]
      Slaves: [ filer-1 ]
  Master/Slave Set: ms_drbd_iscsi-files
      Masters: [ filer-1 ]
      Slaves: [ filer-2 ]
  Master/Slave Set: ms_drbd_mysql
      Masters: [ filer-1 ]
      Slaves: [ filer-2 ]
  Master/Slave Set: ms_drbd_nfs
      Masters: [ filer-1 ]
      Slaves: [ filer-2 ]
  Resource Group: grp_nfsserver
      r_fs_nfs   (ocf::heartbeat:Filesystem):    Started filer-1
      ip_10.66.16.53     (ocf::heartbeat:IPaddr2):       Started filer-1
      r_statd    (lsb:statd):    Started filer-1
      r_portmap  (lsb:portmap):  Started filer-1
      r_nfs-kernel-server        (lsb:nfs-kernel-server):        Started 
filer-1


and here's the cluster config:

node filer-1 \
         attributes standby="off"
node filer-2 \
         attributes standby="off"
primitive ip_10.66.16.15 ocf:heartbeat:IPaddr2 \
         op monitor interval="10s" timeout="10s" \
         params ip="10.66.16.15"
primitive ip_10.66.16.3 ocf:heartbeat:IPaddr2 \
         op monitor interval="10s" timeout="20s" \
         params ip="10.66.16.3"
primitive ip_10.66.16.47 ocf:heartbeat:IPaddr2 \
         op monitor interval="10s" \
         params ip="10.66.16.47"
primitive ip_10.66.16.53 ocf:heartbeat:IPaddr2 \
         op monitor interval="10s" \
         params ip="10.66.16.53" \
         meta target-role="Started"
primitive ip_10.66.16.59 ocf:heartbeat:IPaddr2 \
         op monitor interval="60s" timeout="10s" \
         params ip="10.66.16.59" \
         meta is-managed="true"
primitive r_activemq lsb:arces-activemq \
         op monitor interval="15s" \
         meta target-role="Started" is-managed="true"
primitive r_dhcpd ocf:arces:lsbwrapper \
         op monitor interval="60s" timeout="30s" \
         params initscript="/etc/init.d/dhcp3-server" startstatus="is 
running" stopstatus="not running" \
         meta is-managed="true" target-role="Started"
primitive r_drbd_activemq ocf:linbit:drbd \
         params drbd_resource="activemq" \
         op monitor interval="20s" role="Slave" timeout="20s" \
         op monitor interval="10s" role="Master" timeout="20s" \
         op start interval="0" timeout="240s" \
         op stop interval="0" timeout="100s"
primitive r_drbd_iscsi-files ocf:linbit:drbd \
         params drbd_resource="iscsi-files" \
         op monitor interval="20s" role="Slave" timeout="20s" \
         op monitor interval="10s" role="Master" timeout="20s" \
         op start interval="0" timeout="240s" \
         op stop interval="0" timeout="100s"
primitive r_drbd_mysql ocf:linbit:drbd \
         params drbd_resource="mysql" \
         op monitor interval="20s" role="Slave" timeout="20s" \
         op monitor interval="10s" role="Master" timeout="20s" \
         op start interval="0" timeout="240s" \
         op stop interval="0" timeout="100s"
primitive r_drbd_nfs ocf:linbit:drbd \
         params drbd_resource="nfs" \
         op monitor interval="20s" role="Slave" timeout="20s" \
         op monitor interval="10s" role="Master" timeout="20s" \
         op start interval="0" timeout="240s" \
         op stop interval="0" timeout="100s"
primitive r_fs_activemq ocf:arces:Filesystem \
         op monitor interval="20s" timeout="40s" \
         op start interval="0" timeout="60s" \
         op stop interval="0" timeout="60s" \
         params device="/dev/drbd5" fstype="reiserfs" 
directory="/usr/share/activemq/data" \
         meta target-role="Started"
primitive r_fs_iscsi-files ocf:heartbeat:Filesystem \
         op monitor interval="20s" timeout="40s" \
         op start interval="0" timeout="30s" \
         op stop interval="0" timeout="30s" \
         params device="/dev/drbd2" directory="/export/iscsi" 
fstype="ext3"
primitive r_fs_mysql ocf:arces:Filesystem \
         op monitor interval="20s" timeout="40s" \
         op start interval="0" timeout="60s" \
         op stop interval="0" timeout="60s" \
         params device="/dev/drbd4" directory="/var/lib/mysql" 
fstype="reiserfs"
primitive r_fs_nfs ocf:heartbeat:Filesystem \
         op monitor interval="20s" timeout="40s" \
         op start interval="0" timeout="30s" \
         op stop interval="0" timeout="30s" \
         params device="/dev/drbd3" directory="/export/nfs" 
fstype="ext4"
primitive r_ietd ocf:arces:lsbwrapper \
         op monitor interval="15s" timeout="15s" \
         op start interval="0" timeout="60s" \
         op stop interval="0" timeout="60s" \
         params initscript="/etc/init.d/iscsitarget" stopstatus="not 
access" startstatus="is running" \
         meta is-managed="true" target-role="Started"
primitive r_mysql ocf:heartbeat:mysql \
         op monitor interval="10s" timeout="30s" \
         op start interval="0" timeout="120s" \
         op stop interval="0" timeout="120s" \
         params binary="/usr/bin/mysqld_safe" config="/etc/mysql/my.cnf" 
log="/var/log/mysql/mysqld.log" pid="/var/lib/mysql/mysqld.pid" 
socket="/var/lib/mysql/mysqld.sock" enable_creation="1" 
additional_parame
ters="--skip-external-locking" \
         meta target-role="Started"
primitive r_nfs-kernel-server lsb:nfs-kernel-server \
         op monitor interval="10s" timeout="30s"
primitive r_portmap lsb:portmap \
         op monitor interval="10s" timeout="30s"
primitive r_rsyncd ocf:heartbeat:rsyncd \
         op monitor interval="60s" timeout="30s" \
         meta is-managed="true" target-role="Started"
primitive r_statd lsb:statd \
         op monitor interval="10s" timeout="30s"
primitive r_tftpd ocf:arces:lsbwrapper \
         op monitor interval="120s" timeout="60s" \
         params initscript="/etc/init.d/tftpd-hpa" \
         meta is-managed="true" target-role="Started"
group grp_activemq r_fs_activemq ip_10.66.16.47 r_activemq \
         meta is-managed="true"
group grp_iscsi r_fs_iscsi-files ip_10.66.16.3 r_ietd \
         meta is-managed="true"
group grp_mysql r_fs_mysql ip_10.66.16.15 r_mysql \
         meta is-managed="true"
group grp_nfsserver r_fs_nfs ip_10.66.16.53 r_statd r_portmap 
r_nfs-kernel-server \
         meta target-role="Started"
ms ms_drbd_activemq r_drbd_activemq \
         meta interleave="true" notify="true" is-managed="true" 
target-role="Started"
ms ms_drbd_iscsi-files r_drbd_iscsi-files \
         meta interleave="true" notify="true" is-managed="true" 
target-role="Master"
ms ms_drbd_mysql r_drbd_mysql \
         meta interleave="true" notify="true" is-managed="true"
ms ms_drbd_nfs r_drbd_nfs \
         meta interleave="true" notify="true" is-managed="true" 
target-role="Master"
location loc_crmstatus ip_10.66.16.59 \
         rule $id="loc_crmstatus-rule" inf: #isdc eq true
colocation col_activemq_with_drbd inf: grp_activemq 
ms_drbd_activemq:Master
colocation col_icsi_with_drbd inf: grp_iscsi ms_drbd_iscsi-files:Master
colocation col_mysql_with_drbd inf: grp_mysql ms_drbd_mysql:Master
colocation col_nfsserver_with_drbd inf: grp_nfsserver 
ms_drbd_nfs:Master
colocation col_other_services inf: ( r_rsyncd r_dhcpd r_tftpd ) 
ip_10.66.16.53
order o_drbd_activemq inf: ms_drbd_activemq:promote grp_activemq:start
order o_drbd_iscsi inf: ms_drbd_iscsi-files:promote grp_iscsi:start
order o_drbd_mysql inf: ms_drbd_mysql:promote grp_mysql:start
order o_drbd_nfs inf: ms_drbd_nfs:promote grp_nfsserver:start
order o_other_services inf: ip_10.66.16.53:start ( r_rsyncd:start 
r_dhcpd:start r_tftpd:start )
property $id="cib-bootstrap-options" \
         dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
         cluster-infrastructure="openais" \
         expected-quorum-votes="2" \
         no-quorum-policy="ignore" \
         stonith-enabled="false" \
         default-resource-stickiness="1000" \
         last-lrm-refresh="1355234343"

and messages after issuing #crm resource move grp_nfs filer-2

Dec 12 02:10:03 filer-1 crm_resource: [13179]: info: Invoked: 
crm_resource -M -r grp_nfsserver --node=filer-2

<removed cib diff messages>

Dec 12 02:10:03 filer-1 crmd: [17705]: info: abort_transition_graph: 
need_abort:59 - Triggered transition abort (complete=1) : Non-status 
change

Dec 12 02:10:03 filer-1 crmd: [17705]: info: need_abort: Aborting on 
change to admin_epoch

Dec 12 02:10:03 filer-1 crmd: [17705]: info: do_state_transition: State 
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC 
cause=C_FSA_INTERNAL origin=abort_transition_graph ]

Dec 12 02:10:03 filer-1 crmd: [17705]: info: do_state_transition: All 2 
cluster nodes are eligible to run resources.

Dec 12 02:10:03 filer-1 crmd: [17705]: info: do_pe_invoke: Query 698: 
Requesting the current CIB: S_POLICY_ENGINE

Dec 12 02:10:03 filer-1 cib: [2966]: info: log_data_element: cib:diff: 
+   </configuration>

Dec 12 02:10:03 filer-1 cib: [2966]: info: log_data_element: cib:diff: 
+ </cib>

Is there anything is this configuration that could prevent 
grp_nfsserver from moving to second node? Am I missing something?

Regards

-- 
--
Piotr Jewiec




More information about the Pacemaker mailing list