[Pacemaker] Migrations Fails to stop iSCSI Target

Sat Jan 4 11:53:16 UTC 2014

> -----Original Message-----
> From: Terry Miller [mailto:tmiller at tmsolutionsgroup.com]
> Sent: samedi 4 janvier 2014 01:51
> To: pacemaker at oss.clusterlabs.org
> Subject: Re: [Pacemaker] Migrations Fails to stop iSCSI Target
> 
> Any suggestion on what the problem might be?

We have successfully built a 2-nodes HA cluster using DRBD and the SCST
iSCSI target. However, we had major issues deleting the targets and we came
to the conclusion that we should only handle LUNs (one per target) with
pacemaker, and have the target management done at boot time by the OS.

We basically do the following:
- Boot the OS and create "disabled" targets with their respective portals
- Use pacemaker for managing drbd and groups of {scst luns and IP addresses}
- When starting an SCST lun, we attach it to the existing target and then
enable the target. When stopping the lun, we first disable the target, then
close all the iSCSI sessions and finally remove the lun from the target. It
leaves a disabled target behind, ready for later.

We wrote the attached resource agent for that.

We are now doing some tests with LIO as well, but have nothing to share so
far (too experimental).

As you can see, attaching multiple luns to a single target can be
problematic with this scenario, because you have to disable the target when
removing one lun, so you are essentially killing the other luns as well.

Regards.

> I'm trying to setup a 2-node HA iSCSI cluster but I am having some
problems
> with the failover.  I want each node to be active with half of the storage
> available in each node.  So each node will have half the storage and it
will
> be served by different IP/target/lun.  I have it working until I have an
> initiator setup and test failover.  If I have an initiator (proxmox) that
is
> using the target/lun, then when I migrate lun/target/drbd to the other
node,
> I get inconsistent results.
> 
> Sometimes it works fine, but it usually fails.  It seems to be that it
fails
> in stopping the target after stopping the LUN.  At that point the node is
> fenced (rebooted).  After fencing, sometimes the resources will migrate to
> the surviving node successfully, sometimes it will wait until the fenced
> node comes back online and then move the resources, and sometimes it will
> just remain in the failed state.  Sometimes, after the fenced node boots
> back up, it will try and move the resources back to itself and this will
> fail and the two nodes will continue to fence each other until there is
some
> manual intervention.
> 
> Any assistance if getting this setup correctly would be appreciated.
> Relevant details below.
> 
> Setup would look like:
> On Storage1001A
> Lun:vol01 - Target:data01 - IP1/IP11 - LV:lv_vol01 - VG:vg_data01 -
> DRBD:data01
> 
> On Storage1001B
> Lun:vol01 - Target:data02 - IP2/IP22 - LV:lv_vol01 - VG:vg_data02 -
> DRBD:data02
> 
> Versions:
> Linux Storage1001A.xdomain.com 3.2.0-4-amd64 #1 SMP Debian 3.2.51-1
> x86_64
> GNU/Linux
> targetcli                          2.0rc1-2
> lio-utils                          3.1+git2.fd0b34fd-2
> drbd8-utils                        2:8.3.13-2
> corosync                           1.4.2-3
> pacemaker                          1.1.7-1
> 
> Network is setup with 3 bonded pairs
> - Bond0: Client interface
> - Bond1: Crossover
> - Bond2: Client interface (for future multicast)
> 
> 
> Log file shows
> - Working fine
> - Migrate from Storage1001A to Storage1001B
> - Storage1001A hangs/crashes after stopping LUN before stopping Target
> - Storage1001A is fenced
> - Resources migrate to Storage1001B (not 100% sure on this)
> - Storage1001A boots back up and tries to take back resources
> - Storage1001B hangs after stopping LUN before stopping Target
> - Manually cycle power on Storage1001B
> - Resources remain "stuck" on Storage1001B until Storage1001B is back
online
> (stonith currently disabled for Storage1001B, but sometimes it remains
> "stuck" when enabled).
> 
> 
> 
> 
> **********************************************************
> **
> *** Normal Status
> **********************************************************
> **
> # crm status
> ============
> Last updated: Sat Dec 14 14:58:04 2013
> Last change: Sat Dec 14 14:45:43 2013 via crm_resource on
> Storage1001B.xdomain.com
> Stack: openais
> Current DC: Storage1001A.xdomain.com - partition with quorum
> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> 2 Nodes configured, 2 expected votes
> 22 Resources configured.
> ============
> 
> Online: [ Storage1001B.xdomain.com Storage1001A.xdomain.com ]
> 
> st_node_a      (stonith:external/synaccess):   Started
> Storage1001B.xdomain.com
> Resource Group: rg_data01
>      p_lvm01    (ocf::heartbeat:LVM):   Started Storage1001A.xdomain.com
>      p_ip1      (ocf::heartbeat:IPaddr):        Started
> Storage1001A.xdomain.com
>      p_ip11     (ocf::heartbeat:IPaddr):        Started
> Storage1001A.xdomain.com
>      p_target_data01    (ocf::heartbeat:iSCSITarget):   Started
> Storage1001A.xdomain.com
>      p_lu_data01_vol01  (ocf::heartbeat:iSCSILogicalUnit):      Started
> Storage1001A.xdomain.com
>      p_email_admin1     (ocf::heartbeat:MailTo):        Started
> Storage1001A.xdomain.com
> Master/Slave Set: ms_drbd1 [p_drbd1]
>      Masters: [ Storage1001A.xdomain.com ]
>      Slaves: [ Storage1001B.xdomain.com ]
> Master/Slave Set: ms_drbd2 [p_drbd2]
>      Masters: [ Storage1001B.xdomain.com ]
>      Slaves: [ Storage1001A.xdomain.com ]
> Clone Set: c_lsb_target [p_target]
>      Started: [ Storage1001A.xdomain.com Storage1001B.xdomain.com ]
> Clone Set: c_ping [p_ping]
>      Started: [ Storage1001B.xdomain.com Storage1001A.xdomain.com ]
> 
> 
> **********************************************************
> **
> ***  Status after migrate
> **********************************************************
> **
> # crm resource migrate rg_data01 Storage1001B.xdomain.com
> # crm status
> ============
> Last updated: Sat Dec 14 16:40:55 2013
> Last change: Sat Dec 14 16:40:48 2013 via crm_resource on
> Storage1001B.xdomain.com
> Stack: openais
> Current DC: Storage1001A.xdomain.com - partition with quorum
> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> 2 Nodes configured, 2 expected votes
> 22 Resources configured.
> ============
> 
> Online: [ Storage1001B.xdomain.com Storage1001A.xdomain.com ]
> 
> st_node_a      (stonith:external/synaccess):   Started
> Storage1001B.xdomain.com
> Resource Group: rg_data01
>      p_lvm01    (ocf::heartbeat:LVM):   Started Storage1001A.xdomain.com
>      p_ip1      (ocf::heartbeat:IPaddr):        Started
> Storage1001A.xdomain.com
>      p_ip11     (ocf::heartbeat:IPaddr):        Started
> Storage1001A.xdomain.com
>      p_target_data01    (ocf::heartbeat:iSCSITarget):   Started
> Storage1001A.xdomain.com
>      p_lu_data01_vol01  (ocf::heartbeat:iSCSILogicalUnit):      Stopped
>      p_email_admin1     (ocf::heartbeat:MailTo):        Stopped
> Master/Slave Set: ms_drbd1 [p_drbd1]
>      Masters: [ Storage1001A.xdomain.com ]
>      Slaves: [ Storage1001B.xdomain.com ]
> Master/Slave Set: ms_drbd2 [p_drbd2]
>      Masters: [ Storage1001B.xdomain.com ]
>      Slaves: [ Storage1001A.xdomain.com ]
> Clone Set: c_lsb_target [p_target]
>      Started: [ Storage1001A.xdomain.com Storage1001B.xdomain.com ]
> Clone Set: c_ping [p_ping]
>      Started: [ Storage1001B.xdomain.com Storage1001A.xdomain.com ]
> 
> 
> **********************************************************
> **
> ***  Status after first node boots back up and tries to take back resource
> **********************************************************
> **
> # crm status
> ============
> Last updated: Tue Dec 17 14:06:15 2013
> Last change: Tue Dec 17 14:05:44 2013 via cibadmin on
> Storage1001A.xdomain.com
> Stack: openais
> Current DC: Storage1001B.xdomain.com - partition with quorum
> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> 2 Nodes configured, 2 expected votes
> 22 Resources configured.
> ============
> 
> Node Storage1001B.xdomain.com: UNCLEAN (online)
> Online: [ Storage1001A.xdomain.com ]
> 
> st_node_a      (stonith:external/synaccess):   Started
> Storage1001B.xdomain.com
> Resource Group: rg_data01
>      p_lvm01    (ocf::heartbeat:LVM):   Started Storage1001B.xdomain.com
>      p_ip1      (ocf::heartbeat:IPaddr):        Started
> Storage1001B.xdomain.com
>      p_ip11     (ocf::heartbeat:IPaddr):        Started
> Storage1001B.xdomain.com
>      p_target_data01    (ocf::heartbeat:iSCSITarget):   Started
> Storage1001B.xdomain.com FAILED
>      p_lu_data01_vol01  (ocf::heartbeat:iSCSILogicalUnit):      Stopped
>     p_email_admin1     (ocf::heartbeat:MailTo):        Stopped
> Master/Slave Set: ms_drbd1 [p_drbd1]
>      Masters: [ Storage1001B.xdomain.com ]
>      Slaves: [ Storage1001A.xdomain.com ]
> Master/Slave Set: ms_drbd2 [p_drbd2]
>      Masters: [ Storage1001B.xdomain.com ]
>      Slaves: [ Storage1001A.xdomain.com ]
> Clone Set: c_lsb_target [p_target]
>      Started: [ Storage1001A.xdomain.com Storage1001B.xdomain.com ]
> Clone Set: c_ping [p_ping]
>      Started: [ Storage1001A.xdomain.com Storage1001B.xdomain.com ]
> 
> Failed actions:
>     p_target_data01_stop_0 (node=Storage1001B.xdomain.com, call=111, rc=-
> 2,
> status=Timed Out): unknown exec error
> 
> 
> 
> 
> **********************************************************
> **
> ***  Configuration
> ***  Note:  Some resources are stopped to try and get one
> ***      resource group working properly
> **********************************************************
> **
> node Storage1001A.xdomain.com
> node Storage1001B.xdomain.com
> primitive p_drbd1 ocf:linbit:drbd \
>         params drbd_resource="Data01" \
>         op monitor interval="3" role="Master" timeout="9" \
>         op monitor interval="4" role="Slave" timeout="12"
> primitive p_drbd2 ocf:linbit:drbd \
>         params drbd_resource="Data02" \
>         op monitor interval="3" role="Master" timeout="9" \
>         op monitor interval="4" role="Slave" timeout="12"
> primitive p_email_admin1 ocf:heartbeat:MailTo \
>         params email="admin at xdomain.com" subject="Cluster Failover"
> primitive p_email_admin2 ocf:heartbeat:MailTo \
>         params email="admin at xdomain.com" subject="Cluster Failover"
> primitive p_ip1 ocf:heartbeat:IPaddr \
>         params ip="10.11.2.13" nic="bond0" cidr_netmask="21" \
>         op monitor interval="5s"
> primitive p_ip11 ocf:heartbeat:IPaddr \
>         params ip="10.11.10.13" nic="bond2" cidr_netmask="21" \
>         op monitor interval="5s"
> primitive p_ip2 ocf:heartbeat:IPaddr \
>         params ip="10.11.2.14" nic="bond0" cidr_netmask="21" \
>         op monitor interval="5s"
> primitive p_ip22 ocf:heartbeat:IPaddr \
>         params ip="10.11.10.14" nic="bond2" cidr_netmask="21" \
>         op monitor interval="5s"
> primitive p_lu_data01_vol01 ocf:heartbeat:iSCSILogicalUnit \
>         params target_iqn="iqn.2013-10.com.xdomain.storage1001.data01"
> lun="1" path="/dev/vg_data01/lv_vol01" implementation="lio" \
>         op monitor interval="10"
> primitive p_lu_data02_vol01 ocf:heartbeat:iSCSILogicalUnit \
>         params target_iqn="iqn.2013-10.com.xdomain.storage1001.data02"
> lun="1" path="/dev/vg_data02/lv_vol01" implementation="lio" \
>         op monitor interval="10"
> primitive p_lvm01 ocf:heartbeat:LVM \
>         params volgrpname="vg_data01" \
>         op monitor interval="4" timeout="8"
> primitive p_lvm02 ocf:heartbeat:LVM \
>         params volgrpname="vg_data02" \
>         op monitor interval="4" timeout="8"
> primitive p_ping ocf:pacemaker:ping \
>         op monitor interval="5s" timeout="15s" \
>         params host_list="10.11.10.1" multiplier="200" name="p_ping"
> primitive p_target lsb:target \
>         op monitor interval="30" timeout="30"
> primitive p_target_data01 ocf:heartbeat:iSCSITarget \
>         params iqn="iqn.2013-10.com.xdomain.storage1001.data01"
> implementation="lio" \
>         op monitor interval="10s" timeout="20s"
> primitive p_target_data02 ocf:heartbeat:iSCSITarget \
>         params iqn="iqn.2013-10.com.xdomain.storage1001.data02"
> implementation="lio" \
>         op monitor interval="10s" timeout="20s"
> primitive st_node_a stonith:external/synaccess \
>         params synaccessip="reboot11.xdomain.com" community="*******"
> port="Storage1001A" pcmk_host_list="Storage1001A.xdomain.com" \
>         meta target-role="Started"
> primitive st_node_b stonith:external/synaccess \
>         params synaccessip="reboot10.xdomain.com" community="******"
> port="Storage1001B" pcmk_host_list="Storage1001B.xdomain.com" \
>         meta target-role="Stopped"
> group rg_data01 p_lvm01 p_ip1 p_ip11 p_target_data01 p_lu_data01_vol01
> p_email_admin1 \
>         meta target-role="Started"
> group rg_data02 p_lvm02 p_ip2 p_ip22 p_target_data02 p_lu_data02_vol01
> p_email_admin2 \
>         meta target-role="Stopped"
> ms ms_drbd1 p_drbd1 \
>         meta notify="true" master-max="1" clone-max="2" clone-node-max="1"
> target-role="Started" \
>         meta resource-stickiness="101"
> ms ms_drbd2 p_drbd2 \
>         meta notify="true" master-max="1" clone-max="2" clone-node-max="1"
> target-role="Started" \
>         meta resource-stickiness="101"
> clone c_lsb_target p_target \
>         meta target-role="Started"
> clone c_ping p_ping \
>         meta globally-unique="false" target-role="Started"
> location data01_prefer_a ms_drbd1 \
>         rule $id="data01_prefer_a_rule" $role="Master" 100: #uname eq
> Storage1001A.xdomain.com
> location data02_prefer_b ms_drbd2 \
>         rule $id="data02_prefer_b_rule" $role="Master" 100: #uname eq
> Storage1001B.xdomain.com
> location st_node_a-loc st_node_a \
>         rule $id="st_node_a-loc-id" -inf: #uname eq
Storage1001A.xdomain.com
> location st_node_b-loc st_node_b \
>         rule $id="st_node_b-loc-id" -inf: #uname eq
Storage1001B.xdomain.com
> colocation c_drbd1 inf: rg_data01 ms_drbd1:Master
> colocation c_drbd2 inf: rg_data02 ms_drbd2:Master
> order o_data01_start inf: ms_drbd1:promote rg_data01:start
> order o_data01_stop inf: rg_data01:stop ms_drbd1:demote
> order o_data02_start inf: ms_drbd2:promote rg_data02:start
> order o_data02_stop inf: rg_data02:stop ms_drbd2:demote
> order o_target_before_data01 inf: c_lsb_target:start rg_data01
> order o_target_before_data02 inf: c_lsb_target:start rg_data02
> property $id="cib-bootstrap-options" \
>         dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
>         cluster-infrastructure="openais" \
>         expected-quorum-votes="2" \
>         stonith-enabled="true" \
>         no-quorum-policy="ignore" \
>         default-resource-stickiness="1" \
>         last-lrm-refresh="1387064240"
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scstlun.pacemaker
Type: application/octet-stream
Size: 11160 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140104/3f95d055/attachment-0004.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6043 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140104/3f95d055/attachment-0004.p7s>