[Pacemaker] Configuring LVM and Filesystem resources on top of DRBD
Dejan Muhamedagic
dejanmm at fastmail.fm
Tue Feb 9 10:57:36 UTC 2010
Hi,
On Mon, Feb 08, 2010 at 03:45:25PM -0600, D. J. Draper wrote:
>
> On Mon, Feb 8 13:36:47 EST 2010, Dejan Muhamedagic wrote:
> The logs don't contain the period when CRM probes for running
> resources. But I can imagine what is actually going on. This is a
> deficiency in handling probes in the LVM and, perhaps, the
> Filesystem resource agents. Can you please post the logs from the
> time when the cluster is starting. Actually, best to open a
> bugzilla and attach a hb_report report.
>
> Thanks,
>
> Dejan
> Thanks for the reply Dejan. I attached a zip file with several
> log files covering two reboots on each server. To generate
According to Node01Reboot1500ha-log.log, CRM first starts LVM
then drbd:
Feb 08 15:03:36 node01.houseofdraper.org lrmd: [1771]: info: rsc:lvm_data0:6: start
Feb 08 15:03:36 node01.houseofdraper.org crmd: [1774]: info: do_lrm_rsc_op: Performing key=7:1:0:1fcb0ada-cc5d-463b-ab2d-e046fee580ed op=drbd_data0:1_start_0 )
Feb 08 15:03:36 node01.houseofdraper.org lrmd: [1771]: info: rsc:drbd_data0:1:7: start
Feb 08 15:03:36 node01.houseofdraper.org crmd: [1774]: info: do_lrm_rsc_op: Performing key=35:1:0:1fcb0ada-cc5d-463b-ab2d-e046fee580ed op=drbd_data1:1_start_0 )
Feb 08 15:03:36 node01.houseofdraper.org lrmd: [1771]: info: rsc:drbd_data1:1:8: start
That's obviously a configuration problem. Similar in all other
logs, it's as if there are no constraints.
There are also numerous drbd errors:
Node01Reboot1400messages.log:Feb 8 14:00:56 node01 drbd[6124]: ERROR: data0: Called drbdadm -c /etc/drbd.conf secondary data0
Node01Reboot1400messages.log:Feb 8 14:00:56 node01 drbd[6124]: ERROR: data0: Exit code 11
Node01Reboot1400messages.log:Feb 8 14:00:56 node01 drbd[6124]: ERROR: data0: Command output:
Node01Reboot1400messages.log:Feb 8 14:00:56 node01 drbd[6124]: ERROR: data0: Called drbdadm -c /etc/drbd.conf secondary data0
etc.
Looking again at your configuration, there are some strange
resource relations:
> order ord_data00 inf: ms_drbd_data0:promote ms_drbd_data1:promote
How these two dependent of each other?
> order ord_data01 inf: ms_drbd_data0:promote lvm_data0:start
> order ord_data02 inf: lvm_data0:start fs_data0:start
> order ord_data03 inf: ms_drbd_data1:promote lvm_data1:start
> order ord_data04 inf: lvm_data1:start fs_data1:start
> order ord_data05 inf: fs_data0:start fs_data1:start
And these two.
> order ord_data06 inf: fs_data1:start ip_data:start
> order ord_data07 inf: ip_data:start svc_nfs:start
> order ord_data08 inf: ip_data:start svc_samba:start
Perhaps you could use groups to reduce the configuration size a
bit. It's quite hard to follow all the constraints.
Please use hb_report, it is the only way one can correlate
events with logs with configuration. And you'll find it a tad
easier than collecting stuff by hand.
The bugzilla is at http://developerbugs.linux-foundation.org/
Thanks,
Dejan
> these, I started with all the resources running on Node01. I
> issued the first reboot at 14:00, after which all the resources
> except fs_data0 started successfully on Node02. I issued a
> second reboot at 15:00, after which only the drbd resources
> successfully restarted on Node01:
>
> -bash-4.0# crm status
> ============
> Last updated: Mon Feb 8 15:42:25 2010
> Stack: Heartbeat
> Current DC: node02.houseofdraper.org (a91b7362-448e-4437-a543-19e0067a5d2e) - partition with quorum
> Version: 1.0.7-d3fa20fc76c7947d6de66db7e52526dc6bd7d782
> 2 Nodes configured, unknown expected votes
> 4 Resources configured.
> ============
>
> Online: [ node01.houseofdraper.org node02.houseofdraper.org ]
>
> Master/Slave Set: ms_drbd_data0
> Masters: [ node01.houseofdraper.org ]
> Slaves: [ node02.houseofdraper.org ]
> Master/Slave Set: ms_drbd_data1
> Masters: [ node01.houseofdraper.org ]
> Slaves: [ node02.houseofdraper.org ]
>
> Failed actions:
> lvm_data0_start_0 (node=node02.houseofdraper.org, call=14, rc=1, status=complete): unknown error
> fs_data0_start_0 (node=node02.houseofdraper.org, call=6, rc=5, status=complete): not installed
> lvm_data0_start_0 (node=node01.houseofdraper.org, call=6, rc=1, status=complete): unknown error
> fs_data0_start_0 (node=node01.houseofdraper.org, call=14, rc=5, status=complete): not installed
> -bash-4.0#
>
> As for the bugzilla report, if you would kindly point me to a
> FAQ or HOWTO covering the proper submission of a bugzilla
> report for this group, I would be happy to initiate one.
> Thanks in advance,
>
> DJ
>
> _________________________________________________________________
> Your E-mail and More On-the-Go. Get Windows Live Hotmail Free.
> http://clk.atdmt.com/GBL/go/201469229/direct/01/
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
More information about the Pacemaker
mailing list