[Pacemaker] kernel BUG at fs/dlm/lock.c:242! after sync of GFS2 (2 node - active/active)
Vladislav Bogdanov
bubble at hoster-ok.com
Wed Sep 8 06:33:25 UTC 2010
08.09.2010 09:25, Alisson Landim wrote:
> I am updating this post with an info.
> After stopping the WebFS resource i could create the GFS2 filesystem on
> the second node so the only difference now from the Cluster from scratch
> guide is the:
>
> 1 - Changed "/dev/drbd/by-res/wwwdata" to "/dev/drbd1" on WebFS.
You probably do not have drbd-udev package installed.
Installing it should fix this.
>
> And i still get the error and GFS2 doesn't work on F13_x86_64.
See my previous message. Does it help if you start openais in the place
of corosync?
Best,
Vladislav
>
> ------------------------------------------------------------------------
> From: landim4 at hotmail.com
> To: pacemaker at oss.clusterlabs.org
> Date: Tue, 7 Sep 2010 23:19:49 -0300
> Subject: [Pacemaker] kernel BUG at fs/dlm/lock.c:242! after sync of GFS2
> (2 node - active/active)
>
> After setting up a 2 node cluster following the cluster from scratch
> guide for Fedora 13 i have to say that GFS2 filesystem (active/active)
> doesn't work!
>
> If the kernel bug described below is not caused by one the modifications
> i HAD to do following the guide to continue, so Fedora 13 has no actual
> GFS2 cluster system working!
> The modifications were:
>
> 1 - Changed "/dev/drbd/by-res/wwwdata" to "/dev/drbd1" on WebFS.
> 2 - Did not executed command: "mkfs.gfs2 -p lock_dlm -j 2 -t pcmk:web
> /dev/drbd1" on the second node cause when drbd is not loaded says "
> Could not stat deice" and when it is "Device is busy/Read only file
> system" as i described on previous post.
>
> But no matter what i did, it does not justify a kernel bug.
>
> The problem occurs after the sync of the filesystem, after
> "cat/proc/drbd" reaches 100%.
>
> Any hint to make it work?
>
>
> Cluster from scratch guide followed:
> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch02.html
>
> Kernel Bug Description from "dmesg":
>
> ------------[ cut here ]------------
> kernel BUG at fs/dlm/lock.c:242!
> invalid opcode: 0000 [#1] SMP
> last sysfs file: /sys/kernel/dlm/web/event_done
> CPU 3
> Modules linked in: gfs2 drbd lru_cache ipt_CLUSTERIP dlm configfs sunrpc
> ipv6 cpufreq_ondemand acpi_cpufreq freq_table uinput iTCO_wdt
> iTCO_vendor_support e1000 i2c_i801 shpchp microcode sky2 i2c_core
> e752x_edac i6300esb edac_core raid1 [last unloaded: scsi_wait_scan]
>
> Pid: 2866, comm: mount.gfs2 Not tainted 2.6.34.6-47.fc13.x86_64 #1
> SEP7320VP2D2 /
> RIP: 0010:[<ffffffffa01244d6>] [<ffffffffa01244d6>] is_remote+0x73/0x81
> [dlm]
> RSP: 0018:ffff8801371a1938 EFLAGS: 00010296
> RAX: 0000000000000004 RBX: ffff8801380f9900 RCX: 0000000000005192
> RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
> RBP: ffff8801371a1948 R08: 00000000ffffffff R09: 0000000000000073
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801380490e8
> R13: ffff880138b7b000 R14: 0000000000000000 R15: ffff8801380f99e0
> FS: 00007fddfa6ac700(0000) GS:ffff880002180000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007fe4193da000 CR3: 0000000138053000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process mount.gfs2 (pid: 2866, threadinfo ffff8801371a0000, task
> ffff880137119770)
> Stack:
> 00000000c700a8c0 ffff8801380f9900 ffff8801371a19b8 ffffffffa0129111
> <0> ffff880138b7b604 c700a8c000000c30 ffff8801371a19e0 ffff88013a218200
> <0> 0000000000000000 aa00a8c000000246 ffff8801371a19b8 ffff8801380490e8
> Call Trace:
> [<ffffffffa0129111>] _request_lock+0x22e/0x274 [dlm]
> [<ffffffffa01291d5>] request_lock+0x7e/0xa7 [dlm]
> [<ffffffffa012676f>] ? create_lkb+0x126/0x14e [dlm]
> [<ffffffffa0129b89>] dlm_lock+0xf7/0x14d [dlm]
> [<ffffffffa01998bd>] ? gdlm_bast+0x0/0x43 [gfs2]
> [<ffffffffa019998a>] ? gdlm_ast+0x0/0x116 [gfs2]
> [<ffffffffa01998bd>] ? gdlm_bast+0x0/0x43 [gfs2]
> [<ffffffffa01998a5>] gdlm_lock+0xef/0x107 [gfs2]
> [<ffffffffa019998a>] ? gdlm_ast+0x0/0x116 [gfs2]
> [<ffffffffa01998bd>] ? gdlm_bast+0x0/0x43 [gfs2]
> [<ffffffffa01816e5>] do_xmote+0xed/0x14f [gfs2]
> [<ffffffffa0181853>] run_queue+0x10c/0x14a [gfs2]
> [<ffffffffa0182782>] gfs2_glock_nq+0x282/0x2a6 [gfs2]
> [<ffffffffa01827f1>] gfs2_glock_nq_num+0x4b/0x73 [gfs2]
> [<ffffffffa018c814>] init_locking+0x85/0x162 [gfs2]
> [<ffffffffa018deb6>] gfs2_get_sb+0x6e3/0x9ad [gfs2]
> [<ffffffffa01827e9>] ? gfs2_glock_nq_num+0x43/0x73 [gfs2]
> [<ffffffff811d8b40>] ? selinux_sb_copy_data+0x196/0x1af
> [<ffffffff8110fe29>] vfs_kern_mount+0xbd/0x19b
> [<ffffffff8110ff6f>] do_kern_mount+0x4d/0xed
> [<ffffffff811256c3>] do_mount+0x753/0x7c9
> [<ffffffff810f8373>] ? alloc_pages_current+0x95/0x9e
> [<ffffffff811257c1>] sys_mount+0x88/0xc2
> [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b
> Code: 8b e0 00 00 00 8b 73 3c 44 8b 83 d0 00 00 00 48 c7 c7 38 57 13 a0
> 31 c0 e8 85 67 32 e1 48 c7 c7 6d 57 13 a0 31 c0 e8 77 67 32 e1 <0f> 0b
> eb fe 59 0f 95 c0 0f b6 c0 5b c9 c3 55 48 89 e5 53 48 83
> RIP [<ffffffffa01244d6>] is_remote+0x73/0x81 [dlm]
> RSP <ffff8801371a1938>
> ---[ end trace 7c9ca33705dbca8d ]---
>
> _______________________________________________ Pacemaker mailing list:
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home:
> http://www.clusterlabs.org Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
More information about the Pacemaker
mailing list