[Pacemaker] stonithd segfault
Andrew Beekhof
andrew at beekhof.net
Wed May 8 22:19:22 UTC 2013
On 08/05/2013, at 10:33 PM, Pavel <free.lan.c2.718r at gmail.com> wrote:
> Hello everyone
>
> Can anyone, please assist me with the following problem. In syslog I get the following messages:
>
> kernel: stonithd[2029]: segfault at 0 ip 00000000004047ed sp 00007fffe886c8c0 error 4 in stonithd[400000+17000]
We need the full stack trace (we can't use the core, you'll have to open it with gdb and type "where")
> pacemakerd[2025]: notice: pcmk_child_exit: Child process stonith-ng terminated with signal 11 (pid=2029, core=128)
>
> Then pacemakerd tries to respawn stonith-ng, but it fails again and this goes infinitely.
>
> I have found a very similar problem in the mailing list archives, but it was already fixed and was related to Heartbeat only, while I'm using Corosync.
>
> What I have noticed is that this is somehow related to DRBD that I configure. With empty configuration (no RAs) or some other RAs (IPaddr2, ...), stonithd is running without any problem.
> At the same time, despite the issue, DRBD Master / Slave resource seems to work correctly.
>
> Here is my configuration:
>
>> node $id="1" fio-node1 \
>> attributes standby="off"
>> node $id="2" fio-node2 \
>> attributes standby="off"
>> rsc_template drbd-r ocf:linbit:drbd \
>> op start interval="0" timeout="240" \
>> op promote interval="0" timeout="90" \
>> op demote interval="0" timeout="90" \
>> op notify interval="0" timeout="90" \
>> op stop interval="0" timeout="100" \
>> op monitor interval="20" role="Slave" timeout="20" \
>> op monitor interval="10" role="Master" timeout="20"
>> primitive drbd-r1 @drbd-r \
>> params drbd_resource="r1"
>> ms ms-r1 drbd-r1 \
>> meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
>> property $id="cib-bootstrap-options" \
>> dc-version="1.1.9-2a917dd" \
>> cluster-infrastructure="corosync" \
>> stonith-enabled="false" \
>> last-lrm-refresh="1366018562"
>
> and here is drbd.conf:
>
>> include "drbd.d/global_common.conf";
>> include "drbd.d/*.res";
>>
>> resource r1 {
>> device /dev/drbd1;
>> disk /dev/vg-bio/lv1;
>> meta-disk internal;
>> on fio-node1 {
>> address 172.17.68.128:7789;
>> }
>> on fio-node2 {
>> address 172.17.68.129:7789;
>> }
>> }
>
> You can download full configuration (cib, corosync.conf, drbd.conf, drbd.d/global-common.conf) here - http://up.iteam.ua/download/152101/50aa518a439747e72/.
>
> I'm using Pacemaker 1.1.9 with Corosync 2.3.0 and crmsh 1.2.5 all built from source on Ubuntu Server 12.10 x64.
> Build options for the above are:
> pacemaker: ./configure --with-corosync --with-cs-quorum --without-ais --without-heartbeat --without-cman --with-snmp
> corosync: ./configure --disable-rdma --disable-testagents --disable-dbus --enable-snmp --enable-qdevices
> crmsh: ./configure
>
> Any help or guidance is highly appreciated. Thanks!
>
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list