[Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster
Andreas Kurz
andreas at hastexo.com
Fri Jun 22 09:58:29 UTC 2012
On 06/22/2012 11:14 AM, David Guyot wrote:
> Hello.
>
> Concerning dlm-pcmk, it's not available from backports, so I installed
> it from stable; only ocfs2-tools-pacemaker are available and installed
> from it.
thats ok
>
> I checked if /etc/init.d/ocfs2 and /etc/init.d/o2cb are removed from
> /etc/rcX.d/*, and they are, so the system cannot boot them up by itself.
you also explicitely stopped them (on both nodes) or did you reboot the
systems anyway?
> I also reconfigured DRBD resources using notify=true in each DRBD
> master, then I reconfigured OCFS2 resources using these crm commands
>
> primitive p_controld ocf:pacemaker:controld
> primitive p_o2cb ocf:ocfs2:o2cb
interesting ... should be ocf:pacemaker:o2cb
> group g_ocfs2mgmt p_controld p_o2cb
> clone cl_ocfs2mgmt g_ocfs2mgmt meta interleave=true
>
looks ok for testing o2cb, controld .. you will need colocation and
order constraints later when starting the filesystem
> root at Malastare:/home/david# crm configure show
> node Malastare
> node Vindemiatrix
> primitive p_controld ocf:pacemaker:controld
> primitive p_drbd_backupvi ocf:linbit:drbd \
> params drbd_resource="backupvi"
> primitive p_drbd_pgsql ocf:linbit:drbd \
> params drbd_resource="postgresql"
> primitive p_drbd_svn ocf:linbit:drbd \
> params drbd_resource="svn"
> primitive p_drbd_www ocf:linbit:drbd \
> params drbd_resource="www"
> primitive p_o2cb ocf:pacemaker:o2cb
> primitive soapi-fencing-malastare stonith:external/ovh \
> params reversedns="ns208812.ovh.net"
> primitive soapi-fencing-vindemiatrix stonith:external/ovh \
> params reversedns="ns235795.ovh.net"
> group g_ocfs2mgmt p_controld p_o2cb
> ms ms_drbd_backupvi p_drbd_backupvi \
> meta master-max="2" clone-max="2" notify="true"
> ms ms_drbd_pgsql p_drbd_pgsql \
> meta master-max="2" clone-max="2" notify="true"
> ms ms_drbd_svn p_drbd_svn \
> meta master-max="2" clone-max="2" notify="true"
> ms ms_drbd_www p_drbd_www \
> meta master-max="2" clone-max="2" notify="true"
> clone cl_ocfs2mgmt g_ocfs2mgmt \
> meta interleave="true"
> location stonith-malastare soapi-fencing-malastare -inf: Malastare
> location stonith-vindemiatrix soapi-fencing-vindemiatrix -inf: Vindemiatrix
> property $id="cib-bootstrap-options" \
> dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="2"
>
> Unfortunately, the problem is still there :
>
> root at Malastare:/home/david# crm_mon --one-shot -VroA
> ============
> Last updated: Fri Jun 22 10:54:31 2012
> Last change: Fri Jun 22 10:54:27 2012 via crm_shadow on Malastare
> Stack: openais
> Current DC: Malastare - partition with quorum
> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> 2 Nodes configured, 2 expected votes
> 14 Resources configured.
> ============
>
> Online: [ Malastare Vindemiatrix ]
>
> Full list of resources:
>
> soapi-fencing-malastare (stonith:external/ovh): Started Vindemiatrix
> soapi-fencing-vindemiatrix (stonith:external/ovh): Started Malastare
> Master/Slave Set: ms_drbd_pgsql [p_drbd_pgsql]
> Masters: [ Malastare Vindemiatrix ]
> Master/Slave Set: ms_drbd_svn [p_drbd_svn]
> Masters: [ Malastare Vindemiatrix ]
> Master/Slave Set: ms_drbd_www [p_drbd_www]
> Masters: [ Malastare Vindemiatrix ]
> Master/Slave Set: ms_drbd_backupvi [p_drbd_backupvi]
> Masters: [ Malastare Vindemiatrix ]
> Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
> Stopped: [ g_ocfs2mgmt:0 g_ocfs2mgmt:1 ]
>
> Node Attributes:
> * Node Malastare:
> + master-p_drbd_backupvi:0 : 10000
> + master-p_drbd_pgsql:0 : 10000
> + master-p_drbd_svn:0 : 10000
> + master-p_drbd_www:0 : 10000
> * Node Vindemiatrix:
> + master-p_drbd_backupvi:1 : 10000
> + master-p_drbd_pgsql:1 : 10000
> + master-p_drbd_svn:1 : 10000
> + master-p_drbd_www:1 : 10000
>
> Operations:
> * Node Vindemiatrix:
> soapi-fencing-malastare: migration-threshold=1000000
> + (4) start: rc=0 (ok)
> p_drbd_pgsql:1: migration-threshold=1000000
> + (5) probe: rc=8 (master)
> p_drbd_svn:1: migration-threshold=1000000
> + (6) probe: rc=8 (master)
> p_drbd_www:1: migration-threshold=1000000
> + (7) probe: rc=8 (master)
> p_drbd_backupvi:1: migration-threshold=1000000
> + (8) probe: rc=8 (master)
> p_o2cb:1: migration-threshold=1000000
> + (10) probe: rc=5 (not installed)
> * Node Malastare:
> soapi-fencing-vindemiatrix: migration-threshold=1000000
> + (4) start: rc=0 (ok)
> p_drbd_pgsql:0: migration-threshold=1000000
> + (5) probe: rc=8 (master)
> p_drbd_svn:0: migration-threshold=1000000
> + (6) probe: rc=8 (master)
> p_drbd_www:0: migration-threshold=1000000
> + (7) probe: rc=8 (master)
> p_drbd_backupvi:0: migration-threshold=1000000
> + (8) probe: rc=8 (master)
> p_o2cb:0: migration-threshold=1000000
> + (10) probe: rc=5 (not installed)
>
> Failed actions:
> p_o2cb:1_monitor_0 (node=Vindemiatrix, call=10, rc=5,
> status=complete): not installed
> p_o2cb:0_monitor_0 (node=Malastare, call=10, rc=5, status=complete):
> not installed
>
> Nevertheless, I noticed a strange error message in Corosync/Pacemaker logs :
> Jun 22 10:54:25 Vindemiatrix lrmd: [24580]: info: RA output:
> (p_controld:1:probe:stderr) dlm_controld.pcmk: no process found
this looks like the initial probing so there is no running controld is
expected
>
> This message was immediately followed by "Wrong stack" errors, and
check the content of /sysfs/fs/ocfs2/loaded_cluster_plugins ... and if
you have that configfile and it contains the value "user" this is a good
sign you have started ocfs2/o2cb via init ;-)
Regards,
Andreas
> because dlm_controld.pcmk seems to be Pacemaker DLM dæmon, I strongly
> thinks these messages are related. Strangely, even if I have this dæmon
> executable in /usr/sbin, it's not loaded by Pacemaker :
> root at Vindemiatrix:/home/david# ls /usr/sbin/dlm_controld.pcmk
> /usr/sbin/dlm_controld.pcmk
> root at Vindemiatrix:/home/david# ps fax | grep pcmk
> 26360 pts/1 S+ 0:00 \_ grep pcmk
>
> But, if I understood correctly, such process should be launched by DLM
> resource, and as I have no error messages concerning launching such a
> process whereas its executable is present, do you know where this
> problem could come from?
>
> Thank you in advance.
>
> Kind regards.
>
> PS: I'll have the next week off, so I won't be able to answer you
> between this evening and the 2th of July.
>
> Le 20/06/2012 17:39, Andreas Kurz a écrit :
>> On 06/20/2012 03:49 PM, David Guyot wrote:
>>> Actually, yes, I start DRBD manually, because this is currently a test
>>> configuration which relies on OpenVPN for the communications between
>>> these 2 nodes. I have no order and collocation constraints because I'm
>>> discovering these software and trying to configure them step by step and
>>> make resources work before ordering them (nevertheless, I just tried to
>>> configure DLM/O2CB constraints, but they fail, apparently because they
>>> are relying on O2CB, which causes the problem I wrote you about.) And I
>>> have no OCFS2 mounts because I was on the assumption that OCFS2 wouldn't
>>> mount partitions without O2CB and DLM, which seems to be right :
>> In fact it won't work without constraints, even if you are only testing
>> e.g. controld and o2cb must run on the same node (in fact on both nodes
>> of course) and controld must run before o2cb.
>>
>> And the error message you showed in a previous mail:
>>
>> 2012/06/20_09:04:35 ERROR: Wrong stack o2cb
>>
>> ... implies, that you are already running the native ocfs2 cluster stack
>> outside of pacemaker. You did a "/etc/init.d/ocfs2" stop before starting
>> your cluster tests and it is still stopped? And if it is stopped, a
>> cleanup of cl_ocfs2mgmt resource should start that resource ... if there
>> are no more other errors.
>>
>> You installed dlm-pcmk and ocfs2-tools-pacemaker packages from backports?
>>
>>> root at Malastare:/home/david# crm_mon --one-shot -VroA
>>> ============
>>> Last updated: Wed Jun 20 15:32:50 2012
>>> Last change: Wed Jun 20 15:28:34 2012 via crm_shadow on Malastare
>>> Stack: openais
>>> Current DC: Vindemiatrix - partition with quorum
>>> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
>>> 2 Nodes configured, 2 expected votes
>>> 14 Resources configured.
>>> ============
>>>
>>> Online: [ Vindemiatrix Malastare ]
>>>
>>> Full list of resources:
>>>
>>> soapi-fencing-malastare (stonith:external/ovh): Started Vindemiatrix
>>> soapi-fencing-vindemiatrix (stonith:external/ovh): Started Malastare
>>> Master/Slave Set: ms_drbd_ocfs2_pgsql [p_drbd_ocfs2_pgsql]
>>> Masters: [ Malastare Vindemiatrix ]
>>> Master/Slave Set: ms_drbd_ocfs2_backupvi [p_drbd_ocfs2_backupvi]
>>> Masters: [ Malastare Vindemiatrix ]
>>> Master/Slave Set: ms_drbd_ocfs2_svn [p_drbd_ocfs2_svn]
>>> Masters: [ Malastare Vindemiatrix ]
>>> Master/Slave Set: ms_drbd_ocfs2_www [p_drbd_ocfs2_www]
>>> Masters: [ Malastare Vindemiatrix ]
>>> Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
>>> Stopped: [ g_ocfs2mgmt:0 g_ocfs2mgmt:1 ]
>>>
>>> Node Attributes:
>>> * Node Vindemiatrix:
>>> + master-p_drbd_ocfs2_backupvi:1 : 10000
>>> + master-p_drbd_ocfs2_pgsql:1 : 10000
>>> + master-p_drbd_ocfs2_svn:1 : 10000
>>> + master-p_drbd_ocfs2_www:1 : 10000
>>> * Node Malastare:
>>> + master-p_drbd_ocfs2_backupvi:0 : 10000
>>> + master-p_drbd_ocfs2_pgsql:0 : 10000
>>> + master-p_drbd_ocfs2_svn:0 : 10000
>>> + master-p_drbd_ocfs2_www:0 : 10000
>>>
>>> Operations:
>>> * Node Vindemiatrix:
>>> p_drbd_ocfs2_pgsql:1: migration-threshold=1000000
>>> + (4) probe: rc=8 (master)
>>> p_drbd_ocfs2_backupvi:1: migration-threshold=1000000
>>> + (5) probe: rc=8 (master)
>>> p_drbd_ocfs2_svn:1: migration-threshold=1000000
>>> + (6) probe: rc=8 (master)
>>> p_drbd_ocfs2_www:1: migration-threshold=1000000
>>> + (7) probe: rc=8 (master)
>>> soapi-fencing-malastare: migration-threshold=1000000
>>> + (10) start: rc=0 (ok)
>>> p_o2cb:1: migration-threshold=1000000
>>> + (9) probe: rc=5 (not installed)
>>> * Node Malastare:
>>> p_drbd_ocfs2_pgsql:0: migration-threshold=1000000
>>> + (4) probe: rc=8 (master)
>>> p_drbd_ocfs2_backupvi:0: migration-threshold=1000000
>>> + (5) probe: rc=8 (master)
>>> p_drbd_ocfs2_svn:0: migration-threshold=1000000
>>> + (6) probe: rc=8 (master)
>>> soapi-fencing-vindemiatrix: migration-threshold=1000000
>>> + (10) start: rc=0 (ok)
>>> p_drbd_ocfs2_www:0: migration-threshold=1000000
>>> + (7) probe: rc=8 (master)
>>> p_o2cb:0: migration-threshold=1000000
>>> + (9) probe: rc=5 (not installed)
>>>
>>> Failed actions:
>>> p_o2cb:1_monitor_0 (node=Vindemiatrix, call=9, rc=5,
>>> status=complete): not installed
>>> p_o2cb:0_monitor_0 (node=Malastare, call=9, rc=5, status=complete):
>>> not installed
>>> root at Malastare:/home/david# mount -t ocfs2 /dev/drbd1 /media/ocfs/
>>> mount.ocfs2: Cluster stack specified does not match the one currently
>>> running while trying to join the group
>>>
>>> Concerning the notify meta-attribute, I didn't configured it because it
>>> wasn't even referred to in the DRBD official guide (
>>> http://www.drbd.org/users-guide-8.3/s-ocfs2-pacemaker.html), and I don't
>>> know what it does, so, by default, I stupidly followed the official
>>> guide. What does this meta-attribute sets? If you know a better guide,
>>> could you please tell me about, so I can check my config based on this
>>> other guide?
>> Well, than this is a documentation bug ... you will find the correct
>> configuration in the same guide, where pacemaker integration is
>> described ... "notify" sends out notification messages before and after
>> an instance of the DRBD OCF RA exectutes an action (like start, stop,
>> promote, demote) ... that allows the other instances to react.
>>
>> Regards,
>> Andreas
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
--
Need help with Pacemaker?
http://www.hastexo.com/now
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 222 bytes
Desc: OpenPGP digital signature
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120622/6f4d535b/attachment-0004.sig>
More information about the Pacemaker
mailing list