[Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

Andreas Kurz andreas at hastexo.com
Fri Jun 22 05:58:29 EDT 2012


On 06/22/2012 11:14 AM, David Guyot wrote:
> Hello.
> 
> Concerning dlm-pcmk, it's not available from backports, so I installed
> it from stable; only ocfs2-tools-pacemaker are available and installed
> from it.

thats ok

> 
> I checked if /etc/init.d/ocfs2 and /etc/init.d/o2cb are removed from
> /etc/rcX.d/*, and they are, so the system cannot boot them up by itself.

you also explicitely stopped them (on both nodes) or did you reboot the
systems anyway?

> I also reconfigured DRBD resources using notify=true in each DRBD
> master, then I reconfigured OCFS2 resources using these crm commands
> 
> primitive p_controld ocf:pacemaker:controld
> primitive p_o2cb ocf:ocfs2:o2cb

interesting ... should be ocf:pacemaker:o2cb

> group g_ocfs2mgmt p_controld p_o2cb
> clone cl_ocfs2mgmt g_ocfs2mgmt meta interleave=true
> 

looks ok for testing o2cb, controld .. you will need colocation and
order constraints later when starting the filesystem

> root at Malastare:/home/david# crm configure show
> node Malastare
> node Vindemiatrix
> primitive p_controld ocf:pacemaker:controld
> primitive p_drbd_backupvi ocf:linbit:drbd \
>     params drbd_resource="backupvi"
> primitive p_drbd_pgsql ocf:linbit:drbd \
>     params drbd_resource="postgresql"
> primitive p_drbd_svn ocf:linbit:drbd \
>     params drbd_resource="svn"
> primitive p_drbd_www ocf:linbit:drbd \
>     params drbd_resource="www"
> primitive p_o2cb ocf:pacemaker:o2cb
> primitive soapi-fencing-malastare stonith:external/ovh \
>     params reversedns="ns208812.ovh.net"
> primitive soapi-fencing-vindemiatrix stonith:external/ovh \
>     params reversedns="ns235795.ovh.net"
> group g_ocfs2mgmt p_controld p_o2cb
> ms ms_drbd_backupvi p_drbd_backupvi \
>     meta master-max="2" clone-max="2" notify="true"
> ms ms_drbd_pgsql p_drbd_pgsql \
>     meta master-max="2" clone-max="2" notify="true"
> ms ms_drbd_svn p_drbd_svn \
>     meta master-max="2" clone-max="2" notify="true"
> ms ms_drbd_www p_drbd_www \
>     meta master-max="2" clone-max="2" notify="true"
> clone cl_ocfs2mgmt g_ocfs2mgmt \
>     meta interleave="true"
> location stonith-malastare soapi-fencing-malastare -inf: Malastare
> location stonith-vindemiatrix soapi-fencing-vindemiatrix -inf: Vindemiatrix
> property $id="cib-bootstrap-options" \
>     dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
>     cluster-infrastructure="openais" \
>     expected-quorum-votes="2"
> 
> Unfortunately, the problem is still there :
> 
> root at Malastare:/home/david# crm_mon --one-shot -VroA
> ============
> Last updated: Fri Jun 22 10:54:31 2012
> Last change: Fri Jun 22 10:54:27 2012 via crm_shadow on Malastare
> Stack: openais
> Current DC: Malastare - partition with quorum
> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> 2 Nodes configured, 2 expected votes
> 14 Resources configured.
> ============
> 
> Online: [ Malastare Vindemiatrix ]
> 
> Full list of resources:
> 
>  soapi-fencing-malastare    (stonith:external/ovh):    Started Vindemiatrix
>  soapi-fencing-vindemiatrix    (stonith:external/ovh):    Started Malastare
>  Master/Slave Set: ms_drbd_pgsql [p_drbd_pgsql]
>      Masters: [ Malastare Vindemiatrix ]
>  Master/Slave Set: ms_drbd_svn [p_drbd_svn]
>      Masters: [ Malastare Vindemiatrix ]
>  Master/Slave Set: ms_drbd_www [p_drbd_www]
>      Masters: [ Malastare Vindemiatrix ]
>  Master/Slave Set: ms_drbd_backupvi [p_drbd_backupvi]
>      Masters: [ Malastare Vindemiatrix ]
>  Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
>      Stopped: [ g_ocfs2mgmt:0 g_ocfs2mgmt:1 ]
> 
> Node Attributes:
> * Node Malastare:
>     + master-p_drbd_backupvi:0            : 10000    
>     + master-p_drbd_pgsql:0               : 10000    
>     + master-p_drbd_svn:0                 : 10000    
>     + master-p_drbd_www:0                 : 10000    
> * Node Vindemiatrix:
>     + master-p_drbd_backupvi:1            : 10000    
>     + master-p_drbd_pgsql:1               : 10000    
>     + master-p_drbd_svn:1                 : 10000    
>     + master-p_drbd_www:1                 : 10000    
> 
> Operations:
> * Node Vindemiatrix:
>    soapi-fencing-malastare: migration-threshold=1000000
>     + (4) start: rc=0 (ok)
>    p_drbd_pgsql:1: migration-threshold=1000000
>     + (5) probe: rc=8 (master)
>    p_drbd_svn:1: migration-threshold=1000000
>     + (6) probe: rc=8 (master)
>    p_drbd_www:1: migration-threshold=1000000
>     + (7) probe: rc=8 (master)
>    p_drbd_backupvi:1: migration-threshold=1000000
>     + (8) probe: rc=8 (master)
>    p_o2cb:1: migration-threshold=1000000
>     + (10) probe: rc=5 (not installed)
> * Node Malastare:
>    soapi-fencing-vindemiatrix: migration-threshold=1000000
>     + (4) start: rc=0 (ok)
>    p_drbd_pgsql:0: migration-threshold=1000000
>     + (5) probe: rc=8 (master)
>    p_drbd_svn:0: migration-threshold=1000000
>     + (6) probe: rc=8 (master)
>    p_drbd_www:0: migration-threshold=1000000
>     + (7) probe: rc=8 (master)
>    p_drbd_backupvi:0: migration-threshold=1000000
>     + (8) probe: rc=8 (master)
>    p_o2cb:0: migration-threshold=1000000
>     + (10) probe: rc=5 (not installed)
> 
> Failed actions:
>     p_o2cb:1_monitor_0 (node=Vindemiatrix, call=10, rc=5,
> status=complete): not installed
>     p_o2cb:0_monitor_0 (node=Malastare, call=10, rc=5, status=complete):
> not installed
> 
> Nevertheless, I noticed a strange error message in Corosync/Pacemaker logs :
> Jun 22 10:54:25 Vindemiatrix lrmd: [24580]: info: RA output:
> (p_controld:1:probe:stderr) dlm_controld.pcmk: no process found

this looks like the initial probing so there is no running controld is
expected

> 
> This message was immediately followed by "Wrong stack" errors, and

check the content of /sysfs/fs/ocfs2/loaded_cluster_plugins ... and if
you have that configfile and it contains the value "user" this is a good
sign you have started ocfs2/o2cb via init ;-)

Regards,
Andreas

> because dlm_controld.pcmk seems to be Pacemaker DLM dæmon, I strongly
> thinks these messages are related. Strangely, even if I have this dæmon
> executable in /usr/sbin, it's not loaded by Pacemaker :
> root at Vindemiatrix:/home/david# ls /usr/sbin/dlm_controld.pcmk
> /usr/sbin/dlm_controld.pcmk
> root at Vindemiatrix:/home/david# ps fax | grep pcmk
> 26360 pts/1    S+     0:00                          \_ grep pcmk
> 
> But, if I understood correctly, such process should be launched by DLM
> resource, and as I have no error messages concerning launching such a
> process whereas its executable is present, do you know where this
> problem could come from?
> 
> Thank you in advance.
> 
> Kind regards.
> 
> PS: I'll have the next week off, so I won't be able to answer you
> between this evening and the 2th of July.
> 
> Le 20/06/2012 17:39, Andreas Kurz a écrit :
>> On 06/20/2012 03:49 PM, David Guyot wrote:
>>> Actually, yes, I start DRBD manually, because this is currently a test
>>> configuration which relies on OpenVPN for the communications between
>>> these 2 nodes. I have no order and collocation constraints because I'm
>>> discovering these software and trying to configure them step by step and
>>> make resources work before ordering them (nevertheless, I just tried to
>>> configure DLM/O2CB constraints, but they fail, apparently because they
>>> are relying on O2CB, which causes the problem I wrote you about.) And I
>>> have no OCFS2 mounts because I was on the assumption that OCFS2 wouldn't
>>> mount partitions without O2CB and DLM, which seems to be right :
>> In fact it won't work without constraints, even if you are only testing
>> e.g. controld and o2cb must run on the same node (in fact on both nodes
>> of course) and controld must run before o2cb.
>>
>> And the error message you showed in a previous mail:
>>
>> 2012/06/20_09:04:35 ERROR: Wrong stack o2cb
>>
>> ... implies, that you are already running the native ocfs2 cluster stack
>> outside of pacemaker. You did a "/etc/init.d/ocfs2" stop before starting
>> your cluster tests and it is still stopped? And if it is stopped, a
>> cleanup of cl_ocfs2mgmt resource should start that resource ... if there
>> are no more other errors.
>>
>> You installed dlm-pcmk and ocfs2-tools-pacemaker packages from backports?
>>
>>> root at Malastare:/home/david# crm_mon --one-shot -VroA
>>> ============
>>> Last updated: Wed Jun 20 15:32:50 2012
>>> Last change: Wed Jun 20 15:28:34 2012 via crm_shadow on Malastare
>>> Stack: openais
>>> Current DC: Vindemiatrix - partition with quorum
>>> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
>>> 2 Nodes configured, 2 expected votes
>>> 14 Resources configured.
>>> ============
>>>
>>> Online: [ Vindemiatrix Malastare ]
>>>
>>> Full list of resources:
>>>
>>>  soapi-fencing-malastare    (stonith:external/ovh):    Started Vindemiatrix
>>>  soapi-fencing-vindemiatrix    (stonith:external/ovh):    Started Malastare
>>>  Master/Slave Set: ms_drbd_ocfs2_pgsql [p_drbd_ocfs2_pgsql]
>>>      Masters: [ Malastare Vindemiatrix ]
>>>  Master/Slave Set: ms_drbd_ocfs2_backupvi [p_drbd_ocfs2_backupvi]
>>>      Masters: [ Malastare Vindemiatrix ]
>>>  Master/Slave Set: ms_drbd_ocfs2_svn [p_drbd_ocfs2_svn]
>>>      Masters: [ Malastare Vindemiatrix ]
>>>  Master/Slave Set: ms_drbd_ocfs2_www [p_drbd_ocfs2_www]
>>>      Masters: [ Malastare Vindemiatrix ]
>>>  Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
>>>      Stopped: [ g_ocfs2mgmt:0 g_ocfs2mgmt:1 ]
>>>
>>> Node Attributes:
>>> * Node Vindemiatrix:
>>>     + master-p_drbd_ocfs2_backupvi:1      : 10000    
>>>     + master-p_drbd_ocfs2_pgsql:1         : 10000    
>>>     + master-p_drbd_ocfs2_svn:1           : 10000    
>>>     + master-p_drbd_ocfs2_www:1           : 10000    
>>> * Node Malastare:
>>>     + master-p_drbd_ocfs2_backupvi:0      : 10000    
>>>     + master-p_drbd_ocfs2_pgsql:0         : 10000    
>>>     + master-p_drbd_ocfs2_svn:0           : 10000    
>>>     + master-p_drbd_ocfs2_www:0           : 10000    
>>>
>>> Operations:
>>> * Node Vindemiatrix:
>>>    p_drbd_ocfs2_pgsql:1: migration-threshold=1000000
>>>     + (4) probe: rc=8 (master)
>>>    p_drbd_ocfs2_backupvi:1: migration-threshold=1000000
>>>     + (5) probe: rc=8 (master)
>>>    p_drbd_ocfs2_svn:1: migration-threshold=1000000
>>>     + (6) probe: rc=8 (master)
>>>    p_drbd_ocfs2_www:1: migration-threshold=1000000
>>>     + (7) probe: rc=8 (master)
>>>    soapi-fencing-malastare: migration-threshold=1000000
>>>     + (10) start: rc=0 (ok)
>>>    p_o2cb:1: migration-threshold=1000000
>>>     + (9) probe: rc=5 (not installed)
>>> * Node Malastare:
>>>    p_drbd_ocfs2_pgsql:0: migration-threshold=1000000
>>>     + (4) probe: rc=8 (master)
>>>    p_drbd_ocfs2_backupvi:0: migration-threshold=1000000
>>>     + (5) probe: rc=8 (master)
>>>    p_drbd_ocfs2_svn:0: migration-threshold=1000000
>>>     + (6) probe: rc=8 (master)
>>>    soapi-fencing-vindemiatrix: migration-threshold=1000000
>>>     + (10) start: rc=0 (ok)
>>>    p_drbd_ocfs2_www:0: migration-threshold=1000000
>>>     + (7) probe: rc=8 (master)
>>>    p_o2cb:0: migration-threshold=1000000
>>>     + (9) probe: rc=5 (not installed)
>>>
>>> Failed actions:
>>>     p_o2cb:1_monitor_0 (node=Vindemiatrix, call=9, rc=5,
>>> status=complete): not installed
>>>     p_o2cb:0_monitor_0 (node=Malastare, call=9, rc=5, status=complete):
>>> not installed
>>> root at Malastare:/home/david# mount -t ocfs2 /dev/drbd1 /media/ocfs/
>>> mount.ocfs2: Cluster stack specified does not match the one currently
>>> running while trying to join the group
>>>
>>> Concerning the notify meta-attribute, I didn't configured it because it
>>> wasn't even referred to in the DRBD official guide (
>>> http://www.drbd.org/users-guide-8.3/s-ocfs2-pacemaker.html), and I don't
>>> know what it does, so, by default, I stupidly followed the official
>>> guide. What does this meta-attribute sets? If you know a better guide,
>>> could you please tell me about, so I can check my config based on this
>>> other guide?
>> Well, than this is a documentation bug ... you will find the correct
>> configuration in the same guide, where pacemaker integration is
>> described ... "notify" sends out notification messages before and after
>> an instance of the DRBD OCF RA exectutes an action (like start, stop,
>> promote, demote) ... that allows the other instances to react.
>>
>> Regards,
>> Andreas
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 



-- 
Need help with Pacemaker?
http://www.hastexo.com/now


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 222 bytes
Desc: OpenPGP digital signature
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120622/6f4d535b/attachment-0003.sig>


More information about the Pacemaker mailing list