[Pacemaker] How to tell pacemaker to start exportfs after filesystem resource

Tue Jun 21 16:00:30 UTC 2011

Hi Dejan,

21.06.2011 17:46, Dejan Muhamedagic wrote:
> Hi Vladislav,
> 
> On Tue, Jun 21, 2011 at 05:38:21PM +0300, Vladislav Bogdanov wrote:
>> 21.06.2011 17:23, Dejan Muhamedagic wrote:
>>> On Tue, Jun 21, 2011 at 06:10:16PM +0400, Aleksander Malaev wrote:
>>>> How can I check this?
>>>> If I don't add this exportfs resource then cluster is become the fully
>>>> operational - all mounts are accesible and fail-over between nodes is
>>>> working as it should. May be I need to add some sort of delay between this
>>>> resources?
>>>
>>> If you need to do so (there's actually start-delay, but it
>>> should be deprecated), then some RA doesn't implement start
>>> action correctly. In this case, it looks like it's Filesystem,
>>> right? Since the filesystem is ocfs2 it may be that the cluster
>>> services supporting ocfs2 are not fast enough. At any rate,
>>> Filesystem shouldn't start before the filesystem is really
>>> mounted.
>>
>> If I recall correctly from my totally failed experiments with ocfs2
>> (simultaneous kernel panic on all nodes running f13-x86_64 ;), this is
>> ocfs2-specific problem.
>>
>> Although mount call returns success, ocfs2 filesystem may be not ready
>> for consumption for at least several seconds.
> 
> That sounds like a plausible explanation. Before trying to fix
> ocfs2, which may take time or be impossible, we can make
> Filesystem use monitor internally to exit only once the
> filesystem has really been mounted. But please somebody first
> open a bugzilla, this needs to be tracked.
> 
> BTW, interestingly I cannot recall that anybody complained about
> this before. It obviously depends on the network, but still...

I was so surprised by how ocfs2 behaves (at least on fedora), assuming
that I did not see any complains for last few years, so it is a complete
no-go for me for a long-long time.
gfs2 seems to be much more stable, and it should be even more after
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=6d3117b41295150d4ac70622055dd8f5529d86b2
(although I got rid of clustered filesystems at all for now and didn't
try this change yet).

It is much better for me to have slow but predictable filesystem then
that famous ocfs2.

Best,
Vladislav

> 
> Cheers,
> 
> Dejan
> 
>> Best,
>> Vladislav
>>
>>>
>>> If so, please file a bugzilla for it and attach hb_report of the
>>> incident.
>>>
>>> Thanks,
>>>
>>> Dejan
>>>
>>>> 2011/6/21 Dejan Muhamedagic <dejanmm at fastmail.fm>
>>>>
>>>>> On Tue, Jun 21, 2011 at 05:56:40PM +0400, Aleksander Malaev wrote:
>>>>>> Sure, I'm using order constraint.
>>>>>> But it seems that it doesn't check monitor of the previous started
>>>>> resource.
>>>>>
>>>>> It doesn't need to check monitor. The previous resource, if
>>>>> started, must be fully operational. If it's not, then the RA is
>>>>> broken.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Dejan
>>>>>
>>>>>> 2011/6/21 Dejan Muhamedagic <dejanmm at fastmail.fm>
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> On Mon, Jun 20, 2011 at 11:40:04PM +0400, Александр Малаев wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> I have configured pacemaker+ocfs2 cluster with shared storage
>>>>> connected
>>>>>>> by
>>>>>>>> FC.
>>>>>>>> Now I need to setup NFS export in Active/Active mode and I added all
>>>>>>> needed
>>>>>>>> resources and wrote the order of starting.
>>>>>>>> But then node is starting after reboot I got race condition between
>>>>>>>> Filesystem resource and exportfs.
>>>>>>>> Exportfs couldn't start because ocfs2 mountpoint isn't mounted yet.
>>>>>>>>
>>>>>>>> How to tell ExportFS resource to start then filesystem resource will
>>>>> be
>>>>>>>> ready?
>>>>>>>
>>>>>>> Use the order constraint? Or did I miss something? You already
>>>>>>> have some order constraints defined, so you should be able to
>>>>>>> manage.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Dejan
>>>>>>>
>>>>>>>> crm config is the following:
>>>>>>>> node msk-nfs-gw01
>>>>>>>> node msk-nfs-gw02
>>>>>>>> primitive nfs-kernel-server lsb:nfs-kernel-server \
>>>>>>>>         op monitor interval="10s" timeout="30s"
>>>>>>>> primitive ping ocf:pacemaker:ping \
>>>>>>>>         params host_list="10.236.22.35" multiplier="100" name="ping"
>>>>> \
>>>>>>>>         op monitor interval="20s" timeout="60s" \
>>>>>>>>         op start interval="0" timeout="60s"
>>>>>>>> primitive portmap upstart:portmap \
>>>>>>>>         op monitor interval="10s" timeout="30s"
>>>>>>>> primitive res-dlm ocf:pacemaker:controld \
>>>>>>>>         op monitor interval="120s"
>>>>>>>> primitive res-fs ocf:heartbeat:Filesystem \
>>>>>>>>         params device="/dev/mapper/mpath0" directory="/media/media0"
>>>>>>>> fstype="ocfs2" \
>>>>>>>>         op monitor interval="120s"
>>>>>>>> primitive res-nfs1-ip ocf:heartbeat:IPaddr2 \
>>>>>>>>         params ip="10.236.22.38" cidr_netmask="27" nic="bond0" \
>>>>>>>>         op monitor interval="30s"
>>>>>>>> primitive res-nfs2-ip ocf:heartbeat:IPaddr2 \
>>>>>>>>         params ip="10.236.22.39" cidr_netmask="27" nic="bond0" \
>>>>>>>>         op monitor interval="30s"
>>>>>>>> primitive res-o2cb ocf:pacemaker:o2cb \
>>>>>>>>         op monitor interval="120s"
>>>>>>>> primitive res-share ocf:heartbeat:exportfs \
>>>>>>>>         params directory="/media/media0/nfsroot/export1" clientspec="
>>>>>>>> 10.236.22.0/24" options="rw,async,no_subtree_check,no_root_squash"
>>>>>>> fsid="1"
>>>>>>>> \
>>>>>>>>         op monitor interval="10s" timeout="30s" \
>>>>>>>>         op start interval="10" timeout="40s" \
>>>>>>>>         op stop interval="0" timeout="40s"
>>>>>>>> primitive st-null stonith:null \
>>>>>>>>         params hostlist="msk-nfs-gw01 msk-nfs-gw02"
>>>>>>>> group nfs portmap nfs-kernel-server
>>>>>>>> clone clone-dlm res-dlm \
>>>>>>>>         meta globally-unique="false" interleave="true"
>>>>>>>> clone clone-fs res-fs \
>>>>>>>>         meta globally-unique="false" interleave="true"
>>>>>>>> clone clone-nfs nfs \
>>>>>>>>         meta globally-unique="false" interleace="true"
>>>>>>>> clone clone-o2cb res-o2cb \
>>>>>>>>         meta globally-unique="false" interleave="true"
>>>>>>>> clone clone-share res-share \
>>>>>>>>         meta globally-unique="false" interleave="true"
>>>>>>>> clone fencing st-null
>>>>>>>> clone ping_clone ping \
>>>>>>>>         meta globally-unique="false"
>>>>>>>> location nfs1-ip-on-nfs1 res-nfs1-ip 50: msk-nfs-gw01
>>>>>>>> location nfs2-ip-on-nfs2 res-nfs2-ip 50: msk-nfs-gw02
>>>>>>>> colocation col-fs-o2cb inf: clone-fs clone-o2cb
>>>>>>>> colocation col-nfs-fs inf: clone-nfs clone-fs
>>>>>>>> colocation col-o2cb-dlm inf: clone-o2cb clone-dlm
>>>>>>>> colocation col-share-nfs inf: clone-share clone-nfs
>>>>>>>> order ord-dlm-o2cb 0: clone-dlm clone-o2cb
>>>>>>>> order ord-nfs-share 0: clone-nfs clone-share
>>>>>>>> order ord-o2cb-fs 0: clone-o2cb clone-fs
>>>>>>>> order ord-o2cb-nfs 0: clone-fs clone-nfs
>>>>>>>> order ord-share-nfs1 0: clone-share res-nfs1-ip
>>>>>>>> order ord-share-nfs2 0: clone-share res-nfs2-ip
>>>>>>>> property $id="cib-bootstrap-options" \
>>>>>>>>         dc-version="1.0.9-da7075976b5ff0bee71074385f8fd02f296ec8a3" \
>>>>>>>>         cluster-infrastructure="openais" \
>>>>>>>>         expected-quorum-votes="2" \
>>>>>>>>         stonith-enabled="true" \
>>>>>>>>         no-quorum-policy="ignore" \
>>>>>>>>         last-lrm-refresh="1308040111"
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best Regards
>>>>>>>> Alexander Malaev
>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>
>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>> Getting started:
>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>> Bugs:
>>>>>>>
>>>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> С уважением,
>>>>>> Александр Малаев
>>>>>> +7-962-938-9323
>>>>>
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs:
>>>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>>>
>>>>>
>>>>
>>>>
>>>> -- 
>>>> С уважением,
>>>> Александр Малаев
>>>> +7-962-938-9323
>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker