[Pacemaker] Lustre and Multiple Mount Protection

Bernd Schubert bs_lists at aakef.fastmail.fm
Wed Dec 30 12:31:27 UTC 2009


Hello Dejan,

On Thursday 24 December 2009, Dejan Muhamedagic wrote:

[...]

> > > > In a pacemaker cluster with correctly enables STONITH the cluster
> > > > manager takes care that the resource is only mounted on one node,
> > > > isn't it? At least in my understanding it should. Only after getting
> > > > the positive feedback that the resource was stopped on the other node
> > > > or the other node was fenced pacemaker starts the resource on the
> > > > second node, or?
> > >
> > > Right. But I guess you knew that.
> >
> > The problem are those annoying bugs that tell you the device is umounted
> > although it is not.
> 
> If the RA lies all bets are off, of course.
> 
> > My lustre server agent, which I will submit here once I
> > find some time to review it again, will protect you from this. I least I
> > hope I did catch all Lustre bugs...
> > And then pacemaker does not protect you to mount a filesystem, for which
> > presently e2fsck is running.
> 
> I guess that the start action would fail in this case. Please

No, without Multiple Mount Protection (MMP) the start action would *not* fail 
on the fail-over node, so it would be possible to get data corruption.
Lustre internally uses a modified ext3/ext4 and neither ext3 nor ext4 would 
protect you against that.  That is why Sun wrote the MMP extension...

> file a bugzilla if the RA does something unexpected.

The Filesystem agent behaves correctly, just Lustre must not claim the device 
is umounted although it is not. One of these bugs will be fixed in the next 
Lustre release and another one I still need to analyze. 
That is why one should use a specific agent for Lustre, which does specific 
Lustre checks if the filesystem is really unmounted. 


Cheers,
Bernd


--
Bernd Schubert
DataDirect Networks




More information about the Pacemaker mailing list