[Pacemaker] killproc not found? o2cb shutdown via resource agent

Matthew O'Connor matt at ecsorl.com
Fri Nov 9 17:41:01 EST 2012


On 11/09/2012 04:26 PM, Andrew Beekhof wrote:
> On Fri, Nov 9, 2012 at 4:43 PM, Matthew O'Connor <matt at ecsorl.com> wrote:
>> On 11/08/2012 08:15 PM, Andrew Beekhof wrote:
>>> You're not starting it as a pacemaker resource are you?
>>> CMAN should be doing that as part of the init script (which explains
>>> why its still there until after pacemaker is gone).
>> I thought that was the dlm_controld, not ocfs2_controld?
> I know it starts gfs_controld when using GFS... I assume its the same for OCFS2
Yes, I saw that in the cman script...though I can't seem to find the
magic combination of modules and/or configfs writes to make cman
actually configure ocfs2/o2cb, though there are times it detects o2cb's
presence on init (shortly before dying horribly).

I can configure ocfs2 manually (via /etc/ocfs2/cluster.conf), though
this has no effect on cman (and vice versa), except that cman does not
crash on shutdown and pacemaker then has no involvement with o2cb.  The
presence of the "cman" option for cluster stack in the o2cb RA is a
little bewildering.

I will do more research and reading, perhaps trying GFS out just to get
my head around how it interacts with CMAN.  Perhaps there is a
corollary, or something simple missing from cluster.conf.  Perhaps GFS
is the way to go, obviating these problems with OCFS2?

Thank you for your help!

>
>> dlm_controld
>> is certainly managed by CMAN, but it hasn't been starting ocfs2_controld
>> for me...and without it, the OCFS2 shares won't mount.  For reference:
>>
>> primitive p_iscsiclient-store0-sandbox ocf:heartbeat:iscsi \
>>         params portal="10.16.16.5:3260" target="..." \
>>         ...
>> primitive p_mount-store0-sandbox ocf:heartbeat:Filesystem \
>>         params device="-U 443d287f-b98f-45e4-bd6e-d64dd7af0169"
>> directory="/opt/store3" fstype="ocfs2" \
>>         ...
>> primitive p_o2cb ocf:pacemaker:o2cb \
>>         params stack="cman" \
>>         ...
>>
>> (ordering and colocation constraints omitted, along with uninteresting
>> arguments.)  I'll feel quite dumb if there was just some additional
>> configuration required for CMAN and OCFS2 and I somehow missed it.  I
>> guess that would explain why CMAN would try to restart the
>> ocfs2_controld if the ocfs2 modules were still loaded and configfs was
>> still alive and well...though technically it failed every time it tried.
>>
>>> On Fri, Nov 9, 2012 at 11:14 AM, Matthew O'Connor <matt at ecsorl.com> wrote:
>>>> I'm honestly beginning to wonder what exactly that killproc does for the
>>>> ocfs2_controld.cman process... For kicks, I created a script in /sbin
>>>> and /usr/sbin for killproc, which simply sources the lsb include and
>>>> calls the function with whatever was passed via the command-line.
>>>> Perhaps an equivalent fix to modifying the RA or the included shell
>>>> extensions file, but still not as friendly as installing a .deb. ;-)
>>>>
>>>> However, I'm not sure if it's doing anything useful, even though I can
>>>> see (via echos) that it's being called.  The ocfs2_controld.cman process
>>>> doesn't go away till pacemaker is stopped (and isn't started until
>>>> pacemaker is running and the node is online), which blunders into
>>>> another problem: the o2cb RA appears to be in charge of unloading any
>>>> modules it loaded, but it fails to unload the ocfs2_stack_user module.
>>>> This causes CMAN to fail when shutting down; manually running 'service
>>>> o2cb stop' before 'service cman stop' resolves the problem, but I would
>>>> believe the RA should be doing this.  Even when the ocfs2_controld.cman
>>>> process dies with pacemaker, the module remains.  :-/
>>>>
>>>>
>>>> On 11/08/2012 06:02 AM, Dejan Muhamedagic wrote:
>>>>> Hi,
>>>>>
>>>>> On Thu, Nov 08, 2012 at 08:23:53PM +1100, Tim Serong wrote:
>>>>>> On 11/08/2012 07:56 PM, Andrew Beekhof wrote:
>>>>>>> On Thu, Nov 8, 2012 at 5:16 PM, Tim Serong <tserong at suse.com> wrote:
>>>>>>>> On 11/08/2012 12:11 PM, Andrew Beekhof wrote:
>>>>>>>>> On Thu, Nov 8, 2012 at 9:59 AM, Matthew O'Connor <matt at ecsorl.com> wrote:
>>>>>>>>>> Follow-up and additional info:
>>>>>>>>>>
>>>>>>>>>> System is Ubuntu 12.04.  Not sure where killproc is supposed to be derived
>>>>>>>>>> from, or if there is an assumption for it to be a standalone binary or
>>>>>>>>>> script.  I did find it defined in /lib/lsb/init-functions.  Adding a ".
>>>>>>>>>> /lib/lsb/init-functions" to the start of the
>>>>>>>>>> /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs file makes the
>>>>>>>>>> process-kill work, but I suspect this is not the most desirable solution.
>>>>>>>>> I think thats as good a solution as any.
>>>>>>>>> I wonder where other distros are getting it from.
>>>>>>>> SLES 11 SP2:
>>>>>>>>
>>>>>>>> # rpm -qf /sbin/killproc
>>>>>>>> sysvinit-2.86-210.1
>>>>>>>>
>>>>>>>> openSUSE 12.2:
>>>>>>>>
>>>>>>>> # rpm -qf /sbin/killproc
>>>>>>>> sysvinit-tools-2.88+-77.3.1.x86_64
>>>>>>>>
>>>>>>>> Can't speak for any others offhand...
>>>>>>> Definitely not on fedora or its derivatives
>>>>>> Hrm.  Well, I just had a quick skim of the ocfs2-tools source, and I'd
>>>>>> be willing to bet the o2cb RA was based on the upstream o2cb init
>>>>>> script, which uses killproc, but also sources /lib/lsb/init-functions.
>>>>>> Does Fedora have killproc buried somewhere in there maybe?
>>>>>>
>>>>>> On SUSE, /lib/lsb/init-functions defines start_daemon(), killproc(), and
>>>>>> pidofproc() but these just wrap binaries of the same name in /sbin
>>>>>> (which would explain why o2cb works fine on SUSE, as those "missing"
>>>>>> things are presumably in $PATH anyway).
>>>>>>
>>>>>> I don't know about sourcing /lib/lsb/init-functions in .ocf-shellfuncs -
>>>>>> might be a bit broad?  Presumably couldn't hurt to source it in the o2cb
>>>>>> RA though, unless there's some other cleaner solution...
>>>>> I'd also say just in this particular RA. Unfortunately, the
>>>>> distro specific stuff creeps now and again into agents supposed
>>>>> to work everywhere.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Dejan
>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Tim
>>>>>> --
>>>>>> Tim Serong
>>>>>> Senior Clustering Engineer
>>>>>> SUSE
>>>>>> tserong at suse.com
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5029 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20121109/5a54eb3f/attachment-0003.p7s>


More information about the Pacemaker mailing list