[ClusterLabs] ocf scripts shell and local variables
Gabriele Bulfon
gbulfon at sonicle.com
Mon Aug 29 09:17:35 UTC 2016
Hi Ken,
I have been talking with the illumos guys about the shell problem.
They all agreed that ksh (and specially the ksh93 used in illumos) is absolutely Bourne-compatible, and that the "local" variables used in the ocf shells is not a Bourne syntax, but probably a bash specific.
This means that pointing the scripts to "#!/bin/sh" is portable as long as the scripts are really Bourne-shell only syntax, as any Unix variant may link whatever Bourne-shell they like.
In this case, it should point to "#!/bin/bash" or whatever shell the script was written for.
Also, in this case, the starting point is not the ocf-* script, but the original RA (IPaddr, but almost all of them).
What about making the code base of RA and ocf-* portable?
It may be just by changing them to point to bash, or with some kind of configure modifier to be able to specify the shell to use.
Meanwhile, changing the scripts by hands into #!/bin/bash worked like a charm, and I will start patching.
Gabriele
----------------------------------------------------------------------------------------
Sonicle S.r.l.
:
http://www.sonicle.com
Music:
http://www.gabrielebulfon.com
Quantum Mechanics :
http://www.cdbaby.com/cd/gabrielebulfon
----------------------------------------------------------------------------------
Da: Ken Gaillot
A: gbulfon at sonicle.com Cluster Labs - All topics related to open-source clustering welcomed
Data: 26 agosto 2016 15.56.02 CEST
Oggetto: Re: ocf scripts shell and local variables
On 08/26/2016 08:11 AM, Gabriele Bulfon wrote:
I tried adding some debug in ocf-shellfuncs, showing env and ps -ef into
the corosync.log
I suspect it's always using ksh, because in the env output I produced I
find this: KSH_VERSION=.sh.version
This is normally not present in the environment, unless ksh is running
the shell.
The RAs typically start with #!/bin/sh, so whatever that points to on
your system is what will be used.
I also tried modifiying all ocf shells with "#!/usr/bin/bash" at the
beginning, no way, same output.
You'd have to change the RA that includes them.
Any idea how can I change the used shell to support "local" variables?
You can either edit the #!/bin/sh line at the top of each RA, or figure
out how to point /bin/sh to a Bourne-compatible shell. ksh isn't
Bourne-compatible, so I'd expect lots of #!/bin/sh scripts to fail with
it as the default shell.
Gabriele
----------------------------------------------------------------------------------------
*Sonicle S.r.l. *: http://www.sonicle.com
*Music: *http://www.gabrielebulfon.com
*Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
------------------------------------------------------------------------
*Da:* Gabriele Bulfon
*A:* kgaillot at redhat.com Cluster Labs - All topics related to
open-source clustering welcomed
*Data:* 26 agosto 2016 10.12.13 CEST
*Oggetto:* Re: [ClusterLabs] ocf::heartbeat:IPaddr
I looked around what you suggested, inside ocf-binaris and
ocf-shellfuncs etc.
So I found also these logs in corosync.log :
Aug 25 17:50:33 [2250] crmd: notice: process_lrm_event:
xstorage1-xstorage2_wan2_IP_start_0:22 [
/usr/lib/ocf/resource.d/heartbeat/IPaddr[71]: local: not found [No
such file or
directory]\n/usr/lib/ocf/resource.d/heartbeat/IPaddr[354]: local:
not found [No such file or
directory]\n/usr/lib/ocf/resource.d/heartbeat/IPaddr[355]: local:
not found [No such file or
directory]\n/usr/lib/ocf/resource.d/heartbeat/IPaddr[356]: local:
not found [No such file or directory]\nocf-exit-reason:Setup
problem: coul
Aug 25 17:50:33 [2246] lrmd: notice: operation_finished:
xstorage2_wan2_IP_start_0:3613:stderr [
/usr/lib/ocf/resource.d/heartbeat/IPaddr[71]: local: not found [No
such file or directory] ]
Looks like the shell is not happy with the "local" variable definition.
I tried running ocf-shellfuncs manually with sh and bash and they
all run without errors.
How can I see what shell is running these scripts?
----------------------------------------------------------------------------------------
*Sonicle S.r.l. *: http://www.sonicle.com
*Music: *http://www.gabrielebulfon.com
*Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
----------------------------------------------------------------------------------
Da: Ken Gaillot
A: users at clusterlabs.org
Data: 25 agosto 2016 18.07.42 CEST
Oggetto: Re: [ClusterLabs] ocf::heartbeat:IPaddr
On 08/25/2016 10:51 AM, Gabriele Bulfon wrote:
Hi,
I'm advancing with this monster cluster on XStreamOS/illumos ;)
In the previous older tests I used heartbeat, and I had these
lines to
take care of the swapping public IP addresses:
primitive xstorage1_wan1_IP ocf:heartbeat:IPaddr params
ip="1.2.3.4"
cidr_netmask="255.255.255.0" nic="e1000g1"
primitive xstorage2_wan2_IP ocf:heartbeat:IPaddr params
ip="1.2.3.5"
cidr_netmask="255.255.255.0" nic="e1000g1"
location xstorage1_wan1_IP_pref xstorage1_wan1_IP 100: xstorage1
location xstorage2_wan2_IP_pref xstorage2_wan2_IP 100: xstorage2
They get configured, but then I get this in crm status:
xstorage1_wan1_IP (ocf::heartbeat:IPaddr): Stopped
xstorage2_wan2_IP (ocf::heartbeat:IPaddr): Stopped
Failed Actions:
* xstorage1_wan1_IP_start_0 on xstorage1 'not installed' (5):
call=20,
status=complete, exitreason='Setup problem: couldn't find command:
/usr/bin/gawk',
last-rc-change='Thu Aug 25 17:50:32 2016', queued=1ms, exec=158ms
* xstorage2_wan2_IP_start_0 on xstorage1 'not installed' (5):
call=22,
status=complete, exitreason='Setup problem: couldn't find command:
/usr/bin/gawk',
last-rc-change='Thu Aug 25 17:50:33 2016', queued=1ms, exec=29ms
* xstorage1_wan1_IP_start_0 on xstorage2 'not installed' (5):
call=22,
status=complete, exitreason='Setup problem: couldn't find command:
/usr/bin/gawk',
last-rc-change='Thu Aug 25 17:50:30 2016', queued=1ms, exec=36ms
* xstorage2_wan2_IP_start_0 on xstorage2 'not installed' (5):
call=20,
status=complete, exitreason='Setup problem: couldn't find command:
/usr/bin/gawk',
last-rc-change='Thu Aug 25 17:50:29 2016', queued=0ms, exec=150ms
The crm configure process already checked of the presence of the
required IPaddr shell, and it was ok.
Now looks like it's looking for "/usr/bin/gawk", and that is
actually there!
Is there any known incompatibility with the mixed heartbeat
ocf ? Should
I use corosync specific ocf files or something else?
"heartbeat" in this case is just an OCF provider name, and has
nothing
to do with the heartbeat messaging layer, other than having its
origin
in the same project. There actually has been a recent proposal
to rename
the provider to "clusterlabs" to better reflect the current reality.
The "couldn't find command" message comes from the ocf-binaries
shell
functions. If you look at have_binary() there, it uses sed and
which,
and I'm guessing that fails on your OS somehow. You may need to
patch it.
Thanks again!
Gabriele
----------------------------------------------------------------------------------------
*Sonicle S.r.l. *: http://www.sonicle.com
*Music: *http://www.gabrielebulfon.com
*Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20160829/d201986b/attachment.htm>
More information about the Users
mailing list