[Pacemaker] Trouble with ocf:Squid resource agent

Mon Aug 13 08:07:46 EDT 2012

Hi,

On Mon, Jul 30, 2012 at 12:09:10PM -0400, Jake Smith wrote:
> 
> ----- Original Message -----
> > From: "Julien Cornuwel" <cornuwel at gmail.com>
> > To: pacemaker at oss.clusterlabs.org
> > Sent: Wednesday, July 25, 2012 5:51:28 AM
> > Subject: Re: [Pacemaker] Trouble with ocf:Squid resource agent
> > 
> > Oops! Spoke too fast. The fix below allows squid to start. But the
> > script also has problems in the 'stop' part. It is stuck in an
> > infinite loop and here are the logs (repeats every second) :
> > 
> > Jul 25 11:38:47 corsen-a lrmd: [24099]: info: RA output:
> > (Proxy:stop:stderr) /usr/lib/ocf/resource.d//heartbeat/Squid: line
> > 320: kill: -: arguments must be process or job IDs
> > Jul 25 11:38:47 corsen-a lrmd: [24099]: info: RA output:
> > (Proxy:stop:stderr) /usr/lib/ocf/resource.d//heartbeat/Squid: line
> > 320: kill: -: arguments must be process or job IDs
> > Jul 25 11:38:48 corsen-a Squid(Proxy)[24659]: [25682]: INFO:
> > squid:stop_squid:318:  try to stop by SIGKILL: -
> > Jul 25 11:38:48 corsen-a Squid(Proxy)[24659]: [25682]: INFO:
> > squid:stop_squid:318:  try to stop by SIGKILL: -
> > 
> > Being on a deadline, I'll use the lsb script for the moment. If
> > someone figures out how to use this ocf script, I'm very interrested.
> > 
> 
> I took a quick look at the OCF... here's the stop section with inline comments from me (###)
> 
> stop_squid()
> {
> 	typeset lapse_sec
> 
> 	if ocf_run $SQUID_EXE -f $SQUID_CONF -k shutdown; then
> 		lapse_sec=0
> 		while true; do
> 			get_pids
> 			if is_squid_dead; then
> 				rm -f $SQUID_PIDFILE
> 				return $OCF_SUCCESS
> 			fi
> 			(( lapse_sec = lapse_sec + 1 ))
> 			if (( lapse_sec > SQUID_STOP_TIMEOUT )); then
> 
> ### looks to me like you're hitting the line above which then breaks out and drops down to the "while true" 8 lines down.  I would time a manual stop of squid (I know it takes quite awhile) and make sure you're primitive's "op stop interval="0" timeout="120s"" is set high enough (definately more than 120s I would assume) that the elapsed time to stop squid doesn't normally exceed the timeout value.
> 
> 				break
> 			fi
> 			sleep 1
> 			ocf_log info "$SQUID_NAME:$FUNCNAME:$LINENO: " \
> 				"stop NORM $lapse_sec/$SQUID_STOP_TIMEOUT"
> 		done
> 	fi
> 
> 	while true; do
> 		get_pids
> 		ocf_log info "$SQUID_NAME:$FUNCNAME:$LINENO: " \
> 			"try to stop by SIGKILL:${SQUID_PIDS[0]} ${SQUID_PIDS[2]}"
> 		kill -KILL ${SQUID_PIDS[0]} ${SQUID_PIDS[2]}
> 
> ### have you tried manually running the above line and see what you get (inserting the correct PID's of course)?  Maybe the kill -KILL syntax is invalid for your flavor of linux and the OCF needs to be updated to take that into account when running the kill command?  Even if you increase the timeout above to a normally reasonable value you still want it to be able to kill it if it is unresponsive!
> 
> 		sleep 1
> 		if is_squid_dead; then
> 			rm -f $SQUID_PIDFILE
> 			return $OCF_SUCCESS
> 		fi
> 	done
> 
> 	return $OCF_ERR_GENERIC
> }
> 
> 
> > Regards
> > 
> > 
> > 2012/7/24 Julien Cornuwel <cornuwel at gmail.com>:
> > > Hi,
> > >
> > > Fixed! The problem comes from the squid ocf script
> > > (/usr/lib/ocf/resource.d/heartbeat/Squid) that doesn't handle IPv6
> > > addresses correctly.
> > > All you have to do is modify the line 198 as such :
> > > awk '/(tcp.*[0-9]+\.[0-9]+\.+[0-9]+\.[0-9]+:'$SQUID_PORT'
> > > |tcp.*:::'$SQUID_PORT' )/{
> > >
> > > Source:
> > > http://www.n3oxid.fr/index.php?post/2012/04/07/Installation-et-configuration-d-un-cluster-Pacemaker/CoroSync-sous-GNU/Linux-Debian-6-%28Squeeze%29
> > >
> 
> Not sure if the above fully patches the OCF for squid ipv4 and ipv6 but I would recommend submitting a patch against the resource agent so in the future it just works ;-)

Yes. If somebody opens a bugzilla at LF
(https://developerbugs.linuxfoundation.org/) or an issue at
https://github.com/ClusterLabs/resource-agents somebody
(hopefully the author) will take care of it.

Thanks,

Dejan

> HTH
> Jake
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org