[Pacemaker] Startup ordering problem

Brett Delle Grazie brett.dellegrazie at intact-is.com
Fri Sep 10 13:24:06 UTC 2010


Hi,

Comments interleaved below,

On Thu, 2010-09-09 at 15:43 +0000, Jody Nickel wrote:
> I'm attempting to deliver my first clustering solution and
> I'm making progress but I'm having trouble with getting
> things to start in the proper order.
> 
> Here is what I'd like it to do:
> 
> The two database services run on "special" nodes that have 
> more disk capacity. At most 2 nodes in the cluster can 
> be database nodes, 1 primary 1 replicated.
> 
> Apache can run on any node in the cluster, load balances to
> the tomcat nodes.
> 
> Tomcat runs on every node in the cluster.
> 
> 
> When the primary database (non-standby database) goes down 
> I need for Tomcat to be restarted so that any database connections 
> to the failed db service are gone, and brought back up after the standby 
> has been re-started as the new primary.

One alternative approach (and there are others) is to configure the
database pool in Tomcat to retry connecting to the database
automatically.  This usually requires that the database is served on a
floating IP.

Specific database configuration varies according to what database you're
connecting to.  Basic Tomcat configuration is described here:
http://tomcat.apache.org/tomcat-6.0-doc/jndi-datasource-examples-howto.html

Tomcat 6 and below use the DBCP pool by default, configuration described
here:
http://commons.apache.org/dbcp/configuration.html

Tomcat 7 has its own pooling mechanism which by all reports is
significantly faster.

Oracle, DB2, MySQL all have options in their JDBC drivers to connect to
multiple systems.  All of these require a modicum of application support
to preserve ACID compliance under failure conditions.

Alternatively someone else might have a constraint which you could use
to effect a restart.  I'm not sure what it might be however.

> Apache should probably be the last service up, so that when it's IP 
> responds Tomcat and the database are up and ready and requests can
> be serviced w/o failures.
> 
> 
> If the node running the primary database is taken down, the
> standby database stops and then restarts as the primary, but 
> the tomcat isn't restarted so it still maintains database
> connections in it's connection pool to the old server and fails
> when trying to fill requests.
> 
> 
> 
> node server1 \
> 	attributes db="true" standby="off"
> node server2 \
> 	attributes db="true" standby="off"
> primitive ApacheServer ocf:heartbeat:apache \
> 	params statusurl="http://10.9.99.10/index.html" port="80" \
>         testregex="success" envfiles="/etc/apache2/envvars" \
>         configfile="/etc/apache2/apache2.conf" httpd="/usr/sbin/apache2" \
> 	op monitor interval="1min" \
> 	op start interval="0" timeout="90s" \
> 	op stop interval="0" timeout="90s"
> primitive ApacheServerEndPointIP ocf:heartbeat:IPaddr2 \
> 	params ip="10.9.99.11" cidr_netmask="20" \
> 	op monitor interval="90s"
> primitive ApacheServerManagementIP ocf:heartbeat:IPaddr2 \
> 	params ip="10.9.99.10" cidr_netmask="20" \
> 	op monitor interval="90s"
> primitive PrimaryDB lsb:primarydb.rb \
> 	op monitor interval="1min" \
> 	meta priority="50"
> primitive PrimaryDBIP ocf:heartbeat:IPaddr2 \
> 	params ip="10.9.99.12" cidr_netmask="20" \
> 	op monitor interval="90s" \
> 	op stop interval="0" timeout="180s" \
> 	op start interval="0" timeout="90s"
> primitive StandbyDB lsb:failoverdb.rb \
> 	op monitor interval="1min" \
> 	meta priority="10"
> primitive StandbyDBIP ocf:heartbeat:IPaddr2 \
> 	params ip="10.9.99.13" cidr_netmask="20" \
> 	op monitor interval="90s" \
> 	op stop interval="0" timeout="180s" \
> 	op start interval="0" timeout="90s"
> primitive Tomcat lsb:tomcat6 \
> 	op monitor interval="1min"
> group apache_group ApacheServerManagementIP \
>                    ApacheServerEndPointIP \
>                    ApacheServer
> 
> group primary_group PrimaryDB PrimaryDBIP
> group standby_group StandbyDB StandbyDBIP
> 
> clone tomcat_group Tomcat \
> 	meta globally-unique="false"
> 
> colocation SeparatePrimaryAndStandby -inf: StandbyDB PrimaryDB
> 
> order startup_order inf: primary_group:start \
>                          tomcat_group:start \
>                          apache_group:start
> order stop_order inf: apache_group:stop \
>                       tomcat_group:stop \
>                       primary_group:stop
> 
> property $id="cib-bootstrap-options" \
> 	dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
> 	cluster-infrastructure="openais" \
> 	expected-quorum-votes="2" \
> 	stonith-enabled="false" \
> 	no-quorum-policy="ignore" \
> 	symmetric-cluster="true"
> rsc_defaults $id="rsc-options" \
> 	resource-stickiness="100"
> 
> 
> 
> 

-- 
Best Regards,

Brett Delle Grazie

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________



More information about the Pacemaker mailing list