[Pacemaker] Issue with controling resource Oracle Database Express

Dejan Muhamedagic dejanmm at fastmail.fm
Fri May 7 12:58:14 UTC 2010


Hi,

On Fri, May 07, 2010 at 11:54:27AM +0200, JECH Ladislav wrote:
> Hi,
> 
> I finally make Pacemaker up and running on CentOS 5.4. Currently using
> Heartbeat, but I want to switch to OpenAIS(Corosync). There were some
> problems related to Python and XML, strace crm still try to open some
> files which don't exist, I also did some symbolic links because of bad
> paths. I will try to sumarize these problems in another thread. But now
> both nodes of my Active/Passive cluster with shared storage running
> OCFS2 as filesystem are online using following configuration.
> 
> [root at tidevfnkv1 python2.4]# cat /var/lib/heartbeat/crm/cib.xml
> <cib validate-with="pacemaker-1.0" crm_feature_set="3.0.1"
> have-quorum="1" dc-uuid="e7cf0526-5304-45f1-b9ee-0ee9fe69c834"
> admin_epoch="0" epoch="29" num_updates="0" cib-last-written="Fri May  7
> 05:05:28 2010">
>   <configuration>
>     <crm_config>
>       <cluster_property_set id="cib-bootstrap-options">
>         <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
> value="1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7"/>
>         <nvpair id="cib-bootstrap-options-cluster-infrastructure"
> name="cluster-infrastructure" value="Heartbeat"/>
>         <nvpair id="cib-bootstrap-options-stonith-enabled"
> name="stonith-enabled" value="false"/>
>       </cluster_property_set>
>     </crm_config>
>     <nodes>
>       <node type="normal" uname="tidevfnkv2"
> id="e7cf0526-5304-45f1-b9ee-0ee9fe69c834">
>         <instance_attributes
> id="nodes-e7cf0526-5304-45f1-b9ee-0ee9fe69c834">
>           <nvpair name="standby"
> id="nodes-e7cf0526-5304-45f1-b9ee-0ee9fe69c834-standby" value="on"/>
>         </instance_attributes>
>       </node>
>       <node type="normal" uname="tidevfnkv1"
> id="db65bdf6-ecd2-4bcb-9ef5-451681ec2906">
>         <instance_attributes
> id="nodes-db65bdf6-ecd2-4bcb-9ef5-451681ec2906">
>           <nvpair name="standby"
> id="nodes-db65bdf6-ecd2-4bcb-9ef5-451681ec2906-standby" value="off"/>
>         </instance_attributes>
>       </node>
>     </nodes>
>     <resources>
>       <group id="ip_fnkv_cluster">
>         <primitive class="ocf" id="failover-ip" provider="heartbeat"
> type="IPaddr">
>           <instance_attributes id="failover-ip-instance_attributes">
>             <nvpair id="failover-ip-instance_attributes-ip" name="ip"
> value="172.28.140.113"/>
>           </instance_attributes>
>           <operations>
>             <op id="failover-ip-monitor-10s" interval="10s"
> name="monitor"/>
>           </operations>
>         </primitive>
>         <primitive class="lsb" id="failover-apache" type="httpd">
>           <operations>
>             <op id="failover-apache-monitor-15s" interval="15s"
> name="monitor"/>
>           </operations>
>         </primitive>
>       </group>
>       <primitive class="ocf" id="pingd" provider="pacemaker"
> type="pingd">
>         <instance_attributes id="pingd-instance_attributes">
>           <nvpair id="pingd-instance_attributes-host_list"
> name="host_list" value="172.28.140.10"/>
>           <nvpair id="pingd-instance_attributes-multiplier"
> name="multiplier" value="100"/>
>         </instance_attributes>
>         <operations>
>           <op id="pingd-monitor-15s" interval="15s" name="monitor"
> timeout="5s"/>
>         </operations>
>       </primitive>
>       <primitive class="ocf" id="failover-oracle" provider="heartbeat"
> type="oracle">
>         <instance_attributes id="failover-oracle-instance_attributes">
>           <nvpair id="failover-oracle-instance_attributes-sid"
> name="sid" value="XE"/>
>           <nvpair id="failover-oracle-instance_attributes-home"
> name="home"
> value="/usr/lib/oracle/xe/app/oracle/product/10.2.0/server"/>
>           <nvpair id="failover-oracle-instance_attributes-user"
> name="user" value="oracle"/>
>         </instance_attributes>
>         <operations>
>           <op id="failover-oracle-monitor-5s" interval="5s"
> name="monitor" on-fail="restart" timeout="30s"/>
>         </operations>
>       </primitive>
>     </resources>
>     <constraints>
>       <rsc_location id="ip_fnkv_cluster_on_connected_node"
> rsc="ip_fnkv_cluster">
>         <rule boolean-op="or"
> id="ip_fnkv_cluster_on_connected_node-rule" score="-INFINITY">
>           <expression attribute="pingd"
> id="ip_fnkv_cluster_on_connected_node-expression"
> operation="not_defined"/>
>           <expression attribute="pind"
> id="ip_fnkv_cluster_on_connected_node-expression-0" operation="lte"
> value="0"/>
>         </rule>
>       </rsc_location>
>     </constraints>
>     <op_defaults/>
>     <rsc_defaults/>
>   </configuration>

Better use crm configure show to print the configuration.

> Ok, output of crm_mon is following:
> ============
> Last updated: Fri May  7 05:27:20 2010
> Stack: Heartbeat
> Current DC: tidevfnkv2 (e7cf0526-5304-45f1-b9ee-0ee9fe69c834) -
> partition with quorum
> Version: 1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7
> 2 Nodes configured, unknown expected votes
> 3 Resources configured.
> ============
> 
> Online: [ tidevfnkv2 tidevfnkv1 ]
> 
>  Resource Group: ip_fnkv_cluster
>      failover-ip        (ocf::heartbeat:IPaddr):        Started
> tidevfnkv2
>      failover-apache    (lsb:httpd):    Started tidevfnkv2
> pingd   (ocf::pacemaker:pingd): Started tidevfnkv1
> 
> Failed actions:
>     failover-oracle_start_0 (node=tidevfnkv2, call=18, rc=1,
> status=complete): unknown error
>     failover-oracle_monitor_5000 (node=tidevfnkv1, call=42, rc=7,
> status=complete): not running
>     failover-oracle_start_0 (node=tidevfnkv1, call=44, rc=1,
> status=complete): unknown error
> 
> And there is problem with starting up the Oracle Database.

You should take a look at the logs and find out why the start
action failed.

> I have to say
> I selected free Express edition, It was not my decision to select this
> type of db, but this is reality. It seems like the resource agent
> related to oracle is not ready to use with Express edition, but only
> with full version of database. There is also second resource agent for
> oracle listener.
> 
> But, OraDB Express is installed with built-in scripts to start and stop
> db.

Just like any other oracle version I guess.

> Each script start/stop both, the listener and instance.
> I will provide here code of both scripts>
> 
> startdb.sh>
> #!/bin/bash
> #
> #       svaggu 09/28/05 -  Creation
> #	svaggu 11/09/05 -  dba groupd check is added
> #
> 
> xsetroot -cursor_name watch
> case $PATH in
>     "") PATH=/bin:/usr/bin:/sbin:/etc
>         export PATH ;;
> esac
> 
> SAVE_LLP=$LD_LIBRARY_PATH
> 
> ORACLE_HOME=/usr/lib/oracle/xe/app/oracle/product/10.2.0/server
> ORACLE_SID=XE
> LSNR=$ORACLE_HOME/bin/lsnrctl
> SQLPLUS=$ORACLE_HOME/bin/sqlplus
> export ORACLE_HOME
> export ORACLE_SID
> LOG="$ORACLE_HOME_LISTNER/listener.log"
> user=`/usr/bin/whoami`
> group=`/usr/bin/groups $user | grep dba`
> if test -z "$group"
> then
> 	xterm -T "Warning" -n "Warning" -hold -e "echo Operation failed.
> $user is not a member of \'dba\' group." 
> else
> # Starting Oracle Database 10g Express Edition instance and Listener
> 	$SQLPLUS -s /nolog @$ORACLE_HOME/config/scripts/startdb.sql >
> /dev/null 2>&1
> 	if [ ! `ps -ef | grep tns | cut -f1 -d" " | grep -q oracle` ]
> 	then
> 		$LSNR start > /dev/null 2>&1
> 	else
> 		echo ""
> 	fi
> fi
> 	xsetroot -cursor_name left_ptr
> 
> startdb.sql>
> connect / as sysdba
> startup
> exit
> 
> stopdb.sh>
> #!/bin/bash
> #
> #       svaggu 09/28/05 -  Creation
> #       svaggu 11/09/05 -  dba groupd check is added
> #
> 
> xsetroot -cursor_name watch
> 
> case $PATH in
>     "") PATH=/bin:/usr/bin:/sbin:/etc
>         export PATH ;;
> esac
> 
> SAVE_LLP=$LD_LIBRARY_PATH
> 
> ORACLE_HOME=/usr/lib/oracle/xe/app/oracle/product/10.2.0/server
> ORACLE_SID=XE
> SQLPLUS=$ORACLE_HOME/bin/sqlplus
> export ORACLE_HOME
> export ORACLE_SID
> user=`/usr/bin/whoami`
> group=`/usr/bin/groups $user | grep dba`
> if test -z "$group"
> then
>         xterm -T "Warning" -n "Warning" -hold -e "echo Operation failed.
> $user is not a member of \'dba\' group." 
> else
> # Stop Oracle Database 10g Express Edition instance
> 	$SQLPLUS -s /nolog @$ORACLE_HOME/config/scripts/stopdb.sql >
> /dev/null 2>&1
> fi
> 	
> xsetroot -cursor_name left_ptr
> 
> stopdb.sql>
> connect / as sysdba
> shutdown immediate
> exit
> 
> Ok, then my resource agent is located at
> /usr/lib/ocf/resource.d/heartbeat/oracle, and here is what I want to do.
> I want to create my own resource agent, let me name it "oraclexe", and
> here are my questions>
> 1.) Is it possible to create my own resource agent "oraclexe"(which will
> start both listener and db instance) only with creating new shell file
> in /usr/lib/ocf/resource.d/heartbeat/ directory?

Not just like that. The script has to follow the OCF standard.

> 2.) Is there a way to debug/trace resource agents in case the do not
> work in expected way?

Yes, you can use ocf-tester. Or run the script by hand:

OCF_RESKEY_sid=XE /usr/lib/ocf/resource.d/heartbeat/oracle start

> 3.) Do you have another aproach or solution to my issue?

I'd strongly suggest to use the existing oracle RA.

Thanks,

Dejan

> Thank you very much and anyway I have to say, that I went deeper into
> documentation of Pacemaker, Corosync, OpenAIS, ClusterGlue, CRM and I
> thing this is very good stuff. Thank you for your hard work.
> 
> Best regards,
> 
> Ladislav Jech
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf




More information about the Pacemaker mailing list