[Pacemaker] mysql cluster resource agent failed to start

Fri Aug 17 10:22:19 EDT 2012

> Ok, that is the reason why Fridays are better to make documentation or
> something like that, instead of fixing problems on clusters. Sorry, that
> was my fault.
> 
> Here are the logs of that service group before the shutdown:
> 
> Aug 17 13:45:33 hydra2 crmd: [26828]: info: te_rsc_command: Initiating
> action 413: start mysqltest03-db_start_0 on hydra2 (local)
> 
> Aug 17 13:45:33 hydra2 crmd: [26828]: info: do_lrm_rsc_op: Performing
> key=413:850:0:eb13866d-3a8f-4d87-bc81-82e893dc72d6
> op=mysqltest03-db_start_0 )
> 
> Aug 17 13:45:35 hydra2 lrmd: [26825]: info: rsc:mysqltest03-db start[238]
> (pid 11388)
> 
> Aug 17 13:45:39 hydra2 lrmd: [26825]: info: operation start[238] on
> mysqltest03-db for client 26828: pid 11388 exited with return code 1
> 
> Aug 17 13:45:39 hydra2 crmd: [26828]: info: process_lrm_event: LRM
> operation mysqltest03-db_start_0 (call=238, rc=1, cib-update=1320,
> confirmed=true) unknown error
> 
> Aug 17 13:45:39 hydra2 crmd: [26828]: WARN: status_from_rc: Action 413
> (mysqltest03-db_start_0) on hydra2 failed (target: 0 vs. rc: 1): Error
> 
> Aug 17 13:45:39 hydra2 crmd: [26828]: WARN: update_failcount: Updating

> failcount for mysqltest03-db on hydra2 after failed start: rc=1
> (update=INFINITY, time=1345203939)

Ok. start failed, but did not tell why. Bad.

> Be patient with me. I only see in the logs, that there was an unknown
> error. As I wrote before, if I start the cluster agent by hand it worked
> without problems: 782  export OCF_ROOT=/usr/lib/ocf
>   783  export OCF_RESKEY_binary=/srv/mysql-server/releases/mysql/bin/mysqld
>   784  export OCF_RESKEY_config="/srv/mysql/mysqltest03/admin/etc/my.cnf"
>   785  export OCF_RESKEY_user=mysql
>   786  export OCF_RESKEY_group=mysql
>   787  export OCF_RESKEY_datadir=/srv/mysql/mysqltest03/data
>   788  export OCF_RESKEY_log="/srv/mysql/mysqltest03/admin/log/mysqld.log"
>   789  export OCF_RESKEY_pid="/srv/mysql/mysqltest03/admin/run/mysqld.pid"
>   790  export
> OCF_RESKEY_socket="/srv/mysql/mysqltest03/admin/run/mysqld.sock" 791 
> export OCF_RESKEY_additional_parameters="--bind-address=xx.xx.xx.xx" 792 
> /usr/lib/ocf/resource.d/heartbeat/mysql start; echo $?
> 
> 
> The only reason,I can Imagine, for this behavior is, that the cluster sends
> a monitor call after startup with a negative response, because the mysql
> needs more time to start running.
> 
> Thanks
> 
> 
> Josef

Have a look into the resource agent to understand what happens. Add ocf_log 
entries to log debug messages.

For a quick and dirty workaround you could add a start-delay to your resource 
to delay the first monitoring. But that does not solve the real problem. Fix 
your setup by debugging the script/your setup.

-- 
Dr. Michael Schwartzkopff
Guardinistr. 63
81375 München

Tel: (0163) 172 50 98
Fax: (089) 620 304 13
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120817/e0586dda/attachment-0003.sig>