[ClusterLabs] systemd: xxxx.service start request repeated too quickly
Juha Heinanen
jh at tutpro.com
Tue Aug 4 07:27:33 EDT 2015
I have a resource group that consists of file system, virtual ip, mysql
server, and service xxxx. I removed a database from mysql server that
is required for service xxxx to start. After that I started to get huge
number of messages to syslog showing corosync/pacemaker trying to restart
service xxxx over and over again. A snapshot of the messages is shown
below. Is there something that I can do to prevent this?
-- Juha
Aug 4 14:05:14 node1 systemd[1]: xxxx.service start request repeated too quickly, refusing to start.
Aug 4 14:05:14 node1 systemd[1]: Failed to start LSB: Start/stop XXXX.
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now disconnected from corosync
Aug 4 14:05:14 node1 pacemaker_remoted[865]: notice: operation_finished: xxxx_start_0:5804:stderr [ Job for xxxx.service failed. See 'systemctl status xxxx.service' and 'journalctl -xn' for details. ]
Aug 4 14:05:14 node1 crmd[868]: notice: process_lrm_event: Operation xxxx_start_0: unknown error (node=node1, call=62, rc=1, cib-update=105, confirmed=true)
Aug 4 14:05:14 node1 crmd[868]: notice: process_lrm_event: node1-xxxx_start_0:62 [ Job for xxxx.service failed. See 'systemctl status xxxx.service' and 'journalctl -xn' for details.\n ]
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] cib:863:0x7f7e43e34340 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now disconnected from corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now connected to corosync
Aug 4 14:05:14 node1 crmd[868]: warning: status_from_rc: Action 45 (xxxx_start_0) on node1 failed (target: 0 vs. rc: 1): Error
Aug 4 14:05:14 node1 crmd[868]: warning: update_failcount: Updating failcount for xxxx on node1 after failed start: rc=1 (update=value++, time=1438686314)
Aug 4 14:05:14 node1 crmd[868]: warning: update_failcount: Updating failcount for xxxx on node1 after failed start: rc=1 (update=value++, time=1438686314)
Aug 4 14:05:14 node1 crmd[868]: warning: status_from_rc: Action 45 (xxxx_start_0) on node1 failed (target: 0 vs. rc: 1): Error
Aug 4 14:05:14 node1 crmd[868]: warning: update_failcount: Updating failcount for xxxx on node1 after failed start: rc=1 (update=value++, time=1438686314)
Aug 4 14:05:14 node1 crmd[868]: warning: update_failcount: Updating failcount for xxxx on node1 after failed start: rc=1 (update=value++, time=1438686314)
Aug 4 14:05:14 node1 crmd[868]: notice: run_graph: Transition 35 (Complete=2, Pending=0, Fired=0, Skipped=2, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-96.bz2): Stopped
Aug 4 14:05:14 node1 pengine[867]: notice: unpack_config: On loss of CCM Quorum: Ignore
Aug 4 14:05:14 node1 pengine[867]: warning: unpack_rsc_op_failure: Processing failed op start for xxxx on node1: unknown error (1)
Aug 4 14:05:14 node1 pengine[867]: warning: unpack_rsc_op_failure: Processing failed op start for xxxx on node1: unknown error (1)
Aug 4 14:05:14 node1 pengine[867]: notice: LogActions: Recover xxxx#011(Started node1)
Aug 4 14:05:14 node1 pengine[867]: notice: process_pe_message: Calculated Transition 36: /var/lib/pacemaker/pengine/pe-input-97.bz2
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] cib:863:0x7f7e43e34340 is now disconnected from corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now disconnected from corosync
Aug 4 14:05:14 node1 crmd[868]: notice: te_rsc_command: Initiating action 4: stop xxxx_stop_0 on node1 (local)
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now disconnected from corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now disconnected from corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now disconnected from corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now disconnected from corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now disconnected from corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e34340 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e34340 is now disconnected from corosync
Aug 4 14:05:14 node1 crmd[868]: notice: process_lrm_event: Operation xxxx_stop_0: ok (node=node1, call=63, rc=0, cib-update=107, confirmed=true)
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e34340 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e34340 is now disconnected from corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e34340 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] cib:863:0x7f7e43e2ebd0 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e34340 is now disconnected from corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e34340 is now connected to corosync
Aug 4 14:05:14 node1 crmd[868]: notice: te_rsc_command: Initiating action 46: start xxxx_start_0 on node1 (local)
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] cib:863:0x7f7e43e2ebd0 is now disconnected from corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e34340 is now disconnected from corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e34340 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] cib:863:0x7f7e43e2ebd0 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e34340 is now disconnected from corosync
Aug 4 14:05:14 node1 crmd[868]: notice: abort_transition_graph: Transition aborted by status-1084752129-fail-count-xxxx, fail-count-xxxx=9: Transient attribute change (modify cib=1.49.37, source=te_update_diff:391, path=/cib/status/node_state[@id='1084752129']/transient_attributes[@id='1084752129']/instance_attributes[@id='status-1084752129']/nvpair[@id='status-1084752129-fail-count-xxxx'], 0)
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e34340 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] cib:863:0x7f7e43e2ebd0 is now disconnected from corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e34340 is now disconnected from corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now disconnected from corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now connected to corosync
Aug 4 14:05:14 node1 notifyd[836]: [notice] node1[1084752129] attrd:866:0x7f7e43e2ebd0 is now disconnected from corosync
Aug 4 14:05:14 node1 systemd[1]: xxxx.service start request repeated too quickly, refusing to start.
More information about the Users
mailing list