[Pacemaker] making resource managed

Tue Nov 9 13:14:29 UTC 2010

У вт, 2010-11-09 у 09:49 +0100, Andrew Beekhof пише:
> being unmanaged is a side-effect of a) the resource failing to stop
> and b) no fencing being configured
> once you've fixed the error, run crm resource cleanup as misch suggested
> 

I understand that.
However, for example, in situation when VPS fails to start (not to stop)
because of lack of configuration file and due to this becomes unmanaged,
I run:

crm(live)# status
============
Last updated: Tue Nov  9 14:53:09 2010
Stack: Heartbeat
Current DC: ha-3 (a1ad8f56-7eb0-4aec-8d32-83e283903879) - partition with
quorum
Version: 1.0.9-89bd754939df5150de7cd76835f98fe90851b677
2 Nodes configured, unknown expected votes
2 Resources configured.
============

Online: [ ha-3 ha-4 ]

 test_ManageVE (ocf::heartbeat:ManageVE): Started ha-3
 ca (ocf::heartbeat:ManageVE): Started ha-3 (unmanaged) FAILED

Failed actions:
    ca_start_0 (node=ha-3, call=48, rc=5, status=complete): not
installed
    ca_stop_0 (node=ha-3, call=49, rc=1, status=complete): unknown error

After fixing the issue (and checking that VPS really can be started via
shell):

crm(live)# resource cleanup ca
Cleaning up ca on ha-3
Cleaning up ca on ha-4

Got the following in /var/log/messages on current DC ha-3:

Nov  9 14:58:19 ha-3 crmd: [8434]: notice: do_lrm_invoke: Not creating
resource for a delete event: (null)
Nov  9 14:58:19 ha-3 crmd: [8434]: info: send_direct_ack: ACK'ing
resource op ca_delete_60000 from 0:0:crm-resource-17296:
lrm_invoke-lrmd-1289307499-777
Nov  9 14:58:20 ha-3 attrd: [8433]: info: attrd_ha_callback: Update
relayed from ha-4
Nov  9 14:58:25 ha-3 lrmd: [8431]: info: Resource Agent output: []
Nov  9 14:58:25 ha-3 lrmd: [8431]: notice: read's ret: 0 when lrmd_op
finished

crm(live)# resource manage ca
Log:
Nov  9 15:00:48 ha-3 cib: [8430]: info: cib_process_request: Operation
complete: op cib_replace for section resources (origin=ha-4/cibadmin/2,
version=0.92.2): ok (rc=0)

And after this still:
Online: [ ha-3 ha-4 ]

 test_ManageVE (ocf::heartbeat:ManageVE): Started ha-3
 ca (ocf::heartbeat:ManageVE): Started ha-3 (unmanaged) FAILED

Failed actions:
    ca_start_0 (node=ha-3, call=48, rc=5, status=complete): not
installed
    ca_stop_0 (node=ha-3, call=49, rc=1, status=complete): unknown error

If after this I edit CIB and apply it, all LRM messages disappear and
resource starts managed as it should.
Seems like cleanup does not clean all the status information.

What am I missing?

-- 
Vadim S. Khondar
v.khondar at o3.ua