[Pacemaker] Problem with state: UNCLEAN (OFFLINE)

Fri Jun 8 11:01:08 UTC 2012

Problem with state: UNCLEAN (OFFLINE)

Hello,

I'm trying to get up a directord service with pacemaker.

But, I found a problem with the unclean (offline) state. The initial 
state of my cluster was this:

    /Online: [ node2 node1 ]

    node1-STONITH    (stonith:external/ipmi):        Started node2
    node2-STONITH    (stonith:external/ipmi):        Started node1
      Clone Set: Connected
          Started: [ node2 node1 ]
      Clone Set: ldirector-activo-activo
          Started: [ node2 node1 ]
    ftp-vip (ocf::heartbeat:IPaddr):        Started node1
    web-vip (ocf::heartbeat:IPaddr):        Started node2

    Migration summary:
    * Node node1:  pingd=2000
    * Node node2:  pingd=2000
        node2-STONITH: migration-threshold=1000000 fail-count=1000000
    /

and then, I removed the electric connection of node1, the state was the 
next:

    /Node node1 (8b2aede9-61bb-4a5a-aef6-25fbdefdddfd): UNCLEAN (offline)
    Online: [ node2 ]

    node1-STONITH    (stonith:external/ipmi):        Started node2 FAILED
      Clone Set: Connected
          Started: [ node2 ]
          Stopped: [ ping:1 ]
      Clone Set: ldirector-activo-activo
          Started: [ node2 ]
          Stopped: [ ldirectord:1 ]
    web-vip (ocf::heartbeat:IPaddr):        Started node2

    Migration summary:
    * Node node2:  pingd=2000
        node2-STONITH: migration-threshold=1000000 fail-count=1000000
        node1-STONITH: migration-threshold=1000000 fail-count=1000000

    Failed actions:
         node2-STONITH_start_0 (node=node2, call=22, rc=2,
    status=complete): invalid parameter
         node1-STONITH_monitor_60000 (node=node2, call=11, rc=14,
    status=complete): status: unknown
         node1-STONITH_start_0 (node=node2, call=34, rc=1,
    status=complete): unknown error
    /

I was hoping that node2 take the management of ftp-vip resource, but it 
wasn't in that way. node1 kept in a unclean state and node2 didn't take 
the management of its resources. When I put back the electric connection 
of node1 and it was recovered then, node2 took the management of ftp-vip 
resource.

I've seen some similar conversations here. Please, could you show me 
some idea about this subject or some thread where this is discussed?

Thanks a lot!

Regards,

-- 
Juan Manuel Sierra Prieto
Administración de Sistemas
Centro Informatico Cientifico de Andalucia (CICA)
Avda. Reina Mercedes s/n - 41012 - Sevilla (Spain)
Tfno.: +34 955 056 600 / FAX: +34 955 056 650
Consejería de Economía, Innovación y Ciencia
Junta de Andalucía

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120608/e49e6006/attachment-0003.html>