[Pacemaker] Drbd disk don't run

Fri May 15 11:54:31 UTC 2009

Hi, Dejan

The fist problem are solved, but now i have another.
When i try to start de ms-drbd11 resource i don't get any error, but in the
crm_mon i get the log:

============
Last updated: Fri May 15 08:44:11 2009
Current DC: node1 (57e0232d-5b78-4a1a-976e-e5335ba8266d) - partition with
quorum
Version: 1.0.3-b133b3f19797c00f9189f4b66b513963f9d25db9
2 Nodes configured, unknown expected votes
2 Resources configured.
============

Online: [ node1 node2 ]

Clone Set: drbdinit
        Started: [ node1 node2 ]

Failed actions:
    drbd11:0_start_0 (node=node1, call=9, rc=1, status=complete): unknown
error
    drbd11_start_0 (node=node1, call=17, rc=1, status=complete): unknown
error
    drbd11:1_start_0 (node=node2, call=9, rc=1, status=complete): unknown
error
    drbd11_start_0 (node=node2, call=16, rc=1, status=complete): unknown
error

So, in the messes log file, i get

May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_resources: No STONITH
resources have been defined
May 15 08:25:03 node1 pengine: [4749]: info: determine_online_status: Node
node1 is online
May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11:0_start_0
on node1 returned 1 (unknown error) instead of the expected value: 0 (ok)
May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
failed op drbd11:0_start_0 on node1: unknown error
May 15 08:25:03 node1 pengine: [4749]: WARN: process_orphan_resource:
Nothing known about resource drbd11 running on node1
May 15 08:25:03 node1 pengine: [4749]: info: log_data_element:
create_fake_resource: Orphan resource <primitive id="drbd11" type="drbd"
class="ocf" provider="heartbeat" />
May 15 08:25:03 node1 pengine: [4749]: info: process_orphan_resource: Making
sure orphan drbd11 is stopped
May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11_start_0
on node1 returned 1 (unknown error) instead of the expected value: 0 (ok)
May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
failed op drbd11_start_0 on node1: unknown error
May 15 08:25:03 node1 pengine: [4749]: info: determine_online_status: Node
node2 is online
May 15 08:25:03 node1 pengine: [4749]: info: find_clone: Internally renamed
drbdi:0 on node2 to drbdi:1
May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11:1_start_0
on node2 returned 1 (unknown error) instead of the expected value: 0 (ok)
May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
failed op drbd11:1_start_0 on node2: unknown error
May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11_start_0
on node2 returned 1 (unknown error) instead of the expected value: 0 (ok)
May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
failed op drbd11_start_0 on node2: unknown error
May 15 08:25:03 node1 pengine: [4749]: notice: clone_print: Clone Set:
drbdinit
May 15 08:25:03 node1 pengine: [4749]: notice: print_list:     Started: [
node1 node2 ]
May 15 08:25:03 node1 pengine: [4749]: notice: clone_print: Master/Slave
Set: ms-drbd11
May 15 08:25:03 node1 pengine: [4749]: notice: print_list:     Stopped: [
drbd11:0 drbd11:1 ]
May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: ms-drbd11 has
failed 1000000 times on node1
May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
Forcing ms-drbd11 away from node1 after 1000000 failures (max=1000000)
May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: drbd11 has
failed 1000000 times on node1
May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
Forcing drbd11 away from node1 after 1000000 failures (max=1000000)
May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: ms-drbd11 has
failed 1000000 times on node2
May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
Forcing ms-drbd11 away from node2 after 1000000 failures (max=1000000)
May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: drbd11 has
failed 1000000 times on node2
May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
Forcing drbd11 away from node2 after 1000000 failures (max=1000000)
May 15 08:25:03 node1 pengine: [4749]: WARN: native_color: Resource drbd11:0
cannot run anywhere
May 15 08:25:03 node1 pengine: [4749]: WARN: native_color: Resource drbd11:1
cannot run anywhere
May 15 08:25:03 node1 pengine: [4749]: info: master_color: ms-drbd11:
Promoted 0 instances of a possible 1 to master
May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
drbdi:0      (Started node1)
May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
drbdi:1      (Started node2)
May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
drbd11:0     (Stopped)
May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
drbd11:1     (Stopped)

I had this problem with heartbeatV2, then i'm using pacemaker with the same
error.
My idea is that the crm does the management of the drbd, ocfs2 and vmxen
resources to maintain them working...

To drbd resource init, the Sonith must be configured?

Thank you!

On Fri, May 15, 2009 at 7:02 AM, Dejan Muhamedagic <dejanmm at fastmail.fm>wrote:

> Hi,
>
> On Fri, May 15, 2009 at 06:47:37AM -0300, Rafael Emerick wrote:
> > Hi, Dejan
> >
> > thanks for attention
> > following my cib xml conf
> > I am newbie with pacemaker, any hint is very welcome! : D
>
> The CIB as seen by crm:
>
> primitive drbd11 ocf:heartbeat:drbd \
>        params drbd_resource="drbd11" \
>        op monitor interval="59s" role="Master" timeout="30s" \
>        op monitor interval="60s" role="Slave" timeout="30s" \
>        meta target-role="started" is-managed="true"
> ms ms-drbd11 drbd11 \
>        meta clone-max="2" notify="true" globally-unique="false"
> target-role="stopped"
>
> The target-role attribute is defined for both the primitive and
> the container (ms). You should remove the former:
>
> crm configure edit drbd11
>
> and remove all meta attributes (the whole "meta" part). And don't
> forget to remove the backslash in the line above it.
>
> Thanks,
>
> Dejan
>
> > thank you very much
> > for the help
> >
> >
> > On Fri, May 15, 2009 at 4:46 AM, Dejan Muhamedagic <dejanmm at fastmail.fm
> >wrote:
> >
> > > Hi,
> > >
> > > On Thu, May 14, 2009 at 05:13:50PM -0300, Rafael Emerick wrote:
> > > > Hi, Dejan
> > > >
> > > > There is no two set of meta-attributes.
> > > >
> > > > I remove the ms-drbd11, add again and the error is the same:
> > > > Error performing operation: Required data for this CIB API call not
> found
> > >
> > > Can you please post your CIB. As xml.
> > >
> > > Thanks,
> > >
> > > Dejan
> > >
> > > >
> > > > Thanks,
> > > >
> > > >
> > > > On Thu, May 14, 2009 at 3:43 PM, Dejan Muhamedagic <
> dejanmm at fastmail.fm
> > > >wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > On Thu, May 14, 2009 at 03:18:15PM -0300, Rafael Emerick wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I'm tryng to make a cluster with xen-ha using drbd and ocfs2...
> > > > > >
> > > > > > I want that crm management all resources (xen machines, drbd
> disks
> > > and
> > > > > ocfs2
> > > > > > filesystem ).
> > > > > >
> > > > > > First, a create a clone lsb resource to init drbd with gui
> interface.
> > > > > > Now, I'm following this manual
> > > > > http://clusterlabs.org/wiki/DRBD_HowTo_1.0 to
> > > > > > create the drbd disk managemnt and after make the ocfs2
> filesystem.
> > > > > >
> > > > > > So, when i run:
> > > > > > # crm resource start ms-drbd11
> > > > > > # Multiple attributes match name=target-role
> > > > > > # Value: stopped
>  (id=ms-drbd11-meta_attributes-target-role)
> > > > > > # Value: started        (id=drbd11-meta_attributes-target-role)
> > > > > > # Error performing operation: Required data for this CIB API call
> not
> > > > > found
> > > > >
> > > > > As it says, there are multiple matches for the attribute. Don't
> > > > > know how it came to be. Perhaps you can
> > > > >
> > > > > crm configure edit ms-drbd11
> > > > >
> > > > > and drop one of them. It could also be that there are two sets of
> > > > > meta-attributes.
> > > > >
> > > > > If crm can't edit the resource (in that case please report it)
> > > > > then you can try:
> > > > >
> > > > > crm configure edit xml ms-drbd11
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Dejan
> > > > >
> > > > > > My messages:
> > > > > > May 14 15:07:11 node1 pengine: [4749]: info: get_fail count:
> > > ms-drbd11
> > > > > has
> > > > > > failed 1000000 times on node2
> > > > > > May 14 15:07:11 node1 pengine: [4749]: WARN:
> common_apply_stickiness:
> > > > > > Forcing ms-drbd11 away from node2 after 1000000 failures
> > > (max=1000000)
> > > > > > May 14 15:07:11 node1 pengine: [4749]: WARN: native_color:
> Resource
> > > > > drbd11:0
> > > > > > cannot run anywhere
> > > > > > May 14 15:07:11 node1 pengine: [4749]: WARN: native_color:
> Resource
> > > > > drbd11:1
> > > > > > cannot run anywhere
> > > > > > May 14 15:07:11 node1 pengine: [4749]: info: master_color:
> ms-drbd11:
> > > > > > Promoted 0 instances of a possible 1 to master
> > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > > resource
> > > > > > drbdi:0      (Started node1)
> > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > > resource
> > > > > > drbdi:1      (Started node2)
> > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > > resource
> > > > > > drbd11:0     (Stopped)
> > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > > resource
> > > > > > drbd11:1     (Stopped)
> > > > > >
> > > > > >
> > > > > > Thank you for any help!
> > > > >
> > > > > > _______________________________________________
> > > > > > Pacemaker mailing list
> > > > > > Pacemaker at oss.clusterlabs.org
> > > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Pacemaker mailing list
> > > > > Pacemaker at oss.clusterlabs.org
> > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > >
> > >
> > > > _______________________________________________
> > > > Pacemaker mailing list
> > > > Pacemaker at oss.clusterlabs.org
> > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >
> > >
> > > _______________________________________________
> > > Pacemaker mailing list
> > > Pacemaker at oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >
>
> > _______________________________________________
> > Pacemaker mailing list
> > Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20090515/b2cd6c1f/attachment-0002.htm>