<div>

                    Hello.

                </div><div><br></div><div>I have been working on this for 3 days now, and must be so stressed out that I am being blinded to what is probably an obvious cause of this. In a word, HELP.</div><div><br></div><div>I am trying specifically to utilize ocf:heartbeat:IPaddr2, but this issue seems to occur with any of the ocf:heartbeat agents. I will just focus on IPaddr2 for purposes of figuring this out, but it happens exactly the same with any of the default agents. However, I can successfully use ocf:linbit:drbd for example. it seems to be limited to the RAs that are installed along with coro/pace in the resource-agents package.</div><div><br></div><div>I am using CentOS 6.3, fully updated (though this happens in 6.2 with no updates as well). Install&nbsp;pacemaker/coro from default repo.&nbsp;I have stripped everything down to figure this out in vmware and just install centos, update it, install pace/coro (no drbd for this discussion), configure coro, and then start it.&nbsp;pacemaker starts up fine (or at least I think its fine). I can set quorum ignore for example from crm. (crm configure property no-quorum-policy="ignore")</div><div><br></div><div>here is the process list</div><div><div>root &nbsp; &nbsp; &nbsp;1447 &nbsp;0.3 &nbsp;0.6 556080 &nbsp;6636 ? &nbsp; &nbsp; &nbsp; &nbsp;Ssl &nbsp;21:09 &nbsp; 0:00 corosync</div><div>499 &nbsp; &nbsp; &nbsp; 1453 &nbsp;0.0 &nbsp;0.5 &nbsp;88720 &nbsp;5556 ? &nbsp; &nbsp; &nbsp; &nbsp;S &nbsp; &nbsp;21:09 &nbsp; 0:00 &nbsp;\_ /usr/libexec/pacemaker/cib</div><div>root &nbsp; &nbsp; &nbsp;1454 &nbsp;0.0 &nbsp;0.3 &nbsp;86968 &nbsp;3488 ? &nbsp; &nbsp; &nbsp; &nbsp;S &nbsp; &nbsp;21:09 &nbsp; 0:00 &nbsp;\_ /usr/libexec/pacemaker/stonithd</div><div>root &nbsp; &nbsp; &nbsp;1455 &nbsp;0.0 &nbsp;0.2 &nbsp;76188 &nbsp;2492 ? &nbsp; &nbsp; &nbsp; &nbsp;S &nbsp; &nbsp;21:09 &nbsp; 0:00 &nbsp;\_ /usr/lib64/heartbeat/lrmd</div><div>499 &nbsp; &nbsp; &nbsp; 1456 &nbsp;0.0 &nbsp;0.3 &nbsp;91160 &nbsp;3432 ? &nbsp; &nbsp; &nbsp; &nbsp;S &nbsp; &nbsp;21:09 &nbsp; 0:00 &nbsp;\_ /usr/libexec/pacemaker/attrd</div><div>499 &nbsp; &nbsp; &nbsp; 1457 &nbsp;0.0 &nbsp;0.3 &nbsp;87440 &nbsp;3824 ? &nbsp; &nbsp; &nbsp; &nbsp;S &nbsp; &nbsp;21:09 &nbsp; 0:00 &nbsp;\_ /usr/libexec/pacemaker/pengine</div><div>499 &nbsp; &nbsp; &nbsp; 1458 &nbsp;0.0 &nbsp;0.3 &nbsp;91312 &nbsp;3884 ? &nbsp; &nbsp; &nbsp; &nbsp;S &nbsp; &nbsp;21:09 &nbsp; 0:00 &nbsp;\_ /usr/libexec/pacemaker/crmd</div></div><div><br></div><div>499 is hacluster btw.</div><div><br></div><div>***BUT***</div><div><br></div><div>When I run as root the following:</div><div># crm ra meta ocf:heartbeat:IPaddr2</div><div><br></div><div>I get this response:</div><div>lrmadmin[1484]: 2012/07/22_13:28:23 ERROR: lrm_get_rsc_type_metadata(578): got a return code HA_FAIL from a reply message of rmetadata with function get_ret_from_msg.</div><div>ERROR: ocf:heartbeat:IPaddr2: could not parse meta-data:&nbsp;</div><div><br></div><div>And this is in /var/log/messages:</div><div><div>Jul 22 16:35:14 MST lrmd: [48093]: ERROR: get_resource_meta: pclose failed: Resource temporarily unavailable</div><div>Jul 22 16:35:14 MST lrmd: [48093]: WARN: on_msg_get_metadata: empty metadata for ocf::heartbeat::IPaddr2.</div><div>Jul 22 16:35:14 MST lrmd: [48093]: WARN: G_SIG_dispatch: Dispatch function for SIGCHLD was delayed 200 ms (&gt; 100 ms) before being called (GSource: 0x187df10)</div><div>Jul 22 16:35:14 MST lrmd: [48093]: info: G_SIG_dispatch: started at 429616889 should have started at 429616869</div><div>Jul 22 16:35:14 MST lrmadmin: [48254]: ERROR: lrm_get_rsc_type_metadata(578): got a return code HA_FAIL from a reply message of rmetadata with function get_ret_from_msg.</div></div><div><br></div><div>I am using crm ra meta as a way to test, but crm will not accept my trying to add the resource as a primitive either.</div><div><br></div><div>In my research, I have found that often it's permissions. So just to rule that out i set my entire system to 777 permissions. no joy.</div><div><br></div><div>Another suggestion i find often has been to set OCF_ROOT (export OCF_ROOT=/usr/lib/ocf) and then do /usr/lib/ocf/resource.d/heartbeat/IPaddr2 meta-data.</div><div>That produces the desired output. But does not work before i export.&nbsp;</div><div>And CRM still does not accept my meta request&nbsp;</div><div><br></div><div>Another suggestion i find is to make sure that shellfuncs exists in the agents folder. the soft links exist</div><div><div>lrwxrwxrwx. 1 root root &nbsp; &nbsp;32 Jul 22 04:08 .ocf-binaries -&gt; ../../lib/heartbeat/ocf-binaries</div><div>lrwxrwxrwx. 1 root root &nbsp; &nbsp;35 Jul 22 04:08 .ocf-directories -&gt; ../../lib/heartbeat/ocf-directories</div><div>lrwxrwxrwx. 1 root root &nbsp; &nbsp;35 Jul 22 04:08 .ocf-returncodes -&gt; ../../lib/heartbeat/ocf-returncodes</div><div>lrwxrwxrwx. 1 root root &nbsp; &nbsp;34 Jul 22 04:08 .ocf-shellfuncs -&gt; ../../lib/heartbeat/ocf-shellfuncs</div></div><div><br></div><div>And just to make sure I did un-hidden soft links as well with no joy.</div><div><br></div><div>I have used assorted "how to's" to troubleshoot and make sure Im not missing something simple.</div><div>http://www.server-world.info/en/note?os=CentOS_6&amp;p=pacemaker&amp;f=1</div><div>http://snozberry.org/blog/2012/05/02/corosync-slash-pacemaker-on-centos-6/</div><div><br></div><div>one other strange (but might be normal) behavior is that I cannot manually start pacemaker via "service pacemaker start"</div><div>it fails, but I get no information in the logs. But I get the feeling this is normal behavior now?</div><div><div># service pacemaker start</div><div>Starting Pacemaker Cluster Manager: &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;[FAILED]</div></div><div>log shows 1 entry:&nbsp;Jul 22 22:00:50 MST pacemakerd[1511]: &nbsp; &nbsp; info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/root</div><div><br></div><div><br></div><div>I have run through it about 30 times at this point.</div><div>I have tried cent 6.2 not updated. cent 6.3 fully updated. on a physical server (just in case my VM is doing something weird) and in VMs.&nbsp;</div><div><br></div><div>Frankly I am so baffled by this, and have been working so intensely on it, that I am hoping that I am just missing something subtle because of freaking out.</div><div>This should be very straightforward. No magic, but obviously "something" is amiss.&nbsp;</div><div>But what's really weird is that I cannot find a single post online of anyone having issues with the standard RAs like this.</div><div><br></div><div>I can try anything suggested, except change from centos 6. This is all being done in a pair of virtuals.&nbsp;</div><div><br></div><div>Any help or suggestions at all will be greatly appreciated.</div><div>I am a bit desperate now.</div><div>Thanks.</div>