[ClusterLabs] custom resource agent FAILED (blocked)

Bishoy Mikhael b.s.mikhael at gmail.com
Thu Apr 12 17:38:21 EDT 2018

Hi All,

I'm trying to create a resource agent to promote a standby HDFS namenode to
active when the virtual IP failover to another node.

I've taken the skeleton from the Dummy OCF agent.

The modifications I've done to the Dummy agent are as follows:

HDFSHA_start() {
    if [ $? =  $OCF_SUCCESS ]; then
/opt/hadoop/sbin/hdfs-ha.sh start

HDFSHA_stop() {
    if [ $? =  $OCF_SUCCESS ]; then
/opt/hadoop/sbin/hdfs-ha.sh stop
    return $OCF_SUCCESS

HDFSHA_monitor() {
# Monitor _MUST!_ differentiate correctly between running
# (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).
# That is THREE states, not just yes/no.
active_nn=$(hdfs haadmin -getAllServiceState | grep active | cut -d":" -f 1)
current_node=$(uname -n)
if [[ ${active_nn} == ${current_node} ]]; then
   return $OCF_SUCCESS

HDFSHA_validate() {

    return $OCF_SUCCESS

I've created the resource as follows:

# pcs resource create hdfs-ha ocf:heartbeat:HDFSHA op monitor interval=30s

The resource fails right away as follows:

# pcs status

Cluster name: hdfs_cluster

Stack: corosync

Current DC: taulog (version 1.1.16-12.el7_4.8-94ff4df) - partition with

Last updated: Thu Apr 12 03:30:57 2018

Last change: Thu Apr 12 03:30:54 2018 by root via cibadmin on lingcod

3 nodes configured

2 resources configured

Online: [ dentex lingcod taulog ]

Full list of resources:

 VirtualIP (ocf::heartbeat:IPaddr2): Started taulog

 hdfs-ha (ocf::heartbeat:HDFSHA): FAILED (blocked)[ taulog dentex ]

Failed Actions:

* hdfs-ha_stop_0 on taulog 'insufficient privileges' (4): call=12,
status=complete, exitreason='none',

    last-rc-change='Thu Apr 12 03:17:37 2018', queued=0ms, exec=1ms

* hdfs-ha_stop_0 on dentex 'insufficient privileges' (4): call=10,
status=complete, exitreason='none',

    last-rc-change='Thu Apr 12 03:17:43 2018', queued=0ms, exec=1ms

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

I debug the resource as follows, and it returns 0

# pcs resource debug-monitor hdfs-ha

Operation monitor for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0

 >  stderr: DEBUG: hdfs-ha monitor : 0

# pcs resource debug-stop hdfs-ha

Operation stop for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0

 >  stderr: DEBUG: hdfs-ha stop : 0

# pcs resource debug-start hdfs-ha

Operation start for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0

 >  stderr: DEBUG: hdfs-ha start : 0

I don't understand what am I doing wrong!


Bishoy Mikhael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20180412/6795728d/attachment.html>

More information about the Users mailing list