[Pacemaker] quorum lost, Multiple primaries not allowed by config

Mistina Michal Michal.Mistina at virte.sk
Mon Nov 11 02:56:32 EST 2013


Dear all.

I'd like to know why is our cluster loosing quorum, even if there is
no-quorum-policy="ignore",  and why the drbd is unsuccesfull in promoting
primary on the node where it should run according to crm_mon and logs. The
logs (/var/log/messages) are saying there is a problem with promoting, but
drbd according to /proc/drbd is fine and correctly promoted as Secondary on
slave and correctly promoted as a Primary on the master. Is it possible that
there are wrong timouts? Or drbd is failing for some reason? Maybe because
wrong lvm filter defined?

 

Any help would be appreciated.

 

Now to the relevant data.

 

We are running 2 nodes cluster withouth stonith:

-          pcmk1

-          pcmk2

 

Software versions:

RHEL 6.3

Corosync 1.4.1-7

Pacemaker 1.1.7-6

DRBD 8.4.3-2

 

Configuration of pacemaker:

node pcmk1

node pcmk2

primitive drbd_pg ocf:linbit:drbd \

        params drbd_resource="postgres" \

        op monitor interval="15" role="Master" \

        op monitor interval="16" role="Slave" \

        op start interval="0" timeout="240" \

        op stop interval="0" timeout="120"

primitive pg_fs ocf:heartbeat:Filesystem \

        params device="/dev/vg_drbd/lv_pgsql"
directory="/var/lib/pgsql/9.2/data" options="noatime,nodiratime"
fstype="xfs" \

        op start interval="0" timeout="60" \

        op stop interval="0" timeout="120"

primitive pg_lvm ocf:heartbeat:LVM \

        params volgrpname="vg_drbd" \

        op start interval="0" timeout="30" \

        op stop interval="0" timeout="30"

primitive pg_ping ocf:pacemaker:ping \

        params dampen="5s" host_list="10.101.107.30" multiplier="101"
attempts="10" \

        op monitor interval="15s" timeout="30s"

primitive pg_pool ocf:heartbeat:pgpool \

        params pgpoolconf="/etc/pgpool-II/pgpool.conf"
pcpconf="/etc/pgpool-II/pcp.conf" hbaconf="/etc/pgpool-II/pool_hba.conf"
logfile="/var/log/pgpool/pgpool.log" pgpooluser="postgres" checkmethod="pid"
checkstring="/var/run/pgpool/pg

pool.pid" pgpoolcmd="/usr/bin/pgpool" psqlcmd="/usr/pgsql-9.2/bin/psql"
pcpnccmd="/usr/bin/pcp_node_count" \

        op monitor interval="20" timeout="30" \

        op start interval="0" timeout="20" \

        op stop interval="0" timeout="20" \

        meta target-role="Started"

primitive pg_sql lsb:postgresql-9.2 \

        op monitor interval="30" timeout="60" \

        op start interval="0" timeout="60" \

        op stop interval="0" timeout="60" \

        meta target-role="Started"

primitive pg_vip ocf:heartbeat:IPaddr2 \

        params ip="10.101.107.9" iflabel="bbtcapsvip" \

        op monitor interval="5"

group PGServer pg_lvm pg_fs pg_vip pg_sql pg_pool

ms ms_drbd_pg drbd_pg \

        meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true"

clone clone_pg_ping pg_ping

location cli-prefer-PGServer PGServer \

        rule $id="cli-prefer-rule-PGServer" inf: #uname eq pcmk2

location master-prefer-node1 pg_vip 50: pcmk1

location promote_master_on_node_w_reachable_hosts ms_drbd_pg \

        rule $id="promote_master_on_node_w_reachable_hosts-rule"
$role="master" -inf: pingd lte 0

location run_grp_on_node_w_reachable_hosts PGServer \

        rule $id="run_grp_on_node_w_reachable_hosts-rule" -inf: pingd lte 0

colocation col_pg_drbd inf: PGServer ms_drbd_pg:Master

order ord_pg inf: ms_drbd_pg:promote PGServer:start

property $id="cib-bootstrap-options" \

        dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \

        cluster-infrastructure="openais" \

        expected-quorum-votes="2" \

        stonith-enabled="false" \

        no-quorum-policy="ignore" \

        maintenance-mode="false" \

        last-lrm-refresh="1384154078"

rsc_defaults $id="rsc-options" \

        resource-stickiness="100"

 

Configuration of corosync:

[root at pcmk1 ~]# cat /etc/corosync/corosync.conf

# Please read the corosync.conf.5 manual page

compatibility: whitetank

 

totem {

        version: 2

        secauth: off

        threads: 0

        interface {

                ringnumber: 0

bindnetaddr: 10.101.107.0

mcastaddr: 226.96.1.1

mcastport: 4000

                ttl: 1

        }

}

 

logging {

        fileline: off

        to_stderr: no

        to_logfile: yes

        to_syslog: yes

        logfile: /var/log/cluster/corosync.log

        debug: off

        timestamp: on

        logger_subsys {

                subsys: AMF

                debug: off

        }

}

 

amf {

        mode: disabled

}

aisexec {

        user: root

        group: root

}

service {

        name: pacemaker

        ver: 0

}

 

 

 

Configuration of drbd:

[root at pcmk1 ~]# cat /etc/drbd.d/postgres.res 

resource postgres {

        device /dev/drbd0;

        disk /dev/sdb;

        meta-disk internal;

        syncer {

                rate 120M;

                verify-alg md5;

        }

        on pcmk1 {

                address 10.101.107.5:7788;

        }

        on pcmk2 {

                address 10.101.107.6:7788;

        }

        net { 

                after-sb-0pri discard-younger-primary;

                after-sb-1pri discard-secondary;

                after-sb-2pri call-pri-lost-after-sb;

        }

        disk {

                fencing resource-only;

        }

        handlers {

                outdate-peer "/usr/lib/drbd/crm-fence-peer.sh";

                fence-peer "/usr/lib/drbd/crm-fence-peer.sh";

                after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";

        }

}

 

 

Action taken:

crm resource migrate PGServer pcmk2

 

 

This error appears in the output of crm_mon:

 

[root at pcmk1 ~]# crm_mon -1

============

Last updated: Mon Nov 11 08:14:20 2013

Last change: Sun Nov 10 22:54:42 2013 via crm_resource on pcmk1

Stack: openais

Current DC: pcmk1 - partition with quorum

Version: 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14

2 Nodes configured, 2 expected votes

9 Resources configured.

============

 

Online: [ pcmk1 pcmk2 ]

 

Resource Group: PGServer

     pg_lvm     (ocf::heartbeat:LVM):   Started pcmk2

     pg_fs      (ocf::heartbeat:Filesystem):    Started pcmk2

     pg_vip     (ocf::heartbeat:IPaddr2):       Started pcmk2

     pg_sql     (lsb:postgresql-9.2):   Started pcmk2

     pg_pool    (ocf::heartbeat:pgpool):        Started pcmk2

Master/Slave Set: ms_drbd_pg [drbd_pg]

     Masters: [ pcmk2 ]

     Slaves: [ pcmk1 ]

Clone Set: clone_pg_ping [pg_ping]

     Started: [ pcmk1 pcmk2 ]

 

Failed actions:

    drbd_pg:0_promote_0 (node=pcmk1, call=48, rc=-2, status=Timed Out):
unknown exec error

 

 

DRBD is however fine:

[root at pcmk1 ~]# cat /proc/drbd 

version: 8.4.3 (api:1/proto:86-101)

GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root at pgxc01,
2013-06-16 14:43:44

0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----

    ns:2735 nr:29448 dw:32184 dr:3441 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1
wo:f oos:0

 

In the /var/log/messages of the pcmk1 there is logged an issue:

Nov 10 22:54:42 PCMK1 cib[1147]:     info: cib_process_request: Operation
complete: op cib_delete for section constraints
(origin=local/crm_resource/3, version=0.46.90): ok (rc=0)

Nov 10 22:54:42 PCMK1 cib[1147]:     info: cib:diff: - <cib admin_epoch="0"
epoch="46" num_updates="90" />

Nov 10 22:54:42 PCMK1 cib[1147]:     info: cib:diff: + <cib epoch="47"
num_updates="1" admin_epoch="0" validate-with="pacemaker-1.2"
crm_feature_set="3.0.6" update-origin="pcmk1" update-client="cibadmin"
cib-last-written="Sat Nov  2 15:31:01 2013" have-quorum="1" dc-uuid="pcmk1"
>

Nov 10 22:54:42 PCMK1 cib[1147]:     info: cib:diff: +   <configuration >

Nov 10 22:54:42 PCMK1 cib[1147]:     info: cib:diff: +     <constraints >

Nov 10 22:54:42 PCMK1 cib[1147]:     info: cib:diff: +       <rsc_location
id="cli-prefer-PGServer" rsc="PGServer" __crm_diff_marker__="added:top" >

Nov 10 22:54:42 PCMK1 cib[1147]:     info: cib:diff: +         <rule
id="cli-prefer-rule-PGServer" score="INFINITY" boolean-op="and" >

Nov 10 22:54:42 PCMK1 cib[1147]:     info: cib:diff: +           <expression
id="cli-prefer-expr-PGServer" attribute="#uname" operation="eq"
value="pcmk2" type="string" />

Nov 10 22:54:42 PCMK1 cib[1147]:     info: cib:diff: +         </rule>

Nov 10 22:54:42 PCMK1 cib[1147]:     info: cib:diff: +       </rsc_location>

Nov 10 22:54:42 PCMK1 cib[1147]:     info: cib:diff: +     </constraints>

Nov 10 22:54:42 PCMK1 cib[1147]:     info: cib:diff: +   </configuration>

Nov 10 22:54:42 PCMK1 cib[1147]:     info: cib:diff: + </cib>

Nov 10 22:54:42 PCMK1 crmd[1152]:     info: abort_transition_graph:
te_update_diff:126 - Triggered transition abort (complete=1, tag=diff,
id=(null), magic=NA, cib=0.47.1) : Non-status change

Nov 10 22:54:42 PCMK1 cib[1147]:     info: cib_process_request: Operation
complete: op cib_modify for section constraints
(origin=local/crm_resource/4, version=0.47.1): ok (rc=0)

Nov 10 22:54:42 PCMK1 crmd[1152]:   notice: do_state_transition: State
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL
origin=abort_transition_graph ]

Nov 10 22:54:42 PCMK1 pengine[1151]:   notice: unpack_config: On loss of CCM
Quorum: Ignore

Nov 10 22:54:42 PCMK1 pengine[1151]:   notice: LogActions: Move
pg_lvm#011(Started pcmk1 -> pcmk2)

Nov 10 22:54:42 PCMK1 pengine[1151]:   notice: LogActions: Move
pg_fs#011(Started pcmk1 -> pcmk2)

Nov 10 22:54:42 PCMK1 pengine[1151]:   notice: LogActions: Move
pg_vip#011(Started pcmk1 -> pcmk2)

Nov 10 22:54:42 PCMK1 pengine[1151]:   notice: LogActions: Move
pg_sql#011(Started pcmk1 -> pcmk2)

Nov 10 22:54:42 PCMK1 pengine[1151]:   notice: LogActions: Move
pg_pool#011(Started pcmk1 -> pcmk2)

Nov 10 22:54:42 PCMK1 pengine[1151]:   notice: LogActions: Demote
drbd_pg:0#011(Master -> Slave pcmk1)

Nov 10 22:54:42 PCMK1 pengine[1151]:   notice: LogActions: Promote
drbd_pg:1#011(Slave -> Master pcmk2)

Nov 10 22:54:42 PCMK1 crmd[1152]:   notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]

Nov 10 22:54:42 PCMK1 crmd[1152]:     info: do_te_invoke: Processing graph
1977 (ref=pe_calc-dc-1384120482-2107) derived from
/var/lib/pengine/pe-input-93.bz2

Nov 10 22:54:42 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 22: stop pg_pool_stop_0 on pcmk1 (local)

Nov 10 22:54:42 PCMK1 lrmd: [1149]: info: cancel_op: operation monitor[35]
on ocf::pgpool::pg_pool for client 1152, its parameters:
pcpnccmd=[/usr/bin/pcp_node_count] pgpoolconf=[/etc/pgpool-II/pgpool.conf]
psqlcmd=[/usr/pgsql-9.2/bin/psql] pgpoolcmd=[/usr/bin/pgpool]
logfile=[/var/log/pgpool/pgpool.log] pcpconf=[/etc/pgpool-II/pcp.conf]
hbaconf=[/etc/pgpool-II/pool_hba.conf] pgpooluser=[postgres]
crm_feature_set=[3.0.6] checkmethod=[pid]
checkstring=[/var/run/pgpool/pgpool.pid] CRM_meta_name=[monitor]
CRM_meta_timeout=[30000] CRM_meta_interval=[20000]  cancelled

Nov 10 22:54:42 PCMK1 lrmd: [1149]: info: rsc:pg_pool:36: stop

Nov 10 22:54:42 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 4: cancel drbd_pg:0_monitor_15000 on pcmk1 (local)

Nov 10 22:54:42 PCMK1 lrmd: [1149]: info: cancel_op: operation monitor[28]
on ocf::drbd::drbd_pg:0 for client 1152, its parameters: CRM_meta_clone=[0]
CRM_meta_role=[Master] CRM_meta_notify_slave_resource=[ ]
CRM_meta_notify_active_resource=[ ] CRM_meta_timeout=[20000]
CRM_meta_notify_demote_uname=[ ] drbd_resource=[postgres]
CRM_meta_master_node_max=[1] CRM_meta_notify_stop_resource=[ ]
CRM_meta_notify_master_resource=[ ] CRM_meta_clone_node_max=[1]
CRM_meta_notify=[true] CRM_meta_notify_demote_resource=[ ]
CRM_meta_notify_master_uname=[ ] CRM_meta_notify_s cancelled

Nov 10 22:54:42 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 7: cancel drbd_pg:1_monitor_16000 on pcmk2

Nov 10 22:54:42 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
pg_pool_monitor_20000 (call=35, status=1, cib-update=0, confirmed=true)
Cancelled

Nov 10 22:54:42 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
drbd_pg:0_monitor_15000 (call=28, status=1, cib-update=0, confirmed=true)
Cancelled

Nov 10 22:54:42 PCMK1 cib[1147]:     info: cib_process_request: Operation
complete: op cib_delete for section
//node_state[@uname='pcmk2']//lrm_resource[@id='drbd_pg:1']/lrm_rsc_op[@id='
drbd_pg:1_monitor_16000'] (origin=pcmk2/crmd/36, version=0.47.3): ok (rc=0)

Nov 10 22:54:42 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 85: notify drbd_pg:0_pre_notify_demote_0 on pcmk1 (local)

Nov 10 22:54:42 PCMK1 lrmd: [1149]: info: rsc:drbd_pg:0:37: notify

Nov 10 22:54:42 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 87: notify drbd_pg:1_pre_notify_demote_0 on pcmk2

Nov 10 22:54:42 PCMK1 pengine[1151]:   notice: process_pe_message:
Transition 1977: PEngine Input stored in: /var/lib/pengine/pe-input-93.bz2

Nov 10 22:54:42 PCMK1 lrmd: [1149]: info: RA output: (pg_pool:stop:stderr)
stop request sent to pgpool. waiting for termination...

Nov 10 22:54:42 PCMK1 lrmd: [1149]: info: RA output: (pg_pool:stop:stderr)
done.

Nov 10 22:54:42 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
pg_pool_stop_0 (call=36, rc=0, cib-update=2180, confirmed=true) ok

Nov 10 22:54:42 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 19: stop pg_sql_stop_0 on pcmk1 (local)

Nov 10 22:54:42 PCMK1 lrmd: [1149]: info: cancel_op: operation monitor[33]
on lsb::postgresql-9.2::pg_sql for client 1152, its parameters:
CRM_meta_name=[monitor] crm_feature_set=[3.0.6] CRM_meta_timeout=[60000]
CRM_meta_interval=[30000]  cancelled

Nov 10 22:54:42 PCMK1 lrmd: [1149]: info: rsc:pg_sql:38: stop

Nov 10 22:54:42 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
pg_sql_monitor_30000 (call=33, status=1, cib-update=0, confirmed=true)
Cancelled

Nov 10 22:54:42 PCMK1 lrmd: [21232]: WARN: For LSB init script, no
additional parameters are needed.

Nov 10 22:54:42 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
drbd_pg:0_notify_0 (call=37, rc=0, cib-update=0, confirmed=true) ok

Nov 10 22:54:42 PCMK1 lrmd: [1149]: info: RA output: (pg_sql:stop:stdout)
Stopping postgresql-9.2 service: 

Nov 10 22:54:43 PCMK1 lrmd: [1149]: info: RA output: (pg_sql:stop:stdout)
#033[60G

Nov 10 22:54:43 PCMK1 lrmd: [1149]: info: RA output: (pg_sql:stop:stdout) [

Nov 10 22:54:43 PCMK1 lrmd: [1149]: info: RA output: (pg_sql:stop:stdout)
#033[0;32m

Nov 10 22:54:43 PCMK1 lrmd: [1149]: info: RA output: (pg_sql:stop:stdout)
OK  

Nov 10 22:54:43 PCMK1 lrmd: [1149]: info: RA output: (pg_sql:stop:stdout)
#033[0;39m

Nov 10 22:54:43 PCMK1 lrmd: [1149]: info: RA output: (pg_sql:stop:stdout) ]

Nov 10 22:54:43 PCMK1 lrmd: [1149]: info: RA output: (pg_sql:stop:stdout)
#015

Nov 10 22:54:43 PCMK1 lrmd: [1149]: info: RA output: (pg_sql:stop:stdout) 

Nov 10 22:54:43 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
pg_sql_stop_0 (call=38, rc=0, cib-update=2181, confirmed=true) ok

Nov 10 22:54:43 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 16: stop pg_vip_stop_0 on pcmk1 (local)

Nov 10 22:54:43 PCMK1 lrmd: [1149]: info: cancel_op: operation monitor[31]
on ocf::IPaddr2::pg_vip for client 1152, its parameters:
CRM_meta_name=[monitor] crm_feature_set=[3.0.6] ip=[10.101.107.9]
CRM_meta_timeout=[20000] CRM_meta_interval=[5000] iflabel=[bbtcapsvip]
cancelled

Nov 10 22:54:43 PCMK1 lrmd: [1149]: info: rsc:pg_vip:39: stop

Nov 10 22:54:43 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
pg_vip_monitor_5000 (call=31, status=1, cib-update=0, confirmed=true)
Cancelled

Nov 10 22:54:43 PCMK1 IPaddr2(pg_vip)[21267]: INFO: IP status = ok, IP_CIP=

Nov 10 22:54:43 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
pg_vip_stop_0 (call=39, rc=0, cib-update=2182, confirmed=true) ok

Nov 10 22:54:43 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 14: stop pg_fs_stop_0 on pcmk1 (local)

Nov 10 22:54:43 PCMK1 lrmd: [1149]: info: rsc:pg_fs:40: stop

Nov 10 22:54:43 PCMK1 Filesystem(pg_fs)[21305]: INFO: Running stop for
/dev/vg_drbd/lv_pgsql on /var/lib/pgsql/9.2/data

Nov 10 22:54:43 PCMK1 Filesystem(pg_fs)[21305]: INFO: Trying to unmount
/var/lib/pgsql/9.2/data

Nov 10 22:54:44 PCMK1 Filesystem(pg_fs)[21305]: INFO: unmounted
/var/lib/pgsql/9.2/data successfully

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
pg_fs_stop_0 (call=40, rc=0, cib-update=2183, confirmed=true) ok

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 12: stop pg_lvm_stop_0 on pcmk1 (local)

Nov 10 22:54:44 PCMK1 lrmd: [1149]: info: rsc:pg_lvm:41: stop

Nov 10 22:54:44 PCMK1 LVM(pg_lvm)[21375]: INFO: Deactivating volume group
vg_drbd

Nov 10 22:54:44 PCMK1 LVM(pg_lvm)[21375]: INFO: 0 logical volume(s) in
volume group "vg_drbd" now active

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
pg_lvm_stop_0 (call=41, rc=0, cib-update=2184, confirmed=true) ok

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 31: demote drbd_pg:0_demote_0 on pcmk1 (local)

Nov 10 22:54:44 PCMK1 lrmd: [1149]: info: rsc:drbd_pg:0:42: demote

Nov 10 22:54:44 PCMK1 kernel: block drbd0: role( Primary -> Secondary ) 

Nov 10 22:54:44 PCMK1 kernel: block drbd0: bitmap WRITE of 0 pages took 0
jiffies

Nov 10 22:54:44 PCMK1 kernel: block drbd0: 0 KB (0 bits) marked out-of-sync
by on disk bit-map.

Nov 10 22:54:44 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:demote:stdout) 

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
drbd_pg:0_demote_0 (call=42, rc=0, cib-update=2185, confirmed=true) ok

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 86: notify drbd_pg:0_post_notify_demote_0 on pcmk1 (local)

Nov 10 22:54:44 PCMK1 lrmd: [1149]: info: rsc:drbd_pg:0:43: notify

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 88: notify drbd_pg:1_post_notify_demote_0 on pcmk2

Nov 10 22:54:44 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:notify:stdout) 

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
drbd_pg:0_notify_0 (call=43, rc=0, cib-update=0, confirmed=true) ok

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 81: notify drbd_pg:0_pre_notify_promote_0 on pcmk1 (local)

Nov 10 22:54:44 PCMK1 lrmd: [1149]: info: rsc:drbd_pg:0:44: notify

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 83: notify drbd_pg:1_pre_notify_promote_0 on pcmk2

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
drbd_pg:0_notify_0 (call=44, rc=0, cib-update=0, confirmed=true) ok

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 36: promote drbd_pg:1_promote_0 on pcmk2

Nov 10 22:54:44 PCMK1 kernel: block drbd0: peer( Secondary -> Primary ) 

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 82: notify drbd_pg:0_post_notify_promote_0 on pcmk1 (local)

Nov 10 22:54:44 PCMK1 lrmd: [1149]: info: rsc:drbd_pg:0:45: notify

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 84: notify drbd_pg:1_post_notify_promote_0 on pcmk2

Nov 10 22:54:44 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:notify:stdout) 

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
drbd_pg:0_notify_0 (call=45, rc=0, cib-update=0, confirmed=true) ok

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 13: start pg_lvm_start_0 on pcmk2

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 32: monitor drbd_pg:0_monitor_16000 on pcmk1 (local)

Nov 10 22:54:44 PCMK1 lrmd: [1149]: info: rsc:drbd_pg:0:46: monitor

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 37: monitor drbd_pg:1_monitor_15000 on pcmk2

Nov 10 22:54:44 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
drbd_pg:0_monitor_16000 (call=46, rc=0, cib-update=2186, confirmed=false) ok

Nov 10 22:54:45 PCMK1 ntpd[1321]: Deleting interface #6 eth0:bbtcapsvip,
10.101.107.9#123, interface stats: received=0, sent=0, dropped=0,
active_time=717805 secs

Nov 10 22:54:46 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 15: start pg_fs_start_0 on pcmk2

Nov 10 22:54:48 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 17: start pg_vip_start_0 on pcmk2

Nov 10 22:54:48 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 18: monitor pg_vip_monitor_5000 on pcmk2

Nov 10 22:54:48 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 20: start pg_sql_start_0 on pcmk2

Nov 10 22:54:51 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 21: monitor pg_sql_monitor_30000 on pcmk2

Nov 10 22:54:51 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 23: start pg_pool_start_0 on pcmk2

Nov 10 22:54:52 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 24: monitor pg_pool_monitor_20000 on pcmk2

Nov 10 22:54:52 PCMK1 crmd[1152]:   notice: run_graph: ==== Transition 1977
(Complete=44, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pengine/pe-input-93.bz2): Complete

Nov 10 22:54:52 PCMK1 crmd[1152]:   notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]

Nov 10 22:56:02 PCMK1 crmd[1152]:  warning: action_timer_callback: Timer
popped (timeout=20000, abort_level=0, complete=true)

Nov 10 22:56:02 PCMK1 crmd[1152]:  warning: action_timer_callback: Ignoring
timeout while not in transition

Nov 10 22:58:41 PCMK1 cib[1147]:     info: cib_stats: Processed 69
operations (724.00us average, 0% utilization) in the last 10min

Nov 10 23:08:41 PCMK1 cib[1147]:     info: cib_stats: Processed 38
operations (263.00us average, 0% utilization) in the last 10min

Nov 10 23:09:52 PCMK1 crmd[1152]:     info: crm_timer_popped: PEngine
Recheck Timer (I_PE_CALC) just popped (900000ms)

Nov 10 23:09:52 PCMK1 crmd[1152]:   notice: do_state_transition: State
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED
origin=crm_timer_popped ]

Nov 10 23:09:52 PCMK1 crmd[1152]:     info: do_state_transition: Progressed
to state S_POLICY_ENGINE after C_TIMER_POPPED

Nov 10 23:09:52 PCMK1 pengine[1151]:   notice: unpack_config: On loss of CCM
Quorum: Ignore

Nov 10 23:09:52 PCMK1 crmd[1152]:   notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]

Nov 10 23:09:52 PCMK1 crmd[1152]:     info: do_te_invoke: Processing graph
1978 (ref=pe_calc-dc-1384121392-2140) derived from
/var/lib/pengine/pe-input-94.bz2

Nov 10 23:09:52 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 2: cancel drbd_pg:0_monitor_15000 on pcmk1 (local)

Nov 10 23:09:52 PCMK1 crmd[1152]:   notice: run_graph: ==== Transition 1978
(Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pengine/pe-input-94.bz2): Complete

Nov 10 23:09:52 PCMK1 crmd[1152]:   notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]

Nov 10 23:09:52 PCMK1 cib[1147]:     info: cib_process_request: Operation
complete: op cib_delete for section
//node_state[@uname='pcmk1']//lrm_resource[@id='drbd_pg:0']/lrm_rsc_op[@id='
drbd_pg:0_monitor_15000'] (origin=local/crmd/2188, version=0.47.21): ok
(rc=0)

Nov 10 23:09:52 PCMK1 pengine[1151]:   notice: process_pe_message:
Transition 1978: PEngine Input stored in: /var/lib/pengine/pe-input-94.bz2

Nov 10 23:11:12 PCMK1 crmd[1152]:  warning: action_timer_callback: Timer
popped (timeout=20000, abort_level=0, complete=true)

Nov 10 23:11:12 PCMK1 crmd[1152]:  warning: action_timer_callback: Ignoring
timeout while not in transition

Nov 10 23:18:41 PCMK1 cib[1147]:     info: cib_stats: Processed 39
operations (512.00us average, 0% utilization) in the last 10min

Nov 10 23:24:52 PCMK1 crmd[1152]:     info: crm_timer_popped: PEngine
Recheck Timer (I_PE_CALC) just popped (900000ms)

Nov 10 23:24:52 PCMK1 crmd[1152]:   notice: do_state_transition: State
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED
origin=crm_timer_popped ]

Nov 10 23:24:52 PCMK1 crmd[1152]:     info: do_state_transition: Progressed
to state S_POLICY_ENGINE after C_TIMER_POPPED

Nov 10 23:24:52 PCMK1 pengine[1151]:   notice: unpack_config: On loss of CCM
Quorum: Ignore

Nov 10 23:24:52 PCMK1 crmd[1152]:   notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]

Nov 10 23:24:52 PCMK1 crmd[1152]:     info: do_te_invoke: Processing graph
1979 (ref=pe_calc-dc-1384122292-2143) derived from
/var/lib/pengine/pe-input-95.bz2

Nov 10 23:24:52 PCMK1 crmd[1152]:   notice: run_graph: ==== Transition 1979
(Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pengine/pe-input-95.bz2): Complete

Nov 10 23:24:52 PCMK1 crmd[1152]:   notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]

Nov 10 23:24:52 PCMK1 pengine[1151]:   notice: process_pe_message:
Transition 1979: PEngine Input stored in: /var/lib/pengine/pe-input-95.bz2

Nov 10 23:28:41 PCMK1 cib[1147]:     info: cib_stats: Processed 38
operations (0.00us average, 0% utilization) in the last 10min

Nov 10 23:30:59 PCMK1 corosync[1141]:   [TOTEM ] A processor failed, forming
new configuration.

Nov 10 23:31:01 PCMK1 cib[1147]:   notice: ais_dispatch_message: Membership
68: quorum lost

Nov 10 23:31:01 PCMK1 cib[1147]:     info: crm_update_peer: Node pcmk2:
id=107701514 state=lost (new) addr=r(0) ip(10.101.107.6)  votes=1 born=20
seen=64 proc=00000000000000000000000000111312

Nov 10 23:31:01 PCMK1 crmd[1152]:   notice: ais_dispatch_message: Membership
68: quorum lost

Nov 10 23:31:01 PCMK1 crmd[1152]:     info: ais_status_callback: status:
pcmk2 is now lost (was member)

Nov 10 23:31:01 PCMK1 crmd[1152]:     info: crm_update_peer: Node pcmk2:
id=107701514 state=lost (new) addr=r(0) ip(10.101.107.6)  votes=1 born=20
seen=64 proc=00000000000000000000000000111312

Nov 10 23:31:01 PCMK1 corosync[1141]:   [TOTEM ] A processor joined or left
the membership and a new membership was formed.

Nov 10 23:31:01 PCMK1 corosync[1141]:   [CPG   ] chosen downlist: sender
r(0) ip(10.101.107.5) ; members(old:2 left:1)

Nov 10 23:31:01 PCMK1 corosync[1141]:   [MAIN  ] Completed service
synchronization, ready to provide service.

Nov 10 23:31:01 PCMK1 cib[1147]:     info: cib_process_request: Operation
complete: op cib_modify for section nodes (origin=local/crmd/2190,
version=0.47.22): ok (rc=0)

Nov 10 23:31:01 PCMK1 cib[1147]:     info: cib_process_request: Operation
complete: op cib_modify for section cib (origin=local/crmd/2192,
version=0.47.24): ok (rc=0)

Nov 10 23:31:01 PCMK1 crmd[1152]:     info: crmd_ais_dispatch: Setting
expected votes to 2

Nov 10 23:31:01 PCMK1 crmd[1152]:  warning: match_down_event: No match for
shutdown action on pcmk2

Nov 10 23:31:01 PCMK1 cib[1147]:     info: cib_process_request: Operation
complete: op cib_modify for section crm_config (origin=local/crmd/2194,
version=0.47.25): ok (rc=0)

Nov 10 23:31:01 PCMK1 crmd[1152]:     info: te_update_diff: Stonith/shutdown
of pcmk2 not matched

Nov 10 23:31:01 PCMK1 crmd[1152]:     info: abort_transition_graph:
te_update_diff:234 - Triggered transition abort (complete=1, tag=node_state,
id=pcmk2, magic=NA, cib=0.47.23) : Node failure

Nov 10 23:31:01 PCMK1 crmd[1152]:   notice: do_state_transition: State
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL
origin=abort_transition_graph ]

Nov 10 23:31:01 PCMK1 pengine[1151]:   notice: unpack_config: On loss of CCM
Quorum: Ignore

Nov 10 23:31:01 PCMK1 pengine[1151]:   notice: LogActions: Start
pg_lvm#011(pcmk1)

Nov 10 23:31:01 PCMK1 pengine[1151]:   notice: LogActions: Start
pg_fs#011(pcmk1)

Nov 10 23:31:01 PCMK1 pengine[1151]:   notice: LogActions: Start
pg_vip#011(pcmk1)

Nov 10 23:31:01 PCMK1 pengine[1151]:   notice: LogActions: Start
pg_sql#011(pcmk1)

Nov 10 23:31:01 PCMK1 pengine[1151]:   notice: LogActions: Start
pg_pool#011(pcmk1)

Nov 10 23:31:01 PCMK1 pengine[1151]:   notice: LogActions: Promote
drbd_pg:0#011(Slave -> Master pcmk1)

Nov 10 23:31:01 PCMK1 crmd[1152]:   notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]

Nov 10 23:31:01 PCMK1 crmd[1152]:     info: do_te_invoke: Processing graph
1980 (ref=pe_calc-dc-1384122661-2145) derived from
/var/lib/pengine/pe-input-96.bz2

Nov 10 23:31:01 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 2: cancel drbd_pg:0_monitor_16000 on pcmk1 (local)

Nov 10 23:31:01 PCMK1 lrmd: [1149]: info: cancel_op: operation monitor[46]
on ocf::drbd::drbd_pg:0 for client 1152, its parameters: CRM_meta_clone=[0]
CRM_meta_master_max=[1] CRM_meta_notify_slave_resource=[ ]
CRM_meta_notify_active_resource=[ ] CRM_meta_notify_demote_uname=[ ]
drbd_resource=[postgres] crm_feature_set=[3.0.6]
CRM_meta_master_node_max=[1] CRM_meta_notify_stop_resource=[ ]
CRM_meta_notify_master_resource=[ ] CRM_meta_clone_node_max=[1]
CRM_meta_notify=[true] CRM_meta_notify_demote_resource=[ ]
CRM_meta_notify_start_resource=[drbd_pg:0 drbd_pg: cancelled

Nov 10 23:31:01 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
drbd_pg:0_monitor_16000 (call=46, status=1, cib-update=0, confirmed=true)
Cancelled

Nov 10 23:31:01 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 63: notify drbd_pg:0_pre_notify_promote_0 on pcmk1 (local)

Nov 10 23:31:01 PCMK1 lrmd: [1149]: info: rsc:drbd_pg:0:47: notify

Nov 10 23:31:01 PCMK1 pengine[1151]:   notice: process_pe_message:
Transition 1980: PEngine Input stored in: /var/lib/pengine/pe-input-96.bz2

Nov 10 23:31:01 PCMK1 crmd[1152]:     info: process_lrm_event: LRM operation
drbd_pg:0_notify_0 (call=47, rc=0, cib-update=0, confirmed=true) ok

Nov 10 23:31:01 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 21: promote drbd_pg:0_promote_0 on pcmk1 (local)

Nov 10 23:31:01 PCMK1 lrmd: [1149]: info: rsc:drbd_pg:0:48: promote

Nov 10 23:31:02 PCMK1 cib[1147]:   notice: ais_dispatch_message: Membership
72: quorum acquired

Nov 10 23:31:02 PCMK1 cib[1147]:     info: crm_update_peer: Node pcmk2:
id=107701514 state=member (new) addr=r(0) ip(10.101.107.6)  votes=1 born=20
seen=72 proc=00000000000000000000000000111312

Nov 10 23:31:02 PCMK1 crmd[1152]:   notice: ais_dispatch_message: Membership
72: quorum acquired

Nov 10 23:31:02 PCMK1 crmd[1152]:     info: ais_status_callback: status:
pcmk2 is now member (was lost)

Nov 10 23:31:02 PCMK1 crmd[1152]:  warning: match_down_event: No match for
shutdown action on pcmk2

Nov 10 23:31:02 PCMK1 crmd[1152]:     info: crm_update_peer: Node pcmk2:
id=107701514 state=member (new) addr=r(0) ip(10.101.107.6)  votes=1 born=20
seen=72 proc=00000000000000000000000000111312

Nov 10 23:31:02 PCMK1 corosync[1141]:   [TOTEM ] A processor joined or left
the membership and a new membership was formed.

Nov 10 23:31:02 PCMK1 cib[1147]:     info: cib_process_request: Operation
complete: op cib_delete for section //node_state[@uname='pcmk2']/lrm
(origin=local/crmd/2197, version=0.47.27): ok (rc=0)

Nov 10 23:31:02 PCMK1 cib[1147]:     info: cib_process_request: Operation
complete: op cib_delete for section
//node_state[@uname='pcmk2']/transient_attributes (origin=local/crmd/2198,
version=0.47.28): ok (rc=0)

Nov 10 23:31:02 PCMK1 cib[1147]:     info: cib_process_request: Operation
complete: op cib_modify for section nodes (origin=local/crmd/2199,
version=0.47.29): ok (rc=0)

Nov 10 23:31:02 PCMK1 cib[1147]:     info: cib_process_request: Operation
complete: op cib_modify for section cib (origin=local/crmd/2201,
version=0.47.31): ok (rc=0)

Nov 10 23:31:02 PCMK1 crmd[1152]:     info: crmd_ais_dispatch: Setting
expected votes to 2

Nov 10 23:31:02 PCMK1 cib[1147]:     info: cib_process_request: Operation
complete: op cib_modify for section crm_config (origin=local/crmd/2203,
version=0.47.32): ok (rc=0)

Nov 10 23:31:02 PCMK1 crmd[1152]:     info: abort_transition_graph:
te_update_diff:320 - Triggered transition abort (complete=0, tag=lrm_rsc_op,
id=pg_sql_last_0, magic=0:0;20:1977:0:0cc9ce4f-2831-4859-96b6-41a50c7aeb56,
cib=0.47.27) : Resource op removal

Nov 10 23:31:02 PCMK1 crmd[1152]:     info: abort_transition_graph:
te_update_diff:194 - Triggered transition abort (complete=0,
tag=transient_attributes, id=pcmk2, magic=NA, cib=0.47.28) : Transient
attribute: removal

Nov 10 23:31:02 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr) 0: State change failed: (-1) Multiple primaries
not allowed by config

Nov 10 23:31:02 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr) Command '

Nov 10 23:31:02 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr) drbdsetup

Nov 10 23:31:02 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr)  

Nov 10 23:31:02 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr) primary

Nov 10 23:31:02 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr)  

Nov 10 23:31:02 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr) 0

Nov 10 23:31:02 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr) ' terminated with exit code 11

Nov 10 23:31:02 PCMK1 drbd(drbd_pg:0)[27275]: ERROR: postgres: Called
drbdadm -c /etc/drbd.conf primary postgres

Nov 10 23:31:02 PCMK1 drbd(drbd_pg:0)[27275]: ERROR: postgres: Exit code 11

Nov 10 23:31:02 PCMK1 drbd(drbd_pg:0)[27275]: ERROR: postgres: Command
output: 

Nov 10 23:31:02 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stdout) 

Nov 10 23:31:03 PCMK1 cib[1147]:     info: cib_process_request: Operation
complete: op cib_sync_one for section 'all' (origin=pcmk2/pcmk2/(null),
version=0.47.32): ok (rc=0)

Nov 10 23:31:03 PCMK1 crmd[1152]:     info: abort_transition_graph:
te_update_diff:176 - Triggered transition abort (complete=0, tag=nvpair,
id=status-pcmk2-master-drbd_pg.1, name=master-drbd_pg:1, value=10000,
magic=NA, cib=0.47.33) : Transient attribute: update

Nov 10 23:31:03 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr) 0: State change failed: (-1) Multiple primaries
not allowed by config

Nov 10 23:31:03 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr) Command '

Nov 10 23:31:03 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr) drbdsetup

Nov 10 23:31:03 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr)  

Nov 10 23:31:03 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr) primary

Nov 10 23:31:03 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr)  

Nov 10 23:31:03 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr) 0

Nov 10 23:31:03 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stderr) ' terminated with exit code 11

Nov 10 23:31:03 PCMK1 drbd(drbd_pg:0)[27275]: ERROR: postgres: Called
drbdadm -c /etc/drbd.conf primary postgres

Nov 10 23:31:03 PCMK1 drbd(drbd_pg:0)[27275]: ERROR: postgres: Exit code 11

Nov 10 23:31:03 PCMK1 drbd(drbd_pg:0)[27275]: ERROR: postgres: Command
output: 

Nov 10 23:31:03 PCMK1 lrmd: [1149]: info: RA output:
(drbd_pg:0:promote:stdout) 

Nov 10 23:31:03 PCMK1 corosync[1141]:   [CPG   ] chosen downlist: sender
r(0) ip(10.101.107.5) ; members(old:1 left:0)

Nov 10 23:31:04 PCMK1 corosync[1141]:   [MAIN  ] Completed service
synchronization, ready to provide service.

 

 

. the section ending with Exit code 11 keeps happening many times until
this:

Nov 10 23:31:21 PCMK1 lrmd: [1149]: WARN: drbd_pg:0:promote process (PID
27275) timed out (try 1).  Killing with signal SIGTERM (15).

Nov 10 23:31:21 PCMK1 lrmd: [1149]: WARN: operation promote[48] on
ocf::drbd::drbd_pg:0 for client 1152, its parameters: CRM_meta_clone=[0]
CRM_meta_notify_slave_resource=[drbd_pg:0 ]
CRM_meta_notify_active_resource=[ ] CRM_meta_notify_demote_uname=[ ]
drbd_resource=[postgres] CRM_meta_notify_start_resource=[ ]
CRM_meta_master_node_max=[1] CRM_meta_notify_stop_resource=[ ]
CRM_meta_notify_master_resource=[ ] CRM_meta_clone_node_max=[1]
CRM_meta_clone_max=[2] CRM_meta_notify=[true] CRM_meta_notify_master_uname=[
] CRM_meta_master_max=[1] crm_featur: pid [27275] timed out

Nov 10 23:31:21 PCMK1 crmd[1152]:    error: process_lrm_event: LRM operation
drbd_pg:0_promote_0 (48) Timed Out (timeout=20000ms)

Nov 10 23:31:21 PCMK1 crmd[1152]:  warning: status_from_rc: Action 21
(drbd_pg:0_promote_0) on pcmk1 failed (target: 0 vs. rc: -2): Error

Nov 10 23:31:21 PCMK1 crmd[1152]:  warning: update_failcount: Updating
failcount for drbd_pg:0 on pcmk1 after failed promote: rc=-2
(update=value++, time=1384122681)

Nov 10 23:31:21 PCMK1 crmd[1152]:     info: abort_transition_graph:
match_graph_event:277 - Triggered transition abort (complete=0,
tag=lrm_rsc_op, id=drbd_pg:0_last_failure_0,
magic=2:-2;21:1980:0:0cc9ce4f-2831-4859-96b6-41a50c7aeb56, cib=0.47.38) :
Event failed

Nov 10 23:31:21 PCMK1 crmd[1152]:     info: te_rsc_command: Initiating
action 64: notify drbd_pg:0_post_notify_promote_0 on pcmk1 (local)

Nov 10 23:31:21 PCMK1 attrd[1150]:   notice: attrd_trigger_update: Sending
flush op to all hosts for: fail-count-drbd_pg:0 (1)

Nov 10 23:31:21 PCMK1 lrmd: [1149]: info: rsc:drbd_pg:0:49: notify

Nov 10 23:31:21 PCMK1 attrd[1150]:   notice: attrd_perform_update: Sent
update 126: fail-count-drbd_pg:0=1

Nov 10 23:31:21 PCMK1 crmd[1152]:     info: abort_transition_graph:
te_update_diff:176 - Triggered transition abort (complete=0, tag=nvpair,
id=status-pcmk1-fail-count-drbd_pg.0, name=fail-count-drbd_pg:0, value=1,
magic=NA, cib=0.47.39) : Transient attribute: update

Nov 10 23:31:21 PCMK1 attrd[1150]:   notice: attrd_trigger_update: Sending
flush op to all hosts for: last-failure-drbd_pg:0 (1384122681)

Nov 10 23:31:21 PCMK1 attrd[1150]:   notice: attrd_perform_update: Sent
update 129: last-failure-drbd_pg:0=1384122681

Nov 10 23:31:21 PCMK1 crmd[1152]:     info: abort_transition_graph:
te_update_diff:176 - Triggered transition abort (complete=0, tag=nvpair,
id=status-pcmk1-last-failure-drbd_pg.0, name=last-failure-drbd_pg:0,
value=1384122681, magic=NA, cib=0.47.40) : Transient attribute: update

 

 

 

.. and the rest of the log. don't know if it is necessary:

Nov 10 23:31:21 BBTCAPS01 lrmd: [1149]: info: RA output:
(drbd_pg:0:notify:stdout) 

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: process_lrm_event: LRM
operation drbd_pg:0_notify_0 (call=49, rc=0, cib-update=0, confirmed=true)
ok

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:   notice: run_graph: ==== Transition
1980 (Complete=10, Pending=0, Fired=0, Skipped=11, Incomplete=0,
Source=/var/lib/pengine/pe-input-96.bz2): Stopped

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:   notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC
cause=C_FSA_INTERNAL origin=notify_crmd ]

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: do_state_transition:
Membership changed: 64 -> 72 - join restart

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:   notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_INTEGRATION [ input=I_NODE_JOIN
cause=C_FSA_INTERNAL origin=do_state_transition ]

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: join_make_offer: Making join
offers based on membership 72

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: do_dc_join_offer_all:
join-8: Waiting on 2 outstanding join acks

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: update_dc: Set DC to
bbtcaps01 (3.0.6)

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:   notice: do_state_transition: State
transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED
cause=C_FSA_INTERNAL origin=check_join_state ]

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: do_dc_join_finalize: join-8:
Syncing the CIB from bbtcaps01 to the rest of the cluster

Nov 10 23:31:21 BBTCAPS01 cib[1147]:     info: cib_process_request:
Operation complete: op cib_sync for section 'all' (origin=local/crmd/2207,
version=0.47.40): ok (rc=0)

Nov 10 23:31:21 BBTCAPS01 cib[1147]:     info: cib_process_request:
Operation complete: op cib_modify for section nodes (origin=local/crmd/2208,
version=0.47.41): ok (rc=0)

Nov 10 23:31:21 BBTCAPS01 cib[1147]:     info: cib_process_request:
Operation complete: op cib_modify for section nodes (origin=local/crmd/2209,
version=0.47.42): ok (rc=0)

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: do_dc_join_ack: join-8:
Updating node state to member for bbtcaps01

Nov 10 23:31:21 BBTCAPS01 cib[1147]:     info: cib_process_request:
Operation complete: op cib_delete for section
//node_state[@uname='bbtcaps01']/lrm (origin=local/crmd/2210,
version=0.47.43): ok (rc=0)

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: do_dc_join_ack: join-8:
Updating node state to member for bbtcaps02

Nov 10 23:31:21 BBTCAPS01 cib[1147]:     info: cib_process_request:
Operation complete: op cib_delete for section
//node_state[@uname='bbtcaps02']/lrm (origin=local/crmd/2212,
version=0.47.45): ok (rc=0)

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:   notice: do_state_transition: State
transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED
cause=C_FSA_INTERNAL origin=check_join_state ]

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: abort_transition_graph:
do_te_invoke:162 - Triggered transition abort (complete=1) : Peer Cancelled

Nov 10 23:31:21 BBTCAPS01 cib[1147]:     info: cib_process_request:
Operation complete: op cib_modify for section nodes (origin=local/crmd/2214,
version=0.47.47): ok (rc=0)

Nov 10 23:31:21 BBTCAPS01 cib[1147]:     info: cib_process_request:
Operation complete: op cib_modify for section cib (origin=local/crmd/2216,
version=0.47.49): ok (rc=0)

Nov 10 23:31:21 BBTCAPS01 pengine[1151]:   notice: unpack_config: On loss of
CCM Quorum: Ignore

Nov 10 23:31:21 BBTCAPS01 pengine[1151]:  warning: unpack_rsc_op: Processing
failed op drbd_pg:0_last_failure_0 on bbtcaps01: unknown exec error (-2)

Nov 10 23:31:21 BBTCAPS01 pengine[1151]:   notice: common_apply_stickiness:
ms_drbd_pg can fail 999999 more times on bbtcaps01 before being forced off

Nov 10 23:31:21 BBTCAPS01 pengine[1151]:   notice: common_apply_stickiness:
ms_drbd_pg can fail 999999 more times on bbtcaps01 before being forced off

Nov 10 23:31:21 BBTCAPS01 pengine[1151]:   notice: LogActions: Demote
drbd_pg:0#011(Master -> Slave bbtcaps01)

Nov 10 23:31:21 BBTCAPS01 pengine[1151]:   notice: LogActions: Recover
drbd_pg:0#011(Master bbtcaps01)

Nov 10 23:31:21 BBTCAPS01 attrd[1150]:   notice: attrd_local_callback:
Sending full refresh (origin=crmd)

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:   notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]

Nov 10 23:31:21 BBTCAPS01 attrd[1150]:   notice: attrd_trigger_update:
Sending flush op to all hosts for: master-drbd_pg:0 (10000)

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:  warning: destroy_action: Cancelling
timer for action 2 (src=4229)

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: do_te_invoke: Processing
graph 1981 (ref=pe_calc-dc-1384122681-2160) derived from
/var/lib/pengine/pe-input-97.bz2

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: te_rsc_command: Initiating
action 78: notify drbd_pg:0_pre_notify_demote_0 on bbtcaps01 (local)

Nov 10 23:31:21 BBTCAPS01 lrmd: [1149]: info: rsc:drbd_pg:0:50: notify

Nov 10 23:31:21 BBTCAPS01 attrd[1150]:   notice: attrd_trigger_update:
Sending flush op to all hosts for: pingd (101)

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: te_rsc_command: Initiating
action 80: notify drbd_pg:1_pre_notify_demote_0 on bbtcaps02

Nov 10 23:31:21 BBTCAPS01 attrd[1150]:   notice: attrd_trigger_update:
Sending flush op to all hosts for: fail-count-drbd_pg:0 (1)

Nov 10 23:31:21 BBTCAPS01 attrd[1150]:   notice: attrd_trigger_update:
Sending flush op to all hosts for: last-failure-drbd_pg:0 (1384122681)

Nov 10 23:31:21 BBTCAPS01 attrd[1150]:   notice: attrd_trigger_update:
Sending flush op to all hosts for: probe_complete (true)

Nov 10 23:31:21 BBTCAPS01 pengine[1151]:   notice: process_pe_message:
Transition 1981: PEngine Input stored in: /var/lib/pengine/pe-input-97.bz2

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: process_lrm_event: LRM
operation drbd_pg:0_notify_0 (call=50, rc=0, cib-update=0, confirmed=true)
ok

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: te_rsc_command: Initiating
action 26: demote drbd_pg:0_demote_0 on bbtcaps01 (local)

Nov 10 23:31:21 BBTCAPS01 lrmd: [1149]: info: rsc:drbd_pg:0:51: demote

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: process_lrm_event: LRM
operation drbd_pg:0_demote_0 (call=51, rc=0, cib-update=2218,
confirmed=true) ok

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: te_rsc_command: Initiating
action 79: notify drbd_pg:0_post_notify_demote_0 on bbtcaps01 (local)

Nov 10 23:31:21 BBTCAPS01 lrmd: [1149]: info: rsc:drbd_pg:0:52: notify

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: te_rsc_command: Initiating
action 81: notify drbd_pg:1_post_notify_demote_0 on bbtcaps02

Nov 10 23:31:21 BBTCAPS01 lrmd: [1149]: info: RA output:
(drbd_pg:0:notify:stdout) 

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: process_lrm_event: LRM
operation drbd_pg:0_notify_0 (call=52, rc=0, cib-update=0, confirmed=true)
ok

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: te_rsc_command: Initiating
action 71: notify drbd_pg:0_pre_notify_stop_0 on bbtcaps01 (local)

Nov 10 23:31:21 BBTCAPS01 lrmd: [1149]: info: rsc:drbd_pg:0:53: notify

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: te_rsc_command: Initiating
action 72: notify drbd_pg:1_pre_notify_stop_0 on bbtcaps02

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: process_lrm_event: LRM
operation drbd_pg:0_notify_0 (call=53, rc=0, cib-update=0, confirmed=true)
ok

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: te_rsc_command: Initiating
action 2: stop drbd_pg:0_stop_0 on bbtcaps01 (local)

Nov 10 23:31:21 BBTCAPS01 lrmd: [1149]: info: rsc:drbd_pg:0:54: stop

Nov 10 23:31:21 BBTCAPS01 kernel: d-con postgres: Requested state change
failed by peer: Refusing to be Primary while peer is not outdated (-7)

Nov 10 23:31:21 BBTCAPS01 kernel: d-con postgres: peer( Primary -> Unknown )
conn( Connected -> Disconnecting ) disk( UpToDate -> Outdated ) pdsk(
UpToDate -> DUnknown ) 

Nov 10 23:31:21 BBTCAPS01 kernel: d-con postgres: asender terminated

Nov 10 23:31:21 BBTCAPS01 kernel: d-con postgres: Terminating
drbd_a_postgres

Nov 10 23:31:21 BBTCAPS01 kernel: d-con postgres: Connection closed

Nov 10 23:31:21 BBTCAPS01 kernel: d-con postgres: conn( Disconnecting ->
StandAlone ) 

Nov 10 23:31:21 BBTCAPS01 kernel: d-con postgres: receiver terminated

Nov 10 23:31:21 BBTCAPS01 kernel: d-con postgres: Terminating
drbd_r_postgres

Nov 10 23:31:21 BBTCAPS01 kernel: block drbd0: disk( Outdated -> Failed ) 

Nov 10 23:31:21 BBTCAPS01 kernel: block drbd0: bitmap WRITE of 0 pages took
0 jiffies

Nov 10 23:31:21 BBTCAPS01 kernel: block drbd0: 0 KB (0 bits) marked
out-of-sync by on disk bit-map.

Nov 10 23:31:21 BBTCAPS01 kernel: block drbd0: disk( Failed -> Diskless ) 

Nov 10 23:31:21 BBTCAPS01 kernel: block drbd0: drbd_bm_resize called with
capacity == 0

Nov 10 23:31:21 BBTCAPS01 kernel: d-con postgres: Terminating
drbd_w_postgres

Nov 10 23:31:21 BBTCAPS01 lrmd: [1149]: info: RA output:
(drbd_pg:0:stop:stdout) 

Nov 10 23:31:21 BBTCAPS01 lrmd: [1149]: info: RA output:
(drbd_pg:0:stop:stdout) 

Nov 10 23:31:21 BBTCAPS01 attrd[1150]:   notice: attrd_trigger_update:
Sending flush op to all hosts for: master-drbd_pg:0 (<null>)

Nov 10 23:31:21 BBTCAPS01 attrd[1150]:   notice: attrd_perform_update: Sent
delete 146: node=bbtcaps01, attr=master-drbd_pg:0, id=<n/a>, set=(null),
section=status

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: abort_transition_graph:
te_update_diff:194 - Triggered transition abort (complete=0,
tag=transient_attributes, id=bbtcaps01, magic=NA, cib=0.47.63) : Transient
attribute: removal

Nov 10 23:31:21 BBTCAPS01 lrmd: [1149]: info: RA output:
(drbd_pg:0:stop:stdout) 

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: process_lrm_event: LRM
operation drbd_pg:0_stop_0 (call=54, rc=0, cib-update=2219, confirmed=true)
ok

Nov 10 23:31:21 BBTCAPS01 crmd[1152]:     info: te_rsc_command: Initiating
action 73: notify drbd_pg:1_post_notify_stop_0 on bbtcaps02

Nov 10 23:31:22 BBTCAPS01 crmd[1152]:   notice: run_graph: ==== Transition
1981 (Complete=21, Pending=0, Fired=0, Skipped=7, Incomplete=5,
Source=/var/lib/pengine/pe-input-97.bz2): Stopped

Nov 10 23:31:22 BBTCAPS01 crmd[1152]:   notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC
cause=C_FSA_INTERNAL origin=notify_crmd ]

Nov 10 23:31:22 BBTCAPS01 pengine[1151]:   notice: unpack_config: On loss of
CCM Quorum: Ignore

Nov 10 23:31:22 BBTCAPS01 pengine[1151]:  warning: unpack_rsc_op: Processing
failed op drbd_pg:0_last_failure_0 on bbtcaps01: unknown exec error (-2)

Nov 10 23:31:22 BBTCAPS01 pengine[1151]:   notice: common_apply_stickiness:
ms_drbd_pg can fail 999999 more times on bbtcaps01 before being forced off

Nov 10 23:31:22 BBTCAPS01 pengine[1151]:   notice: common_apply_stickiness:
ms_drbd_pg can fail 999999 more times on bbtcaps01 before being forced off

Nov 10 23:31:22 BBTCAPS01 pengine[1151]:   notice: LogActions: Start
drbd_pg:0#011(bbtcaps01)

Nov 10 23:31:22 BBTCAPS01 crmd[1152]:   notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]

Nov 10 23:31:22 BBTCAPS01 crmd[1152]:     info: do_te_invoke: Processing
graph 1982 (ref=pe_calc-dc-1384122682-2173) derived from
/var/lib/pengine/pe-input-98.bz2

Nov 10 23:31:22 BBTCAPS01 crmd[1152]:     info: te_rsc_command: Initiating
action 68: notify drbd_pg:1_pre_notify_start_0 on bbtcaps02

Nov 10 23:31:22 BBTCAPS01 pengine[1151]:   notice: process_pe_message:
Transition 1982: PEngine Input stored in: /var/lib/pengine/pe-input-98.bz2

Nov 10 23:31:22 BBTCAPS01 crmd[1152]:     info: te_rsc_command: Initiating
action 25: start drbd_pg:0_start_0 on bbtcaps01 (local)

Nov 10 23:31:22 BBTCAPS01 lrmd: [1149]: info: rsc:drbd_pg:0:55: start

Nov 10 23:31:22 BBTCAPS01 lrmd: [1149]: info: RA output:
(drbd_pg:0:start:stdout) 

Nov 10 23:31:22 BBTCAPS01 lrmd: [1149]: info: RA output:
(drbd_pg:0:start:stdout) 

Nov 10 23:31:22 BBTCAPS01 lrmd: [1149]: info: RA output:
(drbd_pg:0:start:stdout) 

Nov 10 23:31:22 BBTCAPS01 kernel: d-con postgres: Starting worker thread
(from drbdsetup [27669])

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: disk( Diskless -> Attaching )


Nov 10 23:31:22 BBTCAPS01 kernel: d-con postgres: Method to ensure write
ordering: flush

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: max BIO size = 1048576

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: drbd_bm_resize called with
capacity == 6291192

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: resync bitmap: bits=786399
words=12288 pages=24

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: size = 3072 MB (3145596 KB)

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: bitmap READ of 24 pages took
2 jiffies

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: recounting of set bits took
additional 0 jiffies

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: 0 KB (0 bits) marked
out-of-sync by on disk bit-map.

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: disk( Attaching -> Outdated )


Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: attached to UUIDs
DE5E36A22B7D9444:0000000000000000:37551CE28C1B606D:37541CE28C1B606D

Nov 10 23:31:22 BBTCAPS01 lrmd: [1149]: info: RA output:
(drbd_pg:0:start:stdout) 

Nov 10 23:31:22 BBTCAPS01 lrmd: [1149]: info: RA output:
(drbd_pg:0:start:stdout) 

Nov 10 23:31:22 BBTCAPS01 kernel: d-con postgres: conn( StandAlone ->
Unconnected ) 

Nov 10 23:31:22 BBTCAPS01 kernel: d-con postgres: Starting receiver thread
(from drbd_w_postgres [27671])

Nov 10 23:31:22 BBTCAPS01 kernel: d-con postgres: receiver (re)started

Nov 10 23:31:22 BBTCAPS01 kernel: d-con postgres: conn( Unconnected ->
WFConnection ) 

Nov 10 23:31:22 BBTCAPS01 lrmd: [1149]: info: RA output:
(drbd_pg:0:start:stdout) 

Nov 10 23:31:22 BBTCAPS01 crmd[1152]:     info: process_lrm_event: LRM
operation drbd_pg:0_start_0 (call=55, rc=0, cib-update=2221, confirmed=true)
ok

Nov 10 23:31:22 BBTCAPS01 crmd[1152]:     info: te_rsc_command: Initiating
action 67: notify drbd_pg:0_post_notify_start_0 on bbtcaps01 (local)

Nov 10 23:31:22 BBTCAPS01 lrmd: [1149]: info: rsc:drbd_pg:0:56: notify

Nov 10 23:31:22 BBTCAPS01 crmd[1152]:     info: te_rsc_command: Initiating
action 69: notify drbd_pg:1_post_notify_start_0 on bbtcaps02

Nov 10 23:31:22 BBTCAPS01 lrmd: [1149]: info: RA output:
(drbd_pg:0:notify:stdout) 

Nov 10 23:31:22 BBTCAPS01 crmd[1152]:     info: process_lrm_event: LRM
operation drbd_pg:0_notify_0 (call=56, rc=0, cib-update=0, confirmed=true)
ok

Nov 10 23:31:22 BBTCAPS01 crmd[1152]:     info: te_rsc_command: Initiating
action 26: monitor drbd_pg:0_monitor_16000 on bbtcaps01 (local)

Nov 10 23:31:22 BBTCAPS01 lrmd: [1149]: info: rsc:drbd_pg:0:57: monitor

Nov 10 23:31:22 BBTCAPS01 crmd[1152]:     info: process_lrm_event: LRM
operation drbd_pg:0_monitor_16000 (call=57, rc=0, cib-update=2222,
confirmed=false) ok

Nov 10 23:31:22 BBTCAPS01 crmd[1152]:   notice: run_graph: ==== Transition
1982 (Complete=11, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pengine/pe-input-98.bz2): Complete

Nov 10 23:31:22 BBTCAPS01 crmd[1152]:   notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]

Nov 10 23:31:22 BBTCAPS01 kernel: d-con postgres: Handshake successful:
Agreed network protocol version 101

Nov 10 23:31:22 BBTCAPS01 kernel: d-con postgres: conn( WFConnection ->
WFReportParams ) 

Nov 10 23:31:22 BBTCAPS01 kernel: d-con postgres: Starting asender thread
(from drbd_r_postgres [27686])

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: drbd_sync_handshake:

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: self
DE5E36A22B7D9444:0000000000000000:37551CE28C1B606D:37541CE28C1B606D bits:0
flags:0

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: peer
CAF56A22D5E1724D:DE5E36A22B7D9445:37551CE28C1B606C:37541CE28C1B606D bits:0
flags:0

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: uuid_compare()=-1 by rule 50

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: peer( Unknown -> Primary )
conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate ) 

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: receive bitmap stats
[Bytes(packets)]: plain 0(0), RLE 21(1), total 21; compression: 100.0%

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: send bitmap stats
[Bytes(packets)]: plain 0(0), RLE 21(1), total 21; compression: 100.0%

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: conn( WFBitMapT -> WFSyncUUID
) 

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: updated sync uuid
DE5F36A22B7D9444:0000000000000000:37551CE28C1B606D:37541CE28C1B606D

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: helper command: /sbin/drbdadm
before-resync-target minor-0

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: helper command: /sbin/drbdadm
before-resync-target minor-0 exit code 0 (0x0)

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: conn( WFSyncUUID ->
SyncTarget ) disk( Outdated -> Inconsistent ) 

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: Began resync as SyncTarget
(will sync 0 KB [0 bits set]).

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: Resync done (total 1 sec;
paused 0 sec; 0 K/sec)

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: updated UUIDs
CAF56A22D5E1724C:0000000000000000:DE5F36A22B7D9444:DE5E36A22B7D9445

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: conn( SyncTarget -> Connected
) disk( Inconsistent -> UpToDate ) 

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: helper command: /sbin/drbdadm
after-resync-target minor-0

Nov 10 23:31:22 BBTCAPS01 crm-unfence-peer.sh[27757]: invoked for postgres

Nov 10 23:31:22 BBTCAPS01 kernel: block drbd0: helper command: /sbin/drbdadm
after-resync-target minor-0 exit code 0 (0x0)

Nov 10 23:31:38 BBTCAPS01 attrd[1150]:   notice: attrd_trigger_update:
Sending flush op to all hosts for: master-drbd_pg:0 (10000)

Nov 10 23:31:38 BBTCAPS01 attrd[1150]:   notice: attrd_perform_update: Sent
update 150: master-drbd_pg:0=10000

Nov 10 23:31:38 BBTCAPS01 crmd[1152]:     info: abort_transition_graph:
te_update_diff:176 - Triggered transition abort (complete=1, tag=nvpair,
id=status-bbtcaps01-master-drbd_pg.0, name=master-drbd_pg:0, value=10000,
magic=NA, cib=0.47.67) : Transient attribute: update

Nov 10 23:31:38 BBTCAPS01 crmd[1152]:   notice: do_state_transition: State
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL
origin=abort_transition_graph ]

Nov 10 23:31:38 BBTCAPS01 pengine[1151]:   notice: unpack_config: On loss of
CCM Quorum: Ignore

Nov 10 23:31:38 BBTCAPS01 pengine[1151]:  warning: unpack_rsc_op: Processing
failed op drbd_pg:0_last_failure_0 on bbtcaps01: unknown exec error (-2)

Nov 10 23:31:38 BBTCAPS01 pengine[1151]:   notice: common_apply_stickiness:
ms_drbd_pg can fail 999999 more times on bbtcaps01 before being forced off

Nov 10 23:31:38 BBTCAPS01 pengine[1151]:   notice: common_apply_stickiness:
ms_drbd_pg can fail 999999 more times on bbtcaps01 before being forced off

Nov 10 23:31:38 BBTCAPS01 crmd[1152]:   notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]

Nov 10 23:31:38 BBTCAPS01 crmd[1152]:     info: do_te_invoke: Processing
graph 1983 (ref=pe_calc-dc-1384122698-2180) derived from
/var/lib/pengine/pe-input-99.bz2

Nov 10 23:31:38 BBTCAPS01 crmd[1152]:   notice: run_graph: ==== Transition
1983 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pengine/pe-input-99.bz2): Complete

Nov 10 23:31:38 BBTCAPS01 crmd[1152]:   notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]

Nov 10 23:31:38 BBTCAPS01 pengine[1151]:   notice: process_pe_message:
Transition 1983: PEngine Input stored in: /var/lib/pengine/pe-input-99.bz2

Nov 10 23:33:37 BBTCAPS01 yum[28121]: Installed: omping-0.0.4-1.el6.x86_64

Nov 10 23:38:41 BBTCAPS01 cib[1147]:     info: cib_stats: Processed 117
operations (1111.00us average, 0% utilization) in the last 10min

Nov 10 23:46:38 BBTCAPS01 crmd[1152]:     info: crm_timer_popped: PEngine
Recheck Timer (I_PE_CALC) just popped (900000ms)

Nov 10 23:46:38 BBTCAPS01 crmd[1152]:   notice: do_state_transition: State
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED
origin=crm_timer_popped ]

Nov 10 23:46:38 BBTCAPS01 crmd[1152]:     info: do_state_transition:
Progressed to state S_POLICY_ENGINE after C_TIMER_POPPED

Nov 10 23:46:38 BBTCAPS01 pengine[1151]:   notice: unpack_config: On loss of
CCM Quorum: Ignore

Nov 10 23:46:38 BBTCAPS01 pengine[1151]:  warning: unpack_rsc_op: Processing
failed op drbd_pg:0_last_failure_0 on bbtcaps01: unknown exec error (-2)

Nov 10 23:46:38 BBTCAPS01 pengine[1151]:   notice: common_apply_stickiness:
ms_drbd_pg can fail 999999 more times on bbtcaps01 before being forced off

Nov 10 23:46:38 BBTCAPS01 pengine[1151]:   notice: common_apply_stickiness:
ms_drbd_pg can fail 999999 more times on bbtcaps01 before being forced off

Nov 10 23:46:38 BBTCAPS01 pengine[1151]:   notice: process_pe_message:
Transition 1984: PEngine Input stored in: /var/lib/pengine/pe-input-99.bz2

Nov 10 23:46:38 BBTCAPS01 crmd[1152]:   notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]

Nov 10 23:46:38 BBTCAPS01 crmd[1152]:     info: do_te_invoke: Processing
graph 1984 (ref=pe_calc-dc-1384123598-2181) derived from
/var/lib/pengine/pe-input-99.bz2

Nov 10 23:46:38 BBTCAPS01 crmd[1152]:   notice: run_graph: ==== Transition
1984 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pengine/pe-input-99.bz2): Complete

Nov 10 23:46:38 BBTCAPS01 crmd[1152]:   notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]

Nov 10 23:48:41 BBTCAPS01 cib[1147]:     info: cib_stats: Processed 38
operations (0.00us average, 0% utilization) in the last 10min

 

 

S pozdravom / Best regards

 

ING. MICHAL MIŠTINA / Virte, a.s.

SARTORISOVA 21 / 821 08 BRATISLAVA 2

SLOVAK REPUBLIC

T +421-2-208-50-234

M +421-917-497-134

 <mailto:MICHAL.MISTINA at VIRTE.SK> MICHAL.MISTINA at VIRTE.SK

 <http://www.virte.sk/> WWW.VIRTE.SK

 

Before printing, think about ENVIRONMENTAL responsibility

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20131111/a1125dcf/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 1151 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20131111/a1125dcf/attachment-0002.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3076 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20131111/a1125dcf/attachment-0002.p7s>


More information about the Pacemaker mailing list