[ClusterLabs] Cluster failover failure with Unresolved dependency
Lorand Kelemen
lorand.kelemen at gmail.com
Fri Mar 18 16:42:29 CET 2016
I reviewed all the logs, but found nothing out of the ordinary, besides the
"resource cannot run anywhere" line, however after the cluster recheck
interval expired the services started fine without any suspicious log
entries.
If anybody wants to check further I can provide logs, this behaviour is
odd, but good enough for me, with a maximum downtime of cluster recheck
interval...
Best regards,
Lorand
On Fri, Mar 18, 2016 at 10:53 AM, Lorand Kelemen <lorand.kelemen at gmail.com>
wrote:
> 5 minutes have passed. Cluster recheck interval is set to 5 minutes. I
> will check the logs...
> Sorry for spamming the list, I will try to get back with the solution.
>
> On Fri, Mar 18, 2016 at 10:48 AM, Lorand Kelemen <lorand.kelemen at gmail.com
> > wrote:
>
>> Hmm. While drafting this mail, services started on mail1. Interesting :)
>>
>> On Fri, Mar 18, 2016 at 10:46 AM, Lorand Kelemen <
>> lorand.kelemen at gmail.com> wrote:
>>
>>> Sure thing. Just to highlight the differences from before: current
>>> constraints config, also the mail-services group is growing with systemd
>>> resources.
>>>
>>> What happened: mail2 was running all resources, then I killed the
>>> amavisd master process.
>>>
>>> Best regards,
>>> Lorand
>>>
>>> Location Constraints:
>>> Ordering Constraints:
>>> promote mail-clone then start fs-services (kind:Mandatory)
>>> promote spool-clone then start fs-services (kind:Mandatory)
>>> start network-services then start fs-services (kind:Mandatory)
>>> start fs-services then start mail-services (kind:Mandatory)
>>> Colocation Constraints:
>>> fs-services with spool-clone (score:INFINITY) (rsc-role:Started)
>>> (with-rsc-role:Master)
>>> fs-services with mail-clone (score:INFINITY) (rsc-role:Started)
>>> (with-rsc-role:Master)
>>> mail-services with fs-services (score:INFINITY)
>>> network-services with mail-services (score:INFINITY)
>>>
>>> Group: mail-services
>>> Resource: amavisd (class=systemd type=amavisd)
>>> Operations: monitor interval=60s (amavisd-monitor-interval-60s)
>>> Resource: spamassassin (class=systemd type=spamassassin)
>>> Operations: monitor interval=60s (spamassassin-monitor-interval-60s)
>>> Resource: clamd (class=systemd type=clamd at amavisd)
>>> Operations: monitor interval=60s (clamd-monitor-interval-60s)
>>>
>>>
>>>
>>> Cluster name: mailcluster
>>> Last updated: Fri Mar 18 10:43:57 2016 Last change: Fri Mar 18
>>> 10:40:28 2016 by hacluster via crmd on mail1
>>> Stack: corosync
>>> Current DC: mail2 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with
>>> quorum
>>> 2 nodes and 10 resources configured
>>>
>>> Online: [ mail1 mail2 ]
>>>
>>> Full list of resources:
>>>
>>> Resource Group: network-services
>>> virtualip-1 (ocf::heartbeat:IPaddr2): Stopped
>>> Master/Slave Set: spool-clone [spool]
>>> Masters: [ mail2 ]
>>> Slaves: [ mail1 ]
>>> Master/Slave Set: mail-clone [mail]
>>> Masters: [ mail2 ]
>>> Slaves: [ mail1 ]
>>> Resource Group: fs-services
>>> fs-spool (ocf::heartbeat:Filesystem): Stopped
>>> fs-mail (ocf::heartbeat:Filesystem): Stopped
>>> Resource Group: mail-services
>>> amavisd (systemd:amavisd): Stopped
>>> spamassassin (systemd:spamassassin): Stopped
>>> clamd (systemd:clamd at amavisd): Stopped
>>>
>>> Failed Actions:
>>> * amavisd_monitor_60000 on mail2 'not running' (7): call=2499,
>>> status=complete, exitreason='none',
>>> last-rc-change='Fri Mar 18 10:42:29 2016', queued=0ms, exec=0ms
>>>
>>>
>>> PCSD Status:
>>> mail1: Online
>>> mail2: Online
>>>
>>> Daemon Status:
>>> corosync: active/enabled
>>> pacemaker: active/enabled
>>> pcsd: active/enabled
>>>
>>>
>>>
>>> <cib crm_feature_set="3.0.10" validate-with="pacemaker-2.3" epoch="277"
>>> num_updates="22" admin_epoch="0" cib-last-written="Fri Mar 18 10:40:28
>>> 2016" update-origin="mail1" update-client="crmd" update-user="hacluster"
>>> have-quorum="1" dc-uuid="2">
>>> <configuration>
>>> <crm_config>
>>> <cluster_property_set id="cib-bootstrap-options">
>>> <nvpair id="cib-bootstrap-options-have-watchdog"
>>> name="have-watchdog" value="false"/>
>>> <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
>>> value="1.1.13-10.el7_2.2-44eb2dd"/>
>>> <nvpair id="cib-bootstrap-options-cluster-infrastructure"
>>> name="cluster-infrastructure" value="corosync"/>
>>> <nvpair id="cib-bootstrap-options-cluster-name"
>>> name="cluster-name" value="mailcluster"/>
>>> <nvpair id="cib-bootstrap-options-stonith-enabled"
>>> name="stonith-enabled" value="false"/>
>>> <nvpair id="cib-bootstrap-options-pe-error-series-max"
>>> name="pe-error-series-max" value="1024"/>
>>> <nvpair id="cib-bootstrap-options-pe-warn-series-max"
>>> name="pe-warn-series-max" value="1024"/>
>>> <nvpair id="cib-bootstrap-options-pe-input-series-max"
>>> name="pe-input-series-max" value="1024"/>
>>> <nvpair id="cib-bootstrap-options-no-quorum-policy"
>>> name="no-quorum-policy" value="ignore"/>
>>> <nvpair id="cib-bootstrap-options-cluster-recheck-interval"
>>> name="cluster-recheck-interval" value="5min"/>
>>> <nvpair id="cib-bootstrap-options-last-lrm-refresh"
>>> name="last-lrm-refresh" value="1458294028"/>
>>> <nvpair id="cib-bootstrap-options-default-resource-stickiness"
>>> name="default-resource-stickiness" value="infinity"/>
>>> </cluster_property_set>
>>> </crm_config>
>>> <nodes>
>>> <node id="1" uname="mail1">
>>> <instance_attributes id="nodes-1"/>
>>> </node>
>>> <node id="2" uname="mail2">
>>> <instance_attributes id="nodes-2"/>
>>> </node>
>>> </nodes>
>>> <resources>
>>> <group id="network-services">
>>> <primitive class="ocf" id="virtualip-1" provider="heartbeat"
>>> type="IPaddr2">
>>> <instance_attributes id="virtualip-1-instance_attributes">
>>> <nvpair id="virtualip-1-instance_attributes-ip" name="ip"
>>> value="10.20.64.10"/>
>>> <nvpair id="virtualip-1-instance_attributes-cidr_netmask"
>>> name="cidr_netmask" value="24"/>
>>> <nvpair id="virtualip-1-instance_attributes-nic" name="nic"
>>> value="lan0"/>
>>> </instance_attributes>
>>> <operations>
>>> <op id="virtualip-1-start-interval-0s" interval="0s"
>>> name="start" timeout="20s"/>
>>> <op id="virtualip-1-stop-interval-0s" interval="0s"
>>> name="stop" timeout="20s"/>
>>> <op id="virtualip-1-monitor-interval-30s" interval="30s"
>>> name="monitor"/>
>>> </operations>
>>> </primitive>
>>> </group>
>>> <master id="spool-clone">
>>> <primitive class="ocf" id="spool" provider="linbit" type="drbd">
>>> <instance_attributes id="spool-instance_attributes">
>>> <nvpair id="spool-instance_attributes-drbd_resource"
>>> name="drbd_resource" value="spool"/>
>>> </instance_attributes>
>>> <operations>
>>> <op id="spool-start-interval-0s" interval="0s" name="start"
>>> timeout="240"/>
>>> <op id="spool-promote-interval-0s" interval="0s"
>>> name="promote" timeout="90"/>
>>> <op id="spool-demote-interval-0s" interval="0s"
>>> name="demote" timeout="90"/>
>>> <op id="spool-stop-interval-0s" interval="0s" name="stop"
>>> timeout="100"/>
>>> <op id="spool-monitor-interval-10s" interval="10s"
>>> name="monitor"/>
>>> </operations>
>>> </primitive>
>>> <meta_attributes id="spool-clone-meta_attributes">
>>> <nvpair id="spool-clone-meta_attributes-master-max"
>>> name="master-max" value="1"/>
>>> <nvpair id="spool-clone-meta_attributes-master-node-max"
>>> name="master-node-max" value="1"/>
>>> <nvpair id="spool-clone-meta_attributes-clone-max"
>>> name="clone-max" value="2"/>
>>> <nvpair id="spool-clone-meta_attributes-clone-node-max"
>>> name="clone-node-max" value="1"/>
>>> <nvpair id="spool-clone-meta_attributes-notify" name="notify"
>>> value="true"/>
>>> </meta_attributes>
>>> </master>
>>> <master id="mail-clone">
>>> <primitive class="ocf" id="mail" provider="linbit" type="drbd">
>>> <instance_attributes id="mail-instance_attributes">
>>> <nvpair id="mail-instance_attributes-drbd_resource"
>>> name="drbd_resource" value="mail"/>
>>> </instance_attributes>
>>> <operations>
>>> <op id="mail-start-interval-0s" interval="0s" name="start"
>>> timeout="240"/>
>>> <op id="mail-promote-interval-0s" interval="0s"
>>> name="promote" timeout="90"/>
>>> <op id="mail-demote-interval-0s" interval="0s" name="demote"
>>> timeout="90"/>
>>> <op id="mail-stop-interval-0s" interval="0s" name="stop"
>>> timeout="100"/>
>>> <op id="mail-monitor-interval-10s" interval="10s"
>>> name="monitor"/>
>>> </operations>
>>> </primitive>
>>> <meta_attributes id="mail-clone-meta_attributes">
>>> <nvpair id="mail-clone-meta_attributes-master-max"
>>> name="master-max" value="1"/>
>>> <nvpair id="mail-clone-meta_attributes-master-node-max"
>>> name="master-node-max" value="1"/>
>>> <nvpair id="mail-clone-meta_attributes-clone-max"
>>> name="clone-max" value="2"/>
>>> <nvpair id="mail-clone-meta_attributes-clone-node-max"
>>> name="clone-node-max" value="1"/>
>>> <nvpair id="mail-clone-meta_attributes-notify" name="notify"
>>> value="true"/>
>>> </meta_attributes>
>>> </master>
>>> <group id="fs-services">
>>> <primitive class="ocf" id="fs-spool" provider="heartbeat"
>>> type="Filesystem">
>>> <instance_attributes id="fs-spool-instance_attributes">
>>> <nvpair id="fs-spool-instance_attributes-device"
>>> name="device" value="/dev/drbd0"/>
>>> <nvpair id="fs-spool-instance_attributes-directory"
>>> name="directory" value="/var/spool/postfix"/>
>>> <nvpair id="fs-spool-instance_attributes-fstype"
>>> name="fstype" value="ext4"/>
>>> <nvpair id="fs-spool-instance_attributes-options"
>>> name="options" value="nodev,nosuid,noexec"/>
>>> </instance_attributes>
>>> <operations>
>>> <op id="fs-spool-start-interval-0s" interval="0s"
>>> name="start" timeout="60"/>
>>> <op id="fs-spool-stop-interval-0s" interval="0s" name="stop"
>>> timeout="60"/>
>>> <op id="fs-spool-monitor-interval-20" interval="20"
>>> name="monitor" timeout="40"/>
>>> </operations>
>>> </primitive>
>>> <primitive class="ocf" id="fs-mail" provider="heartbeat"
>>> type="Filesystem">
>>> <instance_attributes id="fs-mail-instance_attributes">
>>> <nvpair id="fs-mail-instance_attributes-device"
>>> name="device" value="/dev/drbd1"/>
>>> <nvpair id="fs-mail-instance_attributes-directory"
>>> name="directory" value="/var/spool/mail"/>
>>> <nvpair id="fs-mail-instance_attributes-fstype"
>>> name="fstype" value="ext4"/>
>>> <nvpair id="fs-mail-instance_attributes-options"
>>> name="options" value="nodev,nosuid,noexec"/>
>>> </instance_attributes>
>>> <operations>
>>> <op id="fs-mail-start-interval-0s" interval="0s"
>>> name="start" timeout="60"/>
>>> <op id="fs-mail-stop-interval-0s" interval="0s" name="stop"
>>> timeout="60"/>
>>> <op id="fs-mail-monitor-interval-20" interval="20"
>>> name="monitor" timeout="40"/>
>>> </operations>
>>> </primitive>
>>> </group>
>>> <group id="mail-services">
>>> <primitive class="systemd" id="amavisd" type="amavisd">
>>> <instance_attributes id="amavisd-instance_attributes"/>
>>> <operations>
>>> <op id="amavisd-monitor-interval-60s" interval="60s"
>>> name="monitor"/>
>>> </operations>
>>> </primitive>
>>> <primitive class="systemd" id="spamassassin" type="spamassassin">
>>> <instance_attributes id="spamassassin-instance_attributes"/>
>>> <operations>
>>> <op id="spamassassin-monitor-interval-60s" interval="60s"
>>> name="monitor"/>
>>> </operations>
>>> </primitive>
>>> <primitive class="systemd" id="clamd" type="clamd at amavisd">
>>> <instance_attributes id="clamd-instance_attributes"/>
>>> <operations>
>>> <op id="clamd-monitor-interval-60s" interval="60s"
>>> name="monitor"/>
>>> </operations>
>>> </primitive>
>>> </group>
>>> </resources>
>>> <constraints>
>>> <rsc_order first="mail-clone" first-action="promote"
>>> id="order-mail-clone-fs-services-mandatory" then="fs-services"
>>> then-action="start"/>
>>> <rsc_order first="spool-clone" first-action="promote"
>>> id="order-spool-clone-fs-services-mandatory" then="fs-services"
>>> then-action="start"/>
>>> <rsc_order first="network-services" first-action="start"
>>> id="order-network-services-fs-services-mandatory" then="fs-services"
>>> then-action="start"/>
>>> <rsc_order first="fs-services" first-action="start"
>>> id="order-fs-services-mail-services-mandatory" then="mail-services"
>>> then-action="start"/>
>>> <rsc_colocation id="colocation-fs-services-spool-clone-INFINITY"
>>> rsc="fs-services" rsc-role="Started" score="INFINITY"
>>> with-rsc="spool-clone" with-rsc-role="Master"/>
>>> <rsc_colocation id="colocation-fs-services-mail-clone-INFINITY"
>>> rsc="fs-services" rsc-role="Started" score="INFINITY" with-rsc="mail-clone"
>>> with-rsc-role="Master"/>
>>> <rsc_colocation id="colocation-mail-services-fs-services-INFINITY"
>>> rsc="mail-services" score="INFINITY" with-rsc="fs-services"/>
>>> <rsc_colocation
>>> id="colocation-network-services-mail-services-INFINITY"
>>> rsc="network-services" score="INFINITY" with-rsc="mail-services"/>
>>> </constraints>
>>> <op_defaults>
>>> <meta_attributes id="op_defaults-options">
>>> <nvpair id="op_defaults-options-on-fail" name="on-fail"
>>> value="restart"/>
>>> </meta_attributes>
>>> </op_defaults>
>>> <rsc_defaults>
>>> <meta_attributes id="rsc_defaults-options">
>>> <nvpair id="rsc_defaults-options-migration-threshold"
>>> name="migration-threshold" value="1"/>
>>> </meta_attributes>
>>> </rsc_defaults>
>>> </configuration>
>>> <status>
>>> <node_state id="1" uname="mail1" in_ccm="true" crmd="online"
>>> crm-debug-origin="do_update_resource" join="member" expected="member">
>>> <transient_attributes id="1">
>>> <instance_attributes id="status-1">
>>> <nvpair id="status-1-shutdown" name="shutdown" value="0"/>
>>> <nvpair id="status-1-probe_complete" name="probe_complete"
>>> value="true"/>
>>> <nvpair id="status-1-last-failure-fs-mail"
>>> name="last-failure-fs-mail" value="1458145164"/>
>>> <nvpair id="status-1-last-failure-amavisd"
>>> name="last-failure-amavisd" value="1458144572"/>
>>> <nvpair id="status-1-master-spool" name="master-spool"
>>> value="10000"/>
>>> <nvpair id="status-1-master-mail" name="master-mail"
>>> value="10000"/>
>>> </instance_attributes>
>>> </transient_attributes>
>>> <lrm id="1">
>>> <lrm_resources>
>>> <lrm_resource id="virtualip-1" type="IPaddr2" class="ocf"
>>> provider="heartbeat">
>>> <lrm_rsc_op id="virtualip-1_last_0"
>>> operation_key="virtualip-1_stop_0" operation="stop"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="13:3651:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;13:3651:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="1930" rc-code="0" op-status="0" interval="0"
>>> last-run="1458292925" last-rc-change="1458292925" exec-time="285"
>>> queue-time="0" op-digest="28a9f5254eca47bbb2a9892a336ab8d6"/>
>>> <lrm_rsc_op id="virtualip-1_monitor_30000"
>>> operation_key="virtualip-1_monitor_30000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="13:3390:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;13:3390:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="1886" rc-code="0" op-status="0" interval="30000"
>>> last-rc-change="1458216597" exec-time="46" queue-time="0"
>>> op-digest="c2158e684c2fe8758a545e9a9387caed"/>
>>> </lrm_resource>
>>> <lrm_resource id="mail" type="drbd" class="ocf"
>>> provider="linbit">
>>> <lrm_rsc_op id="mail_last_failure_0"
>>> operation_key="mail_monitor_0" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="9:3026:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;9:3026:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="1451" rc-code="0" op-status="0" interval="0"
>>> last-run="1458128284" last-rc-change="1458128284" exec-time="72"
>>> queue-time="0" op-digest="98235597a9743aebee92a6c373a068d5"/>
>>> <lrm_rsc_op id="mail_last_0" operation_key="mail_start_0"
>>> operation="start" crm-debug-origin="do_update_resource"
>>> crm_feature_set="3.0.10"
>>> transition-key="50:3669:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;50:3669:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="2014" rc-code="0" op-status="0" interval="0"
>>> last-run="1458294003" last-rc-change="1458294003" exec-time="270"
>>> queue-time="0" op-digest="98235597a9743aebee92a6c373a068d5"/>
>>> <lrm_rsc_op id="mail_monitor_10000"
>>> operation_key="mail_monitor_10000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="50:3670:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;50:3670:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="2019" rc-code="0" op-status="0" interval="10000"
>>> last-rc-change="1458294004" exec-time="79" queue-time="0"
>>> op-digest="57464d93900365abea1493a8f6b22159"/>
>>> </lrm_resource>
>>> <lrm_resource id="spool" type="drbd" class="ocf"
>>> provider="linbit">
>>> <lrm_rsc_op id="spool_last_failure_0"
>>> operation_key="spool_monitor_0" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="9:3028:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;9:3028:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="1459" rc-code="0" op-status="0" interval="0"
>>> last-run="1458128289" last-rc-change="1458128289" exec-time="73"
>>> queue-time="0" op-digest="dbbf364a9d070ebe47b97831a0be60f4"/>
>>> <lrm_rsc_op id="spool_last_0" operation_key="spool_start_0"
>>> operation="start" crm-debug-origin="do_update_resource"
>>> crm_feature_set="3.0.10"
>>> transition-key="20:3669:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;20:3669:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="2015" rc-code="0" op-status="0" interval="0"
>>> last-run="1458294003" last-rc-change="1458294003" exec-time="266"
>>> queue-time="0" op-digest="dbbf364a9d070ebe47b97831a0be60f4"/>
>>> <lrm_rsc_op id="spool_monitor_10000"
>>> operation_key="spool_monitor_10000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="19:3670:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;19:3670:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="2018" rc-code="0" op-status="0" interval="10000"
>>> last-rc-change="1458294004" exec-time="80" queue-time="0"
>>> op-digest="97f3ae82d78b8755a2179c6797797580"/>
>>> </lrm_resource>
>>> <lrm_resource id="fs-spool" type="Filesystem" class="ocf"
>>> provider="heartbeat">
>>> <lrm_rsc_op id="fs-spool_last_0"
>>> operation_key="fs-spool_stop_0" operation="stop"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="78:3651:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;78:3651:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="1928" rc-code="0" op-status="0" interval="0"
>>> last-run="1458292923" last-rc-change="1458292923" exec-time="1258"
>>> queue-time="0" op-digest="54f97a4890ac973bd096580098e40914"/>
>>> <lrm_rsc_op id="fs-spool_monitor_20000"
>>> operation_key="fs-spool_monitor_20000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="69:3392:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;69:3392:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="1896" rc-code="0" op-status="0" interval="20000"
>>> last-rc-change="1458216598" exec-time="47" queue-time="0"
>>> op-digest="e85a7e24c0c0b05f5d196e3d363e4dfc"/>
>>> </lrm_resource>
>>> <lrm_resource id="fs-mail" type="Filesystem" class="ocf"
>>> provider="heartbeat">
>>> <lrm_rsc_op id="fs-mail_last_0"
>>> operation_key="fs-mail_stop_0" operation="stop"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="81:3651:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;81:3651:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="1926" rc-code="0" op-status="0" interval="0"
>>> last-run="1458292923" last-rc-change="1458292923" exec-time="85"
>>> queue-time="1" op-digest="57adf8df552907571679154e346a4403"/>
>>> <lrm_rsc_op id="fs-mail_monitor_20000"
>>> operation_key="fs-mail_monitor_20000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="71:3392:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;71:3392:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="1898" rc-code="0" op-status="0" interval="20000"
>>> last-rc-change="1458216598" exec-time="67" queue-time="0"
>>> op-digest="ad82e3ec600949a8e869e8afe9a21fef"/>
>>> </lrm_resource>
>>> <lrm_resource id="amavisd" type="amavisd" class="systemd">
>>> <lrm_rsc_op id="amavisd_last_0"
>>> operation_key="amavisd_monitor_0" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="9:3674:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:7;9:3674:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="2026" rc-code="7" op-status="0" interval="0"
>>> last-run="1458294028" last-rc-change="1458294028" exec-time="5"
>>> queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
>>> </lrm_resource>
>>> <lrm_resource id="spamassassin" type="spamassassin"
>>> class="systemd">
>>> <lrm_rsc_op id="spamassassin_last_0"
>>> operation_key="spamassassin_monitor_0" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="10:3674:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:7;10:3674:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="2030" rc-code="7" op-status="0" interval="0"
>>> last-run="1458294028" last-rc-change="1458294028" exec-time="5"
>>> queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
>>> </lrm_resource>
>>> <lrm_resource id="clamd" type="clamd at amavisd" class="systemd">
>>> <lrm_rsc_op id="clamd_last_0"
>>> operation_key="clamd_monitor_0" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="11:3674:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:7;11:3674:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="2034" rc-code="7" op-status="0" interval="0"
>>> last-run="1458294028" last-rc-change="1458294028" exec-time="7"
>>> queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
>>> </lrm_resource>
>>> </lrm_resources>
>>> </lrm>
>>> </node_state>
>>> <node_state id="2" uname="mail2" in_ccm="true" crmd="online"
>>> crm-debug-origin="do_update_resource" join="member" expected="member">
>>> <transient_attributes id="2">
>>> <instance_attributes id="status-2">
>>> <nvpair id="status-2-shutdown" name="shutdown" value="0"/>
>>> <nvpair id="status-2-last-failure-spool"
>>> name="last-failure-spool" value="1457364470"/>
>>> <nvpair id="status-2-probe_complete" name="probe_complete"
>>> value="true"/>
>>> <nvpair id="status-2-last-failure-mail"
>>> name="last-failure-mail" value="1457527103"/>
>>> <nvpair id="status-2-last-failure-fs-spool"
>>> name="last-failure-fs-spool" value="1457524256"/>
>>> <nvpair id="status-2-last-failure-fs-mail"
>>> name="last-failure-fs-mail" value="1457611139"/>
>>> <nvpair id="status-2-last-failure-amavisd"
>>> name="last-failure-amavisd" value="1458294149"/>
>>> <nvpair id="status-2-master-mail" name="master-mail"
>>> value="10000"/>
>>> <nvpair id="status-2-master-spool" name="master-spool"
>>> value="10000"/>
>>> <nvpair id="status-2-fail-count-amavisd"
>>> name="fail-count-amavisd" value="1"/>
>>> </instance_attributes>
>>> </transient_attributes>
>>> <lrm id="2">
>>> <lrm_resources>
>>> <lrm_resource id="virtualip-1" type="IPaddr2" class="ocf"
>>> provider="heartbeat">
>>> <lrm_rsc_op id="virtualip-1_last_failure_0"
>>> operation_key="virtualip-1_monitor_0" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="11:3024:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;11:3024:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="1904" rc-code="0" op-status="0" interval="0"
>>> last-run="1458128280" last-rc-change="1458128280" exec-time="49"
>>> queue-time="0" op-digest="28a9f5254eca47bbb2a9892a336ab8d6"/>
>>> <lrm_rsc_op id="virtualip-1_last_0"
>>> operation_key="virtualip-1_stop_0" operation="stop"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="14:3677:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;14:3677:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2513" rc-code="0" op-status="0" interval="0"
>>> last-run="1458294156" last-rc-change="1458294156" exec-time="51"
>>> queue-time="0" op-digest="28a9f5254eca47bbb2a9892a336ab8d6"/>
>>> <lrm_rsc_op id="virtualip-1_monitor_30000"
>>> operation_key="virtualip-1_monitor_30000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="12:3664:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;12:3664:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2425" rc-code="0" op-status="0" interval="30000"
>>> last-rc-change="1458293985" exec-time="48" queue-time="0"
>>> op-digest="c2158e684c2fe8758a545e9a9387caed"/>
>>> </lrm_resource>
>>> <lrm_resource id="mail" type="drbd" class="ocf"
>>> provider="linbit">
>>> <lrm_rsc_op id="mail_last_failure_0"
>>> operation_key="mail_monitor_0" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="11:3026:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:8;11:3026:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="1911" rc-code="8" op-status="0" interval="0"
>>> last-run="1458128284" last-rc-change="1458128284" exec-time="79"
>>> queue-time="0" op-digest="98235597a9743aebee92a6c373a068d5"/>
>>> <lrm_rsc_op id="mail_last_0" operation_key="mail_promote_0"
>>> operation="promote" crm-debug-origin="do_update_resource"
>>> crm_feature_set="3.0.10"
>>> transition-key="41:3652:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;41:3652:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2333" rc-code="0" op-status="0" interval="0"
>>> last-run="1458292925" last-rc-change="1458292925" exec-time="41"
>>> queue-time="0" op-digest="98235597a9743aebee92a6c373a068d5"/>
>>> </lrm_resource>
>>> <lrm_resource id="spool" type="drbd" class="ocf"
>>> provider="linbit">
>>> <lrm_rsc_op id="spool_last_failure_0"
>>> operation_key="spool_monitor_0" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="11:3028:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:8;11:3028:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="1917" rc-code="8" op-status="0" interval="0"
>>> last-run="1458128289" last-rc-change="1458128289" exec-time="73"
>>> queue-time="0" op-digest="dbbf364a9d070ebe47b97831a0be60f4"/>
>>> <lrm_rsc_op id="spool_last_0"
>>> operation_key="spool_promote_0" operation="promote"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="14:3652:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;14:3652:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2332" rc-code="0" op-status="0" interval="0"
>>> last-run="1458292925" last-rc-change="1458292925" exec-time="45"
>>> queue-time="0" op-digest="dbbf364a9d070ebe47b97831a0be60f4"/>
>>> </lrm_resource>
>>> <lrm_resource id="fs-mail" type="Filesystem" class="ocf"
>>> provider="heartbeat">
>>> <lrm_rsc_op id="fs-mail_last_failure_0"
>>> operation_key="fs-mail_monitor_0" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="11:3150:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;11:3150:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2281" rc-code="0" op-status="0" interval="0"
>>> last-run="1458145187" last-rc-change="1458145187" exec-time="77"
>>> queue-time="1" op-digest="57adf8df552907571679154e346a4403"/>
>>> <lrm_rsc_op id="fs-mail_last_0"
>>> operation_key="fs-mail_stop_0" operation="stop"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="81:3677:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;81:3677:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2509" rc-code="0" op-status="0" interval="0"
>>> last-run="1458294155" last-rc-change="1458294155" exec-time="78"
>>> queue-time="0" op-digest="57adf8df552907571679154e346a4403"/>
>>> <lrm_rsc_op id="fs-mail_monitor_20000"
>>> operation_key="fs-mail_monitor_20000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="76:3664:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;76:3664:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2429" rc-code="0" op-status="0" interval="20000"
>>> last-rc-change="1458293985" exec-time="62" queue-time="0"
>>> op-digest="ad82e3ec600949a8e869e8afe9a21fef"/>
>>> </lrm_resource>
>>> <lrm_resource id="fs-spool" type="Filesystem" class="ocf"
>>> provider="heartbeat">
>>> <lrm_rsc_op id="fs-spool_last_failure_0"
>>> operation_key="fs-spool_monitor_0" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="10:3150:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;10:3150:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2277" rc-code="0" op-status="0" interval="0"
>>> last-run="1458145187" last-rc-change="1458145187" exec-time="81"
>>> queue-time="0" op-digest="54f97a4890ac973bd096580098e40914"/>
>>> <lrm_rsc_op id="fs-spool_last_0"
>>> operation_key="fs-spool_stop_0" operation="stop"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="79:3677:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;79:3677:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2511" rc-code="0" op-status="0" interval="0"
>>> last-run="1458294155" last-rc-change="1458294155" exec-time="1220"
>>> queue-time="0" op-digest="54f97a4890ac973bd096580098e40914"/>
>>> <lrm_rsc_op id="fs-spool_monitor_20000"
>>> operation_key="fs-spool_monitor_20000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="74:3664:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;74:3664:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2427" rc-code="0" op-status="0" interval="20000"
>>> last-rc-change="1458293985" exec-time="44" queue-time="0"
>>> op-digest="e85a7e24c0c0b05f5d196e3d363e4dfc"/>
>>> </lrm_resource>
>>> <lrm_resource id="amavisd" type="amavisd" class="systemd">
>>> <lrm_rsc_op id="amavisd_last_failure_0"
>>> operation_key="amavisd_monitor_60000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="86:3675:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:7;86:3675:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2499" rc-code="7" op-status="0" interval="60000"
>>> last-run="1458294028" last-rc-change="1458294149" exec-time="0"
>>> queue-time="0" op-digest="4811cef7f7f94e3a35a70be7916cb2fd"/>
>>> <lrm_rsc_op id="amavisd_last_0"
>>> operation_key="amavisd_stop_0" operation="stop"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="7:3677:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;7:3677:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2507" rc-code="0" op-status="0" interval="0"
>>> last-run="1458294153" last-rc-change="1458294153" exec-time="2068"
>>> queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
>>> <lrm_rsc_op id="amavisd_monitor_60000"
>>> operation_key="amavisd_monitor_60000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="86:3675:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;86:3675:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2499" rc-code="0" op-status="0" interval="60000"
>>> last-rc-change="1458294028" exec-time="2" queue-time="0"
>>> op-digest="4811cef7f7f94e3a35a70be7916cb2fd"/>
>>> </lrm_resource>
>>> <lrm_resource id="spamassassin" type="spamassassin"
>>> class="systemd">
>>> <lrm_rsc_op id="spamassassin_last_failure_0"
>>> operation_key="spamassassin_monitor_0" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="14:3674:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;14:3674:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2494" rc-code="0" op-status="0" interval="0"
>>> last-run="1458294028" last-rc-change="1458294028" exec-time="11"
>>> queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
>>> <lrm_rsc_op id="spamassassin_last_0"
>>> operation_key="spamassassin_stop_0" operation="stop"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="87:3677:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;87:3677:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2505" rc-code="0" op-status="0" interval="0"
>>> last-run="1458294151" last-rc-change="1458294151" exec-time="2072"
>>> queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
>>> <lrm_rsc_op id="spamassassin_monitor_60000"
>>> operation_key="spamassassin_monitor_60000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="89:3675:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;89:3675:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2500" rc-code="0" op-status="0" interval="60000"
>>> last-rc-change="1458294028" exec-time="1" queue-time="0"
>>> op-digest="4811cef7f7f94e3a35a70be7916cb2fd"/>
>>> </lrm_resource>
>>> <lrm_resource id="clamd" type="clamd at amavisd" class="systemd">
>>> <lrm_rsc_op id="clamd_last_failure_0"
>>> operation_key="clamd_monitor_0" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="15:3674:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;15:3674:7:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2498" rc-code="0" op-status="0" interval="0"
>>> last-run="1458294028" last-rc-change="1458294028" exec-time="10"
>>> queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
>>> <lrm_rsc_op id="clamd_last_0" operation_key="clamd_stop_0"
>>> operation="stop" crm-debug-origin="do_update_resource"
>>> crm_feature_set="3.0.10"
>>> transition-key="88:3677:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;88:3677:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2503" rc-code="0" op-status="0" interval="0"
>>> last-run="1458294149" last-rc-change="1458294149" exec-time="2085"
>>> queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
>>> <lrm_rsc_op id="clamd_monitor_60000"
>>> operation_key="clamd_monitor_60000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="92:3675:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:0;92:3675:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail2" call-id="2501" rc-code="0" op-status="0" interval="60000"
>>> last-rc-change="1458294029" exec-time="2" queue-time="0"
>>> op-digest="4811cef7f7f94e3a35a70be7916cb2fd"/>
>>> </lrm_resource>
>>> </lrm_resources>
>>> </lrm>
>>> </node_state>
>>> </status>
>>> </cib>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Mar 17, 2016 at 8:30 PM, Ken Gaillot <kgaillot at redhat.com>
>>> wrote:
>>>
>>>> On 03/16/2016 11:20 AM, Lorand Kelemen wrote:
>>>> > Dear Ken,
>>>> >
>>>> > I already modified the startup as suggested during testing, thanks! I
>>>> > swapped the postfix ocf resource to the amavisd systemd resource, as
>>>> latter
>>>> > controls postfix startup also as it turns out and having both
>>>> resouces in
>>>> > the mail-services group causes conflicts (postfix is detected as not
>>>> > running).
>>>> >
>>>> > Still experiencing the same behaviour, killing amavisd returns an
>>>> rc=7 for
>>>> > the monitoring operation on the "victim" node, this soungs logical,
>>>> but the
>>>> > logs contain the same: amavisd and virtualip cannot run anywhere.
>>>> >
>>>> > I made sure systemd is clean (amavisd = inactive, not running instead
>>>> of
>>>> > failed) and also reset the failcount on all resources before killing
>>>> > amavisd.
>>>> >
>>>> > How can I make sure to have a clean state for the resources beside
>>>> above
>>>> > actions?
>>>>
>>>> What you did is fine. I'm not sure why amavisd and virtualip can't run.
>>>> Can you show the output of "cibadmin -Q" when the cluster is in that
>>>> state?
>>>>
>>>> > Also note: when causing a filesystem resource to fail (e.g. with
>>>> unmout),
>>>> > the failover happens successfully, all resources are started on the
>>>> > "survivor" node.
>>>> >
>>>> > Best regards,
>>>> > Lorand
>>>> >
>>>> >
>>>> > On Wed, Mar 16, 2016 at 4:34 PM, Ken Gaillot <kgaillot at redhat.com>
>>>> wrote:
>>>> >
>>>> >> On 03/16/2016 05:49 AM, Lorand Kelemen wrote:
>>>> >>> Dear Ken,
>>>> >>>
>>>> >>> Thanks for the reply! I lowered migration-threshold to 1 and
>>>> rearranged
>>>> >>> contraints like you suggested:
>>>> >>>
>>>> >>> Location Constraints:
>>>> >>> Ordering Constraints:
>>>> >>> promote mail-clone then start fs-services (kind:Mandatory)
>>>> >>> promote spool-clone then start fs-services (kind:Mandatory)
>>>> >>> start fs-services then start network-services (kind:Mandatory)
>>>> >>
>>>> >> Certainly not a big deal, but I would change the above constraint to
>>>> >> start fs-services then start mail-services. The IP doesn't care
>>>> whether
>>>> >> the filesystems are up yet or not, but postfix does.
>>>> >>
>>>> >>> start network-services then start mail-services (kind:Mandatory)
>>>> >>> Colocation Constraints:
>>>> >>> fs-services with spool-clone (score:INFINITY) (rsc-role:Started)
>>>> >>> (with-rsc-role:Master)
>>>> >>> fs-services with mail-clone (score:INFINITY) (rsc-role:Started)
>>>> >>> (with-rsc-role:Master)
>>>> >>> network-services with mail-services (score:INFINITY)
>>>> >>> mail-services with fs-services (score:INFINITY)
>>>> >>>
>>>> >>> Now virtualip and postfix becomes stopped, I guess these are
>>>> relevant
>>>> >> but I
>>>> >>> attach also full logs:
>>>> >>>
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> native_color: Resource postfix cannot run anywhere
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> native_color: Resource virtualip-1 cannot run anywhere
>>>> >>>
>>>> >>> Interesting, will try to play around with ordering - colocation, the
>>>> >>> solution must be in these settings...
>>>> >>>
>>>> >>> Best regards,
>>>> >>> Lorand
>>>> >>>
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: Diff: --- 0.215.7 2
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: Diff: +++ 0.215.8 (null)
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: + /cib: @num_updates=8
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: ++
>>>> >>>
>>>> >>
>>>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='postfix']:
>>>> >>> <lrm_rsc_op id="postfix_last_failure_0"
>>>> >>> operation_key="postfix_monitor_45000" operation="monitor"
>>>> >>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>>> >>> transition-key="86:2962:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>>> >>>
>>>> transition-magic="0:7;86:2962:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>>> >>> on_node="mail1" call-id="1333" rc-code="7"
>>>> >>> Mar 16 11:38:06 [7420] HWJ-626.domain.local crmd: info:
>>>> >>> abort_transition_graph: Transition aborted by
>>>> postfix_monitor_45000
>>>> >>> 'create' on mail1: Inactive graph
>>>> >>> (magic=0:7;86:2962:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> cib=0.215.8,
>>>> >>> source=process_graph_event:598, 1)
>>>> >>> Mar 16 11:38:06 [7420] HWJ-626.domain.local crmd: info:
>>>> >>> update_failcount: Updating failcount for postfix on mail1 after
>>>> >> failed
>>>> >>> monitor: rc=7 (update=value++, time=1458124686)
>>>> >>
>>>> >> I don't think your constraints are causing problems now; the above
>>>> >> message indicates that the postfix resource failed. Postfix may not
>>>> be
>>>> >> able to run anywhere because it's already failed on both nodes, and
>>>> the
>>>> >> IP would be down because it has to be colocated with postfix, and
>>>> >> postfix can't run.
>>>> >>
>>>> >> The rc=7 above indicates that the postfix agent's monitor operation
>>>> >> returned 7, which is "not running". I'd check the logs for postfix
>>>> errors.
>>>> >>
>>>> >>> Mar 16 11:38:06 [7420] HWJ-626.domain.local crmd: info:
>>>> >>> process_graph_event: Detected action (2962.86)
>>>> >>> postfix_monitor_45000.1333=not running: failed
>>>> >>> Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info:
>>>> >>> attrd_client_update: Expanded fail-count-postfix=value++ to 1
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_process_request: Completed cib_modify operation for section
>>>> status:
>>>> >> OK
>>>> >>> (rc=0, origin=mail1/crmd/253, version=0.215.8)
>>>> >>> Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info:
>>>> >>> attrd_peer_update: Setting fail-count-postfix[mail1]: (null) ->
>>>> 1 from
>>>> >>> mail2
>>>> >>> Mar 16 11:38:06 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>> do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [
>>>> >>> input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
>>>> >>> Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info:
>>>> >>> write_attribute: Sent update 406 with 2 changes for
>>>> >>> fail-count-postfix, id=<n/a>, set=(null)
>>>> >>> Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info:
>>>> >>> attrd_peer_update: Setting last-failure-postfix[mail1]:
>>>> 1458124291 ->
>>>> >>> 1458124686 from mail2
>>>> >>> Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info:
>>>> >>> write_attribute: Sent update 407 with 2 changes for
>>>> >>> last-failure-postfix, id=<n/a>, set=(null)
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_process_request: Forwarding cib_modify operation for section
>>>> status
>>>> >> to
>>>> >>> master (origin=local/attrd/406)
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_process_request: Forwarding cib_modify operation for section
>>>> status
>>>> >> to
>>>> >>> master (origin=local/attrd/407)
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: Diff: --- 0.215.8 2
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: Diff: +++ 0.215.9 (null)
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: + /cib: @num_updates=9
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: ++
>>>> >>>
>>>> >>
>>>> /cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']:
>>>> >>> <nvpair id="status-1-fail-count-postfix" name="fail-count-postfix"
>>>> >>> value="1"/>
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_process_request: Completed cib_modify operation for section
>>>> status:
>>>> >> OK
>>>> >>> (rc=0, origin=mail2/attrd/406, version=0.215.9)
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: Diff: --- 0.215.9 2
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: Diff: +++ 0.215.10 (null)
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: + /cib: @num_updates=10
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: +
>>>> >>>
>>>> >>
>>>> /cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-last-failure-postfix']:
>>>> >>> @value=1458124686
>>>> >>> Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info:
>>>> >>> attrd_cib_callback: Update 406 for fail-count-postfix: OK (0)
>>>> >>> Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info:
>>>> >>> attrd_cib_callback: Update 406 for fail-count-postfix[mail1]=1:
>>>> OK (0)
>>>> >>> Mar 16 11:38:06 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_process_request: Completed cib_modify operation for section
>>>> status:
>>>> >> OK
>>>> >>> (rc=0, origin=mail2/attrd/407, version=0.215.10)
>>>> >>> Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info:
>>>> >>> attrd_cib_callback: Update 406 for
>>>> fail-count-postfix[mail2]=(null): OK
>>>> >>> (0)
>>>> >>> Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info:
>>>> >>> attrd_cib_callback: Update 407 for last-failure-postfix: OK (0)
>>>> >>> Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info:
>>>> >>> attrd_cib_callback: Update 407 for
>>>> >>> last-failure-postfix[mail1]=1458124686: OK (0)
>>>> >>> Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info:
>>>> >>> attrd_cib_callback: Update 407 for
>>>> >>> last-failure-postfix[mail2]=1457610376: OK (0)
>>>> >>> Mar 16 11:38:06 [7420] HWJ-626.domain.local crmd: info:
>>>> >>> abort_transition_graph: Transition aborted by
>>>> >>> status-1-fail-count-postfix, fail-count-postfix=1: Transient
>>>> attribute
>>>> >>> change (create cib=0.215.9, source=abort_unless_down:319,
>>>> >>>
>>>> >>
>>>> path=/cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1'],
>>>> >>> 1)
>>>> >>> Mar 16 11:38:06 [7420] HWJ-626.domain.local crmd: info:
>>>> >>> abort_transition_graph: Transition aborted by
>>>> >>> status-1-last-failure-postfix, last-failure-postfix=1458124686:
>>>> Transient
>>>> >>> attribute change (modify cib=0.215.10, source=abort_unless_down:319,
>>>> >>>
>>>> >>
>>>> path=/cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-last-failure-postfix'],
>>>> >>> 1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>> unpack_config: On loss of CCM Quorum: Ignore
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_online_status: Node mail1 is online
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_online_status: Node mail2 is online
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource mail:0
>>>> active in
>>>> >>> master mode on mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource spool:0
>>>> active in
>>>> >>> master mode on mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource fs-spool
>>>> active on
>>>> >>> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource fs-spool
>>>> active on
>>>> >>> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource fs-mail
>>>> active on
>>>> >>> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource fs-mail
>>>> active on
>>>> >>> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: warning:
>>>> >>> unpack_rsc_op_failure: Processing failed op monitor for
>>>> postfix on
>>>> >>> mail1: not running (7)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource spool:1
>>>> active in
>>>> >>> master mode on mail2
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource mail:1
>>>> active in
>>>> >>> master mode on mail2
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> group_print: Resource Group: network-services
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> native_print: virtualip-1 (ocf::heartbeat:IPaddr2):
>>>> >> Started
>>>> >>> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> clone_print: Master/Slave Set: spool-clone [spool]
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> short_print: Masters: [ mail1 ]
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> short_print: Slaves: [ mail2 ]
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> clone_print: Master/Slave Set: mail-clone [mail]
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> short_print: Masters: [ mail1 ]
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> short_print: Slaves: [ mail2 ]
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> group_print: Resource Group: fs-services
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> native_print: fs-spool (ocf::heartbeat:Filesystem):
>>>> Started
>>>> >> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> native_print: fs-mail (ocf::heartbeat:Filesystem):
>>>> Started
>>>> >> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> group_print: Resource Group: mail-services
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> native_print: postfix (ocf::heartbeat:postfix): FAILED
>>>> >> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> master_color: Promoting mail:0 (Master mail1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> master_color: mail-clone: Promoted 1 instances of a possible 1 to
>>>> master
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> master_color: Promoting spool:0 (Master mail1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> master_color: spool-clone: Promoted 1 instances of a possible 1 to
>>>> master
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> RecurringOp: Start recurring monitor (45s) for postfix on mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> LogActions: Leave virtualip-1 (Started mail1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> LogActions: Leave spool:0 (Master mail1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> LogActions: Leave spool:1 (Slave mail2)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> LogActions: Leave mail:0 (Master mail1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> LogActions: Leave mail:1 (Slave mail2)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> LogActions: Leave fs-spool (Started mail1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> LogActions: Leave fs-mail (Started mail1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>> LogActions: Recover postfix (Started mail1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>> process_pe_message: Calculated Transition 2963:
>>>> >>> /var/lib/pacemaker/pengine/pe-input-330.bz2
>>>> >>> Mar 16 11:38:06 [7420] HWJ-626.domain.local crmd: info:
>>>> >>> handle_response: pe_calc calculation
>>>> pe_calc-dc-1458124686-5541 is
>>>> >>> obsolete
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>> unpack_config: On loss of CCM Quorum: Ignore
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_online_status: Node mail1 is online
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_online_status: Node mail2 is online
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource mail:0
>>>> active in
>>>> >>> master mode on mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource spool:0
>>>> active in
>>>> >>> master mode on mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource fs-spool
>>>> active on
>>>> >>> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource fs-spool
>>>> active on
>>>> >>> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource fs-mail
>>>> active on
>>>> >>> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource fs-mail
>>>> active on
>>>> >>> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: warning:
>>>> >>> unpack_rsc_op_failure: Processing failed op monitor for
>>>> postfix on
>>>> >>> mail1: not running (7)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource spool:1
>>>> active in
>>>> >>> master mode on mail2
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> determine_op_status: Operation monitor found resource mail:1
>>>> active in
>>>> >>> master mode on mail2
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> group_print: Resource Group: network-services
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> native_print: virtualip-1 (ocf::heartbeat:IPaddr2):
>>>> >> Started
>>>> >>> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> clone_print: Master/Slave Set: spool-clone [spool]
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> short_print: Masters: [ mail1 ]
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> short_print: Slaves: [ mail2 ]
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> clone_print: Master/Slave Set: mail-clone [mail]
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> short_print: Masters: [ mail1 ]
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> short_print: Slaves: [ mail2 ]
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> group_print: Resource Group: fs-services
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> native_print: fs-spool (ocf::heartbeat:Filesystem):
>>>> Started
>>>> >> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> native_print: fs-mail (ocf::heartbeat:Filesystem):
>>>> Started
>>>> >> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> group_print: Resource Group: mail-services
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> native_print: postfix (ocf::heartbeat:postfix): FAILED
>>>> >> mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> get_failcount_full: postfix has failed 1 times on mail1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: warning:
>>>> >>> common_apply_stickiness: Forcing postfix away from mail1 after
>>>> 1
>>>> >>> failures (max=1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> master_color: Promoting mail:0 (Master mail1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> master_color: mail-clone: Promoted 1 instances of a possible 1 to
>>>> master
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> master_color: Promoting spool:0 (Master mail1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> master_color: spool-clone: Promoted 1 instances of a possible 1 to
>>>> master
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> rsc_merge_weights: fs-mail: Rolling back scores from postfix
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> rsc_merge_weights: postfix: Rolling back scores from virtualip-1
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> native_color: Resource postfix cannot run anywhere
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> native_color: Resource virtualip-1 cannot run anywhere
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>> LogActions: Stop virtualip-1 (mail1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> LogActions: Leave spool:0 (Master mail1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> LogActions: Leave spool:1 (Slave mail2)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> LogActions: Leave mail:0 (Master mail1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> LogActions: Leave mail:1 (Slave mail2)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> LogActions: Leave fs-spool (Started mail1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: info:
>>>> >>> LogActions: Leave fs-mail (Started mail1)
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>> LogActions: Stop postfix (mail1)
>>>> >>> Mar 16 11:38:06 [7420] HWJ-626.domain.local crmd: info:
>>>> >>> do_state_transition: State transition S_POLICY_ENGINE ->
>>>> >>> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE
>>>> >>> origin=handle_response ]
>>>> >>> Mar 16 11:38:06 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>> process_pe_message: Calculated Transition 2964:
>>>> >>> /var/lib/pacemaker/pengine/pe-input-331.bz2
>>>> >>> Mar 16 11:38:06 [7420] HWJ-626.domain.local crmd: info:
>>>> >>> do_te_invoke: Processing graph 2964 (ref=pe_calc-dc-1458124686-5542)
>>>> >>> derived from /var/lib/pacemaker/pengine/pe-input-331.bz2
>>>> >>> Mar 16 11:38:06 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>> te_rsc_command: Initiating action 5: stop postfix_stop_0 on
>>>> mail1
>>>> >>> Mar 16 11:38:07 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: Diff: --- 0.215.10 2
>>>> >>> Mar 16 11:38:07 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: Diff: +++ 0.215.11 (null)
>>>> >>> Mar 16 11:38:07 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: + /cib: @num_updates=11
>>>> >>> Mar 16 11:38:07 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: +
>>>> >>>
>>>> >>
>>>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='postfix']/lrm_rsc_op[@id='postfix_last_0']:
>>>> >>> @operation_key=postfix_stop_0, @operation=stop,
>>>> >>> @transition-key=5:2964:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> >>> @transition-magic=0:0;5:2964:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> >>> @call-id=1335, @last-run=1458124686, @last-rc-change=1458124686,
>>>> >>> @exec-time=435
>>>> >>> Mar 16 11:38:07 [7420] HWJ-626.domain.local crmd: info:
>>>> >>> match_graph_event: Action postfix_stop_0 (5) confirmed on mail1
>>>> (rc=0)
>>>> >>> Mar 16 11:38:07 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_process_request: Completed cib_modify operation for section
>>>> status:
>>>> >> OK
>>>> >>> (rc=0, origin=mail1/crmd/254, version=0.215.11)
>>>> >>> Mar 16 11:38:07 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>> te_rsc_command: Initiating action 12: stop virtualip-1_stop_0
>>>> on
>>>> >> mail1
>>>> >>> Mar 16 11:38:07 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: Diff: --- 0.215.11 2
>>>> >>> Mar 16 11:38:07 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: Diff: +++ 0.215.12 (null)
>>>> >>> Mar 16 11:38:07 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: + /cib: @num_updates=12
>>>> >>> Mar 16 11:38:07 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_perform_op: +
>>>> >>>
>>>> >>
>>>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='virtualip-1']/lrm_rsc_op[@id='virtualip-1_last_0']:
>>>> >>> @operation_key=virtualip-1_stop_0, @operation=stop,
>>>> >>> @transition-key=12:2964:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> >>>
>>>> @transition-magic=0:0;12:2964:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> >>> @call-id=1337, @last-run=1458124687, @last-rc-change=1458124687,
>>>> >>> @exec-time=56
>>>> >>> Mar 16 11:38:07 [7420] HWJ-626.domain.local crmd: info:
>>>> >>> match_graph_event: Action virtualip-1_stop_0 (12) confirmed on
>>>> mail1
>>>> >>> (rc=0)
>>>> >>> Mar 16 11:38:07 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_process_request: Completed cib_modify operation for section
>>>> status:
>>>> >> OK
>>>> >>> (rc=0, origin=mail1/crmd/255, version=0.215.12)
>>>> >>> Mar 16 11:38:07 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>> run_graph: Transition 2964 (Complete=7, Pending=0, Fired=0,
>>>> Skipped=0,
>>>> >>> Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-331.bz2):
>>>> >> Complete
>>>> >>> Mar 16 11:38:07 [7420] HWJ-626.domain.local crmd: info:
>>>> do_log:
>>>> >>> FSA: Input I_TE_SUCCESS from notify_crmd() received in state
>>>> >>> S_TRANSITION_ENGINE
>>>> >>> Mar 16 11:38:07 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>> do_state_transition: State transition S_TRANSITION_ENGINE ->
>>>> S_IDLE [
>>>> >>> input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
>>>> >>> Mar 16 11:38:12 [7415] HWJ-626.domain.local cib: info:
>>>> >>> cib_process_ping: Reporting our current digest to mail2:
>>>> >>> ed43bc3ecf0f15853900ba49fc514870 for 0.215.12 (0x152b110 0)
>>>> >>>
>>>> >>>
>>>> >>> On Mon, Mar 14, 2016 at 6:44 PM, Ken Gaillot <kgaillot at redhat.com>
>>>> >> wrote:
>>>> >>>
>>>> >>>> On 03/10/2016 09:49 AM, Lorand Kelemen wrote:
>>>> >>>>> Dear List,
>>>> >>>>>
>>>> >>>>> After the creation and testing of a simple 2 node active-passive
>>>> >>>>> drbd+postfix cluster nearly everything works flawlessly (standby,
>>>> >> failure
>>>> >>>>> of a filesystem resource + failover, splitbrain + manual recovery)
>>>> >>>> however
>>>> >>>>> when delibarately killing the postfix instance, after reaching the
>>>> >>>>> migration threshold failover does not occur and resources revert
>>>> to the
>>>> >>>>> Stopped state (except the master-slave drbd resource, which works
>>>> as
>>>> >>>>> expected).
>>>> >>>>>
>>>> >>>>> Ordering and colocation is configured, STONITH and quorum
>>>> disabled, the
>>>> >>>>> goal is to always have one node running all the resources and at
>>>> any
>>>> >> sign
>>>> >>>>> of error it should fail over to the passive node, nothing fancy.
>>>> >>>>>
>>>> >>>>> Is my configuration wrong or am I hitting a bug?
>>>> >>>>>
>>>> >>>>> All software from centos 7 + elrepo repositories.
>>>> >>>>
>>>> >>>> With these versions, you can set "two_node: 1" in
>>>> >>>> /etc/corosync/corosync.conf (which will be done automatically if
>>>> you
>>>> >>>> used "pcs cluster setup" initially), and then you don't need to
>>>> ignore
>>>> >>>> quorum in pacemaker.
>>>> >>>>
>>>> >>>>> Regarding STONITH: the machines are running on free ESXi
>>>> instances on
>>>> >>>>> separate machines, so the Vmware fencing agents won't work
>>>> because in
>>>> >> the
>>>> >>>>> free version the API is read-only.
>>>> >>>>> Still trying to figure out a way to go, until then manual
>>>> recovery +
>>>> >> huge
>>>> >>>>> arp cache times on the upstream firewall...
>>>> >>>>>
>>>> >>>>> Please find pe-input*.bz files attached, logs and config below.
>>>> The
>>>> >>>>> situation: on node mail1 postfix was killed 3 times (migration
>>>> >>>> threshold),
>>>> >>>>> it should have failed over to mail2.
>>>> >>>>> When killing a filesystem resource three times this happens
>>>> flawlessly.
>>>> >>>>>
>>>> >>>>> Thanks for your input!
>>>> >>>>>
>>>> >>>>> Best regards,
>>>> >>>>> Lorand
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> Cluster Name: mailcluster
>>>> >>>>> Corosync Nodes:
>>>> >>>>> mail1 mail2
>>>> >>>>> Pacemaker Nodes:
>>>> >>>>> mail1 mail2
>>>> >>>>>
>>>> >>>>> Resources:
>>>> >>>>> Group: network-services
>>>> >>>>> Resource: virtualip-1 (class=ocf provider=heartbeat
>>>> type=IPaddr2)
>>>> >>>>> Attributes: ip=10.20.64.10 cidr_netmask=24 nic=lan0
>>>> >>>>> Operations: start interval=0s timeout=20s
>>>> >>>> (virtualip-1-start-interval-0s)
>>>> >>>>> stop interval=0s timeout=20s
>>>> >>>> (virtualip-1-stop-interval-0s)
>>>> >>>>> monitor interval=30s
>>>> (virtualip-1-monitor-interval-30s)
>>>> >>>>> Master: spool-clone
>>>> >>>>> Meta Attrs: master-max=1 master-node-max=1 clone-max=2
>>>> >> clone-node-max=1
>>>> >>>>> notify=true
>>>> >>>>> Resource: spool (class=ocf provider=linbit type=drbd)
>>>> >>>>> Attributes: drbd_resource=spool
>>>> >>>>> Operations: start interval=0s timeout=240
>>>> (spool-start-interval-0s)
>>>> >>>>> promote interval=0s timeout=90
>>>> >> (spool-promote-interval-0s)
>>>> >>>>> demote interval=0s timeout=90
>>>> (spool-demote-interval-0s)
>>>> >>>>> stop interval=0s timeout=100
>>>> (spool-stop-interval-0s)
>>>> >>>>> monitor interval=10s (spool-monitor-interval-10s)
>>>> >>>>> Master: mail-clone
>>>> >>>>> Meta Attrs: master-max=1 master-node-max=1 clone-max=2
>>>> >> clone-node-max=1
>>>> >>>>> notify=true
>>>> >>>>> Resource: mail (class=ocf provider=linbit type=drbd)
>>>> >>>>> Attributes: drbd_resource=mail
>>>> >>>>> Operations: start interval=0s timeout=240
>>>> (mail-start-interval-0s)
>>>> >>>>> promote interval=0s timeout=90
>>>> >> (mail-promote-interval-0s)
>>>> >>>>> demote interval=0s timeout=90
>>>> (mail-demote-interval-0s)
>>>> >>>>> stop interval=0s timeout=100
>>>> (mail-stop-interval-0s)
>>>> >>>>> monitor interval=10s (mail-monitor-interval-10s)
>>>> >>>>> Group: fs-services
>>>> >>>>> Resource: fs-spool (class=ocf provider=heartbeat
>>>> type=Filesystem)
>>>> >>>>> Attributes: device=/dev/drbd0 directory=/var/spool/postfix
>>>> >> fstype=ext4
>>>> >>>>> options=nodev,nosuid,noexec
>>>> >>>>> Operations: start interval=0s timeout=60
>>>> >> (fs-spool-start-interval-0s)
>>>> >>>>> stop interval=0s timeout=60
>>>> (fs-spool-stop-interval-0s)
>>>> >>>>> monitor interval=20 timeout=40
>>>> >>>> (fs-spool-monitor-interval-20)
>>>> >>>>> Resource: fs-mail (class=ocf provider=heartbeat type=Filesystem)
>>>> >>>>> Attributes: device=/dev/drbd1 directory=/var/spool/mail
>>>> fstype=ext4
>>>> >>>>> options=nodev,nosuid,noexec
>>>> >>>>> Operations: start interval=0s timeout=60
>>>> (fs-mail-start-interval-0s)
>>>> >>>>> stop interval=0s timeout=60
>>>> (fs-mail-stop-interval-0s)
>>>> >>>>> monitor interval=20 timeout=40
>>>> >>>> (fs-mail-monitor-interval-20)
>>>> >>>>> Group: mail-services
>>>> >>>>> Resource: postfix (class=ocf provider=heartbeat type=postfix)
>>>> >>>>> Operations: start interval=0s timeout=20s
>>>> >> (postfix-start-interval-0s)
>>>> >>>>> stop interval=0s timeout=20s
>>>> (postfix-stop-interval-0s)
>>>> >>>>> monitor interval=45s (postfix-monitor-interval-45s)
>>>> >>>>>
>>>> >>>>> Stonith Devices:
>>>> >>>>> Fencing Levels:
>>>> >>>>>
>>>> >>>>> Location Constraints:
>>>> >>>>> Ordering Constraints:
>>>> >>>>> start network-services then promote mail-clone (kind:Mandatory)
>>>> >>>>> (id:order-network-services-mail-clone-mandatory)
>>>> >>>>> promote mail-clone then promote spool-clone (kind:Mandatory)
>>>> >>>>> (id:order-mail-clone-spool-clone-mandatory)
>>>> >>>>> promote spool-clone then start fs-services (kind:Mandatory)
>>>> >>>>> (id:order-spool-clone-fs-services-mandatory)
>>>> >>>>> start fs-services then start mail-services (kind:Mandatory)
>>>> >>>>> (id:order-fs-services-mail-services-mandatory)
>>>> >>>>> Colocation Constraints:
>>>> >>>>> network-services with spool-clone (score:INFINITY)
>>>> (rsc-role:Started)
>>>> >>>>> (with-rsc-role:Master)
>>>> >>>> (id:colocation-network-services-spool-clone-INFINITY)
>>>> >>>>> network-services with mail-clone (score:INFINITY)
>>>> (rsc-role:Started)
>>>> >>>>> (with-rsc-role:Master)
>>>> >>>> (id:colocation-network-services-mail-clone-INFINITY)
>>>> >>>>> network-services with fs-services (score:INFINITY)
>>>> >>>>> (id:colocation-network-services-fs-services-INFINITY)
>>>> >>>>> network-services with mail-services (score:INFINITY)
>>>> >>>>> (id:colocation-network-services-mail-services-INFINITY)
>>>> >>>>
>>>> >>>> I'm not sure whether it's causing your issue, but I would make the
>>>> >>>> constraints reflect the logical relationships better.
>>>> >>>>
>>>> >>>> For example, network-services only needs to be colocated with
>>>> >>>> mail-services logically; it's mail-services that needs to be with
>>>> >>>> fs-services, and fs-services that needs to be with
>>>> >>>> spool-clone/mail-clone master. In other words, don't make the
>>>> >>>> highest-level resource depend on everything else, make each level
>>>> depend
>>>> >>>> on the level below it.
>>>> >>>>
>>>> >>>> Also, I would guess that the virtual IP only needs to be ordered
>>>> before
>>>> >>>> mail-services, and mail-clone and spool-clone could both be ordered
>>>> >>>> before fs-services, rather than ordering mail-clone before
>>>> spool-clone.
>>>> >>>>
>>>> >>>>> Resources Defaults:
>>>> >>>>> migration-threshold: 3
>>>> >>>>> Operations Defaults:
>>>> >>>>> on-fail: restart
>>>> >>>>>
>>>> >>>>> Cluster Properties:
>>>> >>>>> cluster-infrastructure: corosync
>>>> >>>>> cluster-name: mailcluster
>>>> >>>>> cluster-recheck-interval: 5min
>>>> >>>>> dc-version: 1.1.13-10.el7_2.2-44eb2dd
>>>> >>>>> default-resource-stickiness: infinity
>>>> >>>>> have-watchdog: false
>>>> >>>>> last-lrm-refresh: 1457613674
>>>> >>>>> no-quorum-policy: ignore
>>>> >>>>> pe-error-series-max: 1024
>>>> >>>>> pe-input-series-max: 1024
>>>> >>>>> pe-warn-series-max: 1024
>>>> >>>>> stonith-enabled: false
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: Diff: --- 0.197.15 2
>>>> >>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: Diff: +++ 0.197.16 (null)
>>>> >>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: + /cib: @num_updates=16
>>>> >>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: +
>>>> >>>>>
>>>> >>>>
>>>> >>
>>>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='postfix']/lrm_rsc_op[@id='postfix_last_failure_0']:
>>>> >>>>> @transition-key=4:1234:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> >>>>>
>>>> @transition-magic=0:7;4:1234:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> >>>>> @call-id=1274, @last-rc-change=1457613440
>>>> >>>>> Mar 10 13:37:20 [7420] HWJ-626.domain.local crmd: info:
>>>> >>>>> abort_transition_graph: Transition aborted by
>>>> >> postfix_monitor_45000
>>>> >>>>> 'modify' on mail1: Inactive graph
>>>> >>>>> (magic=0:7;4:1234:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> cib=0.197.16,
>>>> >>>>> source=process_graph_event:598, 1)
>>>> >>>>> Mar 10 13:37:20 [7420] HWJ-626.domain.local crmd: info:
>>>> >>>>> update_failcount: Updating failcount for postfix on mail1
>>>> after
>>>> >>>> failed
>>>> >>>>> monitor: rc=7 (update=value++, time=1457613440)
>>>> >>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local attrd: info:
>>>> >>>>> attrd_client_update: Expanded fail-count-postfix=value++ to 3
>>>> >>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_process_request: Completed cib_modify operation for section
>>>> >> status:
>>>> >>>> OK
>>>> >>>>> (rc=0, origin=mail1/crmd/196, version=0.197.16)
>>>> >>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local attrd: info:
>>>> >>>>> attrd_peer_update: Setting fail-count-postfix[mail1]: 2 -> 3
>>>> from
>>>> >>>> mail2
>>>> >>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local attrd: info:
>>>> >>>>> write_attribute: Sent update 400 with 2 changes for
>>>> >>>>> fail-count-postfix, id=<n/a>, set=(null)
>>>> >>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_process_request: Forwarding cib_modify operation for section
>>>> >> status
>>>> >>>> to
>>>> >>>>> master (origin=local/attrd/400)
>>>> >>>>> Mar 10 13:37:20 [7420] HWJ-626.domain.local crmd: info:
>>>> >>>>> process_graph_event: Detected action (1234.4)
>>>> >>>>> postfix_monitor_45000.1274=not running: failed
>>>> >>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local attrd: info:
>>>> >>>>> attrd_peer_update: Setting last-failure-postfix[mail1]:
>>>> 1457613347
>>>> >> ->
>>>> >>>>> 1457613440 from mail2
>>>> >>>>> Mar 10 13:37:20 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [
>>>> >>>>> input=I_PE_CALC cause=C_FSA_INTERNAL
>>>> origin=abort_transition_graph ]
>>>> >>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local attrd: info:
>>>> >>>>> write_attribute: Sent update 401 with 2 changes for
>>>> >>>>> last-failure-postfix, id=<n/a>, set=(null)
>>>> >>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: Diff: --- 0.197.16 2
>>>> >>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: Diff: +++ 0.197.17 (null)
>>>> >>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: + /cib: @num_updates=17
>>>> >>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: +
>>>> >>>>>
>>>> >>>>
>>>> >>
>>>> /cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-fail-count-postfix']:
>>>> >>>>> @value=3
>>>> >>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_process_request: Completed cib_modify operation for section
>>>> >> status:
>>>> >>>> OK
>>>> >>>>> (rc=0, origin=mail2/attrd/400, version=0.197.17)
>>>> >>>>> Mar 10 13:37:20 [7420] HWJ-626.domain.local crmd: info:
>>>> >>>>> abort_transition_graph: Transition aborted by
>>>> >>>>> status-1-fail-count-postfix, fail-count-postfix=3: Transient
>>>> attribute
>>>> >>>>> change (modify cib=0.197.17, source=abort_unless_down:319,
>>>> >>>>>
>>>> >>>>
>>>> >>
>>>> path=/cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-fail-count-postfix'],
>>>> >>>>> 1)
>>>> >>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local attrd: info:
>>>> >>>>> attrd_cib_callback: Update 400 for fail-count-postfix: OK (0)
>>>> >>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local attrd: info:
>>>> >>>>> attrd_cib_callback: Update 400 for fail-count-postfix[mail1]=3:
>>>> OK
>>>> >> (0)
>>>> >>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local attrd: info:
>>>> >>>>> attrd_cib_callback: Update 400 for
>>>> fail-count-postfix[mail2]=(null):
>>>> >> OK
>>>> >>>>> (0)
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_process_request: Forwarding cib_modify operation for section
>>>> >> status
>>>> >>>> to
>>>> >>>>> master (origin=local/attrd/401)
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: Diff: --- 0.197.17 2
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: Diff: +++ 0.197.18 (null)
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: + /cib: @num_updates=18
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: +
>>>> >>>>>
>>>> >>>>
>>>> >>
>>>> /cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-last-failure-postfix']:
>>>> >>>>> @value=1457613440
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_process_request: Completed cib_modify operation for section
>>>> >> status:
>>>> >>>> OK
>>>> >>>>> (rc=0, origin=mail2/attrd/401, version=0.197.18)
>>>> >>>>> Mar 10 13:37:21 [7418] HWJ-626.domain.local attrd: info:
>>>> >>>>> attrd_cib_callback: Update 401 for last-failure-postfix: OK (0)
>>>> >>>>> Mar 10 13:37:21 [7418] HWJ-626.domain.local attrd: info:
>>>> >>>>> attrd_cib_callback: Update 401 for
>>>> >>>>> last-failure-postfix[mail1]=1457613440: OK (0)
>>>> >>>>> Mar 10 13:37:21 [7418] HWJ-626.domain.local attrd: info:
>>>> >>>>> attrd_cib_callback: Update 401 for
>>>> >>>>> last-failure-postfix[mail2]=1457610376: OK (0)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: info:
>>>> >>>>> abort_transition_graph: Transition aborted by
>>>> >>>>> status-1-last-failure-postfix, last-failure-postfix=1457613440:
>>>> >> Transient
>>>> >>>>> attribute change (modify cib=0.197.18,
>>>> source=abort_unless_down:319,
>>>> >>>>>
>>>> >>>>
>>>> >>
>>>> path=/cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-last-failure-postfix'],
>>>> >>>>> 1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>>>> unpack_config: On loss of CCM Quorum: Ignore
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_online_status: Node mail1 is online
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_online_status: Node mail2 is online
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_op_status: Operation monitor found resource mail:0
>>>> active in
>>>> >>>>> master mode on mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_op_status: Operation monitor found resource spool:0
>>>> active
>>>> >> in
>>>> >>>>> master mode on mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_op_status: Operation monitor found resource fs-spool
>>>> active
>>>> >> on
>>>> >>>>> mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_op_status: Operation monitor found resource fs-mail
>>>> active
>>>> >> on
>>>> >>>>> mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: warning:
>>>> >>>>> unpack_rsc_op_failure: Processing failed op monitor for
>>>> postfix
>>>> >> on
>>>> >>>>> mail1: not running (7)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_op_status: Operation monitor found resource spool:1
>>>> active
>>>> >> in
>>>> >>>>> master mode on mail2
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_op_status: Operation monitor found resource mail:1
>>>> active in
>>>> >>>>> master mode on mail2
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> group_print: Resource Group: network-services
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> native_print: virtualip-1 (ocf::heartbeat:IPaddr2):
>>>> >>>> Started
>>>> >>>>> mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> clone_print: Master/Slave Set: spool-clone [spool]
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> short_print: Masters: [ mail1 ]
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> short_print: Slaves: [ mail2 ]
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> clone_print: Master/Slave Set: mail-clone [mail]
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> short_print: Masters: [ mail1 ]
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> short_print: Slaves: [ mail2 ]
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> group_print: Resource Group: fs-services
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> native_print: fs-spool (ocf::heartbeat:Filesystem):
>>>> Started
>>>> >>>> mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> native_print: fs-mail (ocf::heartbeat:Filesystem):
>>>> Started
>>>> >>>> mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> group_print: Resource Group: mail-services
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> native_print: postfix (ocf::heartbeat:postfix):
>>>> FAILED
>>>> >>>> mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> get_failcount_full: postfix has failed 3 times on mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: warning:
>>>> >>>>> common_apply_stickiness: Forcing postfix away from mail1
>>>> after 3
>>>> >>>>> failures (max=3)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> master_color: Promoting mail:0 (Master mail1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> master_color: mail-clone: Promoted 1 instances of a possible 1 to
>>>> >> master
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> master_color: Promoting spool:0 (Master mail1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> master_color: spool-clone: Promoted 1 instances of a possible 1 to
>>>> >> master
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> rsc_merge_weights: postfix: Rolling back scores from
>>>> virtualip-1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> native_color: Resource virtualip-1 cannot run anywhere
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> RecurringOp: Start recurring monitor (45s) for postfix on mail2
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>>>> LogActions: Stop virtualip-1 (mail1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> LogActions: Leave spool:0 (Master mail1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> LogActions: Leave spool:1 (Slave mail2)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> LogActions: Leave mail:0 (Master mail1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> LogActions: Leave mail:1 (Slave mail2)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>>>> LogActions: Stop fs-spool (Started mail1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>>>> LogActions: Stop fs-mail (Started mail1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>>>> LogActions: Stop postfix (Started mail1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>>>> process_pe_message: Calculated Transition 1235:
>>>> >>>>> /var/lib/pacemaker/pengine/pe-input-302.bz2
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: info:
>>>> >>>>> handle_response: pe_calc calculation
>>>> pe_calc-dc-1457613441-3756 is
>>>> >>>>> obsolete
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>>>> unpack_config: On loss of CCM Quorum: Ignore
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_online_status: Node mail1 is online
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_online_status: Node mail2 is online
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_op_status: Operation monitor found resource mail:0
>>>> active in
>>>> >>>>> master mode on mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_op_status: Operation monitor found resource spool:0
>>>> active
>>>> >> in
>>>> >>>>> master mode on mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_op_status: Operation monitor found resource fs-spool
>>>> active
>>>> >> on
>>>> >>>>> mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_op_status: Operation monitor found resource fs-mail
>>>> active
>>>> >> on
>>>> >>>>> mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: warning:
>>>> >>>>> unpack_rsc_op_failure: Processing failed op monitor for
>>>> postfix
>>>> >> on
>>>> >>>>> mail1: not running (7)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_op_status: Operation monitor found resource spool:1
>>>> active
>>>> >> in
>>>> >>>>> master mode on mail2
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> determine_op_status: Operation monitor found resource mail:1
>>>> active in
>>>> >>>>> master mode on mail2
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> group_print: Resource Group: network-services
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> native_print: virtualip-1 (ocf::heartbeat:IPaddr2):
>>>> >>>> Started
>>>> >>>>> mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> clone_print: Master/Slave Set: spool-clone [spool]
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> short_print: Masters: [ mail1 ]
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> short_print: Slaves: [ mail2 ]
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> clone_print: Master/Slave Set: mail-clone [mail]
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> short_print: Masters: [ mail1 ]
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> short_print: Slaves: [ mail2 ]
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> group_print: Resource Group: fs-services
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> native_print: fs-spool (ocf::heartbeat:Filesystem):
>>>> Started
>>>> >>>> mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> native_print: fs-mail (ocf::heartbeat:Filesystem):
>>>> Started
>>>> >>>> mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> group_print: Resource Group: mail-services
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> native_print: postfix (ocf::heartbeat:postfix):
>>>> FAILED
>>>> >>>> mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> get_failcount_full: postfix has failed 3 times on mail1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: warning:
>>>> >>>>> common_apply_stickiness: Forcing postfix away from mail1
>>>> after 3
>>>> >>>>> failures (max=3)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> master_color: Promoting mail:0 (Master mail1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> master_color: mail-clone: Promoted 1 instances of a possible 1 to
>>>> >> master
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> master_color: Promoting spool:0 (Master mail1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> master_color: spool-clone: Promoted 1 instances of a possible 1 to
>>>> >> master
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> rsc_merge_weights: postfix: Rolling back scores from
>>>> virtualip-1
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> native_color: Resource virtualip-1 cannot run anywhere
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> RecurringOp: Start recurring monitor (45s) for postfix on mail2
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>>>> LogActions: Stop virtualip-1 (mail1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> LogActions: Leave spool:0 (Master mail1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> LogActions: Leave spool:1 (Slave mail2)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> LogActions: Leave mail:0 (Master mail1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: info:
>>>> >>>>> LogActions: Leave mail:1 (Slave mail2)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>>>> LogActions: Stop fs-spool (Started mail1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>>>> LogActions: Stop fs-mail (Started mail1)
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>>>> LogActions: Stop postfix (Started mail1)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: info:
>>>> >>>>> do_state_transition: State transition S_POLICY_ENGINE ->
>>>> >>>>> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE
>>>> >>>>> origin=handle_response ]
>>>> >>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local pengine: notice:
>>>> >>>>> process_pe_message: Calculated Transition 1236:
>>>> >>>>> /var/lib/pacemaker/pengine/pe-input-303.bz2
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: info:
>>>> >>>>> do_te_invoke: Processing graph 1236
>>>> (ref=pe_calc-dc-1457613441-3757)
>>>> >>>>> derived from /var/lib/pacemaker/pengine/pe-input-303.bz2
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> te_rsc_command: Initiating action 12: stop
>>>> virtualip-1_stop_0 on
>>>> >>>> mail1
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> te_rsc_command: Initiating action 5: stop postfix_stop_0 on
>>>> mail1
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: Diff: --- 0.197.18 2
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: Diff: +++ 0.197.19 (null)
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: + /cib: @num_updates=19
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: +
>>>> >>>>>
>>>> >>>>
>>>> >>
>>>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='virtualip-1']/lrm_rsc_op[@id='virtualip-1_last_0']:
>>>> >>>>> @operation_key=virtualip-1_stop_0, @operation=stop,
>>>> >>>>> @transition-key=12:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> >>>>>
>>>> @transition-magic=0:0;12:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> >>>>> @call-id=1276, @last-run=1457613441, @last-rc-change=1457613441,
>>>> >>>>> @exec-time=66
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_process_request: Completed cib_modify operation for section
>>>> >> status:
>>>> >>>> OK
>>>> >>>>> (rc=0, origin=mail1/crmd/197, version=0.197.19)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: info:
>>>> >>>>> match_graph_event: Action virtualip-1_stop_0 (12) confirmed on
>>>> mail1
>>>> >>>>> (rc=0)
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: Diff: --- 0.197.19 2
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: Diff: +++ 0.197.20 (null)
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: + /cib: @num_updates=20
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: +
>>>> >>>>>
>>>> >>>>
>>>> >>
>>>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='postfix']/lrm_rsc_op[@id='postfix_last_0']:
>>>> >>>>> @operation_key=postfix_stop_0, @operation=stop,
>>>> >>>>> @transition-key=5:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> >>>>>
>>>> @transition-magic=0:0;5:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> >>>>> @call-id=1278, @last-run=1457613441, @last-rc-change=1457613441,
>>>> >>>>> @exec-time=476
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: info:
>>>> >>>>> match_graph_event: Action postfix_stop_0 (5) confirmed on mail1
>>>> >> (rc=0)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> te_rsc_command: Initiating action 79: stop fs-mail_stop_0 on
>>>> >> mail1
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_process_request: Completed cib_modify operation for section
>>>> >> status:
>>>> >>>> OK
>>>> >>>>> (rc=0, origin=mail1/crmd/198, version=0.197.20)
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: Diff: --- 0.197.20 2
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: Diff: +++ 0.197.21 (null)
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: + /cib: @num_updates=21
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: +
>>>> >>>>>
>>>> >>>>
>>>> >>
>>>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='fs-mail']/lrm_rsc_op[@id='fs-mail_last_0']:
>>>> >>>>> @operation_key=fs-mail_stop_0, @operation=stop,
>>>> >>>>> @transition-key=79:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> >>>>>
>>>> @transition-magic=0:0;79:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> >>>>> @call-id=1280, @last-run=1457613441, @last-rc-change=1457613441,
>>>> >>>>> @exec-time=88, @queue-time=1
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_process_request: Completed cib_modify operation for section
>>>> >> status:
>>>> >>>> OK
>>>> >>>>> (rc=0, origin=mail1/crmd/199, version=0.197.21)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: info:
>>>> >>>>> match_graph_event: Action fs-mail_stop_0 (79) confirmed on
>>>> mail1
>>>> >>>> (rc=0)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> te_rsc_command: Initiating action 77: stop fs-spool_stop_0
>>>> on
>>>> >> mail1
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: Diff: --- 0.197.21 2
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: Diff: +++ 0.197.22 (null)
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: + /cib: @num_updates=22
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_perform_op: +
>>>> >>>>>
>>>> >>>>
>>>> >>
>>>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='fs-spool']/lrm_rsc_op[@id='fs-spool_last_0']:
>>>> >>>>> @operation_key=fs-spool_stop_0, @operation=stop,
>>>> >>>>> @transition-key=77:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> >>>>>
>>>> @transition-magic=0:0;77:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>> >>>>> @call-id=1282, @last-run=1457613441, @last-rc-change=1457613441,
>>>> >>>>> @exec-time=86
>>>> >>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_process_request: Completed cib_modify operation for section
>>>> >> status:
>>>> >>>> OK
>>>> >>>>> (rc=0, origin=mail1/crmd/200, version=0.197.22)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: info:
>>>> >>>>> match_graph_event: Action fs-spool_stop_0 (77) confirmed on
>>>> mail1
>>>> >>>> (rc=0)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: warning:
>>>> >>>>> run_graph: Transition 1236 (Complete=11, Pending=0, Fired=0,
>>>> >>>> Skipped=0,
>>>> >>>>> Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-303.bz2):
>>>> >>>>> Terminated
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: warning:
>>>> >>>>> te_graph_trigger: Transition failed: terminated
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> print_graph: Graph 1236 with 12 actions: batch-limit=12 jobs,
>>>> >>>>> network-delay=0ms
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> print_synapse: [Action 16]: Completed pseudo op
>>>> >>>>> network-services_stopped_0 on N/A (priority: 0, waiting: none)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> print_synapse: [Action 15]: Completed pseudo op
>>>> >>>>> network-services_stop_0 on N/A (priority: 0, waiting: none)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> print_synapse: [Action 12]: Completed rsc op
>>>> >> virtualip-1_stop_0
>>>> >>>>> on mail1 (priority: 0, waiting: none)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> print_synapse: [Action 84]: Completed pseudo op
>>>> >>>>> fs-services_stopped_0 on N/A (priority: 0, waiting: none)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> print_synapse: [Action 83]: Completed pseudo op
>>>> >>>> fs-services_stop_0
>>>> >>>>> on N/A (priority: 0, waiting: none)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> print_synapse: [Action 77]: Completed rsc op
>>>> fs-spool_stop_0
>>>> >>>>> on mail1 (priority: 0, waiting: none)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> print_synapse: [Action 79]: Completed rsc op
>>>> fs-mail_stop_0
>>>> >>>>> on mail1 (priority: 0, waiting: none)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> print_synapse: [Action 90]: Completed pseudo op
>>>> >>>>> mail-services_stopped_0 on N/A (priority: 0, waiting: none)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> print_synapse: [Action 89]: Completed pseudo op
>>>> >>>>> mail-services_stop_0 on N/A (priority: 0, waiting: none)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> print_synapse: [Action 86]: Pending rsc op
>>>> >> postfix_monitor_45000
>>>> >>>>> on mail2 (priority: 0, waiting: none)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> print_synapse: * [Input 85]: Unresolved dependency rsc op
>>>> >>>>> postfix_start_0 on mail2
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> print_synapse: [Action 5]: Completed rsc op
>>>> postfix_stop_0
>>>> >>>>> on mail1 (priority: 0, waiting: none)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> print_synapse: [Action 8]: Completed pseudo op
>>>> all_stopped
>>>> >>>>> on N/A (priority: 0, waiting: none)
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: info:
>>>> >> do_log:
>>>> >>>>> FSA: Input I_TE_SUCCESS from notify_crmd() received in state
>>>> >>>>> S_TRANSITION_ENGINE
>>>> >>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local crmd: notice:
>>>> >>>>> do_state_transition: State transition S_TRANSITION_ENGINE ->
>>>> S_IDLE [
>>>> >>>>> input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
>>>> >>>>> Mar 10 13:37:26 [7415] HWJ-626.domain.local cib: info:
>>>> >>>>> cib_process_ping: Reporting our current digest to mail2:
>>>> >>>>> 3896ee29cdb6ba128330b0ef6e41bd79 for 0.197.22 (0x1544a30 0)
>>>> >>
>>>> >>
>>>> >
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://clusterlabs.org/pipermail/users/attachments/20160318/b758dd6b/attachment-0001.html>
More information about the Users
mailing list