[ClusterLabs] Mysql slave did not start replication after failure, and read-only IP also remained active on the much outdated slave

Thu Aug 25 18:02:44 CEST 2016

On 08/22/2016 03:56 PM, Attila Megyeri wrote:
> Hi Ken,
> 
> Thanks a lot for your feedback, my answers are inline.
> 
> 
> 
>> -----Original Message-----
>> From: Ken Gaillot [mailto:kgaillot at redhat.com]
>> Sent: Monday, August 22, 2016 4:12 PM
>> To: users at clusterlabs.org
>> Subject: Re: [ClusterLabs] Mysql slave did not start replication after failure,
>> and read-only IP also remained active on the much outdated slave
>>
>> On 08/22/2016 07:24 AM, Attila Megyeri wrote:
>>> Hi Andrei,
>>>
>>> I waited several hours, and nothing happened.
>>
>> And actually, we can see from the configuration you provided that
>> cluster-recheck-interval is 2 minutes.
>>
>> I don't see anything about stonith; is it enabled and tested? This looks
>> like a situation where stonith would come into play. I know that power
>> fencing can be rough on a MySQL database, but perhaps intelligent
>> switches with network fencing would be appropriate.
> 
> Yes, there is no stonith in place because we found it too agressive for this purpose. And to be honest I'm not sure if that would have worked here.

The problem is, in a situation like this where corosync communication is
broken, both nodes will think they are the only surviving node, and
bring up all resources. Network fencing would isolate one of the nodes
so at least it can't cause any serious trouble.

>> The "Corosync main process was not scheduled" message is the start of
>> the trouble. It means the system was overloaded and corosync didn't get
>> any CPU time, so it couldn't maintain cluster communication.
>>
>  
> True, this was the cause of the issue, but we still couldn't find a solution to get rid of the original problem.
> Nevertheless I think that the issue is that the RA did not properly detect the state of mysql.
> 
> 
>> Probably the most useful thing would be to upgrade to a recent version
>> of corosync+pacemaker+resource-agents. Recent corosync versions run with
>> realtime priority, which makes this much less likely.
>>
>> Other than that, figure out what the load issue was, and try to prevent
>> it from recurring.
>>
> 
> Whereas the original problem might have been caused by corosync CPU issue, I am sure that once the load was gone the proper mysql state should have been detected.
> The RA, responsible for this is almost the latest version, and I did not see any changes related to this functionality.
> 
>> I'm not familiar enough with the RA to comment on its behavior. If you
>> think it's suspect, check the logs during the incident for messages from
>> the RA.
>>
> 
> So did I, but there are very few details logged while this was happening, so I am pretty much stuck :(
> 
> I thought that someone might have a clue what is wrong in the RA  - that causes this fake state detection.
> (Un)fortunately I cannot reproduce this situation for now.

Perhaps with the split-brain, the slave tried to come up as master?

>  
> Who could help me in troubleshooting this?
> 
> Thanks,
> Attila
> 
> 
> 
> 
>>> I assume that the RA does not treat this case properly. Mysql was running,
>> but the "show slave status" command returned something that the RA was
>> not prepared to parse, and instead of reporting a non-readable attribute, it
>> returned some generic error, that did not stop the server.
>>>
>>> Rgds,
>>> Attila
>>>
>>>
>>> -----Original Message-----
>>> From: Andrei Borzenkov [mailto:arvidjaar at gmail.com]
>>> Sent: Monday, August 22, 2016 11:42 AM
>>> To: Cluster Labs - All topics related to open-source clustering welcomed
>> <users at clusterlabs.org>
>>> Subject: Re: [ClusterLabs] Mysql slave did not start replication after failure,
>> and read-only IP also remained active on the much outdated slave
>>>
>>> On Mon, Aug 22, 2016 at 12:18 PM, Attila Megyeri
>>> <amegyeri at minerva-soft.com> wrote:
>>>> Dear community,
>>>>
>>>>
>>>>
>>>> A few days ago we had an issue in our Mysql M/S replication cluster.
>>>>
>>>> We have a one R/W Master, and a one RO Slave setup. RO VIP is
>> supposed to be
>>>> running on the slave if it is not too much behind the master, and if any
>>>> error occurs, RO VIP is moved to the master.
>>>>
>>>>
>>>>
>>>> Something happened with the slave Mysql (some disk issue, still
>>>> investigating), but the problem is, that the slave VIP remained on the
>> slave
>>>> device, even though the slave process was not running, and the server
>> was
>>>> much outdated.
>>>>
>>>>
>>>>
>>>> During the issue the following log entries appeared (just an extract as it
>>>> would be too long):
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Aug 20 02:04:07 ctdb1 corosync[1056]:   [MAIN  ] Corosync main process
>> was
>>>> not scheduled for 14088.5488 ms (threshold is 4000.0000 ms). Consider
>> token
>>>> timeout increase.
>>>>
>>>> Aug 20 02:04:07 ctdb1 corosync[1056]:   [TOTEM ] A processor failed,
>> forming
>>>> new configuration.
>>>>
>>>> Aug 20 02:04:34 ctdb1 corosync[1056]:   [MAIN  ] Corosync main process
>> was
>>>> not scheduled for 27065.2559 ms (threshold is 4000.0000 ms). Consider
>> token
>>>> timeout increase.
>>>>
>>>> Aug 20 02:04:34 ctdb1 corosync[1056]:   [TOTEM ] A new membership
>> (xxx:6720)
>>>> was formed. Members left: 168362243 168362281 168362282 168362301
>> 168362302
>>>> 168362311 168362312 1
>>>>
>>>> Aug 20 02:04:34 ctdb1 corosync[1056]:   [TOTEM ] A new membership
>> (xxx:6724)
>>>> was formed. Members
>>>>
>>>> ..
>>>>
>>>> Aug 20 02:13:28 ctdb1 corosync[1056]:   [MAIN  ] Completed service
>>>> synchronization, ready to provide service.
>>>>
>>>> ..
>>>>
>>>> Aug 20 02:13:29 ctdb1 attrd[1584]:   notice: attrd_trigger_update: Sending
>>>> flush op to all hosts for: readable (1)
>>>>
>>>> …
>>>>
>>>> Aug 20 02:13:32 ctdb1 mysql(db-mysql)[10492]: INFO: post-demote
>> notification
>>>> for ctdb1
>>>>
>>>> Aug 20 02:13:32 ctdb1 IPaddr2(db-ip-master)[10490]: INFO: IP status = ok,
>>>> IP_CIP=
>>>>
>>>> Aug 20 02:13:32 ctdb1 crmd[1586]:   notice: process_lrm_event: LRM
>> operation
>>>> db-ip-master_stop_0 (call=371, rc=0, cib-update=179, confirmed=true) ok
>>>>
>>>> Aug 20 02:13:32 ctdb1 IPaddr2(db-ip-slave)[10620]: INFO: Adding inet
>> address
>>>> xxx/24 with broadcast address xxxx to device eth0
>>>>
>>>> Aug 20 02:13:32 ctdb1 IPaddr2(db-ip-slave)[10620]: INFO: Bringing device
>>>> eth0 up
>>>>
>>>> Aug 20 02:13:32 ctdb1 IPaddr2(db-ip-slave)[10620]: INFO:
>>>> /usr/lib/heartbeat/send_arp -i 200 -r 5 -p
>>>> /usr/var/run/resource-agents/send_arp-xxx eth0 xxx auto not_used
>> not_used
>>>>
>>>> Aug 20 02:13:32 ctdb1 crmd[1586]:   notice: process_lrm_event: LRM
>> operation
>>>> db-ip-slave_start_0 (call=377, rc=0, cib-update=180, confirmed=true) ok
>>>>
>>>> Aug 20 02:13:32 ctdb1 crmd[1586]:   notice: process_lrm_event: LRM
>> operation
>>>> db-ip-slave_monitor_20000 (call=380, rc=0, cib-update=181,
>> confirmed=false)
>>>> ok
>>>>
>>>> Aug 20 02:13:32 ctdb1 crmd[1586]:   notice: process_lrm_event: LRM
>> operation
>>>> db-mysql_notify_0 (call=374, rc=0, cib-update=0, confirmed=true) ok
>>>>
>>>> Aug 20 02:13:32 ctdb1 attrd[1584]:   notice: attrd_trigger_update: Sending
>>>> flush op to all hosts for: master-db-mysql (1)
>>>>
>>>> Aug 20 02:13:32 ctdb1 attrd[1584]:   notice: attrd_perform_update: Sent
>>>> update 1622: master-db-mysql=1
>>>>
>>>> Aug 20 02:13:32 ctdb1 crmd[1586]:   notice: process_lrm_event: LRM
>> operation
>>>> db-mysql_demote_0 (call=384, rc=0, cib-update=182, confirmed=true) ok
>>>>
>>>> Aug 20 02:13:33 ctdb1 mysql(db-mysql)[11160]: INFO: Ignoring post-
>> demote
>>>> notification for my own demotion.
>>>>
>>>> Aug 20 02:13:33 ctdb1 crmd[1586]:   notice: process_lrm_event: LRM
>> operation
>>>> db-mysql_notify_0 (call=387, rc=0, cib-update=0, confirmed=true) ok
>>>>
>>>> Aug 20 02:13:33 ctdb1 mysql(db-mysql)[11185]: ERROR: check_slave
>> invoked on
>>>> an instance that is not a replication slave.
>>>>
>>>> Aug 20 02:13:33 ctdb1 crmd[1586]:   notice: process_lrm_event: LRM
>> operation
>>>> db-mysql_monitor_7000 (call=390, rc=0, cib-update=183,
>> confirmed=false) ok
>>>>
>>>> Aug 20 02:13:33 ctdb1 ntpd[1560]: Listen normally on 16 eth0 xxxx. UDP
>> 123
>>>>
>>>> Aug 20 02:13:33 ctdb1 ntpd[1560]: Deleting interface #12 eth0, xxx#123,
>>>> interface stats: received=0, sent=0, dropped=0, active_time=2637334 secs
>>>>
>>>> Aug 20 02:13:33 ctdb1 ntpd[1560]: peers refreshed
>>>>
>>>> Aug 20 02:13:33 ctdb1 ntpd[1560]: new interface(s) found: waking up
>> resolver
>>>>
>>>> Aug 20 02:13:40 ctdb1 mysql(db-mysql)[11224]: ERROR: check_slave
>> invoked on
>>>> an instance that is not a replication slave.
>>>>
>>>> Aug 20 02:13:47 ctdb1 mysql(db-mysql)[11263]: ERROR: check_slave
>> invoked on
>>>> an instance that is not a replication slave.
>>>>
>>>>
>>>>
>>>> And from this time, the last two lines repeat every 7 seconds (mysql
>>>> monitoring interval)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> The expected behavior was that the slave (RO) VIP should have been
>> moved to
>>>> the master, as the secondary db was outdated.
>>>>
>>>> Unfortunately I cannot recall what crm_mon was showing when the issue
>> was
>>>> present, but I am sure that the RA did not handle the situation properly.
>>>>
>>>>
>>>>
>>>> Placing the slave node into standby and the online resolved the issue
>>>> immediately (Slave started to sync, and in  a few minutes it catched up the
>>>> master).
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Here is the relevant config from the configuration:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> primitive db-ip-master ocf:heartbeat:IPaddr2 \
>>>>
>>>>                 params lvs_support="true" ip="XXX" cidr_netmask="24"
>>>> broadcast="XXX" \
>>>>
>>>>                 op start interval="0" timeout="20s" on-fail="restart" \
>>>>
>>>>                 op monitor interval="20s" timeout="20s" \
>>>>
>>>>                 op stop interval="0" timeout="20s" on-fail="block"
>>>>
>>>>
>>>>
>>>> primitive db-ip-slave ocf:heartbeat:IPaddr2 \
>>>>
>>>>                 params lvs_support="true" ip="XXX" cidr_netmask="24"
>>>> broadcast="XXX" \
>>>>
>>>>                 op start interval="0" timeout="20s" on-fail="restart" \
>>>>
>>>>                 op monitor interval="20s" timeout="20s" \
>>>>
>>>>                 op stop interval="0" timeout="20s" on-fail="block" \
>>>>
>>>>                 meta target-role="Started"
>>>>
>>>>
>>>>
>>>> primitive db-mysql ocf:heartbeat:mysql \
>>>>
>>>>                 params binary="/usr/bin/mysqld_safe"
>>>> config="/etc/mysql/my.cnf" datadir="/var/lib/mysql" user="mysql"
>>>> pid="/var/run/mysqld/mysqld.pid"
>> socket="/var/run/mysqld/mysqld.sock"
>>>> test_passwd="XXX" test_table="XXX" test_user="XXX"
>> replication_user="XXX"
>>>> replication_passwd="XXX" additional_parameters="--skip-slave-start" \
>>>>
>>>>                 op start interval="0" timeout="240s" on-fail="restart" \
>>>>
>>>>                 op stop interval="0" timeout="120s" on-fail="block" \
>>>>
>>>>                 op monitor interval="7" timeout="30s" on-fail="restart"
>>>> OCF_CHECK_LEVEL="1" \
>>>>
>>>>                 op promote interval="0" timeout="120" on-fail="restart" \
>>>>
>>>>                 op demote interval="0" timeout="120" on-fail="block"
>>>>
>>>>
>>>>
>>>> ms mysql db-mysql \
>>>>
>>>>                 meta notify="true" master-max="1" clone-max="2"
>>>> target-role="Started" is-managed="true"
>>>>
>>>>
>>>>
>>>> location db-ip-m-1 db-ip-master 0: ctdb1
>>>>
>>>> location db-ip-m-2 db-ip-master 0: ctdb2
>>>>
>>>> location db-ip-s-1 db-ip-slave 0: ctdb1
>>>>
>>>> location db-ip-s-2 db-ip-slave 0: ctdb2
>>>>
>>>> location db-ip-s-readable db-ip-slave \
>>>>
>>>>                 rule $id="rule-no-reader-slave" -inf: readable lt 1
>>>>
>>>
>>> How long did you wait? Conditions are reevaluated every
>>> cluster-recheck-interval which is 15 minutes by default.
>>>
>>>> location db-mysql-loc-1 mysql 100: ctdb1
>>>>
>>>> location db-mysql-loc-2 mysql 100: ctdb2
>>>>
>>>>
>>>>
>>>> colocation db-ip-slave-master -50: db-ip-slave db-ip-master
>>>>
>>>> colocation db-ip-with-master inf: db-ip-master mysql:Master
>>>>
>>>> colocation db-slave-on-db inf: db-ip-slave mysql
>>>>
>>>> order master-after-db inf: mysql db-ip-master
>>>>
>>>> order slave-after-db inf: mysql db-ip-slave
>>>>
>>>>
>>>>
>>>> property $id="cib-bootstrap-options" \
>>>>
>>>>                 dc-version="1.1.10-42f2063" \
>>>>
>>>>                 cluster-infrastructure="corosync" \
>>>>
>>>>                 symmetric-cluster="false" \
>>>>
>>>>                 cluster-recheck-interval="2m" \
>>>>
>>>>                 no-quorum-policy="stop" \
>>>>
>>>>                 stop-orphan-resources="false" \
>>>>
>>>>                 start-failure-is-fatal="false" \
>>>>
>>>>                 maintenance-mode="false"
>>>>
>>>> property $id="mysql_replication" \
>>>>
>>>>                 db-mysql_REPL_INFO="ctdb2|mysql-bin.002928|107"
>>>>
>>>> rsc_defaults $id="rsc-options" \
>>>>
>>>>                 resource-stickiness="0"
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Do you have any hints what could have gone wrong, and how we could
>> avoid
>>>> such issues in the future?
>>>>
>>>>
>>>>
>>>> Versions:
>>>>
>>>> Ubuntu Trusty Tahr
>>>>
>>>> Pacemaker 1.1.10
>>>>
>>>> Corosync 2.3.3
>>>>
>>>> Resource agents 3.9.3
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Thanks a lot in advance,
>>>>
>>>>
>>>>
>>>> Attila