Official Mafiree Blogs

Keeping you informed about Databases

 Home  /  MySQL  
Avoid deadlocks, certification failures on Galera Cluster using ProxySQL
By Sukan   |   July 06, 2019   |   Posted in : MySQL

One of the major limitations of galera cluster is deadlocks, it is quite common across multi-writer setups. This is mainly because of cluster wise optimistic locking. If your transaction per second is high and if you are updating same rows concurrently then deadlocks will be a major setback for your application. 

Yes, one can retry the failed transaction and hope that it succeed next time, obviously this is a hard way and it can increase the response time. If application is not designed to retry the failed transaction then setting the wsrep_retry_autocommit on galera nodes would automatically retry the failed transaction. 

If your table is bound to have more hotspots, then the frequency of the deadlock would be very high. 

Dealing with deadlocks
Our setup is a classic 3 Node MariaDB Galera cluster all three nodes serving reads and writes equally from the ProxySQL Load Balancer.

           One of our tables is having a hotspot that it will update the table row very frequently, thus our application will always bounce a deadlock error and during our peak transaction hour, almost most of the transactions are retried. 

Galera has few options to monitor and troubleshoot the deadlocks.

wsrep_local_bf_aborts - Total number of local transactions that were aborted by slave transactions while in execution.

wsrep_local_cert_failures - Total number of local transactions that failed certification test.
 

As seen above, during our peak transaction hours, deadlocks and certification failures are quite high in our systems. With wsrep_debug=ON and wsrep_log_conflicts=ON configuration one can enable additional logging and get the queries that are failed / retried. In our case, we know our hotspot queries and these are straight forward queries so we decided to test by sending all the writes to one node and reads to the remaining nodes. Fortunately we have ProxySQL 2.0 which has the native support to galera servers. 

The Solution 

We have put the ProxySQL to send write traffic to node 1 and read traffic to other nodes. This can be done by setting max_writers to 1 in mysql_galera_hostgroup_table and thus by setting up regex on mysql_query_rules we have achieved this without taking any downtime.

          Now writes are sent to node 1 and reads are sent to the nodes 2 and 3.  It is designed in such a way that in case of failure of node 1 other node 2 or 3 can be brought up as a writer thus eliminating the risk of having a downtime because of failed single writer. 

    Post these changes, we have seen a drastic dip in the number of deadlocks occurred and there are no certification failure happening in the cluster. 

This is a real improvement and now our applications are not facing any deadlock or certification failure.

Verdict 

If your database server can single handedly serve all the writes then running galera cluster with single node write would solve all your deadlock issues.  

 



0 Comments


Leave a Comment

Name *

Email *

Comment *



Search All Blogs



Need Support?


Solutions      Services      Resources      About Mafiree

More than 3000 servers monitored

Certified database Administrators

24X7X365 Support

Happy Clients


ENQUIRE NOW
For Sales Related Queries, Please Call Our Sales Experts at

 +91-80-41155993


Meet Mafiree on social networks!

     

PCI Certificate

Copyright © 2019 - All Rights Reserved - Mafiree