Official Mafiree Blogs

Keeping you informed about Databases

Achieving High Availability Using Log Shipping
By Sujith   |   February 21, 2018   |   Posted in : SQL Server
Here we will get the detailed explanation of how we can achieve HA using Log Shipping.

SQL Server On Linux
By Sujith   |   March 30, 2019   |   Posted in : SQL Server
SQL Server 2017 brings the best features of the Microsoft relational database engine to the enterprise Linux ecosystem, including SQL Server Agent, Azure Active Directory (Azure AD) authentication, best-in-class high availability/ disaster recovery, and unparalleled data security.

Types of SQL Server Replication
By Sujith   |   February 19, 2018   |   Posted in : SQL Server
Understanding on the types of replication in SQL Server and the type apt for your traffic pattern.

Data File Splitting in SQL Server
By Murali   |   March 15, 2021   |   Posted in : SQL Server
Huge data blocks resides under single MDF file might cause a performance of the query and impact the application services. This blog help you to understand how we can the split the single MDF file into multiple data files.

Dynamic query routing with ProxySQL
By Sabareesh   |   March 15, 2021   |   Posted in : Proxysql
How ProxySQL helps in achieving query routing without making a application changes.

 Home  /  MySQL  
Avoid deadlocks, certification failures on Galera Cluster using ProxySQL
By Sukan   |   July 06, 2019   |   Posted in : MySQL

One of the major limitations of galera cluster is deadlocks, it is quite common across multi-writer setups. This is mainly because of cluster wise optimistic locking. If your transaction per second is high and if you are updating same rows concurrently then deadlocks will be a major setback for your application.

Yes, one can retry the failed transaction and hope that it succeed next time, obviously this is a hard way and it can increase the response time. If application is not designed to retry the failed transaction then setting the wsrep_retry_autocommit on galera nodes would automatically retry the failed transaction.

If your table is bound to have more hotspots, then the frequency of the deadlock would be very high.

Dealing with deadlocks

Our setup is a classic 3 Node MariaDB Galera cluster all three nodes serving reads and writes equally from the ProxySQL Load Balancer.



One of our tables is having a hotspot that it will update the table row very frequently, thus our application will always bounce a deadlock error and during our peak transaction hour, almost most of the transactions are retried.

Galera has few options to monitor and troubleshoot the deadlocks.

wsrep_local_bf_aborts - Total number of local transactions that were aborted by slave transactions while in execution.

wsrep_local_cert_failures - Total number of local transactions that failed certification test.


As seen above, during our peak transaction hours, deadlocks and certification failures are quite high in our systems.
With wsrep_debug=ON and wsrep_log_conflicts=ON configuration one can enable additional logging and get the queries that are failed / retried. In our case, we know our hotspot queries and these are straight forward queries so we decided to test by sending all the writes to one node and reads to the remaining nodes.
Fortunately we have ProxySQL 2.0 which has the native support to galera servers.

The Solution

We have put the ProxySQL to send write traffic to node 1 and read traffic to other nodes. This can be done by setting max_writers to 1 in mysql_galera_hostgroup_table and thus by setting up regex on mysql_query_rules we have achieved this without taking any downtime.



Now writes are sent to node 1 and reads are sent to the nodes 2 and 3. It is designed in such a way that in case of failure of node 1 other node 2 or 3 can be brought up as a writer thus eliminating the risk of having a downtime because of failed single writer.

Post these changes, we have seen a drastic dip in the number of deadlocks occurred and there are no certification failure happening in the cluster


This is a real improvement and now our applications are not facing any deadlock or certification failure.

Verdict

If your database server can single handedly serve all the writes then running galera cluster with single node write would solve all your deadlock issues.




0 Comments


Leave a Comment

Name *

Email *

Comment *



Search All Blogs



Need Support?


Solutions      Services      Resources      About Mafiree

More than 3000 servers monitored

Certified database Administrators

24X7X365 Support

Happy Clients


ENQUIRE NOW
For Sales Related Queries, Please Call Our Sales Experts at

 +91-80-41155993


Meet Mafiree on social networks!

     

PCI Certificate

Copyright © 2019 - All Rights Reserved - Mafiree