Official Mafiree Blogs

Keeping you informed about Databases

Scaling Databases Horizontally: A Solution for High-Concurrency Systems
By Sukan   |   September 20, 2023   |   Posted in : TiDB
Blog post on how TiDB handles the OLAP requests without compromising OLTP

6 Interesting Features In MongoDB 6.0
By Thivakar   |   June 19, 2023   |   Posted in : MongoDB
Upgrade to MongoDB 6.0 for improved index management, error handling, powerful aggregation, replicaset enhancements, time series collection improvements, and embedded field exclusion. Experience efficient database management.

Access Control & Account Management In MySQL 8
By Sukan   |   June 08, 2023   |   Posted in : MySQL
Discover how MySQL 8 enhances security through features like password expiry, password validation, account lock/unlock, privilege management, and dual passwords. Safeguard your valuable data and streamline access control in your MySQL environment.

Significance of using Invisible Primary key (GIPK) with MySQL 8.0
By Sukan   |   February 21, 2023   |   Posted in : MySQL
How Generated Invisible Primary Keys (GIPK) Can Boost Your Database Efficiency

Achieving High Availability Using Log Shipping
By Sujith   |   February 21, 2018   |   Posted in : SQL Server
Here we will get the detailed explanation of how we can achieve HA using Log Shipping.

 Home  /  MySQL  
Real time data streaming from MariaDB to Columnstore
By Sukan   |   May 28, 2019   |   Posted in : MySQL

In the recent past there has been a massive business shift towards analytics database. Unlike traditional databases, analytics database always stands out in running big data queries. No matter how much we optimize, transaction databases are not suitable for these kind of workload.

Business these days are gripped in running statistics on top of everything they have, for them these reports are critical as they can analyse, optimize and apply them to boost their productivity. So it’s no brainer to look for an analytic storage database.

Choosing the right analytic storage database is not an easy task, as you have to consider many factors such as type of workload, queries, storage, compression.. Etc. If you are looking for one, please refer to the previous blog in this series. Next thing would be streaming the data to the analytic database, ETL jobs are the foremost thing that comes in everyone’s mind. Although it is a proven one, it takes a lot of time to write queries and unfortunately if you are on a tight pipeline you don’t have much time to write these jobs.

Fortunately we have few options with maxscale, proxysql , tungsten which eliminates the need for ETL.

In this blog, we can see how we have achieved real time data streaming from transactional to analytics database.

Background of the setup

  • MariaDB as Transactional DB
  • MariaDB columnstore as Analytics DB
  • MariaDB Maxscale as the binlog reader

Requirements on MariaDB

  • Need to enable binary logging
  • Need to update binary log format as ROW
  • Need to create a user to stream the data from MariaDB

Requirements on Maxscale

  • Configure replication_router and replication_listener in the maxscale config
  • Replication user needs to be used in the maxscale config to communicate with the MariaDB.

Architecture


Steps

  • Maxscale will read the events from the binary log and it uses avro / json format to load the data into the columnstore using the CDC adaptor.
  • CDS adaptor is a utility which will act as an intermediate between Maxscale service and analytics DB.
  • Once it is ready, we can call the mxs_adapter utility, it will start the data streaming to columnstore thus eliminating the need for an ETL job.

Conclusion

ETL is time consuming which is tough to maintain, needs to be updated regularly, whereas this service is easy to deploy and maintain. With this recent trend on analytics, data streaming from transactional database becomes much simpler and hassle free.




1 Comments


Praveen Adepu      August 02, 2020 03:18 am


Nice article. We are trying to implement the real time data streaming from MariaDB 10.5 to ColumnStore 1.5 using MaxScale 2.5. Unfortunately, the CDC adaptor is deprecated in MaxScale 2.5. We tried HTAP, but it is very slow. Can I please know what is the alternative for the CDC adapter in MaxScale 2.5?

Leave a Comment

Name *

Email *

Comment *



Search All Blogs



Need Support?


Solutions      Services      Resources      About Mafiree

More than 3000 servers monitored

Certified database Administrators

24X7X365 Support

Happy Clients


ENQUIRE NOW
For Sales Related Queries, Please Call Our Sales Experts at

 +91-80-41155993


Meet Mafiree on social networks!

     

PCI Certificate

Copyright © 2019 - All Rights Reserved - Mafiree