Official Mafiree Blogs

Keeping you informed about Databases

Significance of using Invisible Primary key (GIPK) with MySQL 8.0
By Sukan   |   February 21, 2023   |   Posted in : MySQL
How Generated Invisible Primary Keys (GIPK) Can Boost Your Database Efficiency

Achieving High Availability Using Log Shipping
By Sujith   |   February 21, 2018   |   Posted in : SQL Server
Here we will get the detailed explanation of how we can achieve HA using Log Shipping.

SQL Server On Linux
By Sujith   |   March 30, 2019   |   Posted in : SQL Server
SQL Server 2017 brings the best features of the Microsoft relational database engine to the enterprise Linux ecosystem, including SQL Server Agent, Azure Active Directory (Azure AD) authentication, best-in-class high availability/ disaster recovery, and unparalleled data security.

Types of SQL Server Replication
By Sujith   |   February 19, 2018   |   Posted in : SQL Server
Understanding on the types of replication in SQL Server and the type apt for your traffic pattern.

Data File Splitting in SQL Server
By Murali   |   March 15, 2021   |   Posted in : SQL Server
Huge data blocks resides under single MDF file might cause a performance of the query and impact the application services. This blog help you to understand how we can the split the single MDF file into multiple data files.

 Home  /  MySQL  
Real time data streaming from MariaDB to Columnstore
By Sukan   |   May 28, 2019   |   Posted in : MySQL

In the recent past there has been a massive business shift towards analytics database. Unlike traditional databases, analytics database always stands out in running big data queries. No matter how much we optimize, transaction databases are not suitable for these kind of workload.

Business these days are gripped in running statistics on top of everything they have, for them these reports are critical as they can analyse, optimize and apply them to boost their productivity. So it’s no brainer to look for an analytic storage database.

Choosing the right analytic storage database is not an easy task, as you have to consider many factors such as type of workload, queries, storage, compression.. Etc. If you are looking for one, please refer to the previous blog in this series. Next thing would be streaming the data to the analytic database, ETL jobs are the foremost thing that comes in everyone’s mind. Although it is a proven one, it takes a lot of time to write queries and unfortunately if you are on a tight pipeline you don’t have much time to write these jobs.

Fortunately we have few options with maxscale, proxysql , tungsten which eliminates the need for ETL.

In this blog, we can see how we have achieved real time data streaming from transactional to analytics database.

Background of the setup

  • MariaDB as Transactional DB
  • MariaDB columnstore as Analytics DB
  • MariaDB Maxscale as the binlog reader

Requirements on MariaDB

  • Need to enable binary logging
  • Need to update binary log format as ROW
  • Need to create a user to stream the data from MariaDB

Requirements on Maxscale

  • Configure replication_router and replication_listener in the maxscale config
  • Replication user needs to be used in the maxscale config to communicate with the MariaDB.

Architecture


Steps

  • Maxscale will read the events from the binary log and it uses avro / json format to load the data into the columnstore using the CDC adaptor.
  • CDS adaptor is a utility which will act as an intermediate between Maxscale service and analytics DB.
  • Once it is ready, we can call the mxs_adapter utility, it will start the data streaming to columnstore thus eliminating the need for an ETL job.

Conclusion

ETL is time consuming which is tough to maintain, needs to be updated regularly, whereas this service is easy to deploy and maintain. With this recent trend on analytics, data streaming from transactional database becomes much simpler and hassle free.




1 Comments


Praveen Adepu      August 02, 2020 03:18 am


Nice article. We are trying to implement the real time data streaming from MariaDB 10.5 to ColumnStore 1.5 using MaxScale 2.5. Unfortunately, the CDC adaptor is deprecated in MaxScale 2.5. We tried HTAP, but it is very slow. Can I please know what is the alternative for the CDC adapter in MaxScale 2.5?

Leave a Comment

Name *

Email *

Comment *



Search All Blogs



Need Support?


Solutions      Services      Resources      About Mafiree

More than 3000 servers monitored

Certified database Administrators

24X7X365 Support

Happy Clients


ENQUIRE NOW
For Sales Related Queries, Please Call Our Sales Experts at

 +91-80-41155993


Meet Mafiree on social networks!

     

PCI Certificate

Copyright © 2019 - All Rights Reserved - Mafiree