Official Mafiree Blogs

Keeping you informed about Databases

 Home  /  MySQL  
Real time data streaming from MariaDB to Columnstore
By Sukan   |   May 28, 2019   |   Posted in : MySQL

In the recent past there has been a massive business shift towards analytics database. Unlike traditional databases, analytics database always stands out in running big data queries. No matter how much we optimize, transaction databases are not suitable for these kind of workload.

Business these days are gripped in running statistics on top of everything they have, for them these reports are critical as they can analyse, optimize and apply them to boost their productivity. So it’s no brainer to look for an analytic storage database.

Choosing the right analytic storage database is not an easy task, as you have to consider many factors such as type of workload, queries, storage, compression.. Etc. If you are looking for one, please refer to the previous blog in this series. Next thing would be streaming the data to the analytic database, ETL jobs are the foremost thing that comes in everyone's mind. Although it is a proven one, it takes a lot of time to write queries and unfortunately if you are on a tight pipeline you don’t have much time to write these jobs.

Fortunately we have few options with maxscale, proxysql , tungsten which eliminates the need for ETL.

In this blog, we can see how we have achieved real time data streaming from transactional to analytics database.

Background of the setup

  • MariaDB as transactional DB

  • MariaDB Columnstore as analytics DB

  • MariaDB Maxscale as the binlog reader

Requirements on MariaDB

  • Need to enable binary logging

  • Need to update binary log format as ROW

  • Need to create a user to stream the data from MariaDB.

Requirements on Maxscale

  • Configure replication_router and replication_listener in the maxscale config

  • Replication user needs to be used in the maxscale config to communicate with the MariaDB.

Architecture

 

 

Steps

  • Maxscale will read the events from the binary log and it uses avro / json format to load the data into the columnstore using the CDC adaptor.

  • CDS adaptor is a utility which will act as an intermediate between Maxscale service and analytics DB.

  • Once it is ready, we can call the mxs_adapter utility, it will start the data streaming to columnstore thus eliminating the need for an ETL job.

Conclusion

ETL is time consuming which is tough to maintain, needs to be updated regularly, whereas this service is easy to deploy and maintain. With this recent trend on analytics,  data streaming from transactional database becomes much simpler and hassle free.


 




0 Comments


Leave a Comment

Name *

Email *

Comment *



Search All Blogs



Need Support?


Solutions      Services      Resources      About Mafiree

More than 3000 servers monitored

Certified database Administrators

24X7X365 Support

Happy Clients


ENQUIRE NOW
For Sales Related Queries, Please Call Our Sales Experts at

 +91-80-41155993


Meet Mafiree on social networks!

     

PCI Certificate nasscom member

Copyright © 2019 - All Rights Reserved - Mafiree