Mafiree logo
  • About
  • Services
  • Blogs
  • Careers
  • Products
    • orbit logo Orbit
    • streamer logo Xstreami
  • Contact
Schedule a Call
Menu
  • About
  • Services
  • Blogs
  • Careers
  • Products
    • orbit logo Orbit
    • streamer logo Xstreami
  • Contact
  • Schedule a Call
Database
Database Database Managed Services
MySQL MySQL
MySQL Consulting Services
MySQL Migration Services
MySQL Optimization & Query Tuning
MySQL Database Administration
MySQL Backup & Recovery
MySQL Security & Maintenance
MySQL Cloud Services (AWS RDS, Aurora, Google Cloud SQL, Azure)
MySQL for Ecommerce
MySQL High Availability & Replication
MongoDB MongoDB
MongoDB Consulting Services
MongoDB Migration Services
MongoDB Optimization & Query Tuning
MongoDB Database Administration
MongoDB Backup & Recovery
MongoDB Security & Maintenance
MongoDB Cloud (Atlas)
MongoDB Solutions by Industry
MongoDB High Availability & Replication
PostgreSQL PostgreSQL
PostgreSQL Consulting
PostgreSQL Migration & Upgrades
Performance Tuning & Query Optimization
PostgreSQL Administration & Managed Services
High Availability, Clustering & Replication
PostgreSQL Backup, Recovery & Disaster Planning
PostgreSQL Security, Compliance & Auditing
PostgreSQL for Analytics & Data Warehousing
PostgreSQL on Cloud & Containers
PostgreSQL Extensions & Open-Source Integrations
PostgreSQL for Every Industry
SQL Server MSSQL
MSSQL Consulting Services
MSSQL Migration Services
MSSQL Optimization & Query Tuning Services
MSSQL Database Administration Services
MSSQL Backup & Recovery Services
MSSQL High Availability & Replication Services
MSSQL Security & Compliance Services
MSSQL Performance Monitoring & Health Checks
MSSQL Solutions by Industry
Aerospike Aerospike
Aerospike Consulting Services
Aerospike Migration Services
Aerospike Performance Optimization & Tuning
Aerospike Database Administration
Aerospike Backup & Recovery
Aerospike High Availability
Aerospike Cloud & Hybrid Deployments
Aerospike for Real-Time Applications (AdTech, FinTech, Retail, IoT)
Analytics DB
Analytics DB Analytics DB Services
Clickhouse Clickhouse
ClickHouse Consulting Services
ClickHouse Migration Services
ClickHouse Optimization & Query Tuning
ClickHouse Database Administration
ClickHouse Backup & Recovery
ClickHouse Security & Maintenance
ClickHouse Cloud Services (ClickHouse Cloud, AWS, GCP, Azure)
ClickHouse Solutions by Industry
ClickHouse High Availability & Replication
TiDB TiDB
TiDB Consulting & Architecture Planning
TiDB Administration & Maintenance
TiDB Security and Privacy Maintenance
TiDB Performance & Query Optimization
TiDB Migration Services
TiDB Backup & Disaster Recovery
TiDB High Availability Solutions
TiDB Solutions by Industry
TiDB Cloud Services
ScyllaDB ScyllaDB
ScyllaDB Consulting & Architecture Planning
ScyllaDB Administration & Maintenance
ScyllaDB Security and Privacy Maintenance
ScyllaDB Performance & Query Optimization
ScyllaDB Migration Services
ScyllaDB Backup & Disaster Recovery
ScyllaDB High Availability Solutions
ScyllaDB Solutions by Industry
ScyllaDB Cloud Services
DevOps
DevOps DevOps Services
Version Control Version Control
Kubernetes Kubernetes
Infrastructure Infrastructure Management
Web Servers Web Servers
Networking
Networking Networking Services
Basic Basic
Advanced Advanced
MySQL MySQL
MongoDB MongoDB
PostgreSQL PostgreSQL
MSSQL MSSQL
Aerospike Aerospike
Clickhouse Clickhouse
TiDB TiDB
ScyllaDB ScyllaDB
Version Control Version Control
Kubernetes Kubernetes
Infrastructure Infrastructure Management
Web Servers Web Servers
Basic Basic
Advanced Advanced
MySQL Consulting Services
MySQL Migration Services
MySQL Optimization & Query Tuning
MySQL Database Administration
MySQL Backup & Recovery
MySQL Security & Maintenance
MySQL Cloud Services (AWS RDS, Aurora, Google Cloud SQL, Azure)
MySQL for Ecommerce
MySQL High Availability & Replication
MongoDB Consulting Services
MongoDB Migration Services
MongoDB Optimization & Query Tuning
MongoDB Database Administration
MongoDB Backup & Recovery
MongoDB Security & Maintenance
MongoDB Cloud (Atlas)
MongoDB Solutions by Industry
MongoDB High Availability & Replication
PostgreSQL Consulting
PostgreSQL Migration & Upgrades
Performance Tuning & Query Optimization
PostgreSQL Administration & Managed Services
High Availability, Clustering & Replication
PostgreSQL Backup, Recovery & Disaster Planning
PostgreSQL Security, Compliance & Auditing
PostgreSQL for Analytics & Data Warehousing
PostgreSQL on Cloud & Containers
PostgreSQL Extensions & Open-Source Integrations
PostgreSQL for Every Industry
MSSQL Consulting Services
MSSQL Migration Services
MSSQL Optimization & Query Tuning Services
MSSQL Database Administration Services
MSSQL Backup & Recovery Services
MSSQL High Availability & Replication Services
MSSQL Security & Compliance Services
MSSQL Performance Monitoring & Health Checks
MSSQL Solutions by Industry
Aerospike Consulting Services
Aerospike Migration Services
Aerospike Performance Optimization & Tuning
Aerospike Database Administration
Aerospike Backup & Recovery
Aerospike High Availability
Aerospike Cloud & Hybrid Deployments
Aerospike for Real-Time Applications (AdTech, FinTech, Retail, IoT)
ClickHouse Consulting Services
ClickHouse Migration Services
ClickHouse Optimization & Query Tuning
ClickHouse Database Administration
ClickHouse Backup & Recovery
ClickHouse Security & Maintenance
ClickHouse Cloud Services (ClickHouse Cloud, AWS, GCP, Azure)
ClickHouse Solutions by Industry
ClickHouse High Availability & Replication
TiDB Consulting & Architecture Planning
TiDB Administration & Maintenance
TiDB Security and Privacy Maintenance
TiDB Performance & Query Optimization
TiDB Migration Services
TiDB Backup & Disaster Recovery
TiDB High Availability Solutions
TiDB Solutions by Industry
TiDB Cloud Services
ScyllaDB Consulting & Architecture Planning
ScyllaDB Administration & Maintenance
ScyllaDB Security and Privacy Maintenance
ScyllaDB Performance & Query Optimization
ScyllaDB Migration Services
ScyllaDB Backup & Disaster Recovery
ScyllaDB High Availability Solutions
ScyllaDB Solutions by Industry
ScyllaDB Cloud Services
  1. Home
  2. > Blogs
  3. > TiDB
  4. > Replicating Selective Rows from MySQL to TiDB

Replicating Selective Rows from MySQL to TiDB

Analytics with TiDB's Row-Wise Filtering. Discover how our client slashed costs and boosted flexibility with TiDB's granular data management features. Dive into the future of database optimization with Mafiree

sukan May 06, 2024

Subscribe for email updates

One of our esteemed customers, a prominent player in the realm of video streaming services, faced a unique challenge. Tasked with delivering an exclusive feature tailored for Platinum subscription holders, their MySQL database was straining under the weight of resource-intensive aggregate queries.

A separate dashboard, log streaming, and other essential features were in demand, making it clear that a more robust solution was needed.

We have created a TiDB cluster and configured the TiDB cluster as a replica for MySQL.

Detailed architecture can be referred from this post

Challenges

The adoption of TiDB brought about a significant improvement in query performance, thanks to its columnar facility, TiFlash.

However, the sheer volume of data, 5TB, to be replicated across various TiDB nodes for OLAP requests was proving to be a costly affair.

We uncovered a key feature in TiDB that proved to be a useful ie) Row-Wise Filtered Replication

In TiDB DM we have many filtering techniques while replicating the data. When we are replicating data in MySQL, MariaDB, Postgresql,etc we can filter the replication with the Database and Table level only.

In TiDB we can Filter the replication data in row level and also filter the binlog events. So In this blog we will see how these filtering methods are possible by using TiDB

The following example shows how this feature is working


 


In the above diagram we categorized users based on their usage plan and when we replicate it from the MySQL to the TiDB we only replicate the PREMIUM users.

The rows that have PREMIUM user type replicate to the TiDB. This one done by the filter technique used in the DM task.

Samples

Inserting the data into the table

  1. mysql> insert into details(id,name,age,user_type)values(1,'User1',22,'Premium'),(2,'User2',28,'regular'),(3,'User3',19,'Premium'),(4,'User4',21,'Trial');
  2. Query OK, 4 rows affected (0.19 sec)
  3. Records: 4  Duplicates: 0  Warnings: 0

Viewing the data on the MySQL

  1. mysql> select * from details order by id;
  2. +----+-------+------+-----------+
  3. | id | name  | age  | user_type |
  4. +----+-------+------+-----------+
  5. |  1 | User1 |   22 | Premium   |
  6. |  2 | User2 |   28 | Regular   |
  7. |  3 | User3 |   19 | Premium   |
  8. |  4 | User4 |   21 | Trial     |
  9. +----+-------+------+-----------+
  10. 4 rows in set (0.01 sec)

This output has 4 rows, now let’s check on TiDB side

  1. MySQL [users]> select * from details;
  2. +----+-------+------+-----------+
  3. | id | name  | age  | user_type |
  4. +----+-------+------+-----------+
  5. |  1 | User1 |   22 | Premium   |
  6. |  3 | User3 |   19 | Premium   |
  7. +----+-------+------+-----------+
  8. 2 rows in set (0.00 sec)

With this replication row filter, TiDB has only rows with user_type is PREMIUM.

Another feature of TiDB DM replication is we have filter based on binlog events, events such as deletes, drops can be avovied in replication.

Following are the example where we can have multiple filters,

ParameterDescription
schema_patternFilters matching the schema
table_patternFilters matching the table
eventsBinlog events (create,insert, update,delete, etc)
actionDo or ignore

Example of how deletes are not replicated to TiDB,

In upstream MySQL we are deleting all record from the table.

  1. mysql> delete from details;
  2. Query OK, 12 rows affected (0.09 sec)
  3.  
  4. mysql> select * from details;
  5. Empty set (0.02 sec)

Now there are no records in the table and we are checking the TiDB cluster replica.

  1. MySQL [users]> select * from details;
  2. +----+-------+------+-----------+
  3. | id | name  | age  | user_type |
  4. +----+-------+------+-----------+
  5. |  1 | User1 |   22 | Premium   |
  6. |  3 | User3 |   19 | Premium   |
  7. +----+-------+------+-----------+
  8. 2 rows in set (0.00 sec)

There are no records deleted from the TiDB cluster. This one is done by the binlog filter configuration.

Conclusion

In conclusion, TiDB stands out with its native support for features like row-wise filtering, empowering users to optimize storage and compute costs significantly. By selectively replicating only the necessary rows for analytics, our client experienced tangible savings while ensuring seamless data accessibility.

Unlike other databases that often provide only table-level filtration, TiDB offers a superior level of flexibility with its granular row-wise filtering capabilities. This ensures that users have precise control over their data, facilitating more efficient and tailored data management strategies.

At Mafiree, we’re committed to providing comprehensive solutions tailored to your specific requirements. Our dedicated team of experts is poised to review your setup and offer customized solution architecture recommendations.

For any inquiries or assistance, please don’t hesitate to contact sales@mafiree.com. Your database challenges are our priority, and we’re here to help you navigate them effectively.

Author Bio

sukan

Sukan is Database Team Lead at Mafiree with over a decade of experience in database systems, architecture, and performance optimization. He specializes in MySQL, MongoDB, TiDB, and ClickHouse, developing architectural improvements that make data platforms faster, more efficient, and cost-effective. Sukan writes about practical database engineering topics, real-world performance tuning, data replication, and high-scale system design, drawing from extensive hands-on experience solving complex technical challenges.

Leave a Comment

Comments

S
Sukan December 09, 2024

Hey @Jayson, Costly means the storage & compute cost.

J
Jayson July 03, 2024

can you define "costly" in this sentence "However, the sheer volume of data, 5TB, to be replicated across various TiDB nodes for OLAP requests was proving to be a costly affair." Is it in terms of performance, latency, or bandwidth?

Related Blogs

Manage Large Data Import in TiDB

Efficiently import massive datasets into TiDB clusters using TiDB Lightning for faster analytics and scalable performance.

  1270 views
Automate Expired Data Deletion in TiDB with TTL

Say goodbye to manual cleanup! Learn how TiDB’s TTL feature automates expired data deletion and keeps your database lean and efficient.

  1805 views
Sync-Diff-Inspector: Your Go-To Tool for Verifying Data Consistency Across TiDB and MySQL

A powerful tool to compare, detect, and fix data inconsistencies across distributed databases—ensure accurate migrations and seamless replication with Sync-Diff-Inspector.

  1645 views
The Feature You’ve Been Looking For – Speed Up Your Queries with TiDB Cached Tables!

Boost database performance with TiDB cached tables—reduce latency and accelerate read-heavy queries.

  1523 views
Introducing Foreign Keys and more: Exploring New Features in TiDB v8.5.0

Explore the latest features in TiDB v8.5.0, including support for Foreign Keys, improved performance, and enhanced security for modern database management.

  1803 views

Subscribe for email updates

Get in touch with us

Highlights

More than 6000 Servers Monitored

Happy Clients

Certified DBAs

24 x 7 x 365 Support

PCI

Database Services

MySQL MongoDB PostgreSQL SQL Server Aerospike Clickhouse TiDB MariaDB Columnstore

Quick Links

Careers Blog Contact Privacy Policy Disclaimer Policy

Contacts

Linkedin Mafiree Facebook Mafiree Twitter Mafiree

Nagercoil Office

Miru IT Park, Vallankumaranvillai,

Nagercoil, Tamilnadu - 629 002.

Bangalore Office

Unit 303, Vanguard Rise,

5th Main, Konena Agrahara,

Old Airport Road, Bangalore - 560 017.

Call: +91 6383016411

Email: sales@mafiree.com


Copyright © - All Rights Reserved - Mafiree