One of our esteemed customers, a prominent player in the realm of video streaming services, faced a unique challenge. Tasked with delivering an exclusive feature tailored for Platinum subscription holders, their MySQL database was straining under the weight of resource-intensive aggregate queries.
A separate dashboard, log streaming, and other essential features were in demand, making it clear that a more robust solution was needed.
We have created a TiDB cluster and configured the TiDB cluster as a replica for MySQL.
Detailed architecture can be referred from this post
Challenges
The adoption of TiDB brought about a significant improvement in query performance, thanks to its columnar facility, TiFlash.
However, the sheer volume of data, 5TB, to be replicated across various TiDB nodes for OLAP requests was proving to be a costly affair.
We uncovered a key feature in TiDB that proved to be a useful ie) Row-Wise Filtered Replication
In TiDB DM we have many filtering techniques while replicating the data. When we are replicating data in MySQL, MariaDB, Postgresql,etc we can filter the replication with the Database and Table level only.
In TiDB we can Filter the replication data in row level and also filter the binlog events. So In this blog we will see how these filtering methods are possible by using TiDB
The following example shows how this feature is working

In the above diagram we categorized users based on their usage plan and when we replicate it from the MySQL to the TiDB we only replicate the PREMIUM users.
The rows that have PREMIUM user type replicate to the TiDB. This one done by the filter technique used in the DM task.
Samples
Inserting the data into the table
mysql> insert into details(id,name,age,user_type)values(1,'User1',22,'Premium'),(2,'User2',28,'regular'),(3,'User3',19,'Premium'),(4,'User4',21,'Trial');Query OK, 4 rows affected (0.19 sec)Records: 4 Duplicates: 0 Warnings: 0Viewing the data on the MySQL
mysql> select * from details order by id;+----+-------+------+-----------+| id | name | age | user_type |+----+-------+------+-----------+| 1 | User1 | 22 | Premium || 2 | User2 | 28 | Regular || 3 | User3 | 19 | Premium || 4 | User4 | 21 | Trial |+----+-------+------+-----------+4 rows in set (0.01 sec)This output has 4 rows, now let’s check on TiDB side
MySQL [users]> select * from details;+----+-------+------+-----------+| id | name | age | user_type |+----+-------+------+-----------+| 1 | User1 | 22 | Premium || 3 | User3 | 19 | Premium |+----+-------+------+-----------+2 rows in set (0.00 sec)With this replication row filter, TiDB has only rows with user_type is PREMIUM.
Another feature of TiDB DM replication is we have filter based on binlog events, events such as deletes, drops can be avovied in replication.
Following are the example where we can have multiple filters,
| Parameter | Description |
|---|---|
| schema_pattern | Filters matching the schema |
| table_pattern | Filters matching the table |
| events | Binlog events (create,insert, update,delete, etc) |
| action | Do or ignore |
Example of how deletes are not replicated to TiDB,
In upstream MySQL we are deleting all record from the table.
mysql> delete from details;Query OK, 12 rows affected (0.09 sec)mysql> select * from details;Empty set (0.02 sec)Now there are no records in the table and we are checking the TiDB cluster replica.
MySQL [users]> select * from details;+----+-------+------+-----------+| id | name | age | user_type |+----+-------+------+-----------+| 1 | User1 | 22 | Premium || 3 | User3 | 19 | Premium |+----+-------+------+-----------+2 rows in set (0.00 sec)There are no records deleted from the TiDB cluster. This one is done by the binlog filter configuration.
Conclusion
In conclusion, TiDB stands out with its native support for features like row-wise filtering, empowering users to optimize storage and compute costs significantly. By selectively replicating only the necessary rows for analytics, our client experienced tangible savings while ensuring seamless data accessibility.
Unlike other databases that often provide only table-level filtration, TiDB offers a superior level of flexibility with its granular row-wise filtering capabilities. This ensures that users have precise control over their data, facilitating more efficient and tailored data management strategies.
At Mafiree, we’re committed to providing comprehensive solutions tailored to your specific requirements. Our dedicated team of experts is poised to review your setup and offer customized solution architecture recommendations.
For any inquiries or assistance, please don’t hesitate to contact sales@mafiree.com. Your database challenges are our priority, and we’re here to help you navigate them effectively.
Hey @Jayson, Costly means the storage & compute cost.
can you define "costly" in this sentence "However, the sheer volume of data, 5TB, to be replicated across various TiDB nodes for OLAP requests was proving to be a costly affair." Is it in terms of performance, latency, or bandwidth?
Miru IT Park, Vallankumaranvillai,
Nagercoil, Tamilnadu - 629 002.
Unit 303, Vanguard Rise,
5th Main, Konena Agrahara,
Old Airport Road, Bangalore - 560 017.
Call: +91 6383016411
Email: sales@mafiree.com