In today's data trend, there is an exponential increase in the amount of data being stored and processed. As datasets grow larger, balancing cost factors with usability and performance becomes a significant challenge.
Organizations today are not only focused on storing immense volumes of data but also seek solutions that optimize data accessibility and performance.
TiDB's placement rules address this challenge by enabling the strategic allocation of data, optimizing both performance and resource utilization, and managing costs effectively. By intelligently distributing data, TiDB optimizes performance and resource utilization, effectively managing costs while ensuring data remains readily available when needed.
Placement Rules in TiDB
TiDB allow you to control the placement of data within a TiKV cluster through SQL statements. This feature lets you organize data at the cluster, database, table, or partition level assigning it to specific regions, data centers, racks, or hosts as needed.
Use Cases :
Example:
For one of our clients, a finance company handling a high volume of invoices, managing data efficiently became a challenge due to all the data being stored in a single table. To address this, we proposed partitioning the table by date ranges and implementing TiDB placement rules to optimize storage and performance.
By partitioning the table, we were able to store the most recent data on hot storage for fast access and move older data to cold storage, reducing costs. TiDB placement rules allowed us to assign each partition to the appropriate storage tier, ensuring optimal performance and cost efficiency.
Here’s how we achieved the placement policies:
Steps to apply Placement policies
Create placement labels in the TiKV nodes. Two nodes are performance and one is high performance node.
Checking the placement labels :
mysql> SHOW PLACEMENT LABELS;
+-------------+-----------------+
| Key | Values |
+-------------+-----------------+
| performance | ["high", "low"] |
| zone | ["z1", "z2"] |
+-------------+-----------------+
2 rows in set (0.01 sec)
Creating Placement policies :
High performance node
mysql> CREATE PLACEMENT POLICY hot CONSTRAINTS="[+performance=high]";
Query OK, 0 rows affected (0.97 sec)
Low performance node
mysql> CREATE PLACEMENT POLICY cold CONSTRAINTS="[+performance=low]";
Query OK, 0 rows affected (1.02 sec)
Checking the TiKV nodes :
mysql> SELECT store_id, address, label FROM information_schema.tikv_store_status ;
+----------+------------------+---------------------------------------------------------------------------+
| store_id | address | label |
+----------+------------------+---------------------------------------------------------------------------+
| 1 | 172.17.0.5:20160 | [{"key": "performance", "value": "low"}, {"key": "zone", "value": "z1"}] |
| 2 | 172.17.0.7:20160 | [{"key": "performance", "value": "high"}, {"key": "zone", "value": "z2"}] |
| 5 | 172.17.0.6:20160 | [{"key": "performance", "value": "low"}, {"key": "zone", "value": "z1"}] |
+----------+------------------+---------------------------------------------------------------------------+
3 rows in set (0.01 sec)
Create or Alter the table with partitions :
CREATE TABLE invoice (
invoice_id INT AUTO_INCREMENT,
amount DECIMAL(10, 2),
invoice_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
customer_id INT,
PRIMARY KEY (invoice_id, invoice_date)
)
PARTITION BY RANGE (UNIX_TIMESTAMP(invoice_date)) (
PARTITION p01 VALUES LESS THAN (UNIX_TIMESTAMP('2024-01-01 00:00:00')) PLACEMENT POLICY=cold,
PARTITION p02 VALUES LESS THAN (UNIX_TIMESTAMP('2024-06-01 00:00:00')) PLACEMENT POLICY=cold,
PARTITION p03 VALUES LESS THAN (UNIX_TIMESTAMP('2024-11-01 00:00:00')) PLACEMENT POLICY=cold,
PARTITION p04 VALUES LESS THAN (UNIX_TIMESTAMP('2024-12-01 00:00:00')) PLACEMENT POLICY=hot,
PARTITION p05 VALUES LESS THAN (MAXVALUE) PLACEMENT POLICY=hot
);
P01: Invoices before Jan 01, 2024 (cold storage).
P02: Invoices before Jun 01, 2024 (cold storage).
P03: Invoices before Nov 01, 2024 (cold storage).
P04: Invoices before Dec 01, 2024 (hot storage).
P05: Invoices after Dec 01, 2024 (hot storage).
Verify the region and partition mapping :
mysql> SELECT trs.partition_name,trp.region_id, trp.store_id FROM information_schema.tikv_region_status trs JOIN information_schema.tikv_region_peers trp ON trp.region_id = trs.region_id where trs.db_name='test' AND trs.table_name='invoice' AND trs.is_index = 0 AND trp.is_leader = 1 order by region_id;
+----------------+-----------+----------+
| PARTITION_NAME | region_id | store_id |
+----------------+-----------+----------+
| p01 | 25005 | 5 |
| p02 | 25009 | 5 |
| p03 | 25013 | 1 |
| p04 | 25017 | 2 |
| p05 | 25021 | 2 |
+----------------+-----------+----------+
5 rows in set (0.02 sec)
When we query the current data, the high performance node can be used. And for the old data, low performance nodes can be used. This can be achieved by using the regions that are mapped below.
Checking the region usage on queries
Use the trace command to check the partition pruning and the region usage for the data.
Query check 1
Checking the region request for the recent data (invoice_date >= '2024-11-01 00:00:00' AND invoice_date < '2024-11-10 00:00:00' ).
mysql> Trace SELECT * FROM invoice WHERE invoice_date >= '2024-11-01 00:00:00' AND invoice_date < '2024-11-10 00:00:00';
+------------------------------------------------------------------------+-----------------+------------+
| operation | startTS | duration |
+------------------------------------------------------------------------+-----------------+------------+
| trace | 15:44:14.478221 | 7.02749ms |
| ├─session.ExecuteStmt | 15:44:14.478231 | 2.597407ms |
| │ ├─executor.Compile | 15:44:14.478302 | 954.345µs |
| │ └─session.runStmt | 15:44:14.479297 | 1.486641ms |
| │ └─TableReaderExecutor.Open | 15:44:14.480585 | 175.913µs |
| │ └─distsql.Select | 15:44:14.480664 | 64.599µs |
| │ └─regionRequest.SendReqCtx | 15:44:14.480966 | 3.088868ms |
| │ └─rpcClient.SendRequest, region ID: 25017, type: Cop | 15:44:14.481033 | 2.946535ms |
| │ └─tikv.RPC | 15:44:14.481037 | 1.024465ms |
| │ ├─tikv.Wait | 15:44:14.481037 | 170.095µs |
| │ │ └─tikv.GetSnapshot | 15:44:14.481037 | 95.042µs |
| │ └─tikv.Process | 15:44:14.481207 | 4.12µs |
| └─*executor.TableReaderExecutor.Next | 15:44:14.480859 | 3.428497ms |
+------------------------------------------------------------------------+-----------------+------------+
13 rows in set (0.01 sec)From the output we can see the query requesting the region 25017 (Hot data) which is in partition P04, high performance node.
Query check 2
Checking the region request for the old data (invoice_date >= '2024-01-01 00:00:00' AND invoice_date < '2024-02-01 00:00:00').
mysql> Trace SELECT * FROM invoice WHERE invoice_date >= '2024-01-01 00:00:00' AND invoice_date < '2024-02-01 00:00:00';
+------------------------------------------------------------------------+-----------------+-------------+
| operation | startTS | duration |
+------------------------------------------------------------------------+-----------------+-------------+
| trace | 15:46:05.285536 | 23.206115ms |
| ├─session.ExecuteStmt | 15:46:05.285545 | 2.119478ms |
| │ ├─executor.Compile | 15:46:05.285585 | 865.709µs |
| │ └─session.runStmt | 15:46:05.286495 | 1.119266ms |
| │ └─TableReaderExecutor.Open | 15:46:05.287456 | 134.934µs |
| │ └─distsql.Select | 15:46:05.287510 | 55.647µs |
| │ └─regionRequest.SendReqCtx | 15:46:05.287774 | 20.168645ms |
| │ └─rpcClient.SendRequest, region ID: 25009, type: Cop | 15:46:05.287841 | 20.052818ms |
| │ └─tikv.RPC | 15:46:05.287845 | 19.036333ms |
| │ ├─tikv.Wait | 15:46:05.287845 | 119.631µs |
| │ │ └─tikv.GetSnapshot | 15:46:05.287845 | 37.721µs |
| │ └─tikv.Process | 15:46:05.287965 | 150.857µs |
| └─*executor.TableReaderExecutor.Next | 15:46:05.287695 | 20.355452ms |
+------------------------------------------------------------------------+-----------------+-------------+
13 rows in set (0.03 sec)Here the query requesting the region 25009 (Cold data) which is in partition P01, low performance node.
Limitations
Miru IT Park, Vallankumaranvillai,
Nagercoil, Tamilnadu - 629 002.
Unit 303, Vanguard Rise,
5th Main, Konena Agrahara,
Old Airport Road, Bangalore - 560 017.
Call: +91 6383016411
Email: sales@mafiree.com