When managing large-scale databases, efficient and reliable data import is essential especially during initial setup or migration from legacy systems. TiDB Lightning is a high-performance data import tool designed for the TiDB distributed SQL database.This guide covers everything you need to get started with TiDB Lightning.
What is TiDB Lightning?
TiDB Lightning is a powerful tool within the TiDB ecosystem designed to facilitate fast and efficient data import into TiDB clusters. It is particularly useful for large-scale data migration tasks, offering high performance and scalability.
Purpose
The primary purpose of TiDB Lightning is to import data at a large scale, often used for initial data import into TiDB clusters. It is capable of processing data at a high speed, which is crucial for handling large datasets efficiently. The tool ensures data integrity through mechanisms like global checksums and metadata coordination, which are essential for maintaining data accuracy during the import process.
TiDB Lightning supports the following file formats:
TiDB Lightning can read data from the following sources:
TiDB lightning import modes
TiDB Lightning supports two import modes, configured by backend. The import mode determines the way data is imported into TiDB.
Physical Import Mode:
TiDB Lightning first encodes data into key-value pairs and stores them in a local temporary directory, then uploads these key-value pairs to each TiKV node, and finally calls the TiKV Ingest interface to insert data into TiKV's RocksDB. If you need to perform initial import, consider the physical import mode, which has higher import speed. The backend for the physical import mode is local.
Logical Import Mode:
TiDB Lightning first encodes the data into SQL statements and then runs these SQL statements directly for data import. If the cluster to be imported is in production, or if the target table to be imported already contains data, use the logical import mode. The backend for the logical import mode is tidb.
| Import mode | Physical Import Mode | Logical Import Mode |
| Backend | local | tidb |
| Speed | Fast (100~500 GiB/hour) | Low (10~50 GiB/hour) |
| Resource consumption | High | Low |
| Network bandwidth consumption | High | Low |
| ACID compliance during import | No | Yes |
| Target tables | Must be empty | Can contain data |
| TiDB cluster version | >= 4.0.0 | All |
| Whether the TiDB cluster can provide service during import | Limited service | Yes |
TiDB Lightning Architecture
Use TiDB Lightning
Prepare the source data
Use Dumpling to export the data from MySQL/TiDB.
Run TiDB Lightning
Set the Config
Create the configuration file tidb-lightning.toml based on your cluster information:
[lightning]
# Logging
level = "info"
file = “tidb-lightning.log”
[tikv-importer]
# Uses the Local-backend
backend = "local"
# Sets the directory for temporarily storing the sorted key-value pairs.
# The target directory must be empty.
sorted-kv-dir = “/data/tidb-tmp/”
[mydumper]
# Local source data directory
data-source-dir = “/data/dump/”
# Configures the wildcard rule. By default, all tables in the mysql, sys, INFORMATION_SCHEMA, PERFORMANCE_SCHEMA, METRICS_SCHEMA, and INSPECTION_SCHEMA system databases are filtered.
# If this item is not configured, the "cannot find schema" error occurs when system tables are imported.
filter = ['*.*', '!mysql.*', '!sys.*', '!INFORMATION_SCHEMA.*', '!PERFORMANCE_SCHEMA.*', '!METRICS_SCHEMA.*', '!INSPECTION_SCHEMA.*']
[tidb]
# Information of the target cluster
host = "172.17.0.6"
port = 4000
user = "root"
password = ""
# Table schema information is fetched from TiDB via this status-port.
status-port = 10080
# The PD address of the cluster
pd-addr = “172.17.0.5:2379”
Start the lightning
tiup tidb-lightning -config lightning.toml > lightning.out If the data was imported successfully it will show the output tidb lightning exit successfully
Limitations
Physical Import Mode Limitations
Logical Import Mode Limitations
When using multiple TiDB Lightning instances, it is important not to mix the back ends (i.e., do not use both physical and logical import modes simultaneously on the same cluster).
Miru IT Park, Vallankumaranvillai,
Nagercoil, Tamilnadu - 629 002.
Unit 303, Vanguard Rise,
5th Main, Konena Agrahara,
Old Airport Road, Bangalore - 560 017.
Call: +91 6383016411
Email: sales@mafiree.com