The Challenge of Growing Databases
As businesses grow, so do their databases. Whatever is your database, either MySQL or PostgreSQL, managing data growth within your main transaction database can become a daunting task. Without an effective data archival strategy, you may face challenges:
Rising Resource Costs: Consider a database with 10TB of data and 5+ replicas. Each replica requires more storage space, this will push up the infrastructure costs significantly.
Slower Query Performance: Larger datasets mean more rows to process, especially for complex joins, leading to longer response times.
Expensive Backups: A bigger database also means a more expensive and time-consuming backup process.
Where Should the Archived Data Go?
Deciding where to store archived data depends on your business needs. Here’s a simplified process to help you determine the best approach:
Identify Key Tables: Start by pinpointing the top 10 tables or collections that consume the most space. Our exclusive dashboard helps easier for the clients to get the top 10 tables residing in a database.
Understand Their Use Cases: Determine how these tables are used. For example, in an E-commerce setup the data such as session information or cache can often be archived quickly, while completed orders or invoices might still be needed for analytics.
Choose Storage Solutions
Archived data can be stored in cold storage (like Amazon S3) or hot storage on a separate server. The key is to ensure the data is accessible but doesn’t burden your main transaction database.
Introducing ClickHouse
One of the most efficient solutions for data archival is ClickHouse, an OLAP (Online Analytical Processing) database that processes data in columns rather than rows. Here’s why ClickHouse is advantageous:
Compression: It compresses data, reducing storage requirements.
Resource Efficiency: It uses fewer resources compared to traditional databases.
SQL Support: ClickHouse supports SQL, making it easy to integrate with existing systems.
A Practical Example
For one of our clients, their database had grown to over 10TB. After extensive discussions, we implemented an archival strategy that involved, selecting the data from MySQL for the order related tables whose order date was older than 1 year and storing them in clickhouse before doing the purging on master.
Data Size before archival:
Archival Flow:

Data Size after archival:

The Result
By moving older data to ClickHouse, one of our E-commerce clients saw a significant reduction in their infrastructure costs—by 50%!
This example highlights the power of a well-planned data archival strategy in managing growing databases effectively.
Conclusion
Data archival is not just about freeing up space; it’s about optimising performance and cutting costs. With databases like ClickHouse, businesses can manage their data growth efficiently, ensuring that their infrastructure remains cost-effective and responsive.
For more insights on how ClickHouse and data archival can benefit your business, stay tuned!
Miru IT Park, Vallankumaranvillai,
Nagercoil, Tamilnadu - 629 002.
Unit 303, Vanguard Rise,
5th Main, Konena Agrahara,
Old Airport Road, Bangalore - 560 017.
Call: +91 6383016411
Email: sales@mafiree.com