Mafiree logo
  • About
  • Services
  • Blogs
  • Careers
  • Products
    • orbit logo Orbit
    • streamer logo Xstreami
  • Contact
Schedule a Call
Menu
  • About
  • Services
  • Blogs
  • Careers
  • Products
    • orbit logo Orbit
    • streamer logo Xstreami
  • Contact
  • Schedule a Call
Database
Database Database Managed Services
MySQL MySQL
MySQL Consulting Services
MySQL Migration Services
MySQL Optimization & Query Tuning
MySQL Database Administration
MySQL Backup & Recovery
MySQL Security & Maintenance
MySQL Cloud Services (AWS RDS, Aurora, Google Cloud SQL, Azure)
MySQL for Ecommerce
MySQL High Availability & Replication
MongoDB MongoDB
MongoDB Consulting Services
MongoDB Migration Services
MongoDB Optimization & Query Tuning
MongoDB Database Administration
MongoDB Backup & Recovery
MongoDB Security & Maintenance
MongoDB Cloud (Atlas)
MongoDB Solutions by Industry
MongoDB High Availability & Replication
PostgreSQL PostgreSQL
PostgreSQL Consulting
PostgreSQL Migration & Upgrades
Performance Tuning & Query Optimization
PostgreSQL Administration & Managed Services
High Availability, Clustering & Replication
PostgreSQL Backup, Recovery & Disaster Planning
PostgreSQL Security, Compliance & Auditing
PostgreSQL for Analytics & Data Warehousing
PostgreSQL on Cloud & Containers
PostgreSQL Extensions & Open-Source Integrations
PostgreSQL for Every Industry
SQL Server MSSQL
MSSQL Consulting Services
MSSQL Migration Services
MSSQL Optimization & Query Tuning Services
MSSQL Database Administration Services
MSSQL Backup & Recovery Services
MSSQL High Availability & Replication Services
MSSQL Security & Compliance Services
MSSQL Performance Monitoring & Health Checks
MSSQL Solutions by Industry
Aerospike Aerospike
Aerospike Consulting Services
Aerospike Migration Services
Aerospike Performance Optimization & Tuning
Aerospike Database Administration
Aerospike Backup & Recovery
Aerospike High Availability
Aerospike Cloud & Hybrid Deployments
Aerospike for Real-Time Applications (AdTech, FinTech, Retail, IoT)
Analytics DB
Analytics DB Analytics DB Services
Clickhouse Clickhouse
ClickHouse Consulting Services
ClickHouse Migration Services
ClickHouse Optimization & Query Tuning
ClickHouse Database Administration
ClickHouse Backup & Recovery
ClickHouse Security & Maintenance
ClickHouse Cloud Services (ClickHouse Cloud, AWS, GCP, Azure)
ClickHouse Solutions by Industry
ClickHouse High Availability & Replication
TiDB TiDB
TiDB Consulting & Architecture Planning
TiDB Administration & Maintenance
TiDB Security and Privacy Maintenance
TiDB Performance & Query Optimization
TiDB Migration Services
TiDB Backup & Disaster Recovery
TiDB High Availability Solutions
TiDB Solutions by Industry
TiDB Cloud Services
ScyllaDB ScyllaDB
ScyllaDB Consulting & Architecture Planning
ScyllaDB Administration & Maintenance
ScyllaDB Security and Privacy Maintenance
ScyllaDB Performance & Query Optimization
ScyllaDB Migration Services
ScyllaDB Backup & Disaster Recovery
ScyllaDB High Availability Solutions
ScyllaDB Solutions by Industry
ScyllaDB Cloud Services
DevOps
DevOps DevOps Services
Version Control Version Control
Kubernetes Kubernetes
Infrastructure Infrastructure Management
Web Servers Web Servers
Networking
Networking Networking Services
Basic Basic
Advanced Advanced
MySQL MySQL
MongoDB MongoDB
PostgreSQL PostgreSQL
MSSQL MSSQL
Aerospike Aerospike
Clickhouse Clickhouse
TiDB TiDB
ScyllaDB ScyllaDB
Version Control Version Control
Kubernetes Kubernetes
Infrastructure Infrastructure Management
Web Servers Web Servers
Basic Basic
Advanced Advanced
MySQL Consulting Services
MySQL Migration Services
MySQL Optimization & Query Tuning
MySQL Database Administration
MySQL Backup & Recovery
MySQL Security & Maintenance
MySQL Cloud Services (AWS RDS, Aurora, Google Cloud SQL, Azure)
MySQL for Ecommerce
MySQL High Availability & Replication
MongoDB Consulting Services
MongoDB Migration Services
MongoDB Optimization & Query Tuning
MongoDB Database Administration
MongoDB Backup & Recovery
MongoDB Security & Maintenance
MongoDB Cloud (Atlas)
MongoDB Solutions by Industry
MongoDB High Availability & Replication
PostgreSQL Consulting
PostgreSQL Migration & Upgrades
Performance Tuning & Query Optimization
PostgreSQL Administration & Managed Services
High Availability, Clustering & Replication
PostgreSQL Backup, Recovery & Disaster Planning
PostgreSQL Security, Compliance & Auditing
PostgreSQL for Analytics & Data Warehousing
PostgreSQL on Cloud & Containers
PostgreSQL Extensions & Open-Source Integrations
PostgreSQL for Every Industry
MSSQL Consulting Services
MSSQL Migration Services
MSSQL Optimization & Query Tuning Services
MSSQL Database Administration Services
MSSQL Backup & Recovery Services
MSSQL High Availability & Replication Services
MSSQL Security & Compliance Services
MSSQL Performance Monitoring & Health Checks
MSSQL Solutions by Industry
Aerospike Consulting Services
Aerospike Migration Services
Aerospike Performance Optimization & Tuning
Aerospike Database Administration
Aerospike Backup & Recovery
Aerospike High Availability
Aerospike Cloud & Hybrid Deployments
Aerospike for Real-Time Applications (AdTech, FinTech, Retail, IoT)
ClickHouse Consulting Services
ClickHouse Migration Services
ClickHouse Optimization & Query Tuning
ClickHouse Database Administration
ClickHouse Backup & Recovery
ClickHouse Security & Maintenance
ClickHouse Cloud Services (ClickHouse Cloud, AWS, GCP, Azure)
ClickHouse Solutions by Industry
ClickHouse High Availability & Replication
TiDB Consulting & Architecture Planning
TiDB Administration & Maintenance
TiDB Security and Privacy Maintenance
TiDB Performance & Query Optimization
TiDB Migration Services
TiDB Backup & Disaster Recovery
TiDB High Availability Solutions
TiDB Solutions by Industry
TiDB Cloud Services
ScyllaDB Consulting & Architecture Planning
ScyllaDB Administration & Maintenance
ScyllaDB Security and Privacy Maintenance
ScyllaDB Performance & Query Optimization
ScyllaDB Migration Services
ScyllaDB Backup & Disaster Recovery
ScyllaDB High Availability Solutions
ScyllaDB Solutions by Industry
ScyllaDB Cloud Services
  1. Home
  2. > Blogs
  3. > MySQL
  4. > MySQL Schema Migration Without Downtime: A Real Fintech Case Study

MySQL Schema Migration Without Downtime: A Real Fintech Case Study

Schema changes on large MySQL tables can bring production systems to a halt if not handled correctly. This case study walks through how Mafiree helped a fintech client execute a zero-downtime MySQL schema migration on a 500M+ row production database — covering the real challenges faced, the three-phase tool strategy using gh-ost, pt-online-schema-change, and MySQL 8.0 INSTANT DDL, production configuration settings with performance benchmarks, and best practices for safely evolving your MySQL schema without impacting users

Jethish February 20, 2026

Subscribe for email updates

Every growing application eventually faces a defining moment: the database schema that served you well at launch no longer fits the evolving business logic. Adding a column, changing a data type, or creating an index sounds trivial in development, but on a production MySQL table with hundreds of millions of rows, a single ALTER TABLE can lock your application for hours and bring your service down.

This is the story of how Mafiree helped a fintech client migrate the schema of a 500M+ row MySQL table in production with zero downtime, zero data loss, and no impact on end-user experience. If you are looking for a battle-tested approach to MySQL schema migration, this guide breaks down the tools, trade-offs, and exact steps we followed.

 

Running MySQL at 100M+ Rows?

Why MySQL Schema Migrations Are Risky in Production

MySQL’s default ALTER TABLE behavior rebuilds the entire table. For small tables, this completes in seconds. For tables with hundreds of millions of rows, the implications are severe:

 

  • Table Locking: Depending on the operation, MySQL may acquire a metadata lock that blocks all reads and writes for the duration of the ALTER. Even “online” DDL operations require brief exclusive locks at the start and end of the operation.
  • Replication Lag: On replica-based architectures, the ALTER runs sequentially on each replica after the primary completes. A 4-hour ALTER on the primary means 4+ hours of replication lag on every replica.
  • Disk and I/O Pressure: Rebuilding a large table doubles the disk space temporarily, and the heavy I/O can starve other queries of resources.
  • Rollback Complexity: If the ALTER fails midway, rolling back is not instantaneous. You may end up with a partially altered table and need to restore from backup.

Expert Tip: If your production MySQL tables exceed 10 million rows, never run a raw ALTER TABLE during business hours. Even MySQL 8.0’s Instant DDL has limitations that can catch teams off guard.

The Client Challenge: A Real-World MySQL Schema Migration

Our client, a fast-growing fintech company, operated a transaction processing system backed by MySQL 8.0 running on AWS RDS. Their core transactions table had grown to over 500 million rows and was serving 3,000+ queries per second at peak.

The business requirement was straightforward: add three new columns to support a regulatory compliance feature, modify an existing column’s data type from INT to BIGINT to accommodate larger transaction IDs, and add two composite indexes to support new reporting queries. The constraint was equally clear: zero downtime, no replication lag exceeding 5 seconds, and the migration had to complete within a single maintenance window.

 

The Schema Changes Required

-- Required changes on `transactions` table (500M+ rows) 
ALTER TABLE transactions  ADD COLUMN compliance_flag TINYINT DEFAULT 0,  ADD COLUMN review_timestamp DATETIME NULL,  ADD COLUMN audit_ref VARCHAR(64) NULL,  MODIFY COLUMN txn_id BIGINT NOT NULL,  ADD INDEX idx_compliance (compliance_flag, review_timestamp),  ADD INDEX idx_audit (audit_ref, created_at);

Running this as a single ALTER TABLE would have taken an estimated 6–8 hours based on our benchmarks on a staging replica, with full table lock for the MODIFY COLUMN operation. This was unacceptable.

Facing a similar challenge? Talk to our MySQL experts for a free migration assessment.

Evaluating MySQL Schema Migration Tools: gh-ost vs. pt-online-schema-change vs. Native DDL

Before executing, Mafiree team has evaluated three industry-standard approaches for zero-downtime MySQL schema migration. Each has distinct strengths depending on your environment and the type of change required.

Featuregh-ostpt-oscMySQL 8.0 DDL
MechanismBinlog-based replicationTrigger-based shadow tableNative InnoDB engine
Trigger DependencyNo triggers neededRequires 3 triggersNo triggers
ThrottlingBuilt-in, replica-awareManual configurationNo throttling
RollbackPause/abort anytimeDrop shadow tableLimited (depends on op)
Foreign Key SupportLimitedBetter supportFull native support
Best ForLarge tables, complex opsLegacy MySQL versionsSimple adds/drops

Our Zero-Downtime MySQL Schema Migration Strategy

After thorough evaluation, Mafiree Team adopted a phased approach, using different tools for different operations based on their risk profiles. This hybrid strategy minimized risk while maximizing efficiency.

 

Phase 1: Adding New Columns with MySQL 8.0 INSTANT DDL

MySQL 8.0 supports the INSTANT algorithm for adding columns at the end of a table. This modifies only metadata and completes in milliseconds, regardless of table size.

 

-- Phase 1: Instant column additions (< 1 second each) 
ALTER TABLE transactions ADD COLUMN compliance_flag TINYINT DEFAULT 0, ALGORITHM=INSTANT; ALTER TABLE transactions ADD COLUMN review_timestamp DATETIME NULL, ALGORITHM=INSTANT; ALTER TABLE transactions ADD COLUMN audit_ref VARCHAR(64) NULL, ALGORITHM=INSTANT;

Result: All three columns added in under 500 milliseconds total. Zero replication lag. Zero lock contention.

 

Expert Tip: INSTANT DDL in MySQL 8.0 only works for adding columns at the end of the table, adding or dropping virtual columns, and a few other metadata-only operations. Always verify with ALGORITHM=INSTANT explicitly; if the operation cannot be done instantly, MySQL will throw an error rather than silently falling back to a table-copy operation.

 

Phase 2: Column Type Change with gh-ost

The MODIFY COLUMN from INT to BIGINT required a full table rebuild. This was our highest-risk operation. We chose gh-ost for its binlog-based approach, which avoids triggers entirely and provides real-time throttling based on replica lag.

 

gh-ost \  --host=primary-mysql.internal \  --database=fintech_prod \  --table=transactions \  --alter="MODIFY COLUMN txn_id BIGINT NOT NULL" \  --chunk-size=2500 \  --max-lag-millis=4000 \  --throttle-control-replicas=replica1.internal,replica2.internal \  --critical-lag-millis=5000 \  --nice-ratio=0.5 \  --cut-over=default \  --exact-rowcount \  --concurrent-rowcount \  --execute

 

Key Configuration Decisions

 

chunk-size=2500: Mafiree team has started conservatively. On a table doing 3,000 QPS, smaller chunks keep the I/O overhead per cycle low and avoid overwhelming the buffer pool.

max-lag-millis=4000: gh-ost automatically pauses migration if any monitored replica exceeds 4 seconds of lag, keeping us well within our 5-second SLA.

nice-ratio=0.5: For every chunk copied, gh-ost waits an equal duration before the next chunk. This effectively halves the migration speed but ensures production queries are not starved.

cut-over=default: The final table swap uses an atomic RENAME TABLE operation with a brief metadata lock, typically completing in under 50 milliseconds.

Result: The migration ran for 4 hours and 22 minutes. Replica lag never exceeded 2.8 seconds. Application latency increased by only 3ms at the p99 level during peak migration throughput.

 

Phase 3: Index Creation with pt-online-schema-change

For the two new composite indexes, Mafiree team has used pt-online-schema-change. Index creation is generally safer than column modifications, and pt-osc’s trigger-based approach handles it reliably.

 

pt-online-schema-change \  --alter "ADD INDEX idx_compliance (compliance_flag, review_timestamp), \          ADD INDEX idx_audit (audit_ref, created_at)" \  --host=primary-mysql.internal \  D=fintech_prod,t=transactions \  --chunk-size=5000 \  --max-lag=4 \  --check-interval=2 \  --progress=time,30 \  --execute

Result: Both indexes created in 3 hours 15 minutes with zero impact on application performance.

➤ Need help planning your MySQL schema migration? Request a free assessment

MySQL Schema Migration Performance Best Practices

Based on hundreds of schema migrations across our client base, here are the production-tested best practices that prevent downtime and data loss.

1. Always Dry-Run on a Staging Replica First

Both gh-ost and pt-osc support dry-run modes. Never skip this step. A dry run on a replica with production-equivalent data reveals locking issues, unexpected I/O patterns, and accurate time estimates that pure theory cannot provide.

# gh-ost dry run (does everything except the final cut-over) gh-ost --host=staging-replica.internal \  --database=fintech_prod --table=transactions \  --alter="MODIFY COLUMN txn_id BIGINT NOT NULL" \  --test-on-replica --execute

2. Monitor Active Sessions Before Cut-Over

The most common cause of cut-over failure is not the tool but a long-running query or open session holding a metadata lock on the target table. In the minutes before cut-over, actively check for blocking sessions.

-- Check for sessions holding locks on the target table 
SELECT * FROM performance_schema.metadata_locks WHERE OBJECT_NAME = 'transactions' AND LOCK_STATUS = 'GRANTED'; 

-- Identify long-running queries 
SELECT ID, TIME, STATE, INFO FROM information_schema.PROCESSLIST WHERE TIME > 30 AND DB = 'fintech_prod';

3. Stage Schema Changes Ahead of Application Code

Decouple schema changes from application deployments. Apply schema changes first, then deploy code that uses the new columns or indexes. This ensures backward compatibility at every step: old application code works with the new schema, and new application code works with the new schema. If the code deployment fails, your schema change is still safe.

 

4. Plan for Rollback at Every Phase

Every migration phase should have a documented rollback plan. For gh-ost, you can abort mid-migration and the original table remains untouched. For pt-osc, dropping the shadow table reverts the process. For INSTANT DDL additions, a simple ALTER TABLE ... DROP COLUMN reverses the change. The key is knowing your rollback path before you start.

 

5. Monitor Replication Lag Continuously

Use dedicated monitoring that tracks replication lag at sub-second granularity throughout the migration window. Mafiree’s managed MySQL monitoring collects metrics every 5 minutes and provides real-time alerting on lag spikes, I/O saturation, and query latency anomalies during migration windows.

When to Use Each MySQL Schema Migration Approach

Use Native MySQL DDL When:

The operation is supported by ALGORITHM=INSTANT (adding columns at the end, dropping columns in MySQL 8.0.29+, renaming columns). The change is metadata-only and completes in milliseconds. You have verified the operation on a staging environment with ALGORITHM=INSTANT explicitly specified.

 

Use gh-ost When:

The table has more than 50 million rows and the operation requires a full table rebuild (column type changes, character set conversions). You need granular throttling based on replica lag. The table has existing triggers that conflict with pt-osc’s trigger-based approach.

 

Use pt-online-schema-change When:

You are running older MySQL versions (5.6, 5.7) where gh-ost’s binlog requirements may not be met. The table has foreign key relationships that need to be preserved during migration. You prefer a well-tested, widely-deployed tool with simpler configuration.

 

Planning a Large MySQL Schema Change?

Conclusion: Making MySQL Schema Migrations Routine, Not Risky

Zero-downtime MySQL schema migration is not about finding one magic tool. It is about understanding your specific change, choosing the right tool for each operation, and following a disciplined process of testing, monitoring, and staged execution. By splitting our client’s migration into three targeted phases, we turned what could have been an 8-hour outage into a seamless background operation.

The tools exist. The patterns are proven. What separates a smooth migration from a stressful incident is planning, expertise, and real-time monitoring. If you are scaling your MySQL environment and need professional guidance, Mafiree is here to help.

 

Ready to make your next MySQL schema migration stress-free? Request a free migration assessment today and discover how our team can help you evolve your database without downtime.

 

 

FAQ

Yes. MySQL 8.0 supports several INSTANT and INPLACE DDL operations that modify schemas with minimal or zero locking. For operations that require a full table rebuild, tools like gh-ost and pt-online-schema-change enable zero-downtime migrations by creating a shadow copy of the table and applying changes in the background while the original table continues serving traffic. When should you call an expert? Mafiree’s MySQL consulting services can help you identify which operations are safe for native DDL and which require specialized tooling, saving you from unexpected downtime.
gh-ost uses MySQL’s binary log to capture changes and replicate them to a shadow table, avoiding triggers entirely. pt-online-schema-change uses database triggers to capture inserts, updates, and deletes on the original table and replay them on the shadow table. gh-ost generally offers better throttling and is less invasive, while pt-osc has broader compatibility with foreign keys and older MySQL versions.
Migration time depends on table size, row width, I/O capacity, and how aggressively you can throttle. As a benchmark, our 500M-row migration with gh-ost took approximately 4.5 hours with conservative throttling (nice-ratio=0.5). A similar table with no throttling on dedicated hardware could complete in 1.5–2 hours, but this is rarely safe for production.
Yes, with caveats. RDS supports both gh-ost (in “migrate-on-replica” mode) and pt-online-schema-change. However, RDS restricts SUPER privileges and direct filesystem access, so you need to configure tools accordingly. Mafiree has extensive experience running zero-downtime migrations on RDS, Aurora, and self-managed MySQL environments. Need help with RDS migrations? Contact our team for a tailored strategy.
Critical metrics include replication lag across all replicas, InnoDB buffer pool hit ratio, disk I/O utilization, active connection count, query latency at p95 and p99, and the migration tool’s progress output. A spike in any of these should trigger a pause or abort. Mafiree’s managed MySQL services include 24/7 monitoring of these metrics with automated alerting to ensure migrations complete safely. Explore our managed MySQL monitoring and consulting services to ensure your next migration runs smoothly.

Author Bio

Jethish

Jethish is a PostgreSQL DBA at Mafiree with expertise in building scalable, reliable, and high-performance database infrastructures. He focuses on PostgreSQL architecture, replication strategies, performance tuning, and high availability for mission-critical systems. Through his technical writing, he shares clear, practical insights on database internals, replication choices, load balancing, and cross-database integrations that help engineers and DBAs tackle real-world data challenges.

Leave a Comment

Related Blogs

MySQL Architecture Explained: Performance Tuning & Troubleshooting Guide

MySQL features a unique tiered architecture that separates query processing from data storage through its pluggable storage engine model. This guide explores the core components—from connection handling and the SQL optimizer to the physical storage of data on disk. By understanding how engines like InnoDB provide ACID compliance and row-level locking, you can significantly improve your database's scalability. We also break down the query execution workflow and provide actionable tips for performance tuning, such as optimizing the buffer pool. Whether you're managing a replica set or a standalone instance, mastering MySQL’s internal structure is essential for building high-performance applications.

  858 views
The Evolution of MariaDB : Latest Features and What It Offers Compared to MySQL

Discover how MariaDB 11.x is redefining open-source databases with cutting-edge features like system-versioned tables, native AI-ready vector support, UUIDv7 for scalable inserts, and enterprise-grade security—all in the Community Edition, without the paywall.

  2565 views
Stop Hackers at the Gate: Restricting Brute-Force Attacks with MySQL’s Connection Control Plugin

“Fortify Your MySQL Security: Slow Down Attackers with Connection Control Plugin” Learn how the MySQL Connection Control Plugin helps defend against brute-force login attempts by introducing intelligent, progressive delays—without locking out legitimate users.

  1413 views
8 Major MySQL 8.4 Changes That Every Team Should Review Before Migration

Explore the 8 critical changes in MySQL 8.4 you need to know before migrating your production environment. From authentication updates to InnoDB defaults, this release brings significant operational implications for developers and DBAs.

  278 views
Load Balancing in MySQL: Read and Write Traffic Optimization Using MySQL Router

Optimize MySQL traffic with MySQL Router — smart load balancing made easy

  2456 views

Subscribe for email updates

Get in touch with us

Highlights

More than 6000 Servers Monitored

Happy Clients

Certified DBAs

24 x 7 x 365 Support

PCI

Database Services

MySQL MongoDB PostgreSQL SQL Server Aerospike Clickhouse TiDB MariaDB Columnstore

Quick Links

Careers Blog Contact Privacy Policy Disclaimer Policy

Contacts

Linkedin Mafiree Facebook Mafiree Twitter Mafiree

Nagercoil Office

Miru IT Park, Vallankumaranvillai,

Nagercoil, Tamilnadu - 629 002.

Bangalore Office

Unit 303, Vanguard Rise,

5th Main, Konena Agrahara,

Old Airport Road, Bangalore - 560 017.

Call: +91 6383016411

Email: sales@mafiree.com


Copyright © - All Rights Reserved - Mafiree