Mafiree logo
  • About
  • Services
  • Blogs
  • Careers
  • Products
    • orbit logo Orbit
    • streamer logo Xstreami
  • Contact
Schedule a Call
Menu
  • About
  • Services
  • Blogs
  • Careers
  • Products
    • orbit logo Orbit
    • streamer logo Xstreami
  • Contact
  • Schedule a Call
Database
Database Database Managed Services
MySQL MySQL
MySQL Consulting Services
MySQL Migration Services
MySQL Optimization & Query Tuning
MySQL Database Administration
MySQL Backup & Recovery
MySQL Security & Maintenance
MySQL Cloud Services (AWS RDS, Aurora, Google Cloud SQL, Azure)
MySQL for Ecommerce
MySQL High Availability & Replication
MongoDB MongoDB
MongoDB Consulting Services
MongoDB Migration Services
MongoDB Optimization & Query Tuning
MongoDB Database Administration
MongoDB Backup & Recovery
MongoDB Security & Maintenance
MongoDB Cloud (Atlas)
MongoDB Solutions by Industry
MongoDB High Availability & Replication
PostgreSQL PostgreSQL
PostgreSQL Consulting
PostgreSQL Migration & Upgrades
Performance Tuning & Query Optimization
PostgreSQL Administration & Managed Services
High Availability, Clustering & Replication
PostgreSQL Backup, Recovery & Disaster Planning
PostgreSQL Security, Compliance & Auditing
PostgreSQL for Analytics & Data Warehousing
PostgreSQL on Cloud & Containers
PostgreSQL Extensions & Open-Source Integrations
PostgreSQL for Every Industry
SQL Server MSSQL
MSSQL Consulting Services
MSSQL Migration Services
MSSQL Optimization & Query Tuning Services
MSSQL Database Administration Services
MSSQL Backup & Recovery Services
MSSQL High Availability & Replication Services
MSSQL Security & Compliance Services
MSSQL Performance Monitoring & Health Checks
MSSQL Solutions by Industry
Aerospike Aerospike
Aerospike Consulting Services
Aerospike Migration Services
Aerospike Performance Optimization & Tuning
Aerospike Database Administration
Aerospike Backup & Recovery
Aerospike High Availability
Aerospike Cloud & Hybrid Deployments
Aerospike for Real-Time Applications (AdTech, FinTech, Retail, IoT)
Analytics DB
Analytics DB Analytics DB Services
Clickhouse Clickhouse
ClickHouse Consulting Services
ClickHouse Migration Services
ClickHouse Optimization & Query Tuning
ClickHouse Database Administration
ClickHouse Backup & Recovery
ClickHouse Security & Maintenance
ClickHouse Cloud Services (ClickHouse Cloud, AWS, GCP, Azure)
ClickHouse Solutions by Industry
ClickHouse High Availability & Replication
TiDB TiDB
TiDB Consulting & Architecture Planning
TiDB Administration & Maintenance
TiDB Security and Privacy Maintenance
TiDB Performance & Query Optimization
TiDB Migration Services
TiDB Backup & Disaster Recovery
TiDB High Availability Solutions
TiDB Solutions by Industry
TiDB Cloud Services
ScyllaDB ScyllaDB
ScyllaDB Consulting & Architecture Planning
ScyllaDB Administration & Maintenance
ScyllaDB Security and Privacy Maintenance
ScyllaDB Performance & Query Optimization
ScyllaDB Migration Services
ScyllaDB Backup & Disaster Recovery
ScyllaDB High Availability Solutions
ScyllaDB Solutions by Industry
ScyllaDB Cloud Services
DevOps
DevOps DevOps Services
Version Control Version Control
Kubernetes Kubernetes
Infrastructure Infrastructure Management
Web Servers Web Servers
Networking
Networking Networking Services
Basic Basic
Advanced Advanced
MySQL MySQL
MongoDB MongoDB
PostgreSQL PostgreSQL
MSSQL MSSQL
Aerospike Aerospike
Clickhouse Clickhouse
TiDB TiDB
ScyllaDB ScyllaDB
Version Control Version Control
Kubernetes Kubernetes
Infrastructure Infrastructure Management
Web Servers Web Servers
Basic Basic
Advanced Advanced
MySQL Consulting Services
MySQL Migration Services
MySQL Optimization & Query Tuning
MySQL Database Administration
MySQL Backup & Recovery
MySQL Security & Maintenance
MySQL Cloud Services (AWS RDS, Aurora, Google Cloud SQL, Azure)
MySQL for Ecommerce
MySQL High Availability & Replication
MongoDB Consulting Services
MongoDB Migration Services
MongoDB Optimization & Query Tuning
MongoDB Database Administration
MongoDB Backup & Recovery
MongoDB Security & Maintenance
MongoDB Cloud (Atlas)
MongoDB Solutions by Industry
MongoDB High Availability & Replication
PostgreSQL Consulting
PostgreSQL Migration & Upgrades
Performance Tuning & Query Optimization
PostgreSQL Administration & Managed Services
High Availability, Clustering & Replication
PostgreSQL Backup, Recovery & Disaster Planning
PostgreSQL Security, Compliance & Auditing
PostgreSQL for Analytics & Data Warehousing
PostgreSQL on Cloud & Containers
PostgreSQL Extensions & Open-Source Integrations
PostgreSQL for Every Industry
MSSQL Consulting Services
MSSQL Migration Services
MSSQL Optimization & Query Tuning Services
MSSQL Database Administration Services
MSSQL Backup & Recovery Services
MSSQL High Availability & Replication Services
MSSQL Security & Compliance Services
MSSQL Performance Monitoring & Health Checks
MSSQL Solutions by Industry
Aerospike Consulting Services
Aerospike Migration Services
Aerospike Performance Optimization & Tuning
Aerospike Database Administration
Aerospike Backup & Recovery
Aerospike High Availability
Aerospike Cloud & Hybrid Deployments
Aerospike for Real-Time Applications (AdTech, FinTech, Retail, IoT)
ClickHouse Consulting Services
ClickHouse Migration Services
ClickHouse Optimization & Query Tuning
ClickHouse Database Administration
ClickHouse Backup & Recovery
ClickHouse Security & Maintenance
ClickHouse Cloud Services (ClickHouse Cloud, AWS, GCP, Azure)
ClickHouse Solutions by Industry
ClickHouse High Availability & Replication
TiDB Consulting & Architecture Planning
TiDB Administration & Maintenance
TiDB Security and Privacy Maintenance
TiDB Performance & Query Optimization
TiDB Migration Services
TiDB Backup & Disaster Recovery
TiDB High Availability Solutions
TiDB Solutions by Industry
TiDB Cloud Services
ScyllaDB Consulting & Architecture Planning
ScyllaDB Administration & Maintenance
ScyllaDB Security and Privacy Maintenance
ScyllaDB Performance & Query Optimization
ScyllaDB Migration Services
ScyllaDB Backup & Disaster Recovery
ScyllaDB High Availability Solutions
ScyllaDB Solutions by Industry
ScyllaDB Cloud Services
  1. Home
  2. > Blogs
  3. > SQL Server
  4. > Data File Splitting in SQL Server

Data File Splitting in SQL Server

Huge data blocks resides under single MDF file might cause a performance of the query and impact the application services. This blog help you to understand how we can the split the single MDF file into multiple data files.

Murali March 15, 2021

Subscribe for email updates

Introduction:-

Most of us do not think about using multiple files for the single database until data grows larger. Query performance starts to perform poorly on the PROD environment because the data blocks reside on a single MDF file. Let’s discuss what are all the techniques in SQL server to improve the performance by creating multiple data files and evenly distribute the data from old data files in this blog. There are two ways to achieve this,

  1. Splitting the single data files eventually to multiple files.
  2. Adding a filegroup by moving the clustered index.

Splitting the single data files eventually to multiple files:-

Step 1: Add required new empty data files to the file group.
Step 2: Disable the autogrowth on new files.
Step 3: Need to empty an existing single data file (original).
Step 4: Re-enable autogrowth.
Step 5: Shrink the original datafile to the same size of the new files.

The reason we disabled autogrowth on the five new files is to prevent the original file from getting too empty. In order to balance all the data files we need to disable autogrowth on the newly created files.

Process behind shrink command with empty option:-

  • Shrink command takes the hot data from the tail of the data file and transfers it into the newly added files.
  • It uses a proportional fill algorithm that will evenly distribute the data to the new files.So the database objects blocks are shared across the multiple data files.

Note: Number of additional files is calculating depending on what level we shrink the existing data file.

Originality of data files before the activity:-

-> A represents the Size of the data files

Originality of data files post the activity:-

-> A represents the Size of the data files

Performance during the activity:-

Disk Queue Length:-

Log Growth:-

During this operation we will be facing a high IO and unexpected heavy log growth.So overcome this log growth,we need to change the recovery model FULL to BULK-LOGGED where the servers in standalone setup.With this change,we can avoid the drastical log growth and we can’t change the recovery model where the servers in a high availability mode.

Benefits:-

This process is done online and it does not affect the live object.

Limitations:-

  • We might face a lot of I/O during the activity.
  • Log growth is high during the activity window.So it should cause the setup’s using always-on,database mirroring and log shipping because all of them need to be shipped.

Adding File-Group Method:-

Normally,data is stored in the leaf node of the clustered index.The File Group Method is nothing but the objects are moved to the new file group by rebuilding the clustered index on the new FG.

Existence of clustered index:-

  1. Creating a FG with new data files.
  2. Drop the clustered index on the existing default FG(Primary).
  3. Rebuild the clustered index on the new FG which we created.
  4. Shrink the old data files which reside in the default FG.

Non-Existence of clustered index:-

  1. Creating a FG with new data files. .
  2. Create the clustered index on the new FG.
  3. Drop the clustered index on the new FG if not needed..
  4. Shrink the old data files which reside in the default FG.

Benefits:-

  • Not a risky Task.
  • Process is done online.
  • Better for HA setup.

Limitations:-

  • If there is no clustered index,it’s a time taking process for larger tables.
  • It will give performance if it is in a separate physical drive.
  • Expect log growth.

Impact of log growth against HA and standalone environment with splitting method:-

Standalone env:-

If we change the recovery model from FULL to BULK-LOGGED,it will reduce the log growth,but we will lose the point-in-time recovery incase of any crashes/failures during the activity.So you should take the hot backup full/diff depending upon your needs without any delay.

HA env:-

Increases of log growth will affect the secondary setup because it’s working concept depends upon the Transaction Log.We are not able to change the recovery model from FULL to BULK-LOGGED because it will work only on FULL recovery model.

We will use adding a filegroup approach for both HA and standalone environments,it might cause log growth depending on recreation of index for the tables which are in larger size and for gaining the performance the new FG should be in separate physical drives.

Comparison between the methods:-

Conclusion:-

Splitting data files in a primary filegroup is a time taking online process with an expense of log growth depending on your database sizes and the addition of new FG method is not bit risky task when compared to the first method.So as we conclude,If we are in a standalone environment,we will go for the splitting method and in a high availability environment,we should go for adding filegroup methods to reduce the impact on the PROD environment.


Leave a Comment

Related Blogs

How MS SQL Server Manages Data at Rest and Data in Motion

Data at Rest refers to inactive data stored in databases, while Data in Motion is actively transmitted or processed between systems in SQL Server.

  1215 views
How to Optimize SQL Server Performance: Implementing Online DDL

SQL Server 2022 added the "Wait at Low Priority" option that allows index creation and alteration to wait for resources to become available before executing.

  2680 views
Resumable Operations for ALTER TABLE Constraints in SQL Server 2022

Empower Your Database Management with SQL Server 2022's Resumable Operations

  1923 views
SQL Server On Linux

SQL Server 2017 brings the best features of the Microsoft relational database engine to the enterprise Linux ecosystem, including SQL Server Agent, Azure Active Directory (Azure AD) authentication, best-in-class high availability/ disaster recovery, and unparalleled data security.

  2234 views
Achieving High Availability Using Log Shipping

Here we will get the detailed explanation of how we can achieve HA using Log Shipping.

  2810 views

Subscribe for email updates

Get in touch with us

Highlights

More than 6000 Servers Monitored

Happy Clients

Certified DBAs

24 x 7 x 365 Support

PCI

Database Services

MySQL MongoDB PostgreSQL SQL Server Aerospike Clickhouse TiDB MariaDB Columnstore

Quick Links

Careers Blog Contact Privacy Policy Disclaimer Policy

Contacts

Linkedin Mafiree Facebook Mafiree Twitter Mafiree

Nagercoil Office

Miru IT Park, Vallankumaranvillai,

Nagercoil, Tamilnadu - 629 002.

Bangalore Office

Unit 303, Vanguard Rise,

5th Main, Konena Agrahara,

Old Airport Road, Bangalore - 560 017.

Call: +91 6383016411

Email: sales@mafiree.com


Copyright © - All Rights Reserved - Mafiree