Official Mafiree Blogs

Keeping you informed about Databases

Significance of using Invisible Primary key (GIPK) with MySQL 8.0
By Sukan   |   February 21, 2023   |   Posted in : MySQL
How Generated Invisible Primary Keys (GIPK) Can Boost Your Database Efficiency

Achieving High Availability Using Log Shipping
By Sujith   |   February 21, 2018   |   Posted in : SQL Server
Here we will get the detailed explanation of how we can achieve HA using Log Shipping.

SQL Server On Linux
By Sujith   |   March 30, 2019   |   Posted in : SQL Server
SQL Server 2017 brings the best features of the Microsoft relational database engine to the enterprise Linux ecosystem, including SQL Server Agent, Azure Active Directory (Azure AD) authentication, best-in-class high availability/ disaster recovery, and unparalleled data security.

Types of SQL Server Replication
By Sujith   |   February 19, 2018   |   Posted in : SQL Server
Understanding on the types of replication in SQL Server and the type apt for your traffic pattern.

Data File Splitting in SQL Server
By Murali   |   March 15, 2021   |   Posted in : SQL Server
Huge data blocks resides under single MDF file might cause a performance of the query and impact the application services. This blog help you to understand how we can the split the single MDF file into multiple data files.

 Home  /  PostgreSQL  
Auto Vacuum in PostgreSQL Explained
By Jerwin   |   May 27, 2019   |   Posted in : PostgreSQL

In the recent time in one of our environment we observed that the auto vacuum process has been running for long time during the production hours which has impacted heavily. In this blog post we are going to discuss about the auto vacuum process, its types, how it works, how to reduce the time taken by the auto vacuum process and what are the parameters associated with it. Ok let’s get started.

Postgres uses MVCC - Multi-Version Concurrency Control to handle the consistency. When any insert or update operations are carried out, the row is not physically deleted. So after a series of multiple insert/update/delete operations there are lot of dead spaces available causing bloat. Hence the vacuum process is done as the cleanup for removing the bloat and reclaiming the space.

How to find bloat in a tupule?

Bloat refers to nothing but the dead spaces that are being accumulated on the database. Thus after a delete/update the new set of rows created are referred to as the "live" rows, and the older set are referred to as "dead" rows. We can use the below query to check the live and dead rows status.

Types of Vacuum:

Vacuum process generally involves two types which are explained below:

Standard VACUUM

This process runs in parallel along with the application hits to reclaim the space. It does not affects the performance since it does not hold any lock on the tables. While running auto vacuum even if indexing is carried on, then the auto vacuum process gets cancelled by itself. This is the most recommended option to perform space reclamation.

Full VACUUM

On the other hand Full Vacuum process holds an exclusive lock on the table that has been currently working on. So it creates a huge impact on the performance of the database. This process tends to clear more dead spaces and it happens slowly.

Autovacuum process:

It is a daemon process that runs automatically when the postgres service has been started. It is just a launcher process that runs every time and it performs clearance only when it achieves the threshold to clear the old dead space. We check the process by the below command:

It automates the execution of VACUUM and ANALYZE commands. The vacuum command is used to clear the dead spaces and analyze command is used to gather the statistical information about the contents of tables.

For example consider the below scenario in which we delete a row and reclaim the space for a table.

Checking the table count:

Deleting a record from the table:

Checking the table bloat now:

Applying Analyze command:

Applying vacuum command:

Checking the table bloat again and now we could see the dead tuples are cleared now.

How auto vacuum runs?

The auto vacuum(VACUUM+ANALYZE) process begins to reclaim the dead space when it reaches the threshold based on the below formula.

Analyze/Vacuum scale factor- Fraction of table of records

Analyze/Vacuum threshold- Number of records to trigger the auto vacuum process.

Increasing the frequency of auto vacuum process by increasing the threshold of the autovacuum_vacuum_scale_factor,autovacuum_vacuum_analyze_factor parameters will reduce the auto vacuum process execution time. Thus we can optimise the Auto Vacuum by avoiding frequent run.

Purpose of Auto Vacuuming:

  • To reclaim the dead tuple spaces

  • To speed up the index scans and query performance

  • To update data statistics used by the query planner

  • To protect data loss of old data

Hope you get a better idea when and how to use the auto vacuum in PostgreSQL. Please let us know if you have any queries in the comments section.

 




0 Comments


Leave a Comment

Name *

Email *

Comment *



Search All Blogs



Need Support?


Solutions      Services      Resources      About Mafiree

More than 3000 servers monitored

Certified database Administrators

24X7X365 Support

Happy Clients


ENQUIRE NOW
For Sales Related Queries, Please Call Our Sales Experts at

 +91-80-41155993


Meet Mafiree on social networks!

     

PCI Certificate

Copyright © 2019 - All Rights Reserved - Mafiree