Hope everyone has a visibility of replication. Let us not get into that. :)
In this blog you will get an introduction on various types of replication possible MSSQL, understanding the types of replication and the ideal type apt for your traffic pattern.
Types:
Snapshot Replication
Transactional Replication
Merge Replication
Before getting into the details, let us understand the components which are used for replication.
Consider replication topology as a publication of magazine, where the publisher publishes an article as a publication and then transfers to subscribers through distributors.
Now let us learn more about it.
Publisher:
A database instance (Source) that contains raw data to be replicated, like a master server.
Distributor:
A database instance (Intermediate agent) that will receive data from Publisher and send it to the subscriber.
NOTE: Distributor can be hosted along with publisher or subscriber or on a separate server.
Subscriber:
A database instance (Destination) that receives data from the distributor. NOTE: A Subscriber can receive data from multiple publishers via distributor.
Article:
A database object can be a table, stored procedure or selected table columns/rows to be replicated.
Publication:
A bundle of one or more articles from a single database.
Subscription:
The request to make a copy of publication to be delivered to Subscriber. NOTE: We can define what, where and when publication will be received in a subscription.
Push subscriptions:
Distribution agent or merge agent runs on the Distributor hence pushes the data to subscriber.
Pull subscriptions:
The distribution agent or merge agent runs on the subscriber, hence pull the data from the distributor.
Time to learn the types of replication,
Snapshot Replication:
The above image represents the data flow in snapshot replication.
Tables, databases (Known as Publication) to be replicated will be defined on the source server(Publisher). Snapshot agent in distributor takes snapshot of entire data and stores it in a snapshot folder. The distribution agent moves the snapshot to subscriber. Snapshot agent and distribution agent run based on a preconfigured schedule.
All the logs are maintained on the Distribution DB by the Distributor.
Pros:
No locking or downtime on publisher while taking the snapshot.
Cons:
Snapshot replication is more expensive in terms of overhead, network traffic. It takes place at defined intervals.
Locks are held during snapshot restoration, this can impact other users of the subscriber database.
Takeaways:
Snapshot replication can be used for applications, where the database size is small and latency is acceptable in available replicated data(subscriptions).
Transactional Replication:
Before configuring replication, Subscriber must contain same schema and data as the Publisher. Initial dataset can be replicated through a snapshot, backup or any other means, such as SQL Server Integration Services.
Note that every SQL Server instance has transaction log and that logs/tracks all changes are done on the databases.
Once the replication is configured, Log reader agent reads the transaction logs from publications and copies those transactions in batches to distribution database of Distributor. Then the transactions are moved from distribution DB to subscriber by a distribution agent.
Pros:
It can replicate the data from one Server to another Server with low latency, Near real-time data availability can be achieved
Takeaways:
Transactional replication is preferable for applications where Publisher has a very high data volume and traffic.
Merge Replication:
It is a bi-directional replication. Like transactional replication, initial data set can be replicated to the subscriber using snapshot or backups. The delta changes made in the server are recorded by a set of triggers in change tracking tables available at both the Publisher and Subscriber. Merge agent uses this recorded informations to synchronize the differences between publisher and all its subscribers. It uses a set of conflict-resolution rules to deal with all the problems that occur when two databases update the same data in different ways.
Pros:
It provides built-in and custom conflict resolution capabilities.
Cons:
Merge replication requires a higher hardware configuration and server maintenance.
Takeaways:
If an application requires active-active setup to handle failovers, merge replication is preferable.