Friday, January 10, 2014

Terms and Definitions for Database Replication

These terms and definitions should help talking about replication. I've compiled them after lots of discussions, mostly with other Postgres developers, but the definitions are supposed to be universally usable and should apply to other RDBMSes as well.
Replication
For database systems, replication is the process of sharing transactional data to ensure consistency between redundant database nodes. This improves fault tolerance, which leads to better reliability of the overall system. Replications systems for databases can also be called distributed databases, especially when combined with other nice features described below. I tend to only count eager multi-master systems as distributed databases, because everything less hurts ACID principles and isn't transparent to the application.
Load Balancing
Replication is often coupled with load balancing to improve read performance. Some replication solutions have an integrated load balancer, which knows about the underlying system. Others rely on OS dependent or third party load balancers.
Replication Methods
Databases can be kept coherent in many different ways. A very common and simple approach is statement based replication, where SQL statements are distributed between the nodes. Non-deterministic functions, like now() or random(), pose some problems for that method. Another very common method is log shipping. Unfortunately the database system's log is often not meant as an exchange format for replication and thus it's hard to do replication of only parts of a database. Thus some replication solutions have their own binary format which is specifically designed for replication.
Divergence
Keeping data coherent across multiple nodes is quite expensive in terms of network latency. Thus many systems try to avoid network delays by allowing the nodes to diverge slightly, meaning they allow conflicting transactions to commit. To revert to a coherent and consistent database, those conflicts need to be resolved, either automatically or manually. Such conflicts violate the ACID property of the database system, so the application needs to be aware of that.
Synchronous vs Asynchronous Replication
According to Wikipedia's definition of synchronization, database replication is considered data synchronization, as opposed to process synchronization. However, internally a database systems has to synchronize locks or at least transactions, which would clearly be considered process synchronization.
Within the context of database replication, the most common definition of synchronous replication is, that as soon as a transaction is confirmed to be committed, all nodes of a cluster must have committed the transaction as well. This is very expensive in terms of latency and amount of messages to be sent, but it prevents divergence. In asynchronous replication systems, other nodes can apply the transactional data at any later point, thus the nodes may serve different, possibly even conflicting snapshots of the database.
Eager vs Lazy Replication
The terms eager and lazy are more often used in the literature about database replication. Sometimes synonymously to synchronous or asynchronous replication. Other times, there's a nifty difference: instead of stating, that all nodes need to process transactions synchronously, it's often sufficient to state that the nodes or replicas are eventually kept coherent. This means that transactions must be applied in the very same order, but not necessarily synchronously. According to that definition, eager replication is somewhere between sync and async: it allows nodes to lag behind others while still preventing divergence.
Another way to think about this distinction is, that eager systems have to replicate the transactional data before confirming the commit, while lazy systems replicate that data at some time after committing, so conflicts can arise.
Lazy replication is always asynchronous and does not prevent divergence. Due to this tiny but important difference, I prefer the term lazy to asynchronous. In any case however, the process of exchanging data to detect and resolve conflicts is called reconciliation. How often to reconciliate is a tradeoff between the allowed lag time and the load for reconciliation. It's also important to know, that lazy systems may reduce the probability of divergence by reconciliating more often, but the possibility of divergence cannot be eliminated completely (otherwise the system would be eager).
Distributed Querying
Distributed querying allows a single query to use multiple servers. This improves performance for long running read-only transactions. Short and simple queries should better be answered from a single node. The Postgres documentation speaks of "Multi-Server Parallel Query Execution".
Data Partitioning
A common distinction is vertical vs horizontal partitioning. Vertical partitioning splits into multiple tables with fewer columns, much like during the process of normalization. When partitioning horizontaly, the database systems stores different rows in multiple different tables.
In a multi-master system both methods can be used to split data across multiple nodes, so that every node only holds parts of the complete database. This obviously decreases reliability, as fewer nodes store the same tuple. And it requires some sort of distributed querying to be able to reach all data a transaction needs. But it certainly improves total capacity of your cluster and reduces the total load for writing transactions.
Shared-disk vs. Shared-nothing Clusters
A shared-disk cluster describes a bunch of nodes which share a common disk-subsystem, but with otherwise independent hardware. While that term makes sense, "shared-nothing" is often confusing people. It means that the nodes of a cluster share nothing, not even the disks. Of course, both types are commonly connected via some type of packet switching network. While shared-nothing clusters mostly consist of commodity hardware, shared-disk clusters often use specialized and expensive storage systems.
Shared-disk systems can provide are a good base for single-master replication solutions with failover capability. Using a clustered filesystem in a shared-nothing environment can be an inexpensive alternative.
It's a common misconception, that a shared-disk cluster would allow a faster eager multi-master replication systems than a shared-nothing one. The reasoning being, that less data needs to be transferred. But that's no where the problem is, because it's not the network throughput, but the latency that matters. In other words: doing conflict detection using CPU, memory and a network is faster than doing it using CPU, memory and shared disks.
Clustering
While clustering is a very popular term, it does not have a well defined meaning with regard to database systems. Lots of different techniques, like replication, load balancing, distributed querying, etc.. are called clustering here and there.
Grids

With regard to database replication, the very same applies for "grids", only potentiated. It's a plain marketing term, IMO.

No comments:

Post a Comment