These terms and definitions should
help talking about replication. I've compiled them after lots of discussions,
mostly with other Postgres developers, but the definitions are supposed to be
universally usable and should apply to other RDBMSes as well.
Replication
For database systems,
replication is the process of sharing transactional data to ensure consistency
between redundant database nodes. This improves fault tolerance, which leads to
better reliability of the overall system. Replications systems for databases
can also be called distributed databases, especially when combined with other
nice features described below. I tend to only count eager multi-master systems
as distributed databases, because everything less hurts ACID principles and
isn't transparent to the application.
Load Balancing
Replication is often coupled
with load balancing to improve read performance. Some replication solutions
have an integrated load balancer, which knows about the underlying system.
Others rely on OS dependent or third party load balancers.
Replication Methods
Databases can be kept
coherent in many different ways. A very common and simple approach is statement
based replication, where SQL statements are distributed between the nodes.
Non-deterministic functions, like now() or random(), pose some problems for
that method. Another very common method is log shipping. Unfortunately the
database system's log is often not meant as an exchange format for replication
and thus it's hard to do replication of only parts of a database. Thus some
replication solutions have their own binary format which is specifically
designed for replication.
Divergence
Keeping data coherent across
multiple nodes is quite expensive in terms of network latency. Thus many
systems try to avoid network delays by allowing the nodes to diverge slightly,
meaning they allow conflicting transactions to commit. To revert to a coherent
and consistent database, those conflicts need to be resolved, either
automatically or manually. Such conflicts violate the ACID property of the
database system, so the application needs to be aware of that.
Synchronous vs Asynchronous
Replication
According to Wikipedia's definition of synchronization,
database replication is considered data synchronization, as opposed to process
synchronization. However, internally a database systems has to synchronize
locks or at least transactions, which would clearly be considered process
synchronization.
Within the context of
database replication, the most common definition of synchronous replication is,
that as soon as a transaction is confirmed to be committed, all nodes of a
cluster must have committed the transaction as well. This is very expensive in
terms of latency and amount of messages to be sent, but it prevents divergence.
In asynchronous replication systems, other nodes can apply the transactional
data at any later point, thus the nodes may serve different, possibly even
conflicting snapshots of the database.
Eager vs Lazy Replication
The terms eager and lazy are
more often used in the literature about database replication. Sometimes
synonymously to synchronous or asynchronous replication. Other times, there's a
nifty difference: instead of stating, that all nodes need to process
transactions synchronously, it's often sufficient to state that the nodes or
replicas are eventually kept coherent. This means that transactions must be
applied in the very same order, but not necessarily synchronously. According to
that definition, eager replication is somewhere between sync and async: it
allows nodes to lag behind others while still preventing divergence.
Another way to think about
this distinction is, that eager systems have to replicate the transactional
data before confirming the commit, while lazy systems replicate that data at
some time after committing, so conflicts can arise.
Lazy replication is always
asynchronous and does not prevent divergence. Due to this tiny but important
difference, I prefer the term lazy to asynchronous. In any case however, the
process of exchanging data to detect and resolve conflicts is called
reconciliation. How often to reconciliate is a tradeoff between the allowed lag
time and the load for reconciliation. It's also important to know, that lazy
systems may reduce the probability of divergence by reconciliating more often,
but the possibility of divergence cannot be eliminated completely (otherwise
the system would be eager).
Distributed Querying
Distributed querying allows a
single query to use multiple servers. This improves performance for long
running read-only transactions. Short and simple queries should better be
answered from a single node. The Postgres documentation speaks of
"Multi-Server Parallel Query Execution".
Data Partitioning
A common distinction is
vertical vs horizontal partitioning. Vertical partitioning splits into multiple
tables with fewer columns, much like during the process of normalization. When
partitioning horizontaly, the database systems stores different rows in
multiple different tables.
In a multi-master system both
methods can be used to split data across multiple nodes, so that every node
only holds parts of the complete database. This obviously decreases
reliability, as fewer nodes store the same tuple. And it requires some sort of
distributed querying to be able to reach all data a transaction needs. But it
certainly improves total capacity of your cluster and reduces the total load
for writing transactions.
Shared-disk vs.
Shared-nothing Clusters
A shared-disk cluster
describes a bunch of nodes which share a common disk-subsystem, but with
otherwise independent hardware. While that term makes sense,
"shared-nothing" is often confusing people. It means that the nodes
of a cluster share nothing, not even the disks. Of course, both types are
commonly connected via some type of packet switching network. While
shared-nothing clusters mostly consist of commodity hardware, shared-disk
clusters often use specialized and expensive storage systems.
Shared-disk systems can
provide are a good base for single-master replication solutions with failover
capability. Using a clustered filesystem in a shared-nothing environment can be
an inexpensive alternative.
It's a common misconception,
that a shared-disk cluster would allow a faster eager multi-master replication
systems than a shared-nothing one. The reasoning being, that less data needs to
be transferred. But that's no where the problem is, because it's not the
network throughput, but the latency that matters. In other words: doing
conflict detection using CPU, memory and a network is faster than doing it
using CPU, memory and shared disks.
Clustering
While clustering is a very
popular term, it does not have a well defined meaning with regard to database
systems. Lots of different techniques, like replication, load balancing,
distributed querying, etc.. are called clustering here and there.
Grids
With regard to database
replication, the very same applies for "grids", only potentiated.
It's a plain marketing term, IMO.
No comments:
Post a Comment