Whether you’re growing tired of your mySQL install, or you’re just starting off, PostgreSQL 9.2 is a great time to dive into this extremely exciting project. Many great changes have been made in 9.2, including several replication improvements, which I will go over in a fairly thorough nature for this article.
When initially investigating replication, I really liked streaming replication, with its ability to cascade slaves, have hot standbys, and online backups, it provided the base for a highly available and efficient database cluster.
I will make the assumption that you’re building this on an Ubuntu box. I used Ubuntu Pengolin.
For your convenience, a table of contents:
- Initial Master Setup
- Configuration On Master
- Setting Up An Initial Slave
- Starting Replication On The Slave
- Checking For Replication Issues
- Adding A Second Slave
- Resources
So, let’s go.
Initial Master Setup
First, you’ll need to install PostgreSQL 9.2 packages. By default, these don’t exist on Ubuntu Pengolin, but we can use official packages available at apt.postgresql.org. Create a new file with whatever name you want under
/etc/apt/sources.list.d/
and add the following inside:deb http://apt.postgresql.org/pub/repos/apt/ precise-pgdg main
Then you’ll need to fetch the apt-key, and finally, do an update and install.
$ wget --quiet -O - http://apt.postgresql.org/pub/repos/apt/ACCC4CF8.asc | sudo apt-key add -
$ sudo aptitude update
$ sudo aptitude install postgresql-9.2 postgresql-contrib-9.2
Once installed, drop the cluster and get explicit with the locale you want. I found the default (en_CA.utf8 for us) was not what I wanted.
$ sudo pg_dropcluster --stop 9.2 main
$ sudo pg_createcluster --locale=en_US.utf8 --start 9.2 -d /var/lib/postgresql/9.2/main main
Next, we’ll want to create some database users.
$ sudo -u postgres psql -d template1 -c "ALTER USER postgres WITH ENCRYPTED PASSWORD 'replaceme';"
$ sudo -u postgres psql -c "CREATE USER pg_myapp WITH ENCRYPTED PASSWORD 'replaceme';"
$ sudo -u postgres psql -c "CREATE USER replicator REPLICATION LOGIN ENCRYPTED PASSWORD 'replaceme';"
Fairly straightforward.
pg_myapp
can be a user with a name relevant to your project. The replicator
user will be what we use to do replication. Just allows for some more specific permission setting later on. You should only do this on master, as we’ll copy it down to a slave eventually.Configuration on Master
We’re going to setup the master so that it ships enough information to its slave allowing it to act as a hot standby. Additionally, we’ll archive WAL files to the slave incase we need to point in time recovery, and it helps streaming replication catch up!
Set these within
postgresql.conf
listen_addresses = '*'
wal_level = hot_standby
max_wal_senders = 5
wal_keep_segments = 32
archive_mode = on
archive_command = 'rsync -aq %p postgres@slave1.example.internal:/var/lib/postgresql/9.2/archive/%f'
archive_timeout = 3600
A wildcard on listen_addresses? In most cases, this should be fine as we'll use pg_hba.conf for security. But, be aware of what you're doing!
I’m not going to get into huge detail as to what these mean, as the official documentation explains it really well. I recommend it.
Our archive command is going to assume that the archive directory exists on our slave. We’ll create it later.
You won’t believe me, but we’re pretty close to being done with master. You’ll want to setup some permissions within
pg_hba.conf
. I added the following lines, to allow local replication (via pg_basebackup), slave replication, and app connectivity.host all pg_myapp 127.0.0.0/24 md5
host replication replicator 127.0.0.0/24 trust
local replication postgres trust
Replace
127.0.0.0
with the respective subnet of your master/slave machines or simply use explicit ip addresses, depends on your setup.
Restart PostgreSQL on this master machine and we can move onto our standby server.
$ sudo service postgresql restart
Setting Up An Initial Slave
I’m going to make the assumption that this second slave is on a different box, so go ahead and install the same Ubuntu packages as you did for the master. Initially, we’ll set this slave up as it were the only slave. Further into the article, I’ll go into detail on adding the second slave.
Now again, just as with the master, you’ll need to modify
postgresql.conf
. The configuration will be nearly identical. We want the slave to match the master as closely as possible, so that in the event of a failover, it can easily become the new master.listen_addresses = '*'
wal_level = hot_standby
max_wal_senders = 5
wal_keep_segments = 32
hot_standby = on
The difference is
hot_standby = on
, which simply states that this machine, which is capable of WAL shipping as well, is a ready to replace master at any time.
You’ll also want to give this slave the same
pg_hba.conf
permissions as we did the master. Here they are again:host all pg_myapp 127.0.0.0/24 md5
host replication replicator 127.0.0.0/24 trust
local replication postgres trust
Finally, a new, very important file is introduced,
recovery.conf
. It is necessary in order to replicate from master. Place this in /etc/postgresql/9.2/main/
and create it like so:standby_mode = 'on' # enables stand-by (readonly) mode
# Connect to the master postgres server using the replicator user we created.
primary_conninfo = 'host=pgmaster.mysite.internal port=5432 user=replicator'
# Specifies a trigger file whose presence should cause streaming replication to
# end (i.e., failover).
trigger_file = '/tmp/pg_failover_trigger'
# Shell command to execute an archived segment of WAL file series.
# Required for archive recovery if streaming replication falls behind too far.
restore_command = 'cp /var/lib/postgresql/9.2/archive/%f %p'
archive_cleanup_command = 'pg_archivecleanup /var/lib/postgresql/9.2/archive/ %r'
This is a lot to take in, but it’s fairly straight forward. Key things include:
- Replace the values in
primary_conninfo
with relevant host names to your master. Make sure this machine can connect to the address you use. - Ensure the archive directory under
/var/lib/postgresql/9.2/archive/
is created. This is a choice we make, so it’s not created by Postgresql
Save that file. We’ll actually use it later. PostgreSQL looks for
recovery.conf
within /var/lib/postgresql/9.2/main/
, rather than where we placed it. But before we can move it there, we need to setup the essentials for replication on the slave.Starting Replication On The Slave
Now that the slave is configured, it’s necessary to get the contents of master onto it, so that replication can sync up. This is done using a handy command named
pg_basebackup
, which essentially copies the entire data directory of your cluster into a location of your choice. It is also what we know as an online backup, as the master can stay up while the command copies all the data.
To start, you should stop postgresql on your slave. We don’t want any data flowing while we do this.
$ sudo service postgresql stop
Then, switch to the master server and execute
pg_basebackup
# on master machine
$ pg_basebackup -U postgres -D - -P -Ft | bzip2 > /var/tmp/pg_basebackup.tar.bz2
$ scp /var/tmp/pg_basebackup.tar.bz2 postgres@pgslave.mysite.internal:/var/tmp/
I love pg_basebackup because it allows us to pipe in any way we want. bzip2 is always a nice option, but it’s your choice. After backing up to a temporary correct, I send the backup straight to a temporary directory on the slave.
Now, back on the slave machine, let’s wipe the our clusters data directory before importing our backup:
# on slave machine
$ cd /var/lib/postgresql/9.2/main/
$ rm -rf *
$ tar -xjvf /var/tmp/pg_basebackup.tar.bz2
Make sure that the contents of this directory are owned by the
postgres
user and group. If not, then:$ sudo chown -R postgres:postgres /var/lib/postgresql/9.2/main
Now, copy the
recovery.conf
file we made earlier into the data directory.$ sudo -u postgres cp /etc/postgresql/9.2/main/recovery.conf \
/var/lib/postgresql/9.2/main/recovery.conf
Finally, start the slave up again. If all went well, you should receive no errors:
$ sudo service postgresql start
This is a good time to tail the cluster log to see if all is swell.
$ tail -f /var/log/postgresql/postgresql-9.2-main.log
You should see a couple of lines similar to:
LOG: entering standby mode
LOG: streaming replication successfully connected to primary
Excellent! You now have a master/slave setup with streaming replication. In essence, what happened was pretty simple. We took the entire data directory of the master and gave it to the slave. Within this directory, there are included log files that help the master and slave communicate the point in time they are at.
So how can we test this? Well, assuming the log file said everything is ok, let’s try some queries!
# on master
$ sudo -u postgres psql -c "CREATE TABLE fake_customers (name VARCHAR(32));"
Now, go onto the slave. This table should exist.
$ sudo -u postgres psql -c "SELECT name FROM fake_customers;"
Congratulations! You setup a master/slave replication. Wasn’t it simple? Now, let’s move onto keeping an eye on this replication, and eventually adding a second slave (totally optional, of course.)
Checking For Replication Issues
We definitely need some way to monitor our postgres instances to ensure they’re actually doing what we expect. There are a variety of solutions out there, including aPostgreSQL Munin plugin but I’d like to touch a bit on the XLOG location and how to find the offset between your servers, eventually reporting it to some backend.
XLOG records are generated on the master and shipped to the slaves. By gathering data from the master and slave, we can compare some differences and get an idea of how far behind our slaves are in terms of replication.
First thing your script would need to do is get the XLOG Location of the master.
# run on master.
$ psql -c "SELECT pg_xlog_location_diff(pg_current_xlog_location(), '0/0') AS offset;"
Store this offset into a variable and then get the receive and replay locations from each slave.
# on each slave
$ psql -c "SELECT pg_xlog_location_diff(pg_last_xlog_receive_location(), '0/0') AS receive, \
pg_xlog_location_diff(pg_last_xlog_replay_location(), '0/0') AS replay;"
Then, a simple subtraction:
receive_delta = master_offset - slave_receive_offset
replay_delta = master_offset - slave_replay_offset
Finally, you’ll have a number that looks very arbitrary. Great?
The number returned is essentially the offset in bytes between the master and slave(s). The difference will vary between various applications, so in the end, it comes down to knowing your applications environment, your configuration, and keeping an eye on the consistent result of the difference. Over some time, you’ll be able to figure out what a “critical” difference is for your application.
If you’d like a fully functional script that does this check, I posted it on Github
Adding A Second Slave
We’re now in a situation where we have a master and slave, with streaming replication as well as WAL shipping. At this point, you can be pretty happy. You have a slave failover machine that can be ready to become your master machine at the touch of a trigger (surprisingly, almost quite literally).
However, let’s say we want to add another slave. Initially, I imagined that simply adding a second slave, setting up an NFS mount that the master ships to, and those slaves recover from would be pretty ideal, but I found a few issues with that:
- NFS mounts, at least for us, were notorious for being unreliable in a high-availability situation.
- We use
pg_archivecleanup
on each slave to clean up the WAL logs, but when if an NFS mount is setup and you’re sharing that, when do you know you can actually clean up that file? What if the other slave needs it?
With mostly the latter in mind, I came to the conclusion that the ideal choice Cascading Replication. The ability to do this was introduced in PostgreSQL 9.2 and it makes adding additional slaves really simple.
First, how would this setup look?
repl.
stands for replicate
+-----------+ +-----------+ +-----------+
| master | | slave | | slave |
|-----------| |-----------| |-----------|
| | | (primary) | | |
| | repl. | | repl. | |
| +--------> +--------> |
| | | | | |
| | | | | |
| | | | | |
+-----------+ +-----------+ +-----------+
Simply put, our first slave is more of a primary slave, that sends information to other slaves. The second slave does not directly talk to the master. The two initial issues we faced are now gone, and we have an even more powerful setup. The changes to your existing slave are minimal.
First, we need to tell the primary slave to archive to the secondary slaves. Go back to the first slave machine and edit
postgresql.conf
.archive_command = 'rsync -aq %p postgres@slave2.example.internal:/var/lib/postgresql/9.2/archive/%f'
archive_timeout = 3600
Now, onto the second slave. Provision the box as you did the initial slave, setting up the
postgresql.conf
, pg_hba.conf
, and recovery.conf
files as we did, with one one minor adjustment in recovery.conf
# Connect to the primary slave postgres user using the replicator user.
primary_conninfo = 'host=pgslave1.mysite.internal port=5432 user=replicator'
Instead of connecting to the master for replication, this machine will connect to the first slave.
Now, run
pg_basebackup
just like you did on the primary slave on this machine, acting as if the primary slave was its master. Move recovery.conf
into the/var/lib/postgresql/9.2/main
directory and start postgres up.
You should see your secondary slave has connected to your primary slave within the logs. With a second slave, you now have greater recovery flexibility and a great server to run backups from, checks, etc.
That’s A Wrap
And that’s it. Try running some queries that alter data on your master and watch them flow to your primary slave and down to your secondary. Exciting stuff.
Before going into production, you can try triggering a failure of the master. Play with the
trigger_file
you defined and see the results. There are some interesting quirks that come up when you do these failovers, many which are being addressed in the next version of PostgreSQL.
I hope you found this article useful and you can use it as a reference when you’re setting up your own instance. I’ll leave you with some resources for additional quality reading.
Resources
- The PostgreSQL Documentation is invaluable.
- Excellent slides on PostgreSQL Backup Strategies
- A great mySQL to PostgreSQL conversion tool
- Excellent Postgres blog and slow query analyzer
- PostgreSQL community blog
No comments:
Post a Comment