Nagios plugin script written in Bash to check the master-slave replication in PostgreSQL (tested on PostgreSQL 9.2.4) (executed on the slave).
The script will report how many bytes the slave server is behind, and how many seconds ago the last replay of data occurred.
The script will report how many bytes the slave server is behind, and how many seconds ago the last replay of data occurred.
The script must be executed as ‘postgres’ user.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
| #!/bin/bash# $Id: check_slave_replication.sh 3421 2013-08-09 07:52:44Z jmorano $STATE_OK=0STATE_WARNING=1STATE_CRITICAL=2STATE_UNKNOWN=3 ## Master (p_) and Slave (s_) DB Server Information export s_host=$1export s_port=$2export p_db=$3export p_host=$4export p_port=$5 export psql=/opt/postgresql/bin/psqlexport bc=/usr/bin/bc ## Limitsexport critical_limit=83886080 # 5 * 16MB, size of 5 WAL filesexport warning_limit=16777216 # 16 MB, size of 1 WAL file master_lag=$($psql -U postgres -h$p_host -p$p_port -A -t -c "SELECT pg_xlog_location_diff(pg_current_xlog_location(), '0/0') AS offset" $p_db)slave_lag=$($psql -U postgres -h$s_host -p$s_port -A -t -c "SELECT pg_xlog_location_diff(pg_last_xlog_receive_location(), '0/0') AS receive" $p_db)replay_lag=$($psql -U postgres -h$s_host -p$s_port -A -t -c "SELECT pg_xlog_location_diff(pg_last_xlog_replay_location(), '0/0') AS replay" $p_db)replay_timediff=$($psql -U postgres -h$s_host -p$s_port -A -t -c "SELECT NOW() - pg_last_xact_replay_timestamp() AS replication_delay" $p_db) if [[ $master_lag -eq '' || $slave_lag -eq '' || $replay_lag -eq '' ]]; then echo "CRITICAL: Stream has no value to compare (is replication configured or connectivity problem?)" exit $STATE_CRITICALelse if [[ $master_lag -eq $slave_lag && $master_lag -eq $replay_lag && $slave_lag -eq $replay_lag ]] ; then echo "OK: Stream: MASTER:$master_lag Slave:$slave_lag Replay:$replay_lag" exit $STATE_OK else if [[ $master_lag -eq $slave_lag ]] ; then if [[ $master_lag -ne $replay_lag ]] ; then if [ $(bc <<< $master_lag-$replay_lag) -lt $warning_limit ]; then echo "OK: Stream: MASTER:$master_lag Replay:$replay_lag :: REPLAY BEHIND" exit $STATE_OK else echo "WARNING: Stream: MASTER:$master_lag Replay:$replay_lag :: REPLAY $(bc <<< $master_lag-$replay_lag)bytes BEHIND (${replay_timediff}seconds)" exit $STATE_WARNING fi fi else if [ $(bc <<< $master_lag-$slave_lag) -gt $critical_limit ]; then echo "CRITICAL: Stream: MASTER:$master_lag Slave:$slave_lag :: STREAM BEYOND CRITICAL LIMIT ($(bc <<< $master_lag-$slave_lag)bytes)" exit $STATE_CRITICAL else if [ $(bc <<< $master_lag-$slave_lag) -lt $warning_limit ]; then echo "OK: Stream: MASTER:$master_lag Slave:$slave_lag Replay:$replay_lag :: STREAM BEHIND" exit $STATE_OK else echo "WARNING: Stream: MASTER:$master_lag Slave:$slave_lag :: STREAM BEYOND WARNING LIMIT ($(bc <<< $master_lag-$replay_lag)bytes)" exit $STATE_WARNING fi fi fi echo "UNKNOWN: Stream: MASTER: $master_lag Slave: $slave_lag Replay: $replay_lag" exit $STATE_UNKNOWN fifi |
Possible outputs:
1
2
3
4
5
6
7
8
| $ bash check_slave_replication.sh 192.168.0.1 5432 live 192.168.0.2 5432WARNING: Stream: MASTER:1907958306184 Replay:1907878056888 :: REPLAY 80249296bytes BEHIND (00:03:14.056747seconds)$ bash check_slave_replication.sh 192.168.0.1 5432 live 192.168.0.2 5432OK: Stream: MASTER:2055690128376 Slave:2055690143144 Replay:2055690193744 :: STREAM BEHIND$ bash check_slave_replication.sh 192.168.0.1 5432 live 192.168.0.2 5432OK: Stream: MASTER:2055690497120 Replay:2055690497328 :: REPLAY BEHIND$ bash check_slave_replication.sh 192.168.0.1 5432 live 192.168.0.2 5432OK: Stream: MASTER:2055691704672 Slave:2055691704672 Replay:2055691704672 |
No comments:
Post a Comment