Monday, August 27, 2007

PostgreSQL Replication with PGCluster

I don't know why I am fascinated with databases. I don't work with them on a daily basis. I don't know very much about them. I don't have very much experience working with them. My experience is limited to a few simple SQL queries executed on open-source databases. However, for some reason, I have always been intrigued by the idea of large, complex databases. Perhaps it is because I see the business potential in making extremely large amounts of data available for analysis within a business. Or, perhaps it is due to the unique security challenges that exist when trying to make that data available to different groups of entities, each with their own security concerns. Regardless of the reason, I have become particularly interested in high-availability databases and database replication.

In an earlier post, I briefly mentioned the PGCluster project. PGCluster is an extension of the PostgreSQL database, designed to give it synchronous, multi-master replication. Replication is one of the areas in which PostgreSQL is lacking in comparison to other proprietary databases, such as Oracle or MS SQL Server. However, after playing around with PGCluster, I have become very impressed with its capabilities. If development continues, PGCluster could offer a sound solution to a very important problem that large enterprises will encounter if they wish to role out PostgreSQL in a high-availability situation.

To that end, this article will show the basics of setting up a synchronous, multi-master replicated instance of PGCluster. For simplicity and ease of use, I am running this tutorial using virtual instances of CentOS 5 deployed using the VMWare Player. I highly recommend you obtain this free utility, as it gives you the freedom to run virtual machine instances without having to pay too much. Once you have downloaded and installed VMWare Player, you can download a pre-built CentOS 5 instance to run in the player. I made a copy of the CentOS 5 VM in another directory so that I can run 2 separate, virtual instances of it for this tutorial. You will need to configure each VM instance to have bridged network support, utilizing DHCP. In a real-world scenario, you would use static IPs for addressing, but for the purposes of this tutorial it is not necessary. After starting the VM instances for the first time, you will need to install some compiler tools in order to build PGCluster. This includes gcc, bison, and flex, as well as their respective -devel packages. To install these packages, simply run the following from the command prompt in each VM instance:

yum -y install gcc gcc-c++ flex flex-devel bison bison-devel

Once your development environments are setup, you are ready to download and compile PGCluster. There are 2 different methods for building PGCluster. The first involves downloading a patch to the original PostgreSQL source distribution. You simply apply the patch prior to compiling PostgreSQL in order to add the PGCluster support. This method is very useful if there are other patches you wish to install prior to building PostgreSQL. The second method involves downloading the complete PGCluster distribution. This distribution includes the PostgreSQL source tree already patched with PGCluster. We will use the second, full-distribution method for simplicity. As of this writing, the latest version of PGCluster is 1.7.0rc5, and is available for download here. Download the tar/gz file and unpack it on each VM instance. Building PGCluster is as simple as running:

./configure; make; make install.

This will install PGCluster to /usr/local/pgsql, by default. There is more information about the installation process available at the PGCluster website. Now that we have successfully built PGCluster, it's time to start the configuration process.

First, some terminology. PGCluster consists of 3 main components:

  • Clusters
  • Load Balancers
  • Replicators
A Cluster is simply a database instance. The data in the clusters is what is replicated. A Load Balancer exists to share the query load between all the databases in the replication scheme. Lastly, the Replicator is used to replicate, or synchronize, data between all the clusters. In our tutorial, we will build a replication scheme with the following components:
  • Two clusters, clusterdb1 and clusterdb2
  • Once load balancer, pglb
  • Two replicators, pgrepl1 and pgrepl2
Our logical design will look like this when we are finished:



It is possible to install more than one PGCluster component (cluster, load balancer, replicator) on a single system. For our example, we will put 1 cluster on each VM, with the load balancer on VM 1, and the replicators on each VM. In practice, however, you may find you receive better performance by having each node run as a dedicated PGCluster component. When finished, the physical design will look like this:



Prior to configuring PGCluster, it is important to ensure that the hostnames for all the PGCluster components can be resolved to IP addresses. You may do this using DNS, or simply add the required entries to the /etc/hosts file. In our case, clusterdb1, pglb, and pgrepl1 all resolve to 192.168.72.128. Both clusterdb2 and pgrepl2 resolve to 192.168.72.129.

It is also important that PGCluster run as a non-privileged user. In our case, this user is postgres. You can see the steps for creating the postgres user and setting the appropriate file permissions at the PGCluster install page.

PGCluster makes use of several configuration files, each specific to the component you are installing. First, we will configure each of the clusterdb instances. In the /usr/local/pgsql/data directory, you will find 3 files that need to be modified: cluster.conf, postgresql.conf, and pg_hba.conf. The cluster.conf file defines characteristics of the database cluster, as well as the replication server that will be used. The postgresql.conf is the standard PostgreSQL configuration file. The pg_hba.conf file is used for defining PostgreSQL security and access controls. This file must be modified to trust connections originating from all other databases in the cluster. Below you will find the parameters that must be added or defined for each of these files.

cluster.conf
<replicate_server_info>
<host_name>pgrepl1</host_name>
<port>8001</port>
<recovery_port>8101</recovery_port>
</replicate_server_info>
<recovery_port>7001</recovery_port>
<rsync_path>/usr/bin/rsync</rsync_path>
<rsync_option>ssh -1</rsync_option>
<rsync_compress>yes</rsync_compress>
<pg_dump_path>/usr/local/pgsql/bin/pg_dump</pg_dump_path>
<when_stand_alone>read_only</when_stand_alone>

pg_hba.conf
host all all 192.168.72.128 255.255.255.0 trust
host all all 192.168.72.129 255.255.255.0 trust

postgresql.conf
tcpip_socket = true
port = 5432


Now that the cluster instances have been configured, we can configure the replicators. We have 2 replicators defined, pgrepl1 and pgrepl2. This allows for multi-master replication. The configuration files for each instance follow.

pgreplicate.conf for pgrepl1

<Cluster_Server_Info>
<Host_Name>clusterdb1</Host_Name>
<Port>5432</Port>
<Recovery_Port>7001</Recovery_Port>
</Cluster_Server_Info>
<Cluster_Server_Info>
<Host_Name>clusterdb2</Host_Name>
<Port>5432</Port>
<Recovery_Port>7001</Recovery_Port>
</Cluster_Server_Info>

<loadbalance_server_info>
<host_name>pglb</host_name>
<recovery_port>6001</recovery_port>
</loadbalance_server_info>
<host_name>pgrepl1</host_name>
<replication_port>8001</replication_port>
<recovery_port>8101</recovery_port>
<rlog_port>8301</rlog_port>
<response_mode>normal</response_mode>
<use_replication_log>no</use_replication_log>


pgreplicate.conf for pgrepl2

<Cluster_Server_Info>
<Host_Name>clusterdb1</Host_Name>
<Port>5432</Port>
<Recovery_Port>7001</Recovery_Port>
</Cluster_Server_Info>
<Cluster_Server_Info>
<Host_Name>clusterdb2</Host_Name>
<Port>5432</Port>
<Recovery_Port>7001</Recovery_Port>
</Cluster_Server_Info>

<loadbalance_server_info>
<host_name>pglb</host_name>
<recovery_port>6001</recovery_port>
</loadbalance_server_info>
<host_name>pgrepl2</host_name>
<replication_port>8001</replication_port>
<recovery_port>8101</recovery_port>
<rlog_port>8301</rlog_port>
<response_mode>normal</response_mode>
<use_replication_log>no</use_replication_log>


Once the necessary configuration changes have been made for the clusterdb and replicator instances, we can configure the load balancer. Our load balancer will be located on the same virtual machine as clusterdb1. The configuration file for the load balancer is located in /usr/local/pgsql/etc/pglb.conf. The configuration file follows below.

pglb.conf

<cluster_server_info>
<host_name>clusterdb1</host_name>
<port>5432</port>
<max_connect>32</max_connect>
</cluster_server_info>
<cluster_server_info>
<host_name>clusterdb2</host_name>
<port>5432</port>
<max_connect>32</max_connect>
</cluster_server_info>
<host_name>pglb</host_name>
<backend_socket_dir>/tmp</backend_socket_dir>
<receive_port>5433</receive_port>
<recovery_port>6001</recovery_port>
<max_cluster_num>128</max_cluster_num>
<use_connection_pooling>no</use_connection_pooling>


With our configuration out of the way, we are ready to start the services and begin working with PGCluster. The order in which you start the services is important. The order is as follows:
  1. All replicator instances
  2. All cluster instances
  3. All load balancer instances
To start the replicator services, run the following command as the postgres user on each replicator VM:

/usr/local/pgsql/bin/pgreplicate -D /usr/local/pgsql/etc

To start the clusterdb services, run the following command as the postgres user on each clusterdb VM:

/usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -o "-i" start

Lastly, to start the load balancer service, run the following command as the postgres user on the load balancer VM:

/usr/local/pgsql/bin/pglb -D /usr/local/pgsql/etc

You can find more information on how to start and stop PGCluster services on this page.

With our services started, we can begin testing PGCluster. All changes made to either database, either through a direct connection or one coming in through the load balancer, should be replicated across to all the other databases. The screenshot below shows a database being created. In this case, I ran the command against the clusterdb1 database. You can see the log output for this command in the top left VM. If you look at the lower right VM, you will see the log output for the replication that takes place between the two hosts.




The log output shows our database creation and replication was successful. From here, we can create tables, views, and other SQL objects and know that they will be replicated across to all the other systems we have configured.

If you run into problems, here are some common issues that my be creating the errors you encounter:
  • Make sure that the hostnames used for each cluster, load balancer, and replicator can be resolved to an IP address
  • Your firewall may be blocking incoming connections. Ensure that each host is able to communicate on all the necessary ports
  • Check your pg_hba.conf file if you are getting errors about rejected connections. You must set an entry for each IP address to trust in order for the data to be replicated successfully
This was just a short introduction into data replication using PGCluster. My next experiment will be to try and configure PGCluster for use with SE-PostgreSQL for added security. But, that's a post for another day.

7 comments:

Peter said...

When we investigated the matter a couple of months ago we ran into a problem when we wanted to use blobs within the cluster... do you know if this works in your setup?

WhoAmI? said...

Peter,

I have not investigated this as of yet. It sounds like an intriguing problem, though, and I will take a look at it.

Joe Straitiff said...

Quick comment, the two pgreplicate.conf files are missing an end tag for the second Cluster_Server_Info (it appears as a blank line -- at least in firefox...)

WhoAmI? said...

Joe --

Thanks, I've corrected this.

Carlos HernĂ¡n said...

Hi, i have a problem with pgcluster, i follow all steps to install and config files, but i don't know what happend, all start normally, but when i give a instruction how "createdb test" the system enter in a infinite bucle, the DB is created in both clyusters, but after is created never stop the command, and start to generate error: database "test" already exist, if somebody can help me, please tell me, i need this with some urgency is for my tesis. Excuse me for my english, is not perfect.

exothermic said...

Hey what about sequoia?

http://sequoia.continuent.org/HomePage

topremm said...

Hi how pgreplicator obtains query either by log or by someother means?