Thursday, July 26, 2007

PostgreSQL Replication - PGCluster

Earlier, I did an article on MySQL multi-master replication. While MySQL is a very fast database, able to provide the vast majority of features that most people need, there are definite shortcomings. As a result, I've begun looking more and more into PostgreSQL as my database of choice for my development work. The major shortcoming for me, however, has always been PostgreSQL's replication and failover support. The Slony-I project provides replication, however it is asynchronous and master-slave. In many environments, this just isn't good enough. I came across PGCluster, only to see on its project site that the latest code was for PostgreSQL 7.3.

At first, I thought that this was reason enough to abandon PostgreSQL and focus primarily on MySQL. However, I have since found that I am mistaken. The PGCluster site that I initially came across was an old site for the project. The new PGCluster site shows that PGCluster has been updated for PostgreSQL 8.2. This is great news for anyone looking to use PostgreSQL in an enterprise-wide capacity. PGCluster provides synchronous, multi-master replication. It is designed for high-availability, as well as load-balancing. While I haven't had a chance to install the latest version of PGCluster, I will be giving it a long, hard look in the near future.

While working on the Odyssi PKI project's next release, I've spent a bit more time focusing on the database aspect. Certificate Authority servers are typically considered to have high-security requirements, so they are often run using dedicated database servers. You wouldn't typically want your certificate data in the same database server with your general business data. Normally, the database instance is dedicated to the CA, and is hardened for security to prevent compromise. However, in situations where high numbers of certificates are issued, performance and scalability may become a factor. In a hosted PKI environment, for example, large numbers of certificates being issued may put a strain on the database server. In addition, CA servers have high availability requirements, particularly for CRL issuance, etc. The need for failover replication becomes apparent.

With PGCluster, it is possible to implement a database architecture that allows for the performance and reliability needed for this type of scenario. The example provided on the PGCluster website shows a simple replication scenario between 3 servers; two are located in close proximity, with the third connecting remotely over a VPN. When designing a database layout for an Odyssi PKI deployment, you would typically want to have a minimum of 2 databases, acting in a load-balanced, multi-master configuration. This will provide you with failover capability in the event of a server failure, as well as the ability to split processing between servers.

Starting with Odyssi PKI's next release, I plan to include some documentation outlining recommended architecture, best-practices, etc. for designing a PKI. Now that I have discovered PGCluster is not a dead project, I will be sure to use it in my examples.

1 comment:

Jonathan Ellis said...

I looked at pgcluster about a year ago, and went with Slony instead. There just isn't anybody using pgcluster in production to speak of, and that's a little scary.

Today Slony is still the big dog of pg replication. I would also consider the Skype replication tool (Londiste).