Wednesday, November 14, 2007

The Year of PKI? Finally?

Over on, there is an article saying to Expect more PKI in 2008. Since the late 1990's, several years have been deemed the "Year of PKI". The major push by the U.S. Department of Defense to build the world's largest PKI has provided a lot of momentum and investment in the PKI space. The cost of deploying smartcards across an enterprise for use with PKI, workstation login, and physical security has been greatly reduced due to the economies of scale created by DoD PKI. Microsoft's latest PKI-related products, combined with PKI integration built into Windows Server 2003 has reduced many barriers to entry for even small organizations. However, the question still remains: Will there every be a "Year of PKI?" There are many obstacles that must first be overcome before we can definitively answer that question.

Many organizations have grown comfortable with the username/password combination. Passwords aren't secure? Just change the password policy to lengthen the minimum requirements, add special characters, etc. They see no need to move towards client-side certificates, smartcard login, or other strong authentication mechanisms until a major change in their environment requires such an effort. The same is true for intra-server communication. In most organizations, data being transmitted between web servers, application servers, database servers, etc. goes over the network in plaintext. The additional effort involved in configuring SSL for dozens of servers is often enough to justify overlooking this important security fix.

Ease of Use
In the past, rolling out an enterprise-wide PKI was an enormous undertaking in terms of the amount of work required. If an organization wanted to use client-side certificates for authentication, root certificates had to be manually installed on each workstation. Users were enrolled in a time-consuming, manual process. Additionally, the PKI products themselves were often too difficult to administer. A skilled PKI expert also came with a hefty salary. Today, however, many PKI products have become more administrator-friendly. Microsoft's Certificate Server, included within Windows Server 2003, makes deploying client-side certificates much easier by integrating with an organization's existing Active Directory infrastructure. However, this would require an organization to use Active Directory throughout, which many places are unwilling to do.

By this, I do not mean on interoperability between products. There are numerous well-defined standards governing PKI that even Microsoft adheres to. I am referring to interoperability between disparate PKI setups. A PKI is a hierarchical trust system created by an organization. The root certificate serves as the basis for this trust hierarchy. The problem arises when two organizations making use of PKI want to interoperate with one another. Imagine if two PKI-enabled companies, WidgetWorld and GadgetBarn, wish to interoperate with one another. Each organization has their own root certificate. Establishing a trust between these two companies would involve exchanging root and subordinate CA certificates so that SSL and client certificates can be verified all the way up the trust chain. While there are companies such as Cybertrust or Verisign that offer CA certificate signing using a ubiquitous root, that leads us to our next, and most major obstacle.

Cost $$$
Commercial, hosted PKI solutions are typically very expensive. To have certificates that are globally trusted, however, they are necessary. It is possible to create a PKI that is internal to an organization. But, as we discussed earlier, a problem occurs if you want to extend that trust outside of your organization's boundaries. The inclusion of Microsoft Certificate Server in its Windows Server 2003 product, along with open-source PKI solutions, has greatly reduced the cost of deploying an internal PKI. However, this does not address the cost of implementing, maintaining, and monitoring the PKI. In addition, there are equipment costs associated with the servers, smartcards, and smartcard readers necessary to deploy a full-scale, enterprise-wide PKI.

So, given these issues, will there ever be a "Year of PKI?" The answer is "probably not." However, this does not mean that PKI adoption will not continue to grow within the enterprise arena. As more and more organizations realize the potential severity of a data security breach, they are increasingly looking at strong authentication solutions. The benefits to implementing a PKI begin to look very attractive when weighed against the nightmare of a major security breach. In addition, PKI adoption can enable an organization to implement additional security measures, such as encrypted file systems, 802.1x network authentication, code and e-mail digital signatures, and VPN access.

While it is clear there will be no global explosion of PKI use any time soon, the future of PKI adoption does look very bright. The number of PKI implementations will most likely continue to grow in an increasingly rapid manner. However the amount of effort and investigation required with rolling out a PKI -- as with any other security-related endeavor -- will ensure that the transition will not occur overnight.

Friday, October 5, 2007

PostgreSQL Replication with Slony-I

In an earlier blog post, we looked at synchronous, master-master replication in PostgreSQL using PGCluster. PGCluster's load balanacing and replication features provide PostgreSQL with high availability features that are not included in the core distribution. While PGCluster does provide an adequate solution for many environments, it is not the only replication mechanism available for PostgreSQL. In addition, its drawbacks may be too great to be deployed in many circumstances.

In this post, we look at another replication mechanism available for PostgreSQL, Slony-I. Slony-I is an asynchronous, master-slave replication system. With PGCluster, we had the ability to load balance connections to the PostgreSQL database, knowing that data that is modified in one server will be replicated across to the other server. Additionally, its synchronous nature gave us confidence that, in the event of a failure, all completed transactions will be accounted for. With Slony-I, we run into a very different type of replication. Slony-I is a master-slave system, designed to utilize one master database, and one or more slaves. Data is replicated in a one-way fashion, from the master to the slave(s). It does not include a built-in load balancing feature like PGCluster, and there is no automatic failover from a failed master to one of the slaves. Failover must be manually configured using a third-party utility, such as heartbeat. Slony-I's asynchronous nature means that, in the event of a database failure, there may be uncommitted transactions that have not been replicated across. Slony-I performs batching of transaction replication in order to improve performance. However, if the master fails prior to a batch of transactions being replicated, those transactions are lost.

On the surface, it may appear as though Slony-I is a less-than-ideal choice for database replication. However, upon further investigation, it becomes clear that Slony-I may be an acceptable choice, depending on the needs of your organization. In our example, we have one master and one slave. Our master is on a host called pgmaster, while the slave is on a host called pgslave. The database that we wish to replicate is called repl_test_db. The database consists of a single, basic table, repl_test_tbl.

Before we can begin replicating our data, we need to examine a few of the core concepts involved in Slony-I replication. The first is the notion of a 'cluster'. A cluster is simply a collection of database nodes that are connected together. These are the databases that will eventually be replicated. A 'replication set' describes the actual data (tables, sequences, etc.) that will be replicated. So, while a cluster describes the underlying database that will support replication, a replication set describes the underlying database objects that will be replicated.

To make our replication simpler, we will first declare some environment variables that will be used throughout the commands that we will issue. We will store these environment variables in a shell script called The contents are as follows:







Now that our environment variables have been established, we are ready to begin setting up the necessary databases. Once again, I recommend creating a shell script that will encapsulate all the commands needed to perform our setup steps. This script will first create the repltestdb database on our master host, and then ensure that the plpgsql language is installed. This step is vital, as Slony-I requires that plpgsql be installed in order to run. Once the master database has been setup, our table will be created and populated with some simple test data. From here, we begin to setup the slave host. The slave database is created, and then a pg_dump is performed to copy the initial data from our master database into the slave database. NOTE: It is important that our master and slave databases are able to communicate with each other. This will involve modifying the pg_hba.conf file to ensure that the database permissions are set properly. For more information on how to accomplish this, see the PostgreSQL 8.2 referrence manual, or the post on PGCluster replication.

For our tests, we will use a very simple table consisting of 2 columns. In our next article, we will look at the performance differences between PGCluster and Slony-I when using larger and more complex data sets. For this example, however, our simple setup will be sufficient. The SQL script looks like this:

DROP TABLE IF EXISTS repl_test_tbl;

CREATE SEQUENCE repl_test_tbl_id_seq;
CREATE TABLE repl_test_tbl (
id INTEGER DEFAULT nextval('repl_test_tbl_id_seq') NOT NULL,
my_str VARCHAR(32),

INSERT INTO repl_test_tbl (my_str) VALUES ('This is my #1 string');
INSERT INTO repl_test_tbl (my_str) VALUES ('This is my #2 string');

After our databases have been created and setup, we are ready to start initializing the replication sets. The slonik utility is used to issue commands to the Slony-I replication engine. The easiest way to enter these commands is using a shell script like the one in this example. Our replication script looks like this:


echo Setting PostgreSQL environment variables...
. ./


echo Initializing master database...

echo Initializing master database data...

echo Initializing slave database...

echo Initializing the Slony-I cluster...

slonik <<_eof_
cluster name = ${CLUSTERNAME};
node 1 admin conninfo = 'dbname=${MASTERDB} host=${MASTERHOST} user=${REPLICATIONUSER}';
node 2 admin conninfo = 'dbname=${SLAVEDB} host=${SLAVEHOST} user=${REPLICATIONUSER}';
init cluster (id = 1, comment = 'Master Node');
create set (id = 1, origin = 1, comment = 'Test repl. table');
set add table (set id = 1, origin = 1, id = 1, full qualified name = 'public.repl_test_tbl', comment = 'Test repl. table');
set add sequence (set id = 1, origin = 1, id = 2, full qualified name = 'public.repl_test_tbl_id_seq', comment = 'Test repl. table PK');

store node (id = 2, comment = 'Slave Node');
store path (server = 1, client = 2, conninfo = 'dbname=${MASTERDB} host=${MASTERHOST} user=${REPLICATIONUSER}');
store path (server = 2, client = 1, conninfo = 'dbname=${SLAVEDB} host=${SLAVEHOST} user=${REPLICATIONUSER}');
store listen (origin = 1, provider = 1, receiver = 2);
store listen (origin = 2, provider = 2, receiver = 1);

It is important to run this script as the postgres user, so make sure you su to postgres before running it. In this script, you will notice how we add our repl_test_tbl table to the replication set. It is also important to note that the repl_test_tbl_id_seq sequence must be added to the replication set as well. Slony-I requires sequences to be explicity added to the replication set, and it also requires that every table has a primary key.

Now that our cluster and replication sets have been defined, it is time to start the replication engine and subscribe to the replication set. Subscribing to a replication set tells the database to start replicating to the defined slaves. The slon command is used to start and stop the replication engine. Once again, it is best to make use of our environment variable shell script to make things easier on us. Run the first command as postgres at a command prompt on the master host, and the second set of commands on the slave host:

Master Host:
. ./

Slave Host:
. ./

If all was successful, we are ready to subscribe to the replication sets. The shell script for performing the subscription follows:


. ./
echo Subscribing to replication set...

slonik <<_eof_
cluster name = ${CLUSTERNAME};
node 1 admin conninfo = 'dbname=${MASTERDB} host=${MASTERHOST} user=${REPLICATIONUSER}';
node 2 admin conninfo = 'dbname=${SLAVEDB} host=${SLAVEHOST} user=${REPLICATIONUSER}';
subscribe set (id = 1, provider = 1, receiver = 2, forward = yes);

Once again, run this script as the postgres user. We are now ready to test our data replication. First, connect to the PostgreSQL master and do a SELECT to see the data in our table prior to replication. If you perform the same SELECT on the PostgreSQL slave, you should see the same data. This is the result of the pg_dump that we performed earlier. Now it is time to see if our replication is successful. Once again, connect to the master database, and, this time, INSERT data into our replicated table. If we have configured our replication properly, you should be able to connect to the slave database and see that the newly INSERTed data has been successfully replicated to the slave.

If you followed the steps listed above, you can see our replication works as expected. But, how does this compare to PGCluster for data replication? As evidenced by the number of commands we were required to issue above to get the replication working, it is obvious that Slony-I requires more administration and has a steeper learning curve. Slony-I also requires us to manually define the replication sets and the data to be replicated. With PGCluster, our databases are replicated automatically. This is due to the fact that PGCluster uses rsync as its underlying mechanism to replicate the data. However, this reliance on rsync may prove to reduce performance, particularly with large datasets. Slony-I uses a trigger-based mechanism for replicating the data. In addition, its asynchronous nature may win out in environments that do not need the high availablity that a synchronous, multi-master solution can provide.

So, it is obvious that PGCluster wins out in terms of ease of administration. But what about performance? You'll have to wait until my next article to see how each of these replication mechanisms handle larger datasets, including PGBench.

Friday, September 21, 2007

More PostgreSQL Replication

It appears as though my article on Synchronous Multi-Master Replication in PostgreSQL using PGCluster filled an important need. It has become the most read article on this blog in a very short time. As such, I've started wondering about the other replication mechanisms available for PostgreSQL, as well as for other databases, such as MySQL. While PGCluster does provide a good set of capabilities, it does have its drawbacks and limitations. As a result, the next PostgreSQL mechanism I will be evaluating is Slony-I. Slony-I is an asynchronous, master-slave replication mechanism. While this type of replication does not compare to the synchronous, multi-master capabilities of PGCluster, the Slony-I project is more mature, and the way replication is performed may be a better choice in some environments.

While I've found the official Slony-I documentation to be a bit thin when it comes to providing a good introduction, I did come across a lengthy introduction that has been quite helpful in getting things started. I am still working on my article about Slony-I, but, in the meantime, take a look at the Slony-I introduction to get familiar with some of the concepts used.

Monday, August 27, 2007

PostgreSQL Replication with PGCluster

I don't know why I am fascinated with databases. I don't work with them on a daily basis. I don't know very much about them. I don't have very much experience working with them. My experience is limited to a few simple SQL queries executed on open-source databases. However, for some reason, I have always been intrigued by the idea of large, complex databases. Perhaps it is because I see the business potential in making extremely large amounts of data available for analysis within a business. Or, perhaps it is due to the unique security challenges that exist when trying to make that data available to different groups of entities, each with their own security concerns. Regardless of the reason, I have become particularly interested in high-availability databases and database replication.

In an earlier post, I briefly mentioned the PGCluster project. PGCluster is an extension of the PostgreSQL database, designed to give it synchronous, multi-master replication. Replication is one of the areas in which PostgreSQL is lacking in comparison to other proprietary databases, such as Oracle or MS SQL Server. However, after playing around with PGCluster, I have become very impressed with its capabilities. If development continues, PGCluster could offer a sound solution to a very important problem that large enterprises will encounter if they wish to role out PostgreSQL in a high-availability situation.

To that end, this article will show the basics of setting up a synchronous, multi-master replicated instance of PGCluster. For simplicity and ease of use, I am running this tutorial using virtual instances of CentOS 5 deployed using the VMWare Player. I highly recommend you obtain this free utility, as it gives you the freedom to run virtual machine instances without having to pay too much. Once you have downloaded and installed VMWare Player, you can download a pre-built CentOS 5 instance to run in the player. I made a copy of the CentOS 5 VM in another directory so that I can run 2 separate, virtual instances of it for this tutorial. You will need to configure each VM instance to have bridged network support, utilizing DHCP. In a real-world scenario, you would use static IPs for addressing, but for the purposes of this tutorial it is not necessary. After starting the VM instances for the first time, you will need to install some compiler tools in order to build PGCluster. This includes gcc, bison, and flex, as well as their respective -devel packages. To install these packages, simply run the following from the command prompt in each VM instance:

yum -y install gcc gcc-c++ flex flex-devel bison bison-devel

Once your development environments are setup, you are ready to download and compile PGCluster. There are 2 different methods for building PGCluster. The first involves downloading a patch to the original PostgreSQL source distribution. You simply apply the patch prior to compiling PostgreSQL in order to add the PGCluster support. This method is very useful if there are other patches you wish to install prior to building PostgreSQL. The second method involves downloading the complete PGCluster distribution. This distribution includes the PostgreSQL source tree already patched with PGCluster. We will use the second, full-distribution method for simplicity. As of this writing, the latest version of PGCluster is 1.7.0rc5, and is available for download here. Download the tar/gz file and unpack it on each VM instance. Building PGCluster is as simple as running:

./configure; make; make install.

This will install PGCluster to /usr/local/pgsql, by default. There is more information about the installation process available at the PGCluster website. Now that we have successfully built PGCluster, it's time to start the configuration process.

First, some terminology. PGCluster consists of 3 main components:

  • Clusters
  • Load Balancers
  • Replicators
A Cluster is simply a database instance. The data in the clusters is what is replicated. A Load Balancer exists to share the query load between all the databases in the replication scheme. Lastly, the Replicator is used to replicate, or synchronize, data between all the clusters. In our tutorial, we will build a replication scheme with the following components:
  • Two clusters, clusterdb1 and clusterdb2
  • Once load balancer, pglb
  • Two replicators, pgrepl1 and pgrepl2
Our logical design will look like this when we are finished:

It is possible to install more than one PGCluster component (cluster, load balancer, replicator) on a single system. For our example, we will put 1 cluster on each VM, with the load balancer on VM 1, and the replicators on each VM. In practice, however, you may find you receive better performance by having each node run as a dedicated PGCluster component. When finished, the physical design will look like this:

Prior to configuring PGCluster, it is important to ensure that the hostnames for all the PGCluster components can be resolved to IP addresses. You may do this using DNS, or simply add the required entries to the /etc/hosts file. In our case, clusterdb1, pglb, and pgrepl1 all resolve to Both clusterdb2 and pgrepl2 resolve to

It is also important that PGCluster run as a non-privileged user. In our case, this user is postgres. You can see the steps for creating the postgres user and setting the appropriate file permissions at the PGCluster install page.

PGCluster makes use of several configuration files, each specific to the component you are installing. First, we will configure each of the clusterdb instances. In the /usr/local/pgsql/data directory, you will find 3 files that need to be modified: cluster.conf, postgresql.conf, and pg_hba.conf. The cluster.conf file defines characteristics of the database cluster, as well as the replication server that will be used. The postgresql.conf is the standard PostgreSQL configuration file. The pg_hba.conf file is used for defining PostgreSQL security and access controls. This file must be modified to trust connections originating from all other databases in the cluster. Below you will find the parameters that must be added or defined for each of these files.

<rsync_option>ssh -1</rsync_option>

host all all trust
host all all trust

tcpip_socket = true
port = 5432

Now that the cluster instances have been configured, we can configure the replicators. We have 2 replicators defined, pgrepl1 and pgrepl2. This allows for multi-master replication. The configuration files for each instance follow.

pgreplicate.conf for pgrepl1



pgreplicate.conf for pgrepl2



Once the necessary configuration changes have been made for the clusterdb and replicator instances, we can configure the load balancer. Our load balancer will be located on the same virtual machine as clusterdb1. The configuration file for the load balancer is located in /usr/local/pgsql/etc/pglb.conf. The configuration file follows below.



With our configuration out of the way, we are ready to start the services and begin working with PGCluster. The order in which you start the services is important. The order is as follows:
  1. All replicator instances
  2. All cluster instances
  3. All load balancer instances
To start the replicator services, run the following command as the postgres user on each replicator VM:

/usr/local/pgsql/bin/pgreplicate -D /usr/local/pgsql/etc

To start the clusterdb services, run the following command as the postgres user on each clusterdb VM:

/usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -o "-i" start

Lastly, to start the load balancer service, run the following command as the postgres user on the load balancer VM:

/usr/local/pgsql/bin/pglb -D /usr/local/pgsql/etc

You can find more information on how to start and stop PGCluster services on this page.

With our services started, we can begin testing PGCluster. All changes made to either database, either through a direct connection or one coming in through the load balancer, should be replicated across to all the other databases. The screenshot below shows a database being created. In this case, I ran the command against the clusterdb1 database. You can see the log output for this command in the top left VM. If you look at the lower right VM, you will see the log output for the replication that takes place between the two hosts.

The log output shows our database creation and replication was successful. From here, we can create tables, views, and other SQL objects and know that they will be replicated across to all the other systems we have configured.

If you run into problems, here are some common issues that my be creating the errors you encounter:
  • Make sure that the hostnames used for each cluster, load balancer, and replicator can be resolved to an IP address
  • Your firewall may be blocking incoming connections. Ensure that each host is able to communicate on all the necessary ports
  • Check your pg_hba.conf file if you are getting errors about rejected connections. You must set an entry for each IP address to trust in order for the data to be replicated successfully
This was just a short introduction into data replication using PGCluster. My next experiment will be to try and configure PGCluster for use with SE-PostgreSQL for added security. But, that's a post for another day.

Tuesday, August 7, 2007

Palm CAC Solution -- Mobile Smartcards for the DoD

Recently, Palm announced the availability of a Common Access Card solution for their Treo smartphone line. For those of you not familiar with the CAC, it is a smartcard designed for use within the U.S. Department of Defense. The CAC acts as an individuals ID card, with photos and information about the user's identity, as well as a smartcard with PKI credentials for logging into a network, encrypting e-mail, etc.

Palm's CAC solution is quite interesting, particularly with respect to how the smartphone interacts with the smartcard. Palm has developed a Bluetooth-enabled smartcard reader that also acts as a badge holder. The photo, name, and expiration information is still visible, but the smartcard chip itself makes contact with a reader. This allows your smartphone to make use of the credentials on the card when sending e-mail or accessing a network.

This is a truly innovative way to handle smartcards, and with more and more enterprises evaluating the use of PKI and strong authentication, I can see this being a viable solution for the commercial world, as well. The phone itself doesn't have any bulky externally-attached readers. This makes it possible to maintain the phone's current usability and mobility, without sacrificing the security that smartcards provide.

The possibilities are endless when discussing how this technology could be used within the enterprise. Users can login to the corporate VPN from anywhere in the world, using mobile broadband connections or even WiFi (provided you have an adapter for your phone). E-mails can be digitally signed and encrypted prior to being sent. Given the lack of security in public WiFi hotspots, this is a capability that is long overdue. In addition, being able to encrypt the data on the phone itself, or lockdown access to it without the smartcard, provides the user with confidence that a lost phone will not result in data compromise.

For more information on how this product works, see Palm's Flash demo.

Monday, July 30, 2007

Java - Implement your own Service Provider Interface

As I continue development on the next release of Odyssi PKI, I've tried to apply some of the lessons I've learned regarding extensibility in object-oriented code. Version 0.1 of Odyssi PKI was relatively static, in terms of what formats and technologies were supported. The request types that the CA was able to handle were statically defined. While you could implement the X509CertificateRequest interface to define your own request format, the rest of the CA was unable to handle the request beyond a certain point. This made for a very limited, inflexible design. Want to add XKMS support? Get ready to rework a LOT of code. As a result, this lack of preparation for the future was high on my list of things to redesign for the next release.

As I sought out the best way to handle this problem, I realized that working with X.509 certificate request formats/objects is very similar to the way Java already handles certificates, encryption keys, and signature algorithms. In each case, the algorithms and formats for these objects is implemented through the use of a Service Provider Interface (SPI). In this example, the SPI is the implementation of the Java Cryptography Extension that is configured for the JVM. If your preferred algorithm is not supported, perhaps one of the other JCE implementations has it. All that is needed is to configure the JVM to recognize the JCE implementation, and ensure that it is located on your classpath.

Upon realizing this, I decided that there must be a way to take advantage of this type of design to handle X.509 certificate request objects. I needed to look no further than Java's ServiceLoader class. The ServiceLoader class is used to locate classes that implement a given SPI interface or extend an abstract SPI base class. In our example, we have several classes and interfaces that will be used to handle X.509 certificate request objects. The first interface is X509CertificateRequest. This interface defines several methods that are common to all types of certificate requests: getPublicKey(), getSignatureAlgorithm(), etc. We also define a class called X509CertificateRequestFactory. This class is a factory class that creates X509CertificateRequest objects. It is also the class that will make use of the ServiceLoader object. Lastly, we need to define an interface called X509CertificateRequestFactorySpi. This interface defines the methods that must be present in all of our SPI implementations. We'll look at X509CertificateRequestFactory first, as it is the most important.

The X509CertificateRequestFactory class has a static method called getInstance() that takes as its parameter a certificate request format. This format could be PKCS #10, XKMS, CRMF, or any other format that we like, so long as we have an SPI implementation for it. The constructor for this class is private, and takes a X509CertificateRequestFactorySpi object as its only parameter. This is done because the SPI implementation object will be used by the factory class to provide all of its underlying functionality.

Now that we've discussed the general structure of the factory class, let's look at the SPI interface and the methods it defines. Since the X509CertificateRequestFactorySpi class is used to provide all of the underlying functionality for the request factory, it must provide methods such as getCertificateRequest(...) that will be used to generate the request object itself from another object (byte[], String, XML document, etc.) In addition, another method isSupported() is defined, which takes as its only parameter a String, denoting the request format. This method returns true if the request format is supported by this SPI implementation.

So how do we use the ServiceLoader class? The ServiceLoader class looks for a provider configuration file located in the META-INF/services directory of the SPI implementation's JAR file. This file is the fully-qualified binary name of the services type. That is, if our request factory's full name is net.odyssi.certserv.x509.request.X509CertificateRequestFactory, the provider configuration file would be called net.odyssi.certserv.x509.request.X509CertificateRequestFactory, and would be located in the META-INF/services directory of the implementation JAR file. This file contains the name(s) of the SPI implementation class(es) contained within the JAR. So, for example, if we wanted to support PKCS #10 certificate requests, our provider configuration file may have one entry that reads:


The PKCS10CertificateRequestFactory class provides all of the dirty work in generating an X509CertificateRequest object from an arbitrary source, such as an InputStream or a byte[]. Now that we've defined our provider configuration file with our SPI implementation class, we're ready to put the ServiceLoader class to work.

Within the getInstance() method of the X509CertificateRequestFactory class, we would have the following:

public static X509CertificateRequestFactory getInstance(String format) {

ServiceLoader sl = ServiceLoader.load(X509CertificateRequestFactorySpi.class);
Iterator it = sl.iterator();
while(it.hasNext()) {
X509CertificateRequestFactorySpi impl = (X509CertificateRequestFactorySpi);
if(impl.isSupported(format)) {
return new X509CertificateRequestFactory(impl);

return null;

In this method, the ServiceLoader sl = ServiceLoader.class(X509CertificateRequestFactorySpi.class) line instructs the ServiceLoader to find all implementations of the X509CertificateRequestFactorySpi interface. From here, we iterate through the returned results and see if an implementation can be found for the given request format. The returned X509CertificateFactory would then used the SPI implementation to provide its underlying functionality for getCertificateRequest(...), etc.

The ServiceLoader class provides a great mechanism for adding functionality and future-proofing your applications. Unlike other mechanisms such as Spring's dependency injection, no modifications are necessary for your code to take advantage of the new SPI implementation. The SPI implementation JAR file simply needs to be located on the classpath for the functionality to be available. By making use of a well defined SPI, it is possible to interchange components with your application without losing functionality. It also provides a great way of adding new capabilities to your already existing applications. For more information about ServiceLoader, take a look at the javadocs, or this article.

Thursday, July 26, 2007

PostgreSQL Replication - PGCluster

Earlier, I did an article on MySQL multi-master replication. While MySQL is a very fast database, able to provide the vast majority of features that most people need, there are definite shortcomings. As a result, I've begun looking more and more into PostgreSQL as my database of choice for my development work. The major shortcoming for me, however, has always been PostgreSQL's replication and failover support. The Slony-I project provides replication, however it is asynchronous and master-slave. In many environments, this just isn't good enough. I came across PGCluster, only to see on its project site that the latest code was for PostgreSQL 7.3.

At first, I thought that this was reason enough to abandon PostgreSQL and focus primarily on MySQL. However, I have since found that I am mistaken. The PGCluster site that I initially came across was an old site for the project. The new PGCluster site shows that PGCluster has been updated for PostgreSQL 8.2. This is great news for anyone looking to use PostgreSQL in an enterprise-wide capacity. PGCluster provides synchronous, multi-master replication. It is designed for high-availability, as well as load-balancing. While I haven't had a chance to install the latest version of PGCluster, I will be giving it a long, hard look in the near future.

While working on the Odyssi PKI project's next release, I've spent a bit more time focusing on the database aspect. Certificate Authority servers are typically considered to have high-security requirements, so they are often run using dedicated database servers. You wouldn't typically want your certificate data in the same database server with your general business data. Normally, the database instance is dedicated to the CA, and is hardened for security to prevent compromise. However, in situations where high numbers of certificates are issued, performance and scalability may become a factor. In a hosted PKI environment, for example, large numbers of certificates being issued may put a strain on the database server. In addition, CA servers have high availability requirements, particularly for CRL issuance, etc. The need for failover replication becomes apparent.

With PGCluster, it is possible to implement a database architecture that allows for the performance and reliability needed for this type of scenario. The example provided on the PGCluster website shows a simple replication scenario between 3 servers; two are located in close proximity, with the third connecting remotely over a VPN. When designing a database layout for an Odyssi PKI deployment, you would typically want to have a minimum of 2 databases, acting in a load-balanced, multi-master configuration. This will provide you with failover capability in the event of a server failure, as well as the ability to split processing between servers.

Starting with Odyssi PKI's next release, I plan to include some documentation outlining recommended architecture, best-practices, etc. for designing a PKI. Now that I have discovered PGCluster is not a dead project, I will be sure to use it in my examples.

Friday, July 20, 2007

Odyssi PKI - 1,000 Downloads

As of this morning, version 0.1 of Odyssi PKI has been downloaded 1,000 times. My thanks to all those who downloaded the program. I am currently working on version 0.2, which will represent a major rewrite and restructuring of the entire codebase. Version 0.1 was more intended as a proof-of-concept. In writing it, I was able to experiment with several technologies I hadn't played with before, like Spring or Hibernate. The development turned out to be a great learning experience.

Version 0.2 will be much better from a design and functionality standpoint. It is being rewritten from the ground up, and will include many functions and features that were missing in version 0.1. This includes:

  • Easier CA administration, through the use of a web-based admin console
  • Easier creation and management of Registration Authorities
  • Enhanced security requirements, such as enforcement of strong authentication in the Registration Authority console
  • CRL support
  • XKMS (possibly)
  • Enhanced certificate template support, allowing admins to define the characteristics of different types of certificates that will be generated
  • Support for numerous X.509 certificate extensions
  • Stand-alone release (using an included distribution of Apache Tomcat) for easier deployment and administration
This is just a small list of the features I plan on including in the next version. I have no firm dates on when it will be released, as I'm working on it in my free time (which is minimal, now that I have a toddler). However, work is progressing quickly and I hope to have something posted really soon.

Wednesday, May 30, 2007

Free at Last, Free at Last! FiOS is Here!

This morning, I was awakened by the sound of jackhammers outside of our house. Normally, jackhammers at 7:00 would be quite an unpleasant situation. However, this morning, that sound is a welcome one. Why? Because it means that, soon, there will be a fiber optic cable buried in the ground on our street. And it is this cable that represents my freedom. Freedom from our local cable company, Comcast.

Anyone who has ever had Comcast for their cable or Internet service can most likely relate to my feelings about the company. Slow service. Outrageous prices (and frequent price increases). Terrible customer support. The list goes on, and on. Soon, however, I will be able to break free of the shackles of oppression (Comcast), and experience the freedom that is Verizon FiOS. Now, don't get me wrong. I'm well aware of the fact that FiOS will likely cost me just as much as Comcast does right now. However, the mere fact that I'm able to get away from Comcast in the first place makes it all worth it.

When I purchased my house (about 6 years ago), my bill for cable and Internet service was about $80. This included basic cable, and a 4Mbps (advertised) Internet connection. At the time, I had no complaints whatsoever. My cable rarely, if ever, went out. And my Internet connection was always speedy and reliable. Over time, however, the situation has changed dramatically. My combined bill is now over $100. And what has the addition $20 gotten me in 6 years? 2 additional TV channels (only one of which I'm remotely interested in), and an increased in the advertised Internet connection speed. Note the word advertised. The advertisement in question is quick to point out that this is a maximum connection speed. In other words, Comcast could provide me with all of 1bps, and would still be offering the service they advertise.

In reality, my Internet service has gotten consistently worse over the last 6 years. In the first 3 years that I lived there, I lost service only twice. In both instances, I was able to resolve my problem by simply disconnecting my cable modem, waiting a couple of minutes, and reconnecting it. Now, however, I am faced with connection problems on an almost weekly basis. And, sadly, most of these are issues I am unable to resolve. Periodically, I will end up with little to no connectivity. The indicator light on the modem tells me that there is a signal coming in. But, for whatever reason, I can't connect to anything. Other times, I am faced with DNS resolution issues. I can get out if I know the IP address. But, good luck relying on Comcast's DNS servers. What upsets me most, though, is that my downstream service has gotten progressively slower over the last 3 years. My bill has gone up, my connection speed has gone down.

I'm sure that many of these problems boil down to one thing: Comcast doesn't feel compelled to spend money on improving service when they've already got a monopolistic hold on a market. Why invest in infrastructure improvements if your customers don't have an alternative? Such sentiment exists whenever a company has a monopoly. However, when an attractive alternative opens up, customers who feel abused by a monopoly are often quick to migrate to that alternative. Look at Microsoft, and the monopoly it has on the software (particularly Operating System) market. Consumers, fed up with the ongoing instability and security issues found in Windows, have begun migrating to Apple's Mac OS X. While Apple's market share is still small compared to Microsoft, it is growing steadily, particularly in the notebook computer segment. When I walk through our neighborhood, I inevitably hear someone talking to one of the workers laying Verizon's fiber optic cable. The discussion always seems to start out the same way: "When can I call and have FiOS installed at our house?" Consumers are chomping at the bit to find a better alternative.

While Comcast still has a stranglehold on numerous markets (including ours, for the time being), this situation demonstrates how quickly things can change in any market. Dell at one point had a commanding lead over its competitors, based primarily on its reputation for quality and good customer service. However, recent declines in quality combined with poor customer support due to outsourcing has caused Dell's market share to slip, and its reputation to be tarnished. Ed Catmull, one of the creative geniuses behind Pixar, once said, "Quality is the best business plan." By offering its customers sub-par products at an ever-increasing price, Comcast is risking losing those customers to competitors such as Verizon or DirecTV. I, for one, will be scheduling an installation appointment just as soon as the Verizon trucks pull out of our street.

Friday, May 11, 2007

New Odyssi PKI Project Location

I have modified the Sourceforge project location for OdyssiCS. The project has moved to its new page,, and has been given a new project name, Odyssi PKI, to reflect the overall goals of the project. With the inclusion of some additional features, such as an OCSP responder and GUI-based certificate/key management tools, it became clear that I was developing more than just a Certificate Authority server. As a result, I've renamed the project Odyssi PKI to reflect the suite of tools that will eventually be released. Although the project page has changed locations, the website for the project remains the same.

Friday, May 4, 2007

OdyssiCS Update and Milestones

This morning, I logged into the SourceForge project page for OdyssiCS. I was curious to see how many downloads of the application have occurred. To my surprise, the number of downloads since release 0.1 last year has topped 800. I don't know how many of these downloads resulted in any substantial use, but it's still refreshing to know that people are at least looking at it. To all of you that have downloaded and tried OdyssiCS, thank you. And, please, give me your feedback! It will go a long way to helping me in my development.

I have been working on version 0.2 in what little spare time I have. I've been fortunate to learn a great deal about Hibernate, Spring, and general Java security since the first release. The main areas I have been focusing on for release 0.2 are:

  • Security
  • Ease of Administration
  • Inclusion of more advanced features
Release 0.1 was rather weak in terms of security. While I did perform basic group membership checks, I realized that I was not designing for security as much as I'd originally intended. As a result, I have gutted a great deal of the original code to ensure that it is designed with security in mind from the ground up. I've also been spending a bit of time working on a secure JAAS-based application framework that I hope to have included in version 0.3.

Administration in release 0.1 was also quite lacking. In an effort to get the first release out there, I didn't spend as much time on developing an administrative interface. This includes both CA server administration, as well as certificate administration for Registration Authorities. I've sketched out quite a few ideas that I would like to incorporate into the web interface for administration. Hopefully, this will make OdyssiCS easier to deploy and maintain, making it viable for some real-world work. Additionally, for version 0.3 I plan to include an embedded Tomcat server and Windows GUI installer to make installation and configuration even easier. My plans for version 0.3 are still quite sketchy, but hopefully it will represent a really high-quality, feature-rich release.

Lastly, there was quite a bit of functionality missing in version 0.1. This is being addressed in version 0.2, starting with the use of certificate extensions and templates. Part of the power of X.509 certificates is the ability to include certificate extensions. These extensions outline additional characteristics about the certificate, such as the constraints that exist pertaining to what it can be used for, CRL information, and policies that apply to the certificate. I had some code in version 0.1 for working with extensions, but just wasn't happy with how it turned out. I shelved it, deciding it would be best for the next release. Certificate templates provide a way to define the characteristics of the different types of certificates a CA can create. For example, SSL server certificates might have different properties (extensions, validity periods, key sizes, etc.) from e-mail certificates. This will add a great deal of functionality to OdyssiCS, and I look forward to having that code completed.

Some of the features I plan on implementing for version 0.2 include:
  • Certificate revocation, with full CRL support
  • OCSP (Online Certificate Status Protocol) for reporting certificate revocation information
  • Completely redesigned GUI for both end-entities and administrators
  • Enhanced security features throughout all domains of the application
  • Revised SQL schema, and SQL scripts for MySQL, PostgreSQL, and other SQL databases
  • X.509 certificate extensions and certificate templates
This is just a sample of some of the things I'm working on for the next release. I don't have a timeline for it, as my free time for development has been limited. Please feel free to leave a comment if there are additional features you would like me to investigate for this release. I've already started pondering what I want for version 0.3 (XKMS, JAAS-based security framework, JMX management, etc.) so please pass on your suggestions!

Monday, April 2, 2007

Hibernate CompositeUserType and X509Certificate

Often, when working with Hibernate as your ORM tool, you find the need to persist an object to the database that doesn't match one of the supported types. For example, what if you wanted to persist an XML object? One such way would be to convert the XML to a String type and persist it that way. Of course, that messes up our object model, doesn't it? Wouldn't it be nice if a setXXX() method could handle the XML object directly? Using Hibernate's UserType object, we have the flexibility to handle these types of conversions outside of our object model.

I was faced with a very similar problem while reworking the way X.509 certificates are persisted in OdyssiCS. Previously, I would conver the certificate to a byte[] and store its base64 encoded representation in the database. This resulted in a very ugly object model. I contemplated making use of a UserType object for my X.509 certificate, but was then faced with a secondary problem: How would I be able to perform queries against properties of the certificate? What if I wanted to locate a certificate with a specific serial number? One approach would be to maintain those properties separately. However, this results, again, in a polluted object model. After further review, I decided upon Hibernate's CompositeUserType object. A CompositeUserType is an extension of UserType that allows for properties to be derefferenced when performing a Hibernate query. Since I couldn't find much information about how to work with CompositeUserType objects, I decided to post some of my findings here to serve as a reference to others. First, let's look at what we want our end result to be.

In my object model, I have a CertificateModelImpl class that contains one method related to what we're doing with CompositeUserType: getCertificate(). This method returns the X.509 certificate object. Our database has a table called certificate_tbl with the following columns:

  • ID -- A simple incremented row identifier
  • SUBJECT_ID -- Corresponds to the table containing our subject distinguished names
  • START_DATE -- The start of the certificate validity
  • END_DATE -- The expiration date for the certificate
  • SERIAL_NUMBER -- The certificate serial number
  • CERTIFICATE_DATA -- Contains the certificate in byte[] format
In our relational model, SUBJECT_ID is a foreign key pointing to another table. For our discussion, it is not relavant. We would like to focus on the remaining columns. Our end goal is this: We want to be able to call setCertificate() on CertificateModel, passing an X509Certificate object as a parameter. We also need to be able to use properties of the certificate, such as serial number and expiration date, in a Hibernate query. We want to store the certificate properties in the corresponding columns of the table, but don't want them available to our object model. If these properties were available, we could run into an issue with data inconsistancy.

To start implementing our X509CertificateUserType object, we must implement a couple of preliminary methods. The returnedClass() method tells Hibernate what type of object we are dealing with. In our case, we return an X509Certificate.class object. The equals() and hashCode() methods are pretty standard fare, acting as they normally do in Java. There are a couple of other methods that must be implemented, such as deepCopy() and isMutable(). Take a look at the Hibernate Javadocs for more information on what the remaining methods accomplish.

Our first goal is to be able to store an X509Certificate object in the database as a byte[] and retrieve it later as an X509Certificate object. The two methods of CompositeUserType that accomplish this are nullSafeSet() and nullSafeGet(). The nullSafeSet() method is responsible for converting the certificate to a byte[] and storing it in the appropriate column. This is also where we store the other certificate properties, such as expiration information and serial number. For our puirposes, this is what the implementation of this method looks like:

public void nullSafeSet(PreparedStatement preparedStatement, Object object,
int i, SessionImplementor sessionImplementor) throws HibernateException, SQLException {
X509Certificate cert = (X509Certificate) object;
Hibernate.DATE.nullSafeSet(preparedStatement, cert.getNotBefore(), i);
Hibernate.DATE.nullSafeSet(preparedStatement, cert.getNotAfter(), i + 1);
Hibernate.BIG_INTEGER.nullSafeSet(preparedStatement, cert.getSerialNumber(), i + 2);

try {
preparedStatement.setBytes(i + 3, cert.getEncoded());
} catch (CertificateEncodingException e) {
throw new HibernateException(e);

You see here that we, first, set the additional properties for certificate. How do we know which columns the values are set in? This is set as part of our Hibernate mapping file, which I will explain later. From there, we convert the certificate to a byte[] and set that in the database as well. We're now ready to implement the method for retrieving the certificate from the database. Our nullSafeGet() implementation looks like this:

public Object nullSafeGet(ResultSet resultSet, String[] strings,
SessionImplementor sessionImplementor, Object object)
throws HibernateException, SQLException {
X509Certificate cert = null;
byte[] certData = resultSet.getBytes(strings[3]);
if (!resultSet.wasNull()) {
try {
CertificateFactory cf = CertificateFactory.getInstance("X.509");
cert = (X509Certificate) cf.generateCertificate(new ByteArrayInputStream(certData));
} catch (CertificateException e) {
throw new HibernateException(e);
return cert;

Here, you'll se we create a CertificateFactory object to parse our byte[] into an X509Certificate object. At no point do we concern ourselves with the additional certificate properties we set earlier. Our only concern is the certificate itself. At this point, we can store and retrieve an X509Certificate object with no problems.

We now need the ability to reference properties of an X509Certificate in a Hibernate query to simplify with searching. The getPropertyNames(), getPropertyTypes(), getPropertyValue(), and setPropertyValue() methods of CompositeUserType assist with this goal. The getPropertyNames() method returns the names of the properties we wish to make available when querying for a certificate. In our case, these values are:

  • issueDate
  • expirationDate
  • serialNumber
  • certificateData
The getPropertyTypes() method specifies the Hibernate type that corresponds to each of these properties. So, in our case, the following types would be returned:
  • Hibernate.DATE
  • Hibernate.DATE
  • Hibernate.BIG_INTEGER
  • Hibernate.BLOB
The getPropertyValue() method is responsible for determining which property is being queried and then returning the appropriate value. In our case, the implementation looks like this:

public Object getPropertyValue(Object object, int i) throws HibernateException {
if (object == null) {
return null;
X509Certificate cert = (X509Certificate) object;
if (i == 0) {
return cert.getNotBefore();
} else if (i == 1) {
return cert.getNotAfter();
} else if (i == 2) {
return cert.getSerialNumber();
} else if (i == 3) {
try {
return cert.getEncoded();
} catch (CertificateEncodingException e) {
throw new HibernateException(e);
} else {
return null;

The final method, setPropertyValue(), does nothing in our implementation. We, obviously, don't want the certificate properties to be modifiable, so our method implementation simply does nothing and then returns.

Now that we have implemented the necessary methods, we need to create the Hibernate mapping for our object. In our CertificateModelImpl.hbm.xml file, we have the following mapping for our certificate property:

<property name="certificate" type="net.odyssi.certserv.dao.hibernate.model.X509CertificateUserType">
<column null="true" type="date" name="start_date">
<column null="true" type="time" name="expiration_date">
<column null="true" type="integer" name="serial_number">
<column null="true" type="blob" name="certificate_data">

That's all there is to it! Hibernate can now store and retrieve X509Certificate objects with no problem, and queries can reference properties of that certificate. To see the finished product, take a look at X509CertificateUserType in the OdyssiCS CVS repository.

Friday, March 30, 2007

Security-Enhanced PostgreSQL

Recently, I came across a posting announcing the release of a SELinux-enhanced version of PostgreSQL. SE-PostgreSQL makes use of the labeling and Mandatory Access Control (MAC) features that SELinux provides. This is a tremendous addition to enterprise security, particularly in the government and financial fields.

What SE-PostgreSQL provides is a mechanism for controlling access to information in a database in a very fine-graned manner. The example provided in the announcement demonstrates this using a table containing beverage information. In this example, the table contains several columns related to different types of beverages (name, price, type of beverage, quantity on hand, etc.). The table is then altered to require a SE-Linux security context to access specific parts of this data. For example, one security context is required to access the table at all. An additional, higher-security context is required to access rows where the beverage is not a soft-drink. If you look at the output provided in the sample queries, you will see that these rules are applied based on the user's security context. The amount of customization doesn't end there, however. You can apply access control based on rows, columns, tables, databases, etc. While this level of customization may be intimidating to smaller enterprises, larger enterprises will welcome the flexibility it gives them.

This type of capability has huge ramifications for the use of PostgreSQL, Linux, and open-source in general within the government and financial sectors. Data of different government classification levels can now reside in the same database. A unified database can be used in financial institutions with assurance that the information cannot be modified by unprivileged or unauthorized users. My small amount of SQL experience has largely revolved around open-source databases, such as MySQL and PostgreSQL. My day job does not require much SQL work, so as a result I only play around with it on my free time. While MySQL seems to be the most widely used, my personal preference is for PostgreSQL. True, it lacks some of the features MySQL has, such as clustering. However, it more than makes up for it in other ways, such as its strict transaction handling abilities. I've always felt PostgreSQL's security architecture was superior as well, with its ability to use external authentication sources (Kerberos, LDAP, etc.) for users. The addition of SELinux support to this mix has put PostgreSQL in a class unmatched by any other open-source database (and many closed-source ones as well).

Thursday, March 15, 2007

CentOS 5-beta -- First Impressions

Last night, I installed the beta release of CentOS 5 to take it for a test drive. For those of you looking for screenshots, there are several screenshots of Red Hat Enterprise Linux (RHEL) 5 available here. Although the artwork is different in CentOS, everything else is the same. These screenshots should give you a general idea of what the experience is like. While I haven't had a chance to play with all of the new and enhanced features, I've been quite impressed with what I have seen so far. I started by installing the beta on my HP Pavilion zt3000 laptop. My biggest criticism of those who claim that "Linux is ready for the desktop" has been issues with running it on a laptop. From ACPI support, poor hardware support, and an inability to function the way a laptop should, I have been less than thrilled with several Linux distributions.

For starters, I was pleasantly surprised with the updates to Anaconda, the system installer. The entire install process has been streamlined, requiring far less user intervention, even for more advanced configurations. One of the things I like most about Windows XP installations is that, for the most part, it is "set it, and forget it". You answer a few questions about networking and such, and the rest of the install is automated. CentOS 5 follows this philosophy with the updated installer.

One thing that was obvious from the first time I booted into CentOS 5, is that it is FAST. Really fast. In CentOS 4.4, it took approximately 10 seconds from login to GNOME desktop. With CentOS 5, this time has been reduced to 2 seconds. While this is a pretty trivial achievement, it goes a long way to improving the usability of the system. Speaking of usability, CentOS won me over during the boot process. Why? Because X Windows, by default, was set to my laptop's optimum resolution: 1680x1050. In every other Linux distribution I have used, it defaulted to some other non-widescreen resolution, and I was forced to manually change the configuration file. Not any more. In addition, when I later plugged the laptop into its docking station to use my external 19" LCD, it immediately changed its resolution to the appropriate 1280x1024. This is such a long overdue change in system behavior.

Once I was logged in, I began taking a quick tour of the system. The new Clearlooks theme looks great, and the crispness of screen fonts is incredible. I've never seen a Linux distribution that displays fonts this clearly. This release also includes Compiz, the compositing engine for X Windows. Compiz provides all the nifty (yet pointless) visual effects for the desktop, similar to what Mac OS X is capable of. This was one of the things I was most curious about, as I have not had a chance to work with a Compiz-enabled system. My impressions? Not bad. The wobbly windows are definitely a neat trick, but I don't like that it takes a split second for the windows to snap into place and regain their focus. One thing I didn't get a chance to investigate was the load on the CPU with Compiz enabled. Compiz is OpenGL accelerated, and I wanted to make sure that the extra load was forced off to the graphics card instead of the CPU. I'll take a look at this tonight.

This evening, I plan to investigate some of the server and developer-oriented features of CentOS 5. Since I spend most of my time in Linux doing Java development work, I'm curious as to the performance of Eclipse, Intellij Idea, MySQL/PostgreSQL, and Tomcat. If the server performance has increased in the same way as desktop performance, I think I will find myself spending a lot more time in Linux, and a lot less time using Windows XP.

Wednesday, March 14, 2007

Smartcards at Disneyland?

I stumbled upon this blog entry, showing a picture of the new hotel room keys in use at Disneyland Paris. You're not mistaken, they look like smartcards. It seems every year in recent memory has been called, "The Year of the Smartcard" or, "The Year of PKI". I can certainly imagine some innovative uses for these smartcards. Since they can be used for payments at all of the restaurants and stores within the park, what about a Kerberos-enabled payment system? Or, load the cards with credentials enabling a user to download photos from their vacation when they get home (so long as they have a reader attached to their PC). While I'm not sure what the backend technology behind the cards does, it's not the first time that Disney has used cutting-edge security technologies at their parks and resorts. In 2005, Disney implemented biometric finger scans for all guests at Walt Disney World. However, they used similar technology for annual passholders even before that.