Monday, August 11, 2008

Managing your database schema - upgrading

One of the first problems I encountered at Pointserve (my current place of employment), was how to manage our database schema. Our software comes packaged as an ear, but relies on an everchanging database schema.

My predecessors had setup a file in cvs called nextBuildPatchUpdates.sql. Developers would commit into that file. At release time, that file was dynamically appended with an insert statement which included the release number. This was ok, but it didn't play well our continuous integration servers. Developers hated it as it was hard for them to keep track of changes during a development cycle.

Here is what we have now.

In subversion, we maintain an sql directory.

The sql directory contains these folders:
1) functions
2) procedures
3) upgrades
4) definitions
5) includes
6) triggers
7) views

These should be self explanatory. All of our database scripts are version controlled. That is key. Includes are just some commonly used data snippets.

The upgrades folder is the piece that we will look at. This folder contains two important pieces:

1) A collection of upgrade scripts in the format FROMVERSION-TOVERSION-upgrade.sql. It also contains an xml file called upgradePaths.xml.

Here is a sample of the upgradePaths.xml

<upgradePaths>
<upgradePath>
<fromVersion>0.0.0</fromVersion>
<toVersion>39.0.0</toVersion>
<fileName>0.0.0-39.0.0-upgrade.sql</fileName>
</upgradePath>
<upgradePath>
<fromVersion>39.0.0</fromVersion>
<toVersion>40.0.0</toVersion>
<fileName>39.0.0-40.0.0-upgrade.sql</fileName>
</upgradePath>
</upgradePaths>

What we have defined here, is an xml file that describes a collection of upgrade paths. Imagine that there are several hundred upgrade path definitions here. How do then figure out how to upgrade a schema?

First, you must figure out what version your db is currently at. Next, you must figure out which version you want to upgrade to. Then, you must compute a path between your start version and your end version. This is easy to do. Using the upgradePaths.xml, we transform it into a tree (complete with nodes and vertices). We locate the start node and end node and then use dijkstra's shortest path algorithm to find the path. We new have a listing of all nodes required (and more importantly, the upgrade scripts for each node). We then execute all of these scripts in order and voila! An upgraded database.

This logic was written once as a POJO. Its been wrapped into an ant task for use in both our build system and by our product installers.

The ant definition looks like this:

<taskdef name="dbupgrade" classname="com.pointserve.ant.tasks.DatabaseUpgrade" classpathref="buildtasks"/>
<!-- build customized db upgrade script -->
<dbupgrade servername="${JDBC_SERVER}" dbname="${JDBC_DATABASE}" username="${JDBC_LOGIN}" password="${JDBC_PASSWORD}" driver="${JDBC_DRIVER}" upgradetoversion="${JDBC_DATABASE_VERSION}" upgradeXML="${basedir}/source/sql/upgrades/upgradePaths.xml" upgradesDir="${basedir}/source/sql/upgrades" upgradeScript="${basedir}/source/sql/dbupgrade.sql"/>


That's a quick overview of the process. I'm happy to discuss this more as I haven't seen many standard solutions out there.

1 comment:

Charles Hudak said...

You might want to check out Liquibase