Thursday 16 July 2009

Refactoring of persistent classes in Java

This morning on my way home from the latest job interview, I got to thinking about the costs and benefits of applying different approaches to persisting the domain objects of an application.

The company that I had the interview with have two development teams, working on two major projects. One of the projects is using iBatis, which I took as a sign that this company isn't just blindly following the majority of the industry, who seem to be using Hibernate. The other project is using JPA. So I had to ask why they have gone with a different approach, to which the response was that it was quicker and more flexible while the product is still being fleshed out.

So, what happens when you need to make some changes to the structure of your domain, and, correspondingly, your database structure? Is there a best practice, or does each organisation have their own approach?

Are there tools out there that recognise the side effects of the changes that you are making to your Java code, and can produce SQL DML update scripts for you to apply?

This situation reminds me of the early 2000s when UML diagrams were trendy and tools such as TogetherJ attempted to keep class diagrams and source code in sync with eachother, except this situation has an additional dimension to it - in addition to the mappings between your classes and database tables there is existing data which needs to be transformed to fit the new structure.

I see this as a scenario where you really need to keep a close eye on what the ORM system you have chosen is doing. As far as I am aware if you make changes that effect where data is being stored, you are going to have to involve some human intervention for transforming your old database structure and transferring its data before you can safely put your modified code and its ORM mappings into place.

Of course all of this is nothing new, but I'm a little curious as to whether tools can make the implications of changes visible, or even automate the generation of database update scripts. The alternative would seem to require identifying the changes and following the approaches outlined in Refactoring Databases: Evolutionary Database Design.

It looks like someone has made an attempt at automating some of the updates required for updating a schema to match an updated version:

Another PostgreSQL Diff Tool

A quick Google search reveals there are a few commercial products for MS SQL Server as well.

1 comment:

  1. Hello,

    that someone was me :-) I have described the process I use with regard to database refactoring at my blog at The tool of course is not perfect as there are things that the tool simply cannot know just by reading dumps, like renaming database objects. But still, it saves a lot of time with finding differences and creating update DDL commands and the things that it cannot handle can be added to the script by hand.

    Btw, new official homepage of apgdiff is at