Tuesday 22 November 2011

Hey, that's cool! - Contagious enthusiasm

I find it fun to work with young developers, their enthusiasm and willingness to learn new things makes leading the way less of a chore and more of a privilege. Today was a great example. Late in the day I set up a user account in Jira and emailed it across to the newest member of the team. As we got ready to shut down our development for the day he said something along the lines of, "Hey, that's cool. I'll definitely be using this. I've only used bugzilla before, but this looks heaps better. Can it integrate with tasks in eclipse?" ... The next ten minutes flew by as I gave a brief demo of the plugin, including uploading/downloading of contexts. Then it was time to show how the subversion check-in comments automatically tie together which files were updated for each task. I think the next step will be to get Jenkins running on a shared server instead of just on my laptop.

Sunday 20 November 2011

Auto-Refreshing Caches - Part 1

Introduction


This blog entry provides an overview of my recent experience of developing a software system which keeps its core data fresh and ready to be presented to users.

It is a work in progress, so may end up spanning a few posts.


What do I mean by an auto-refreshing cache?

A cache which has its content pre-loaded and refreshed automatically without user action.

Why do I want an auto-refreshing cache?

The desired time for displaying content on screen is less than the time required to fetch the data from the data source across the network - by orders of magnitude.

The remote data source and network connection are beyond our control.

There's more than one way data can change over time


There are two primary ways in which the data in this system changes over time, which should be reflected in the state of the cache:
- Data that should no longer be displayed because it is no longer relevant
- Data properties that change over time

Automating removal


The data to be cached includes some date and time properties which can be used as a basis for removal from the cache.

Due to the diverse nature of the data we have multiple caches.  In some caches an entity has its own entry keyed by a unique identifier, while in other caches multiple entities are grouped together by a key generated from the criteria used for the data source lookup (e.g. date and group id).

For the grouped data, the approach for removing expired data involves:
- iterating over the cache entries
- checking that the overall cache entry is not due to expire
- obtaining a write lock on the cache entry
- iterating over the entities contained in the cached data structure and removing those that are expired
- putting the updated data back into the cache
- releasing the write lock on the cache entry

Automating updates of existing entries


For data which is already held in the cache and is not ready to be removed, we can re-fetch the data from the data source and write it into the cache.