Saturday, 23 July 2016

Referendum petition data examined

You may have noticed that on 23rd June 2016 the UK held a referendum about whether to remain as a member of the European Union.

When the result didn't go quite the way they wanted, some people participated in an online petition with the goal of having parliament consider re-running the referendum to try to obtain a more significant resulting margin.

Link to petition

The apparent popularity of the petition attracted attention on social and conventional media, so I decided to take a look at the data.

There isn't a lot of information that is exposed, basically a breakdown of numbers by country of residence and a breakdown of numbers by constituency.

By loading the data into an application and sorting it and printing it out I noticed something conspicuous - Vatican City showing up as one of the most popular countries of residence.

Hold on, isn't that one of the least populated countries in the world?

Sure enough, the numbers involved exceeded the public stats of the total population - that literally didn't add up.

Myself and a few others flagged this up on Twitter, so the relevant parties did some checks and figured out how to filter out dodgy data and block later attempts at automated contributions.

My code is available on github.  By looking into the Git history you can see for yourself how the data for country counts changed over time.

Petition Analysis github repo

The code for the petition site itself (not by me) is also on GitHub:

https://github.com/alphagov/e-petitions

Tuesday, 19 January 2016

A post about a post - Java performance for obtaining an array from a Collection

Avid readers may recall that a while back I posted something about creating StringBuilders with the correct initial size to avoid wasted memory allocation and subsequent garbage collection.

Today I came across a blog post that covered a similar topic - creating an array from a Collection - but analysing why advice that had become conventional wisdom may no longer be valid.

It goes a little deeper than my day to day work requires me to understand, but I think it is well worth a read.

http://shipilev.net/blog/2016/arrays-wisdom-ancients/

I'm now contemplating finally getting around to using JMH to evaluate whether my approach to StringBuilders is sensible or programming by superstition.