Thoughts on software development: 2015

Monday 3 August 2015

Continuous improvement

Every few years I like to take a couple of weeks out of the working world to learn about techniques and technologies that are approaching the mainstream.

When I first moved to London back in 2008 I had to learn all about Spring and Hibernate as the local job market was mainly fixated on those technologies.

After the recession cooled down a bit I found my way into a development role based on Hybris - an e-commerce system held together by Spring.

Fast forward to 2012, before joining Springer I dabbled in accessing some youtube APIs to see what would be involved in establishing a degrees of separation relationship between music videos via their related videos. That was a bit of fun and gave me some insight into some simple performance optimisation options in distributed systems:

divide the work up and allocate it to a pool of workers
have the workers share a small piece of information to prevent duplication of effort (checking videos that had already been visited).

At around the same time I took an interest in nosql databases. After attending the inaugural London Data Bar meetup group, I won a ticket to a two day hands on NoSQL databases conference which gave me a chance to dabble with neo4j and MongoDB.

For 2015 I'm taking a dabble in open source products, seeing how they work, contributing some code and improving some documentation along the way.

Do drop-outs have an advantage?

Since starting my quest for a new job I've started to notice that the technical interview process often involves some aspect of computer science that I learnt in my first or second year of university.

Twenty years is a long time to stretch my memory back, so I'm going to have to do an online refresher course and / or some reading to keep myself competitive with the recent graduates - where by recent I mean people who have graduated in the last 10 years or so.

Actually, I remember a guy from my class who was getting good grades but decided to leave university without graduating - I think someone in that situation would excel in these technical interviews.

Tuesday 21 July 2015

Java 8 Lambda overhead

Intro

This is a follow-up to an earlier post where I was speculating that Java's Just In Time compilation may have been causing significant performance differences when some code was being exercised more than one or two times.

Here is a snippet of the code which I would like to focus on:

int anIndex = Collections.binarySearch(sorted, target,

(o1, o2) -> o1.compareTo(o2)
);

The title of this post should have given away that the lambda expression made the performance take a hit until something kicked in and probably replace the lambda setup in each loop iteration with a single instance.

Alternatives

Comparator intComparator = Integer::compareTo;

int anIndex = Collections.binarySearch(sorted, target, intComparator);

Also uses some Java 8 magic and performs slowly.

Comparator intComparator = new Comparator() {
    @Override    public int compare(Integer o1, Integer o2) {
        return o1.compareTo(o2);    }
};int anIndex = Collections.binarySearch(sorted, target, intComparator);

Doesn't use any lambda expressions and performs quickly.

The final, slightly embarrassing example:

int anIndex = Collections.binarySearch(sorted, target);

So, I didn't even need to define the method for the comparison.

The micro benchmarking trap

JVM Compilation Magic

(Updated - corrected code sample to include actual call to indexOf, which I only noticed after tweeting about this.)

A better indexOf for a sorted ArrayList

I'm in the process of brushing up on my knowledge of the Java Collections API at the moment. When I came across indexOf(Object object) in ArrayList I decided to see how it works.

Although Lists in Java preserve order they cannot assume their contents will be in a particular order. So in indexOf(myObject) on a List the safest way to guarantee that the index of the first match that exists will be returned is by starting from the head of the list and iterating through the tail.

As an exercise for myself, I decided to compare the performance of indexOf against a slightly tweaked binary search for when the elements of the list are known to be in a sorted order.

The Collections class provides a binarySearch static method which returns the index of an Object in a List - so the majority of the work has already been done for us.

The tweak that I mentioned earlier is that my List can contain duplicates. So, once the binarySearch has found a match, we still need to check earlier in the List until we can confirm that we have the very first match.

private static void indexOfSortedList() {
    // indexOf is a bit naive, so let's see if a sorted list     // can be made to behave better with binarySearch    ArrayList sorted = new ArrayList<>(50000000);    // Pre-loading randoms    Random random = new Random();    random.setSeed(1337L);    boolean[] randoms = new boolean[20000000];
    for (int b = 0; b < 20000000; b++) {
        randoms[b] = random.nextBoolean();    }

    for (int i = 0; i < 20000000; i++) {
        sorted.add(i);
        // Introducing duplicates at random points        if (randoms[i]) {
            sorted.add(i);        }
    }

    final Integer target = Integer.valueOf(1700000);
    long timeBefore = System.currentTimeMillis();    int anIndex = Collections.binarySearch(sorted, target,            (o1, o2) -> o1.compareTo(o2)
    );
    // We treat anIndex as starting position and move backwards    // until we establish the first instance    while (anIndex > 0 && sorted.get(anIndex - 1).equals(target)) {
        anIndex--;    }
    System.out.println("Duration binary approach " + 
            (System.currentTimeMillis() - timeBefore));    System.out.println("binarySearch value: " + anIndex);
    long timeBeforeIndexOf = System.currentTimeMillis();    int indexOfValue = sorted.indexOf(target);    System.out.println("Duration indexOf " + 
            (System.currentTimeMillis() - timeBeforeIndexOf));    System.out.println("Index of value: " + indexOfValue);}

The computer scientists amongst you should appreciate how a binary search will involve far fewer operations than the linear search that indexOf is based on.

Would it surprise you to find that the indexOf implementation consistently showed a much faster performance?

That was until I scaled up the search space to have 10s of millions of elements to search through, and shifted the match to be quite high in the search space - biasing towards the binarySearch in a very unfair way.

Slightly surprised at this counterintuitive outcome, I went away for lunch and picked up a book: Java Performance : The definitive guide

A section about the JIT compiler gave me a theory to try out - perhaps this code wasn't being compiled beyond byte code because it was only being run once.

Modifying the setup to loop around calling the method showed numbers that would reinforce that theory. After the first iteration the tweaked binarySearch performed faster than the indexOf call.

Pre-allocating for capacity

Within the same code you may have noticed that I chose to pre-initialise the capacity of the ArrayList. This is a habit that I got into when want to reduce waste in memory allocation.

Some code that I have deleted from the example above was measuring the performance of the calls to populate the ArrayList.

I observed that specifying the capacity in advance resulted in slower insertion - even though the code for setting up the capacity was not included in the calls to be measured.

When I changed the code to have multiple calls to the method I observed that the first pass through had poor performance around 11s, but the next few calls dropped down around 4s, and subsequent passes went as low as a few hundred millis but fluctuated back up to around 4s.

Summary

My future approaches to trying some minor performance tweaking change out will not involve single runs, but will take into account how the JVM will actually massage the code into its final working state.

Monday 13 July 2015

Software Library Dependency Iceberg

A while ago I wrote some Java code as a plugin for the Go Continuous Delivery server software. It works quite nicely and won a contest - but that's not the topic for today.

Out of curiosity I have considered changing the way that dependencies have been managed. So far the application just has three direct runtime dependencies and relies on Gradle to pull in any transitive dependencies. I want to explicitly declare the dependencies and let Gradle only take care of downloading them, compilation, and assembling the jar.

A quick look at the tree of transitive dependencies shows common libraries but different versions.

+--- cd.go.plugin:go-plugin-api:14.4.0

+--- com.google.code.gson:gson:2.3.1

\--- org.cloudfoundry:cloudfoundry-client-lib:1.1.3

   +--- org.springframework:spring-webmvc:4.0.5.RELEASE -> 4.0.8.RELEASE

   | +--- org.springframework:spring-beans:4.0.8.RELEASE

   | | \--- org.springframework:spring-core:4.0.8.RELEASE

   | | \--- commons-logging:commons-logging:1.1.3

   | +--- org.springframework:spring-context:4.0.8.RELEASE

   | | +--- org.springframework:spring-aop:4.0.8.RELEASE

   | | | +--- aopalliance:aopalliance:1.0

   | | | +--- org.springframework:spring-beans:4.0.8.RELEASE (*)

   | | | \--- org.springframework:spring-core:4.0.8.RELEASE (*)

   | | +--- org.springframework:spring-beans:4.0.8.RELEASE (*)

   | | +--- org.springframework:spring-core:4.0.8.RELEASE (*)

   | | \--- org.springframework:spring-expression:4.0.8.RELEASE

   | | \--- org.springframework:spring-core:4.0.8.RELEASE (*)

   | +--- org.springframework:spring-core:4.0.8.RELEASE (*)

   | +--- org.springframework:spring-expression:4.0.8.RELEASE (*)

   | \--- org.springframework:spring-web:4.0.8.RELEASE

   | +--- org.springframework:spring-aop:4.0.8.RELEASE (*)

   | +--- org.springframework:spring-beans:4.0.8.RELEASE (*)

   | +--- org.springframework:spring-context:4.0.8.RELEASE (*)

   | \--- org.springframework:spring-core:4.0.8.RELEASE (*)

   +--- org.springframework.security.oauth:spring-security-oauth2:2.0.4.RELEASE

   | +--- org.springframework:spring-beans:4.0.8.RELEASE (*)

   | +--- org.springframework:spring-core:4.0.8.RELEASE (*)

   | +--- org.springframework:spring-context:4.0.8.RELEASE (*)

   | +--- org.springframework:spring-webmvc:4.0.8.RELEASE (*)

   | +--- org.springframework.security:spring-security-core:3.2.5.RELEASE

   | | +--- aopalliance:aopalliance:1.0

   | | +--- org.springframework:spring-aop:3.2.8.RELEASE -> 4.0.8.RELEASE (*)

   | | +--- org.springframework:spring-beans:3.2.8.RELEASE -> 4.0.8.RELEASE (*)

   | | +--- org.springframework:spring-context:3.2.8.RELEASE -> 4.0.8.RELEASE (*)

   | | +--- org.springframework:spring-core:3.2.8.RELEASE -> 4.0.8.RELEASE (*)

   | | \--- org.springframework:spring-expression:3.2.8.RELEASE -> 4.0.8.RELEASE (*)

   | +--- org.springframework.security:spring-security-config:3.2.5.RELEASE

   | | +--- aopalliance:aopalliance:1.0

   | | +--- org.springframework.security:spring-security-core:3.2.5.RELEASE (*)

   | | +--- org.springframework:spring-aop:3.2.8.RELEASE -> 4.0.8.RELEASE (*)

   | | +--- org.springframework:spring-beans:3.2.8.RELEASE -> 4.0.8.RELEASE (*)

   | | +--- org.springframework:spring-context:3.2.8.RELEASE -> 4.0.8.RELEASE (*)

   | | \--- org.springframework:spring-core:3.2.8.RELEASE -> 4.0.8.RELEASE (*)

   | +--- org.springframework.security:spring-security-web:3.2.5.RELEASE

   | | +--- aopalliance:aopalliance:1.0

   | | +--- org.springframework.security:spring-security-core:3.2.5.RELEASE (*)

   | | +--- org.springframework:spring-beans:3.2.8.RELEASE -> 4.0.8.RELEASE (*)

   | | +--- org.springframework:spring-context:3.2.8.RELEASE -> 4.0.8.RELEASE (*)

   | | +--- org.springframework:spring-core:3.2.8.RELEASE -> 4.0.8.RELEASE (*)

   | | +--- org.springframework:spring-expression:3.2.8.RELEASE -> 4.0.8.RELEASE (*)

   | | \--- org.springframework:spring-web:3.2.8.RELEASE -> 4.0.8.RELEASE (*)

   | +--- commons-codec:commons-codec:1.6

   | \--- org.codehaus.jackson:jackson-mapper-asl:1.9.13

   | \--- org.codehaus.jackson:jackson-core-asl:1.9.13

   +--- org.apache.httpcomponents:httpclient:4.3.6

   | +--- org.apache.httpcomponents:httpcore:4.3.3

   | +--- commons-logging:commons-logging:1.1.3

   | \--- commons-codec:commons-codec:1.6

   +--- commons-io:commons-io:2.1

   +--- com.esotericsoftware.yamlbeans:yamlbeans:1.06

   +--- com.fasterxml.jackson.core:jackson-core:2.3.3

   +--- com.fasterxml.jackson.core:jackson-databind:2.3.3

   | +--- com.fasterxml.jackson.core:jackson-annotations:2.3.0

   | \--- com.fasterxml.jackson.core:jackson-core:2.3.3

   +--- org.apache.tomcat.embed:tomcat-embed-websocket:8.0.15

   | \--- org.apache.tomcat.embed:tomcat-embed-core:8.0.15

   +--- org.apache.tomcat:tomcat-juli:8.0.15

   \--- com.google.protobuf:protobuf-java:2.6.1

Just look at the number of times a Spring library version 3.2.8 is specified, but overridden to 4.0.8 - that's a jump in major version number, which may have introduced incompatible code changes, such as removing a method from the API.

I wonder if there are any tools out there that could analyse the caller / callee relationships to offer developers some reassurance or warnings when jars get replaced by different versions in a situation like this.

Service lifecycle

Just some notes about the lifecycle of an instance of a service application from deployment through to shutdown.

In a continuous delivery environment this cycle might happen multiple times per day, where each instance represents a new version of working software that has passed through a deployment pipeline.

Start up

Initialisation
- locate and parse configuration
- validate configuration
- establish connectivity to downstream services
- log success or failure details

PASS/FAIL

Ready to accept requests, but no routing in place to receive live traffic.

Smoke tests
- final readiness checks before setting live
- accept and process incoming requests, produce responses

PASS/FAIL

Load balancer updated to direct live traffic

Routine business of processing incoming requests

Requests come in via load balancer, get processed by service - potentially with calls to downstream services - and responses are returned.

Depending on demand, the number of instances of the service may be scaled up and down to service the volume of requests.

Shut down

Load balancer updated to exclude instance from receiving live traffic requests (connection draining to allow existing requests to each receive a response - check how well this is supported).

Log the fact that shutdown has been initiated.

Finish processing existing requests

Close all connections to downstream services.

Terminate all threads.

Process terminates.

Thursday 23 April 2015

Tapping into the Zeitgeist

A while ago there was a contest to develop a plugin for a Continuous Delivery system called Go.

I decided to "have a go," and my product was judged to be the best suited to the criteria of the competition.

When I was first contemplating producing something a colleague suggested that I do something with Docker because that was the hot technology at the time. Instead I opted for something that I believed would benefit my team in our use of Cloud Foundry.

Rather than going for the obvious - automating deployment to Cloud Foundry - I chose to give developers a system which could alert their CI pipeline when an app has been re-deployed.

Github doesn't seem to give statistics on downloads, so I find myself a little frustrated at not being able to tell how many people are actually making use of my first serious foray into open source software.

Sunday 8 March 2015

Can my GoCD plugin work with Java 8?

Background

A few weeks back I developed a plugin for the Go Continuous Delivery server software.

My main frustration with trying this on a Mac was that the Go Server installer insisted on locating a Java 6 installation. A bit of Googling revealed that this can be a non-trivial feature to work around, so I gave up trying to run Go on a Mac and tried things out on my work Linux machine.

I only have a limited amount of time at work to do hobby based development, so I went looking for an alternative solution and found the Sample GoCD VirtualBox based environment.

This is pretty cool and enables me to try things out.

Upgrading Java

For my day to day development I have been programming with Java 8 for almost a year (I know it couldn't be a year yet because it wasn't released until mid-March 2014).

The VirtualBox environment only has Java 7 available, so my plugin development was restricted to the features available pre-Java 8.

During some refactoring of my plugin, I found myself extracting a method because it involve a bit of ugly nesting with a loop and a condition check - I want Java 8's features to filter and forEach instead.

Long story short, the upgrade steps for making the VirtualBox environment use Java 8 instead of Java 7 are as follows

Follow the JDK 8 installation instructions:

http://www.webupd8.org/2012/09/install-oracle-java-8-in-ubuntu-via-ppa.html

Update the go-server and go-agent files under /etc/default to point to the newly installed JDK's home as JAVA_HOME.

I found the location by tracing through some symbolic links, like so:

vagrant@vagrant-ubuntu-trusty-32:/etc/default$ ls -la /usr/bin/java

lrwxrwxrwx 1 root root 22 Jun 14 2014 /usr/bin/java -> /etc/alternatives/java

vagrant@vagrant-ubuntu-trusty-32:/etc/default$ ls -la /etc/alternatives/java

lrwxrwxrwx 1 root root 39 Mar 8 12:33 /etc/alternatives/java -> /usr/lib/jvm/java-8-oracle/jre/bin/java

vagrant@vagrant-ubuntu-trusty-32:/etc/default$ ls -la /usr/lib/jvm/java-8-oracle/jre/bin/java

-rwxr-xr-x 1 root root 5730 Mar 8 12:33 /usr/lib/jvm/java-8-oracle/jre/bin/java

With the config changes in place, a restart of the go-server and go-agent processes should get them to use the Java 8.

Navigating to http://localhost:8153/go/about should show something like:

Where the JVM version confirms that GoCD is indeed running with Java 8.

Now I can replace my refactoring with some Java 8 goodness.

Tuesday 24 February 2015

Push versus Poll - Cloud Foundry CI plugins

Out of curiosity I recently Google'd what Jenkins plugins are available for Cloud Foundry.

Interestingly - for me at least - the approach the developers have taken is quite different to the direction I approached for the Go CD Cloud Foundry plugin.

While my plugin is focussed on detected when an application has been pushed, the Jenkins plugin supports the process for pushing an application.

So, if you've made it to my github or this blog and you were looking for a plugin for pushing - sorry, my thought process was a little different.

In the interests of shameless self-promotion, here's a link to my github repo:

https://github.com/stephen-souness-springer/springer-gocd-cloudfoundry-plugin

Monday 23 February 2015

Experimenting with Go Continuous Delivery

If you're working on a Mac you may encounter an annoyance when installing the Go Continuous Delivery server. It blindly insists on finding a version 6 JVM - even if a compatible later version is available.
After an initial attempt at adjusting the config, and a bit of Googling around I gave up on trying to make it work.
A week or so later I found myself using some spare time at home to develop a plugin on a Mac. After a couple of days of having to wait until the following day to try out the functionality on my work Linux setup I came across some instructions for installing and using a Vagrant virtual box which contains a fully operational Go server.

http://www.go.cd/2014/09/09/Go-Sample-Virtualbox.html

I haven't used it extensively but it has made plugin development easier for me.

Friday 20 February 2015

Go CD Cloud Foundry plugin

In February 2015 I developed a Go CD server plugin to enable triggering of builds when a Cloud Foundry application has been deployed.

This post is intended as an introduction to how to use this plugin.

A starting assumption is that you already have the plugin installed on your Go CD server.

Update: Binary jars are available:
https://github.com/stephen-souness-springer/springer-gocd-cloudfoundry-plugin/releases

The source code is also available on Github:
https://github.com/stephen-souness-springer/springer-gocd-cloudfoundry-plugin

Step 1

Navigate to the Package Repositories section of the Admin menu

Add a new repository with your CloudFoundry API credentials.

Update: In version 1.0.1 the password property has changed to be secured - so you won't see it in cleartext the way is is shown here.

To make sure that you have supplied the correct credentials - and confirm that your Go server can connect to Cloud Foundry you can use the Check Connection button before choosing to save the configuration.

You're now ready to include some Cloud Foundry configuration to a build pipeline with your newly available repository.

Step 2

The Check Package button will trigger a check on two levels - first that the supplied credentials can log in, and second that an application exists which starts with the specified App Name.

Step 3

The next step in your pipeline definition depends on what you want to do when a change is detected. From the example settings above there will be an environment variable called GO_PACKAGE_MYCLOUDFOUNDRYDEV_SERVICEX_LABEL set with the latest matching detected app version.

Friday 23 January 2015

Assembling a jar to include dependencies with Gradle

How to build a jar containing dependency jars with Gradle

As part of a hack day project at work I have started developing a plugin for use with ThoughtWorks Go CD.

After some initial confusion around how Go Server expects to find the plugin bundled, I realised that a jar file containing a lib folder of jars is the way to make it work.

The example plugins GitHub repository only showed the use of Maven as a project build tool, but I have gotten accustomed to using Gradle - so I need to do a bit of reverse engineering and searching online to find a way to produce a suitable jar to include the managed dependencies of my project.

Without further ado, here is the relevant snippet of Gradle configuration:

jar {
    into('lib') {
        from configurations.runtime
    }
}

This simply creates a lib directory within the generated jar. The lib directory will contain the various jars that are pulled in by the managed dependencies. By default this will include transitive dependencies.

This works with Gradle 2.2.1 and I would expect it to work for earlier versions.

Thursday 15 January 2015

Basic readiness check before developing with micro services - DNS

So, you have your application nicely sliced up into purposeful, self-contained units which can call upon each-other as required - great.

Presuming each of these components is communicating over http, how are you going to deploy them and make them accessible to each other? Why not use a beautiful Platform As A Service environment - developers can deploy new apps, and manage existing apps to their hearts' content without any assistance from other teams. Lovely.

Ok, let's presume that your setup has big ambitions and limited IP addresses so you're going to need these applications to have hostnames. No big deal, and easier to quickly verify what is what. Who would use IP addresses directly - this is the 21st century!

What if DNS hasn't been set up to cope with some additional load. It only craps out every other week, but when it does your applications can't reach each other. Deployments fail, developers get stalled tracking down problems that they haven't caused. Live demonstrations are now considered risky.

Welcome to my world. It's the third or fourth time that I've bugged a team that should be able to trace this problem and arrange for it to be fixed (even temporarily). Their response is that the impact seems to be broader than the single host that the app is currently complained about. Their recommendation is to update the app's config with IP addresses for a while.

We have a central configuration system, but I don't think it is intended to act as a replacement for an /etc/hosts file.

After publicly lecturing me about how the configuration management system should make dirty hacks like this (I'm paraphrasing) possible, they contacted someone who could flush a cache or restart the misbehaving service to solve the problem temporarily.

Sunday 4 January 2015

Career safety checkpoint

As another year begins, I found myself struggling to get to sleep before the first day back at work. This wasn't a new experience for me as I seem to recall the same sort of nervousness before the first day back at school as a child.

I consider this time of year to be a bit like travelling between very different timezones - just as I have gotten accustomed to going to bed late and sleeping late the following morning, it's time to adjust back to the work life routine.

One of the many thoughts that occurred to me when I should have been blissfully sleeping was whether my skills are still as relevant to my chosen career in software development.

In the morning I woke up early and decided to have a look at some source code from a system that my current project will be interacting with...

A section of code that particularly stood out to me involved something like 12 branches of if / else checks, where each comparison was against a constant defined earlier in the class, and each outcome also involved another constant. This could easily be condensed down to two or three lines with one conditional expression by replacing all of the constants with a single Map.

I'm sleeping much better.