January 13, 2010

Concurrent Builds with Hudson

Multiple Build Executors

We are using the Hudson Continuous Integration Server for our integration builds and are quite happy with it. It is fast, stable, feature-rich, extensible, well integrated with Maven and has an appealing user interface.

One of the nice features that we are using regularly is the Build Executor setting that allows you to specify the number of simultaneous builds. This is useful to increase throughput of Hudson on multi-core processor systems, where the number of executors should (at least) match the number of available cores.

However, Maven isn't really designed for running multiple instances simultaneously since the local respository isn't multi-process safe. The chance for conflicts seems small (multiple processes must access the same dependency at the same time, at least one of them writing). However, in praxis, we encounter this type of concurrency issue at least once a day now, which is starting to hurt us! The build is failing with a message like this:

[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] Failed to resolve artifact.

GET request of: some/group/some-artifact-1.2.3-SNAPSHOT.jar from my-repo failed
some.group:some-artifact:jar:1.2.3-SNAPSHOT
...

Caused by I/O exception: ...some-artifact-1.2.3-SNAPSHOT.jar.tmp (The requested operation cannot be performed on a file with a user-mapped section open)

or this:

[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] Failed to resolve artifact.

Error copying temporary file to the final destination: Failed to copy full contents from ...some-artifact-1.2.3-SNAPSHOT.jar.tmp to ...\some-artifact-1.2.3-SNAPSHOT.jar

The reason is, the JAR file is locked by another process that is executing some long-lasting test cases, for instance. At the same time, a second build tries to download a new version of this snapshot into the local repository, which is done with the help of the mentioned .tmp file.

Safe Maven Repository

The only way to avoid this type of issue is to use separate local Maven Repositories for each of the processes. You can tell Maven to use a custom local repository location by specifying the localRepository setting in your settings.xml file.

In Hudson, this is even more convenient. There is a checkbox Use private Maven repository in the advanced part of the Build section of Maven projects. Just click that to setup a private local Maven repo for that project. You should consider to do so when you run into the described issue now and then.

Obviously using private repos will increase the total amount of disk space due to caching the same dependencies in multiple places. Additionally, the first build will take significantly more time because everything has to be downloaded once. However, both consequences are well acceptable given the better stability and isolation of projects.

Instead of clicking the Hudson checkbox for all your projects, you should consider to setup the local Maven repo in your settings.xml instead. This has a number of advantages:


  • You don't have to setup the option for each and every project, but have it in a central place.
  • You can use a common root for all local Maven repos, like d:/maven-repo. This allows you to easily purge all your local repositories from time to time, in order to reduce disk space as well as validate the content (i.e. make sure the build is still running in a clean environment and all required artifacts are in your corporate Maven repository).

For instance, here is what works fine for us:

<localRepository>d:/builds/.m2/${env.JOB_NAME}/repository</localRepository>

This is using a Hudson environment variable (JOB_NAME) to create subfolders for the actual projects aka jobs. See here for a list of available variables.

Oh yes, what I suggest is also encouraged by Brian Fox in his Maven Continuous Integration Best Practices blog post, so you should consider twice to adopt this best practice :o)

5 comments:

  1. How do you purge your local repositories? Do you use a cron job?

    ReplyDelete
  2. Well, we do this once every few month and haven't automated that job. We thought about adding a Jenkins job that just calls a batch file which deletes this Maven repo folder, so the task would be accessible from Jenkins UI. However, nobody has done that yet, and nobody has missed it ;-)

    ReplyDelete
  3. Excellent blog. We don't see this issue on our Jenkins' Linux OS slaves, but we do see it on our Windows OS slaves. I'm glad there is a simple solution to this problem.

    ReplyDelete
  4. Thank you Sir. This has helped me 5 years after posted!

    ReplyDelete