January 13, 2010

Concurrent Builds with Hudson

Multiple Build Executors

We are using the Hudson Continuous Integration Server for our integration builds and are quite happy with it. It is fast, stable, feature-rich, extensible, well integrated with Maven and has an appealing user interface.

One of the nice features that we are using regularly is the Build Executor setting that allows you to specify the number of simultaneous builds. This is useful to increase throughput of Hudson on multi-core processor systems, where the number of executors should (at least) match the number of available cores.

However, Maven isn't really designed for running multiple instances simultaneously since the local respository isn't multi-process safe. The chance for conflicts seems small (multiple processes must access the same dependency at the same time, at least one of them writing). However, in praxis, we encounter this type of concurrency issue at least once a day now, which is starting to hurt us! The build is failing with a message like this:

[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] Failed to resolve artifact.

GET request of: some/group/some-artifact-1.2.3-SNAPSHOT.jar from my-repo failed
some.group:some-artifact:jar:1.2.3-SNAPSHOT
...

Caused by I/O exception: ...some-artifact-1.2.3-SNAPSHOT.jar.tmp (The requested operation cannot be performed on a file with a user-mapped section open)

or this:

[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] Failed to resolve artifact.

Error copying temporary file to the final destination: Failed to copy full contents from ...some-artifact-1.2.3-SNAPSHOT.jar.tmp to ...\some-artifact-1.2.3-SNAPSHOT.jar

The reason is, the JAR file is locked by another process that is executing some long-lasting test cases, for instance. At the same time, a second build tries to download a new version of this snapshot into the local repository, which is done with the help of the mentioned .tmp file.

Safe Maven Repository

The only way to avoid this type of issue is to use separate local Maven Repositories for each of the processes. You can tell Maven to use a custom local repository location by specifying the localRepository setting in your settings.xml file.

In Hudson, this is even more convenient. There is a checkbox Use private Maven repository in the advanced part of the Build section of Maven projects. Just click that to setup a private local Maven repo for that project. You should consider to do so when you run into the described issue now and then.

Obviously using private repos will increase the total amount of disk space due to caching the same dependencies in multiple places. Additionally, the first build will take significantly more time because everything has to be downloaded once. However, both consequences are well acceptable given the better stability and isolation of projects.

Instead of clicking the Hudson checkbox for all your projects, you should consider to setup the local Maven repo in your settings.xml instead. This has a number of advantages:


  • You don't have to setup the option for each and every project, but have it in a central place.
  • You can use a common root for all local Maven repos, like d:/maven-repo. This allows you to easily purge all your local repositories from time to time, in order to reduce disk space as well as validate the content (i.e. make sure the build is still running in a clean environment and all required artifacts are in your corporate Maven repository).

For instance, here is what works fine for us:

<localRepository>d:/builds/.m2/${env.JOB_NAME}/repository</localRepository>

This is using a Hudson environment variable (JOB_NAME) to create subfolders for the actual projects aka jobs. See here for a list of available variables.

Oh yes, what I suggest is also encouraged by Brian Fox in his Maven Continuous Integration Best Practices blog post, so you should consider twice to adopt this best practice :o)

January 2, 2010

Cargo Maven Plugin: Not Made for JBoss

Again...

Well, actually this blog was supposed to be about Java in general and all the ups and downs I experience during my daily work. However, I've not been doing much other than Maven configuration and build management lately, so here is another Maven related post. Sorry folks.

As already shown in this post, I have been doing integration tests with JBoss by using the Cargo Maven plugin to start the JBoss locally and deploy the application to it. This all works quite as soon as you have figured out how to configure Cargo for JBoss.

But Remotely Now!

Now, the next step is to deploy our EAR file which is generated during nightly build to a running JBoss instance on a separate computer. This is different because no JBoss configuration has to be created locally and no JBoss has to be started. Instead, the EAR file must be transferred to a remote server where JBoss is already running, and JBoss must be persuaded to deploy this file.

That sounds feasible, and I've done exactly this before for other servers like Tomcat, so I did not expect any issue here. However, I was wrong.

Itch #1

First trouble was caused by my lack of knowledge regarding JBoss. With standard installation, you are not able to connect to the server remotely and all the services are bound to localhost only (see here or here). This is intentionally, to prevent unprotected installations appearing all over the net. You have to pass the option -b 0.0.0.0 when starting JBoss to allow remote connections to the services, but take care to secure your JBoss accordingly!

Itch #2

Okay, after this has been configured, I tried to use Cargo to deploy my EAR file to JBoss. This is the configuration I ended up with:

<!-- *** Cargo plugin: deploy the application to running JBoss *** -->
<plugin>
<groupId>org.codehaus.cargo</groupId>
<artifactId>cargo-maven2-plugin</artifactId>
<version>1.0</version>
<configuration>
<wait>false</wait>
<!-- Container configuration -->
<container>
<containerId>jboss5x</containerId>
<type>remote</type>
</container>
<!-- Configuration to use with the Container -->
<configuration>
<type>runtime</type>
<properties>
<cargo.hostname>...</cargo.hostname>
<cargo.servlet.port>8080</cargo.servlet.port>
</properties>
</configuration>
<!-- Deployer configuration -->
<deployer>
<type>remote</type>
<deployables>
<deployable>
<location>...</location>
</deployable>
</deployables>
</deployer>
</configuration>

<executions>
<execution>
<id>deploy</id>
<phase>deploy</phase>
<goals>
<goal>deployer-redeploy</goal>
</goals>
</execution>
</executions>
</plugin>

However, I always got this error message:

[INFO] Failed to deploy to [http://...]
Server returned HTTP response code: 500 for URL: ...

The configuration seems to be correct, so what is the problem?

After asking Google, I realized that Cargo is not able to transfer a file to JBoss! Instead, it requires the deployable to be deployed to be present on the server filesystem (see here). This is obviously caused by the JBoss JMX deployer which is used by Cargo, but actually you don't care who is to blame – you just want it to work. The name "Cargo" implies the parcel is transferred to its destination, right? Also note that this issue is dated from Sep 2006, so there has been some time to fix it in either way.

What Can We Do?

Well, there are probably not many options. Since current version of Cargo is not able to transfer the file to the server, you'd have to do this on your own. The location given in our Cargo configuration above actually is the path on the JBoss server. So, when the file exists locally on the JBoss server, Cargo should be able to deploy it successfully.

For transferring the file to JBoss server, we could use the maven-dependency-plugin, a quite useful plugin for all kind of analyzing, copying or unpacking artifacts. We configure it to run in pre-integration-test phase and to copy the EAR file (produced by this POM) to some temp directory on the JBoss server:

<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<executions>
<execution>
<id>copy</id>
<phase>install</phase>
<goals>
<goal>copy</goal>
</goals>
<configuration>
<artifactItems>
<artifactItem>
<groupId>${project.groupId}</groupId>
<artifactId>${project.artifactId}</artifactId>
<version>${project.version}</version>
<type>${project.packaging}</type>
<destFileName>test.ear</destFileName>
</artifactItem>
</artifactItems>
<outputDirectory>${publish.tempdir}</outputDirectory>
<overWrite>true</overWrite>
</configuration>
</execution>
</executions>
</plugin>

The property ${publish.tempdir} can be anything on the JBoss server (which must be available in the network!) and is exactly what has to be used for the value of location element in Cargo configuration.

Another option would be to use the hot-deploy directory of JBoss as outputDirectory for the dependency plugin, and hence rely on hot deployment of JBoss instead of Cargo and JBoss JMX deployer. This way, we could get rid of Cargo configuration and cleanup the POM a bit, but in the end it seemed a bit less clean to me... your mileage may vary.

So, as always, in the end we got it to work, but not without unforeseen pain. When will Cargo be fixed to get the EAR file to JBoss server? Who knows.

Updates

2010/01/22: Note that the dependency plugin must be bound after the install phase so that the artifact has been copied at least to your local Maven repository. As a consequence, the Cargo plugin must be run in deploy phase, which is actually a good choice anyways. I have changed this in my code above.