Integration testing CloudBees' RUN@cloud

As part of the preparations for the recent 1.0.0 release of jclouds, I was tidying up the existing Tweetstore demo application as well as porting it to CloudBees' Tomcat-based RUN@cloud platform.

A key part of the test harness for the original versions of Tweetstore that run on the Google App Engine is the neat GoogleDevServer class. Basically, it's a clever wrapper around the KickStart class used under the covers by the GAE SDK tools that allows you to specify the SDK location, address, port and WAR file (or expanded WAR directory) to run.
Better still, it can programmatically be shut down cleanly, making it ideal for integration test runs1.

For CloudBees, I was thus looking to put together a similar RunAtCloudServer. It proved more challenging than expected...2

Just spawn a container already?

Both the maven-gae-plugin and the bees-maven-plugin allow you to spawn a local GAE or RUN@cloud server simulation, of course. These can easily be integrated into your Maven build, for instance as part of your pre-integration-test setup. So why not just stick with that?

Well, for some scenarios that can be perfectly sufficient. For Tweetstore, having an inline server that cleanly shuts down allows us to:

  • run integration tests alongside unit tests from an IDE without having to remember to start an external process
  • debug the application running in the server on demand, without having to mess with remote debuggers or figure out a way to sometimes get the Maven plugins to start in debug mode
  • work some conditional magic with the server arguments

If your project's requirements are similar, an inline server might be a good option. And luckily, the code's already been written!

Hangin' around

The RUN@cloud SDK's version of KickStart is the StaxSdkAppServer3. Figuring out the correct arguments for the launchServer factory method wasn't too hard, so I hooked up a StaxSdkAppServer and kicked off the test.

Started like a charm! Then the test finished and...just hung on in there. Over in the trusty debugger4, I could see not one, but two zombie threads still running. The design was based on a "shutdown-by-JVM-kill" scenario, I guess.

Assassins and other Ninja Tricks

The first of the two threads is a Timer that something called the WebAppEngine uses to do...well, who knows? In any case, we can get a reference to it with a decent handful of reflection and cancel it. Easy enough.

Our remaining zombie is a bit more elusive, unfortunately. It's a thread spawned by an instance of the RequestMonitorValve class in a manner analogous to

new Thread(idleTimer, "requestMonitor").start();

So no reference, and no way of interrupting it. However, the idleTimer Runnable is referenced and repeatedly calls a handler as part of its execution. How about injecting a handler that throws an exception to end the thread?

Nice idea, but it turns out the thread really doesn't want to die that way:

while(true) {
    try {
        // do stuff
        if(callbackClient != null)
            callbackClient.updateStatus(state);
    } catch(Exception e) {
        // increment some error counters
    }

    try {
        Thread.sleep(statusIntervalSecs * 1000);
    } catch(InterruptedException interruptedexception) { }
}

So I've had to go one better and throw a ThreadDeath error instead. Urgh.

Now you see it, now you don't?

The remaining gotcha involves injecting the catchily-named KillerCallback into the idleTimer. Even using reflection, this is a elementary concurrency failure because without a synchronized-with relationship5 between injecting the new handler and invocation of the handler by idleTimer, there is no guarantee the new handler will actually be visible to the thread we're trying to stop.

Such a relationship could be established if the "assassin" was injected before the thread that runs idleTimer starts. Unless we want to override library classes, that would mean setting it after creating but before starting the StaxSdkAppServer6...which unfortunately doesn't work either because the instances referencing idleTimer are only created when the server starts.

So?

So, for now the potential concurrency issue is documented and we hope for the best 😉 No problems observered so far...

Footnotes

  1. The maven-gae-plugin is another nice option for starting the SDK's local GAE simulation. The advantage of GoogleDevServer is that it also stops cleanly, can be run from your IDE and also allows you to add additional files to the WAR before starting. Tweetstore uses that to include login credentials to various cloud storage providers - not the kind of thing you're likely to want to store in Github 😉
  2. but see here
  3. CloudBees acquired Stax in December 2010
  4. This is when being able to run from an IDE really comes into its own!
  5. as provided e.g. by volatile
  6. Actually, the code uses an almost exact clone called StaxSdkAppServer2 that simply splits creating the server instance from starting it.

Comments (0)

    Add a Comment