EJCP: #9 Stopping threads

Posted by Peter Veentjer around lunchtime: February 12, 2008

Continuing the Enterprise Java Concurrency Problem Top 10 countdown, it's time to talk about number 9 'Stopping threads'.

Stopping a Thread is complicated. In the beginning the Thread.stop() method was added for this purpose, but is has been
deprecated for a long time. The reason is when some thread calls stop() on a victim thread, a ThreadDeath error is thrown inside the victim thread. And while this error propagates up the stacktrace, all locks the victim thread owns, are released. This means that the victim-thread could leave objects (on which it held locks) in an inconsistent state. So stopping a thread should be a cooperative mechanism between threads: the victim thread and the initiating thread both need to agree upon a protocol being used.

Luckily I don't see the usage of Thread.stop() (or even worse, the Thread.destroy()) often, but the idiom displayed in the example below, is something I do see regularly. Although the exception handling is sometimes forgotten. This Task is executed by some thread, and when a different thread wants to stop this Task, it calls the stop method on the Task and the stop variable is set to true. As soon as the victim thread starts the next iteration of the loop, it reads the stop variable, ends the loop and returns from the run method (returning from the run method eventually terminates the victim thread).

public class Task implements Runnable{
	private boolean stop = false;

	public void stop(){
		stop = true;
	}

	public void run(){
		while(!stop){
			try{
				runSingleUnit();
			}catch(Exception ex){
				//log it, in most cases we
				//don't want to break the loop
			}
		}
	}

	public void runSingleUnit()throws Exception{
		System.out.println("executing some service");
	}
}

The problem is that this example is faulty. There are 2 reasons and they are both linked to the Java Memory Model (JMM) that after a long period is completely defined in Java 5 and higher (see the JSR 133 for more information):

  1. visibility
  2. reorderings

Visibility

Because the read and write of the stop variable are not special (so not volatile, not final, and not done inside a synchronized context) the JVM doesn't provide any guarantee that the most recently written value is read (there are more ways to provide visibility guarantees but they are out of scope for this post). This means that the victim thread could see the initial written value till end of time. And this effectively transforms the loop to an infinitive one.

You might wonder why a thread is not able to see the most recently written value, because this most surely be some kind of error. Well... it isn't because this desired behaviour, called sequential consistency, prevents a lot of optimizations from being used. There are different forms of local memory (multi level caches, cpu registers) where value's can be stored instead of the main memory. This can result in the following scenario's:

  1. local memory doesn't need to contain the most recently written value by another thread. So
    a thread that uses this local memory, doesn't need to see writes made by other threads.
  2. local memory contain the most recently written value made by the current thread without being visible in main memory, so a write doesn't have to be visible to other reading threads.

The chance of this happening with the strong cache coherence most cpu's provide (realized by snooping and sniffing caches for example) isn't that big, but it certainly is possible. And concurrency problems have the tendency to appear only once in a while and therefore are very hard to reproduce (they could also be hardware dependent).

Reasoning about caches, invalidation, memory fences and the like can complicate the platform independent view Java developers have. Therefore the JMM is defined in terms of happens before rules where each rule defines a relation between 2 actions (read/write/volatile-read/volatile-write/lock-acquire/lock-release), and this should shield us from reasoning about these low level issues. If a happens before relation between 2 actions (in this cases the normal write and normal read) can be found, all changes made in first action (the write) are visible in second action (the read). In this case there is no such relation, so no guarantee can be given about the write of the stop variable in the stop method (executed by the initiating thread) ever being visible at the read of the stop variable in the run method (executed by the victim thread).

Reorderings

The compiler could decide to hoist the read of the stop variable out of the loop, a performance increasing technique called 'Loop invariant code motion'

public void run(){
	if(stop)
		return;

	while(true){
		try{
			runSingleUnit();
		}catch(Exception ex){
			//log it
		}
	}
}

This optimization is possible because inside the loop the stop value is not changed, so why do an expensive memory access every time. As you can imagine, the new run method never ends as soon as it passes the if(stop) check. This reordering is allowed because from the victim thread point of view, the transformed code is equivalent to the original one, so the within-thread as-if-serial semantics of the transformed code is the same as the original.

Solution

The simplest way to solve these problems is to make the stop variable volatile. Volatile variables prevent:

  1. visibility problems: there is a happens before relation (called volatile variable rule) between the write to a volatile variable and the subsequent read of the same variable, so a read will always see the most recently written value.
  2. reordering problems: the instructions prior to the volatile read of the stop variable, are not allowed to be moved over this read. This prevents the movement (reordering) of the check outside the loop.

But using a read and write under a lock (should be the same lock) should also fix the problem:

public synchronized void stop(){
	stop = true;
}

private synchronized boolean isStopped(){
	return stop;
}

public void run(){
	while(!isStopped()){
		...
	}
}

The reason why the synchronized approach works:

  1. visibility: there is a happens before relation between the write of stop and the read of stop (called monitor lock rule) that makes sure that all changes made before releasing the lock in the stop method, are visible when the lock is acquired in the isStopped() method.
  2. reordering the movement of instructions prior to the lock acquire is not allowed to jump over the release of the lock. This principle is called the 'Roach Motel': instructions in front of the lock acquire are not allowed to jump over the lock release, but are allowed to jump over the lock.acquire. The same principle can be applied to instruction after the lock release: they are not allowed to jump over the lock acquire, but they are allowed to jump over the lock.release. Instruction that are between the lock acquire and the lock release are not permitted to jump over the lock.release or lock.acquire (a cockroach doesn't want to leave his dark shelter).

This stopping technique is quite useful if the execution of a single iteration doesn't take too much time. If it takes a longer time, different techniques need to be considered like thread interruption. For more information about stopping threads I recommend chapter 7 "Cancelling and Shutdown" from 'Java Concurrency in Practice'. For more information about the JMM I recommend chapter 16 "The Java Memory Model" from the same book or check out the the JMM website.

13 Responses to “EJCP: #9 Stopping threads”

  1. Maarten Winkels Says:

    Hi Peter,

    Very interesting and well written blog!

    I find instructions jumping over other instructions rather hard to follow, but the comparison is very imaginative.

    Are there any utility classes that can be used with this behavior? Your using the ’synchronized’ keyword here to do locking, but IIRC there are a few other ways to do that (using java.util.concurrent package). Would this work in a similar fashion?

    Regards!

  2. Nirav Thaker Says:

    Very Explanatory! Thanks!

  3. Peter Veentjer Says:

    Hi Maarten,

    thank you!

    The synchronized keyword is not really meant as an exclusion mechanism (something most developers use it for) but it is used to define the happens before relation between the write and the read.

    But there are other classes you can use like the AtomicBoolean. But you need to make sure that it is final (or volatile)

    private final AtomicBoolean stop = new AtomicBoolean(false);

    If the final is omitted, there is no direct happens before relation between the assignment to the write of the stop variable (in the constructor) and the usage in the stop() or run() method. It could be that the environment provides this mechanism (the spring application context for example)

    http://blog.xebia.com/2007/03/01/spring-and-visibility-problems/

    But it makes reasoning about visibility much harder.

  4. Lars Vonk Says:

    Nice one Peter. If have a question regarding stopping via interruption: What exactly do you mean by “execution of a single iteration doesn’t take too much time”? Or are you coming back to the interruption of Threads later on in your top 10?

  5. ilgvars Says:

    thread.interrupt() and thread.isInterrupted() will do the same

  6. Erik Says:

    Hi,

    interesting read. Would be declaring stop as volatile solve the problem? Under 1.4.2? Under 1.5+? Maybe you can explain shortly what “volatile” means.

    Thanks.

  7. Peter Veentjer Says:

    >>What exactly do you mean >>by “execution of a >>single iteration doesn’t take too much time?

    I a single iteration takes 5 minutes for example, the shutdown of the thread can take 5 minutes. If the task blocks for and undetermined amount of time, you can’t say anything about the shutdown duration.

    >>Or are you coming back to the interruption of >>Threads later on in your top 10?

    I don’t think so. If you really want to know more about shutting down threads, please have a look at “Concurrent Programming in Java” chapter 7.

  8. Peter Veentjer Says:

    >> thread.interrupt() and thread.isInterrupted() will do the same

    True.. partly.. I have seen way too much code that just gobbles up the interrupted exception.

    try{
    ..some blocking call
    }catch(InterruptedException ex){
    throw new RuntimeException(ex);
    }

    In this case the interrupted status is removed from the thread, and the interruptedexception is hidden. So for the loop it is very hard to figure out what to do. If you look at the ThreadPoolExecutor, they also are not just relying on the Thread interrupt status because it is very easy to break.

  9. Peter Veentjer Says:

    >> Would be declaring stop as volatile solve the
    >> problem? Under 1.4.2? Under 1.5+? Maybe you can
    >> explain shortly what “volatile” means.

    Hmm.. under the 1.4 virtual machine the variable is visible, but the reordering is still possible. The behavior of this code under 1.4 is not very well defined. The 1.4 vm’s should not be used, they are broken.

    Volatile (under 1.5) does 2 things:
    -prevent visibility problems
    -prevent reorderings

  10. Stephen Zhang Says:

    public void run()
    {
    while(!Thread.currrentThread().isInterrupted())
    {
    //some actions
    }
    }

    public void stop( )
    {
    Thread.currentThread().interupt();
    }

    and make some other thread call stop method would be enough to safely stop a thread, though you may need to add some housekeeping code to implement a interruption policy like clean up unfinished operation etc..

  11. Someone Says:

    For what it’s worth, we have that class in our internal framework. It’s called a MortalThread!

  12. Peter Veentjer Says:

    @Stephen Zhang,

    In principle this solution would work. But when a thread hits an interruption-tripwire (an interrupted exception is thrown when such a tripwire is passed), the interrupt status is removed. And if the InterruptedException is wrapped in a RuntimeException (a practice I see quite often) all information about the desired stop is lost.

    Another problem is that you don\’t always want to interrupt, but want to stop as soon as the thread is ready for the next iteration.

    So personally I would still go for a stop flag (to prevent not stopping) and maybe for the interrupt (if a thread needs to be shutdown immediately).

  13. JMM: Thank God or the Devil for Strong Cache Coherence? « Blog of Peter Veentjer Says:

    […] Model problems, but the compiler can cause it as well. Currently it already is possible that a compiler optimization could break an application, no matter how strong the cache coherence of the hardware is. And no […]

Leave a Reply