# Solving Sudoku in Anger

I guess most of you will be familiar with Sudoku. These puzzles appear in my newspaper every day and I have to say, when every such a semi mathematical puzzle comes by it intrigues me and I tend to get addicted. And when I get addicted, I start to think of a computer program to solve the puzzles. This feeling is strongest when my human incapabilities (carelessness, eagerness) infallibly mislead me in messing up one of the puzzles.

Last week one evening when I messed up another one, I couldn't resist and started up the computer. It took me most of the night (It was about 6 when I finally got to bed), but the puzzles that my newspaper brings me will haunt me no longer: I implemented a Java program that solves Sudokus.

The **model** of these games is pretty simple. To make it a little generic, I stick to this description:

- There is a
*Game*that consists of a rectangular array of*Field*s. - The
*Field*s can be assigned a*Value*, which could be any word but normally will be one of the numbers "1" to "9". - The
*Game*also contains a set of*Group*. These*Group*consist of one or more of the*Field*s and cannot contain the same*Value*(assigned to one of those*Field*s) more than once.

This is a fairly loose description of the puzzle. It doesn't restrict the groups to a size that is equal to the number of assignable values, or even speak of a restricted set of assignable values for that matter. These features of a normal Sudoku game, however, are not strictly necessary to implement a model of the puzzle. They are necessary to implement a winning strategy though! Notice that a *Game* in this implementation is responsible for checking that a value can be assigned to a field following rule 3.

The general idea of a **strategy** for this puzzle is that based on all moves (assignments of values to fields) so far a strategy can suggest another move. To be able to suggest a move, the strategy has to listen to all moves that have been made and maintain a certain statistic on the fields of the game, that might help it to determine a possible or even obligatory move. The interface for a *MoveStrategy* in this implementation is rather simple:

```
public interface MoveStrategy {
/**
* Do a move, if any can be suggested.
*
* @return whether a move has been made.
*/
boolean doMove();
/**
* Notifies the strategy that a move has been made.
*
* @param field The field that has obtained a value.
*/
void notifyMove (Field field);
/**
* Clears all state in the strategy.
*/
void reset ();
}
```

Now, when thinking about actual implementations of a strategy, lets start with the simple one. In the 3x3 games below, the columns and rows form groups and the set of assignable values consists of the numbers "1","2" and "3".

1 | _ | _ |

2 | _ | _ |

_ | _ | _ |

It's easy to see that the number "3" must be assigned to the last cell in the first column, otherwise it must be assigned a value that is already assigned to a field in the column. This strategy is called *LastValueStrategy*, since it assigns the last free value to the last free field in a group. To implement this strategy, we keep track of the number of free values for each group. Starting with an empty game, all assignable values are free values for all groups. Whenever a field has been assigned a value, we decrease the number of free values for each group that the field is in. As soon as a group reaches has only one free value, we find the (a?) free field in the group. The strategy now suggests to assign the free value to this field.

Obviously this strategy is not capable of solving even the simplest Sudoku puzzle. It is a very straight forward and easy to calculate strategy though. Lets look at the following game.

1 | _ | _ |

_ | _ | _ |

_ | 2 | _ |

No field in the first row of this game can be assigned the value "1". No field in the second column can be assigned "2". This means that the field in row 1 and column 2 must be assigned the value "3" (The same holds for the field in column 1 and row 3). This strategy is called *PossibleValueForFieldStrategy*. To implement it, the strategy maintains a set of possible values for all fields. Whenever a field is assigned a value, that assigned value is removed from the set of possible values for all fields that are in a group with the assigned field. Whenever there is a sole single value for a field, the strategy suggests to assign this value to the field.

Now, this strategy seems quite powerfull, but it falls silent for the next game, although there is an obligatory move to make.

1 | _ | _ |

_ | _ | _ |

_ | 1 | _ |

No field in the first column can be assigned the value "1", neither can any field in the second column. This means that the first two fields in the second row cannot hold the value "1", thus the last field in this row must be assigned this value (someone has to take it...). This strategy is called *PossibleFieldForValueInGroup*. The strategy maintains a set of free fields for each value, **per group**. For the group that consists of the first column of the game, there are two free fields for value "2" and two for value "3". There are no free fields for value "1", since it has yet been assigned in this group. For the last column there is only one free field for value "1". To maintain the correct state for this strategy, whenever a field is assigned a value, all fields that are in a group with the assigned field are "forbidden" fields; the assigned value cannot be assigned to any of the fields. For every group in the game, these fields are removed from the possible fields for the assigned value. Furthermore, the assigned field is removed as possible field for all values in all groups.

I was still thinking about guessing strategies, that would choose a field and a value to try and then backtrack when an inconsistency is found, but when I ran these three strategies against the puzzles that my newspaper had brought me, they were solved in mere milli seconds! I have to admit that the type of sudoku game is rather typical and actually rather simple. In the puzzles published in the newspaper, there are four additional groups. These consist of a block of 9 cells spaced evenly in the game. There is a *GameSpecification* interface and a *GroupSpecification* interface to help construct other types of games (leaving out the extra groups and/or adding new ones). The interface to this game is currently very simple. There is a *ConsolePlayer* that will start a command line client. It can read simple games from text files or follow instructions. This is surely an area of improvement.

When experimenting further, I found that either of the latter two strategies could solve all puzzles. Furthermore, the "classic" Sudoku puzzles (without the additional groups) can also be solved with these simple strategies. So far, I haven't been able to find a single Sudoku puzzle that the strategies do not solve. Maybe you can help?

* disclaimer:* I am aware of other programs that solve Sudoku puzzles, but I still like my simple, little implementation enough to share it with you all.

## Comments

Of course, right after publishing this post, I ran into this puzzle:

2__67____

__6___2_1

4_____8__

5____93__

_3_____5_

__28____7

__1_____4

7_8___6__

____53__8

It appears not to be solved by the strategies. Anybody interested in improving the program?

Ahhh... Sudoku's, a favourite passtime of mine :-)!

Nice article, there are of course different strategies which you have still missed. Things like 2 cells in a group which can only have 2 possible values (thus eliminating those values from all other cells in the group. Same goes with triplets. Try to feed it any 4-star Sudoku from "Denksport".

You are right, I missed a few. I think the one you describe will solve the problem in the first comment. Haven't come around to implement it yet (though shouldn't be too dificult)