Log in

No account? Create an account

Wed, Oct. 7th, 2009, 07:35 pm
Sudoku and the way of the SAT solver

Writing a Sudoku solver is a slightly tricky but reasonably straightforward problem. Here I'll present a non-obvious way of implementing it which is short in terms of lines of code and also much easier to modify and extend than the more direct approaches. The general technique is applicable to a very wide range of problems, so hopefully someone will read this and then find it useful in the future.

My general approach is in multiple steps. First, express the Sudoku problem as a much more general type of problem called a SAT problem (short for 'satisfiability', traditionally capitalized for no particular reason). Then solve the SAT problem, then translate the solution back into the original format.

A SAT problem works on boolean variables, and the problem is to find a find an assignment which satisfies every one of a list of clauses. An individual clause states that at least one of a set of variables an their negations must be true, for example 'A or B or C' or 'not B on not D or E' are both typical clauses. To translate a Sudoku problem into a SAT problem, I make one boolean variable for each possible state of each cell of the original problem, so there's one variable for whether the upper-left cell is a one, another for whether it's a two, etc. Translating the constraints on the Sudoku problem is then straightforward - each digit must occur exactly once in each row, column, and subsquare, and each pair of cells in each row, column, or subsquare must not both be true (expressed as, either the first one is false or the second one is false).

To solve the SAT problem, the solver first scans for a clause containing no variables. If there is one, then the problem is insoluble and you return. Otherwise, it scans for clauses containing exactly one variable. If there is such a thing, it makes that assignment and recurses. If not, it finds the clause with the smallest number of variables in it (skipping the 'not A or not B' clauses, because everything contains about the same number of those) and recurses twice, setting one of the variables to true and false.

The advantage of this approach is that the heuristics for noticing when there's only one possible value left on a row, column, subsquare, or an individual cell don't have to be special cased - they're all subsumed by the single simple deduction that if a clause only contains a single variable, then that assignment has to be made. As a result, adding for example a constraint that the diagonals have to have each digit occur only once is trivial, and making irregular shapes and adding novel types of constraints is also fairly easy, while it would be quite difficult in the more obvious but less flexible approaches to writing Sudoku solvers.

It would be possible to make this code a lot faster, even without changing the architecture - for example, one could build an index of what variables occur in what clauses, to get rid of all the linear scans. But for solving a small number of Sudokus, or even a largeish number, taking a few seconds each is no big deal, and this approach is much more about maintainability, flexibility, and extensibility. Interestingly, for a large number of real world problems the fastest known way to solve them is to translate to SAT and run a general SAT solver with appropriately tuned parameters on the result. Obviously this is a hack, but applying custom heuristics to the unencoded problem is so messy and difficult, and so much work has been done on fine-tuning the general purpose solvers, that more often than not the theoretically better appoach can't compete in practice. Sudoku has some very special properties which tip the balance in favor of a very special custom bit of code, and in fact I made my SAT solver do a heuristic which is very Sudoku specific and skipped the common one that if a variable only appears in the positive or negative then you assign it because that doesn't apply, but such customization being warranted is highly unusual, and even in this case the general SAT solver has some clear advantages, starting with it containing fewer lines of code.

I used exception handling to indicate when a solution is found, and add assigments to it in handle and reraise clauses, mostly to make the point that this is a perfectly valid and highly pragmatic way of using exceptions.

A flat file of the code can be found here, but I've included it verbatim below.

By the way, the method of testing I used on this is to run it on a Sudoku problem with no initial constraints, which is a simple one-line test and does a good job of hitting all the edge cases regardless of the implementation details.

class SolutionFound(BaseException):
	def __init__(self):
		self.s = []

def assign(clauses, val):
	n = []
	for c in clauses:
		if val in c:
		elif -val in c:
			n.append([x for x in c if x != -val])
	return n

def try_sat(clauses, val):
		solve_sat(assign(clauses, val))
	except SolutionFound, s:
		raise s

def solve_sat(clauses):
	if [] in clauses:
	if len(clauses) == 0:
		raise SolutionFound()
	for x in clauses:
		if len(x) == 1:
			try_sat(clauses, x[0])
	smallest = clauses[0]
	for c in clauses:
		if c[0] > 0 and (smallest[0] < 0 or len(smallest) > len(c)):
			smallest = c
	try_sat(clauses, smallest[0])
	try_sat(clauses, -smallest[0])

svals = [list(range(x, x+9)) for x in range(0, 81, 9)] + \
		[list(range(x, x+81, 9)) for x in range(9)] + \
		[[x,x+1,x+2,x+9,x+10,x+11,x+18,x+19,x+20] for x in [0,3,6,27,30,33,54,57,60]]

def group(c, p):
	for i in range(len(c)):
		for j in range(i):
			p.append([-c[i], -c[j]])

def groups(c, p):
	for i in range(9):
		group([x + 100 * i + 1 for x in c], p)

def decode(r):
	r2 = [[0 for x in range(9)] for y in range(9)]
	for v in r:
		if v > 0:
			pos = (v-1) % 100
			r2[pos // 9][pos % 9] = (v // 100) + 1
	return r2

def solve_sudoku(problem):
	p = []
	for i in range(81):
		group(list(range(i+1, 901, 100)), p)
	for s in svals:
		groups(s, p)
	pre = []
	for x in range(9):
		for y in range(9):
			if problem[x][y]:
				m = x * 9 + y + 100 * (problem[x][y] - 1) + 1
				p = assign(p, m)
	except SolutionFound, s:
		return decode(s.s + pre)
	return None

import sys

if __name__ == '__main__':
	tt = [[0] * 9 for i in range(9)]
	print('\n'.join(' '.join([str(p) for p in q]) for q in solve_sudoku(tt)))
	print(' ')
	tt[0][0] = 6
	tt[3][4] = 1
	print('\n'.join(' '.join([str(p) for p in q]) for q in solve_sudoku(tt)))

Thu, Oct. 8th, 2009 04:31 am (UTC)

Very cool! When I wrote my Clue solver, I tried reducing it to a SAT problem, but eventually had to give up and use Clue-specific logic because it took forever. I was just using a textbook algorithm, so it's possible using a "real" solver would have helped a lot...

Thu, Oct. 8th, 2009 05:38 am (UTC)

OK, so now I had to compare your code with my sudoku solver. (I had to write one, as I was spending far too much time solving them manually.) I took twice as many lines (in perl) to write a sudoku specific solver, but mine had a user interface and comments. You win on elegance, but I actually used mine to cure a sudoku addiction.
(Deleted comment)

Fri, Oct. 9th, 2009 12:49 pm (UTC)

I worked on the blackbox solver, eons ago, before constraint solvers were even a general category, and came up with walksat for it. Constraint solvers have come a long way since then though, I keep meaning to read up on how survey propagation works but haven't waded through it yet.

Fri, Oct. 9th, 2009 12:50 pm (UTC)

I wrote a custom SAT solver in this case just to show the technique. It is of course more practical to just use a general one.

Thu, Oct. 8th, 2009 01:29 pm (UTC)
knowbuddy: TMTOWTDI

I teach an Advanced Database Structures course for a Bachelors program. Our review for the mid-term exam is to write a Sudoku solver in SQL. It's a great review, as it ends up using unions, several types of joins, aggregate functions, and most of the base concepts they'll need for the second half of the course. I get everyone up at the white board and we figure out, together, how to solve the generic case, then variations.

It turns out that you need 4 or 5 tables and as many (very simple) queries to populate them with data. Once you've done that, it's just a repetition of two queries to iteratively solve the puzzle.

The students always have fun with it, even though they are disappointed to see how easy it is.

Fri, Oct. 9th, 2009 07:42 am (UTC)

I've not investigated SAT solvers much, but some of the things I would worry about as regards to generalizing to other types of Sudoku puzzles would be:

How well does it handle constraints that might not be satisfied at all? For example, in a variant I call an "Udoku" puzzle, each row/column/area is has all but one of the digits, but you don't know which one is missing.

How well does it handle negative constraints? For example, a "Non-consecutive Sudoku" has the additional constraint that digits that are next to each other can't be consecutive. Generally implementing this constraint result in adding something like 72 constraints for each of 144 pairs of adjacent digits, and the number of pure constraints tends to balloon.

How well does it handle arithmetic constraints (which can often end up
being polynomial in complexity)? For instance, in my recent book "Mutant Sudoku", we have "Target Sum Sudoku", where a smattering of cells are circled in each grid, and you're told that in each row, column, and region, the circled cells add up to a magic sum (for example, 11).

I mostly mention these three examples because they are the types that tend to take the slowest with my generalized Sudoku Solver (which is based on Dancing Links and Knuth's Algorithm X, but with a customization I added in to handle arbitrary column weights).

Fri, Oct. 9th, 2009 12:37 pm (UTC)

Udoku can be represented straightforwardly. The clauses saying that a digit must appear in each cell remain, but the clauses which say that a digit must occur at least once in each row and column go away. They're redundant anyhow, but speed things up a lot, hence the udoku solver running slower.

Non-consecutive Sudoku can be trivially built using the obvious clauses. The ballooning isn't a big deal, or at least, is far less than the ballooning necessary to do the basic encoding of sudoku in the first place.

Target sum sudoku can be expressed by having cells for additive values. For example, there's a cell for all values of A + B, with clauses containing 3 elements essentially making a lookup table for sums, then another one for A + B + C with another set of clauses doing the same thing for the adding up the A + B cells and the C cell, and so on for everything making up a single sum constraint. Such constraints are by their nature quite loose and can easily result in high runtimes.

For cases where dancing links can represent the problem straightforwardly it will generally do better than a general SAT solver, because it's doing the same logic with some implementation tricks which make it faster. It's sort of in the same vein though, in that dancing links supports generalized constraints, just not as generalized as full-blown CNF. (That's Conjunctive Normal Form, which is the type of constraint I describe in the post.)

Your comments about which problems take longer to solve I think correspond accurately with the innate difficulty of those problems, so different techniques will generally give similar relative performance times.

Sat, Oct. 10th, 2009 07:03 am (UTC)

For many problems, it may be easier to use a SMT solver: http://combination.cs.uiowa.edu/smtlib/solvers.html
These are a generalisation of SAT solvers, which have a higher level input

Sat, Oct. 10th, 2009 11:06 pm (UTC)

Heh. My solver tries to be human-like, but merges a whole bunch of advanced techniques into a 2-SAT reduction.