Sudoku Generator#

Description: Generate Sudoku boards with unique solution via iterative method and mip formulation.

Tags: mip, heuristics, puzzles, amplpy

Notebook author: Marcos Dominguez Velad <marcos@ampl.com>

Model author: Marcos Dominguez Velad <marcos@ampl.com>

Solving Sudokus with Mixed-Integer Linear Programming techniques is one of the most well-known examples to show Mathematical Optimization potential.

This popular puzzle can be solved in many different ways, from using bitmasks to speed-up a backtracking approach, to Constraint Programming solvers. There are many Sudoku variants (take a look to Cracking the Cryptic’s Youtube channel), but what about puzzle generation? Getting Sudoku instances is also a hard problem.

Sudokus are usually unique, in the sense that, given a grid partially filled, there should only exists one solution. Gary McGuire, Bastian Tugemann, and Gilles Civario proved that at least 17 numbers are necessary to ensure the uniqueness of a partially filled Sudoku (2012). If there are less than 17 clues, there are, for sure, multiple solution. However, if there are 20 clues, it is possible that there are multiple solutions.

A partially filled Sudoku is called “irreducible” if there is only one way to complete it, and removing any of the clues allows multiple solutions.

Find out whether a Sudoku is irreducible or not is not straightforward. We are integrating AMPL and Highs within an iterative method to generate irreducible puzzles.

Algorithm description#

First, generate a completely filled Sudoku. Given an empty Sudoku, it is not difficult to find one solution by shifting consecutive blocks of 3 numbers, and permuting them correctly.

. 456 . 789
. 789 . 123
. 123 . 456
...............
. 564 . 897
. 897 . 231
. 231 . 564
...............
. 645 . 987
. 987 . 312
. 312 . 645

Save solved Sudoku solution in \(T\) matrix (template).
Guess which clues are necessary. Loop over all the tiles (i in 1..9, j in 1..9). For each tile \((a,b)\):

3.1. Solve the Sudoku using a MIP solver. Sudoku models use binary variables x[i,j,k] whose value is 1 if the tile \((i,j)\) ends up having the value \(k\), 0 otherwise. There are a couple of tweaks before solving:

Drop the "Fix tile $(a,b)$" constraint, so $(a,b)$ is not forced to take the value $T_{a,b}$. This is the same as forgetting this clue for this particular solve.

Add the following objective function:

\[ minimize \quad x_{a,b,T_{a,b}} \]

If the Sudoku has a unique solution, the optimal value is going to be 1, and tile $(a,b)$ is going to be the same as $T_{a,b}$. If the solution is not unique, the $(a,b)$ tile can be filled in more than one way, so the solver will not use $x[a,b,k]=1$ to minimize: there will be some $k' \neq T_{a,b}$ (not added to the objective) such that $x[a,b,k']=1$, so the optimal value will be 0. Solve this problem.

3.2. Analyze the optimal value found. If the solution was unique (objective = 1) remove \(T_{a,b}\) clue and the Sudoku would be determined. If not (objective = 0), keep \(T_{a,b}\) clue (since removing it prevents the Sudoku from being irreducible).

If there are remaining tiles, back to step 3.
Finally, \(T\) template must contain an irreducible Sudoku.

Theorem: the previous algorithm returns an irreducible Sudoku. Proof:

By induction, let’s show that \(T\) will have a unique solution after each iteration. In the first iteration the solution is unique. Suppose \(T\) provides a Sudoku template with unique solution after \(k\) iterations. In the iteration \(k+1\), remove clue \((a,b)\) (if not, the template remains the same). If the solution is not unique, there should be many ways to complete the Sudoku, and \((a,b)\) value must be different from \(T_{a,b}\). Fix that tile according to step 3.2, so the template is the same as in iteration \(k\), which had unique solution. It is possible to determine if the solution is unique or not by checking the optimal value of the problem solved in step 3.1.

We have proved that the solution is unique. Let’s see that the Sudoku obtained is also irreducible. By contradiction, suppose there exist some tiles \((i,j)\) with \(T_{i,j}\) filled such that when any of them is removed, the solution is still unique. Pick the last one of these tiles that was processed in the iterative method. When iterated over that \((i,j)\), multiple solutions should have been found since the tile is fixed. But that is a contradiction, because later steps would only clear or keep clues, so the alternative solutions that show up when removing \((i,j)\) clue are feasible if the clue is removed.

Model#

# Install dependencies
%pip install -q amplpy

# Google Colab & Kaggle integration
from amplpy import AMPL, ampl_notebook

ampl = ampl_notebook(
    modules=["highs"],  # modules to install
    license_uuid="default",  # license to use
)  # instantiate AMPL object and register magics

The algorithm and mixed-integer linear model is generalized to any (square) Sudoku size.

We are using a weighted objective function to simplify the algorithm implementation. When processing tile \((a,b)\), set \(w_{a,b,T_{a,b}} = 1\), and \(w_{i,j,k} = 0\) for any other tile. Only the \(x\) variable whose digit is \(T_{a,b}\) is added to the objective. Any non-processed tile should have a \(T_{a,b}\) value, as the initial template is filled.

Another alternative to obtain a initial Sudoku could be using random weights in the objective function and solving the problem with that objective.

%%ampl_eval
reset;

# PARAMETERS AND SETS
param n; # Sudoku size, usually n=3
set ROWS := 1..n^2;
set COLS := 1..n^2;
set DIGITS := 1..n^2;
set SUBSQUARES{sr in 1..n, sc in 1..n} =
    setof {i in ROWS, j in COLS : sr = ceil(i/n) and sc = ceil(j/n)} (i,j);

param T{ROWS, COLS} default 0 <= n^2; # Template
param Weights{ROWS, COLS, DIGITS} default 0; # Objective function

# VARIABLES
var x{ROWS,COLS,DIGITS}, binary;

# CONSTRAINTS
subject to Diff_Rows{i in ROWS, k in DIGITS}:
	sum{j in COLS} x[i,j,k] = 1;

Diff_Cols{j in COLS, k in DIGITS}:
	sum{i in ROWS} x[i,j,k] = 1;

Diff_Subsq{sr in 1..n, sc in 1..n, k in DIGITS}:
	sum{(i,j) in SUBSQUARES[sr,sc]} x[i,j,k] = 1;

One_Digit{i in ROWS, j in COLS}:
	sum{k in DIGITS} x[i,j,k] = 1;

Fill_Tile{i in ROWS, j in COLS : T[i,j] > 0}:
	x[i,j,T[i,j]] = 1;
    
minimize f: sum{i in ROWS, j in COLS : T[i,j] > 0} Weights[i,j,T[i,j]]*x[i,j,T[i,j]];

First template#

To start the algorithm (step 1) we need a completely filled Sudoku.

import itertools

n = 3
ampl.param["n"] = n

solution = [
    [1, 2, 3, 4, 5, 6, 7, 8, 9],
    [4, 5, 6, 7, 8, 9, 1, 2, 3],
    [7, 8, 9, 1, 2, 3, 4, 5, 6],
    [2, 3, 1, 5, 6, 4, 8, 9, 7],
    [5, 6, 4, 8, 9, 7, 2, 3, 1],
    [8, 9, 7, 2, 3, 1, 5, 6, 4],
    [3, 1, 2, 6, 4, 5, 9, 7, 8],
    [6, 4, 5, 9, 7, 8, 3, 1, 2],
    [9, 7, 8, 3, 1, 2, 6, 4, 5],
]

rows = range(1, n * n + 1)  # 1..n^2
ampl.param["T"] = {
    (i, j): solution[i - 1][j - 1] for i, j in itertools.product(rows, rows)
}
ampl.solve(solver="highs")
assert ampl.solve_result == "solved", ampl.solve_result

Solution determined by presolve;
objective f = 0.

Iterative method#

As the order of the tiles matter, iterate over them in a random order. For example, first processed tiles are going to be cleaned, so if they are not ordered, the resulting Sudoku will have many empty squares on the top.

Drop the constraint related to the clue, set the weight value to 1, solve, analyze the optimal value in order to clean the clue or restore the constraint, and finally set the weight to 0.

We use Highs (open source MIP solver) to solve the MIP instances.

import random
import time

# Step 2
x = ampl.var["x"]
solution = {
    (i, j): k
    for (i, j, k) in itertools.product(rows, rows, rows)
    if x[i, j, k].value() > 0
}
# Initial template
ampl.param["T"] = solution
tiles = list(itertools.product(rows, rows))
# Tiles are shuffled, as the very first tiles are going to be cleared.
random.shuffle(tiles)

t0 = time.time()
# Step 3
for i, j in tiles:
    # Step 3.1
    # Clear clue + Add variable to the objective
    ampl.eval("drop Fill_Tile[" + str(i) + "," + str(j) + "];")
    ampl.param["Weights"][
        i, j, solution[i, j]
    ] = 1  # minimize x[i,j,T[i,j]] is the objective
    # Solve
    ampl.solve()
    assert ampl.solve_result == "solved"
    # Step 3.2
    # Check if the objective changed
    if ampl.obj["f"].value() > 0:  # Unique solution: Clear clue
        ampl.param["T"][i, j] = 0
    else:  # Multiple solution: Fix tile
        ampl.eval("restore Fill_Tile[" + str(i) + "," + str(j) + "];")
    # Take out from the objective
    ampl.param["Weights"][i, j, solution[i, j]] = 0
t1 = time.time()

Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
Solution determined by presolve;
objective f = 1.
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
HiGHS 1.4.0: 0; Iter: Time   9.775e-08; average =   9.775e-09; Bound =   9.932e-06
HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
1 branching nodes
HiGHS 1.4.0: 0; Iter: Time   4.053e-08; average =   4.053e-09; Bound =   4.069e-06
HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
1 branching nodes
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
Solution determined by presolve;
objective f = 1.
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes
Solution determined by presolve;
objective f = 1.
Solution determined by presolve;
objective f = 1.
HiGHS 1.4.0: HiGHS 1.4.0: optimal solution; objective 0
0 simplex iterations
0 branching nodes

def show_sudoku(T):
    for i in rows:
        for j in rows:
            print(
                int(T[i, j]) if int(T[i, j]) > 0 else ".",
                end=" | " if j % n == 0 and j < n * n else "",
            )
        print("\n" + "-" * ((n - 1) * 3 + n * n) if i % n == 0 else "")


print("--- Sudoku template with unique solution:\n")
show_sudoku(ampl.param["T"])
print("Time: " + str(t1 - t0) + " seconds")

--- Sudoku template with unique solution:

... | ..6 | .8.
45. | ..9 | 1..
..9 | 1.. | ...
---------------
2.1 | ..4 | .9.
5.. | 8.. | ...
... | .3. | .64
---------------
3.. | 6.. | ..8
... | ... | 31.
.7. | ..2 | .4.
---------------
Time: 1.4933266639709473 seconds

This method requires solving one MIP for each tile. Fortunately, most of these problems are easy to solve as AMPL keeps the previous solution as initial guess.

Solving an empty Sudoku with random weights in the objective function is a way to have different initial templates. The order of the tiles also matters, as it determines the clues that are going to be removed.

Similar methods could be used to find out alternative solutions to a Sudoku grid. This is a way to solve the “Another Solution Problem” for Sudokus (a difficult problem that consists in finding out if there are alternative solutions to a problem).

References:

McGuire, G., Tugemann, B., & Civario, G. (2014). There is no 16-clue Sudoku: Solving the Sudoku minimum number of clues problem via hitting set enumeration. Experimental Mathematics, 23(2), 190-217.