Mirrored from **Sudopedia**, the Free Sudoku Reference Guide

## Contents |

The following simplistic definition was given on the Terminology page:

- For any given computer program which creates puzzles, a random puzzle essentially is a puzzle which is created chaotically, that is to say so randomly that the created puzzle would never (for all practical purposes) be created again.

This is a pragmatic, but most certainly a controversial view point. The fundamental problem is that it is hard to define what is meant by *random* without substantial clarification.

- Does a random puzzle have to be
*valid*? - Does a random puzzle have to be
*minimal*? - Does a random puzzle have to be
*canonical*?

Without answers to these questions *random* is ill defined.

But before we dissect *random* further, let's define how a puzzle can be created *chaotically* in a bit more detail. As per the definition, if a long series of puzzles was created then the same puzzle would be unlikely to be created again. For both a chaotic method and a purely random method, it is very very *unlikely* but not *impossible* for the same puzzle to be created twice in a row. It is however more likely that a *chaotic* method will create the same puzzle twice in a row than a perfectly *random* method.

Conceptually it would be easy to test the two. A particular puzzle is picked as the target and puzzles are generated randomly until the target puzzle is created again. The count of how many puzzles were generated between the duplicates could tabulated as a trial. Over many trials the results could be compared to the exact theoretical expectations. Of course the probability of detecting the non-randomness of a chaotic method would require many samples of a massive number of puzzles. It would be totally impractical.

To start from the beginning, it is know that there are exactly 6,670,903,752,021,072,936,960 different Sudoku solution grids. That is about 6.67E+21 different patterns of the 9x9 digits in a completed Sudoku puzzle. That number is much smaller that the ways to simply randomly fill each of 81 cells in the grid with one of the digits 1 to 9 which is 1.96E+77. It is also much smaller than the number of possible Latin Squares for a 9x9 grid which is about 5.52E+27.

But many puzzles can be made from each solution grid? It is not know exactly how many Sudoku puzzles exist. But we can create a crude upper limit. Image 81 bits which can be 0 or 1. If 0, then the cell does not have a clue, if it is 1 then the cell does have a clue. So for each solution grid there are about 2^81 or 2,417,851,639,229,258,349,412,352 (about 2.41E+24) different puzzles. So there are about: 6.67E+21 x 2.41E+24 = 1.60747e+46 different puzzles in total. This is a very loose interpretation of the word puzzle simply meaning that one or more solutions exits for the incomplete grid.

Conceptually we could pick a random number between 1 and 1.60747e+46, then generate the puzzle of the corresponding number, and we'd have a *random puzzle*. Unfortunately it is not that simple.

To show how simplicity fails the situation consider the following facts:

- There are exactly 6,670,903,752,021,072,936,960
**different Sudoku solution**grids - There are exactly 5,472,730,538
**Unique Solution**grids - For each
**Unique Solution**grid there are exactly 1,218,998,108,160**grid permutations**.

**Different Sudoku solution** grids means the pattern of 1s-9s in the 81 cells.

**Grid permutations** are what we get when we renumber the grid (substitute 1s for 9s, and 9s for 1s for example. Group theory also allows for various patterns of swapping of rows and columns. All the permutations of a particular grid form a *solution group*.

**Unique Solution** grids are formed by selecting just one of each of the possible Grid permutations to represent its solution group.

But if we multiply the number of Unique Solution grids by the number of grid permutations we do not get the number of different Sudoku solution grids. 5,472,730,538 x 1,218,998,108,160 = 6,671,248,172,291,458,990,080

There are fewer different solution grids because some of the permutations are duplicates. This requires a somewhat complicated explanation. Now when the number of permutations are counted, it is unique permutations that are counted one 9 looks just like another. But let's imagine that we use the box number to subscript all the 81 digits. Now each permutation will be unique. But there are permutations for which we erase the subscripts and two permutations become identical. So we can't even choose a random number between 1 and 6.670903E+21 to represent a random grid.

A list of the solution grids has been created. However there is not a list of how many permutations are valid for each solution grid.

So of the roughly 1.60747e+46 different puzzles, it is unknown how many are *valid* and *minimal* puzzles. There are several different aspects to a puzzle being counted in the total number of puzzles.

- Does the puzzle have any solution?

- If we started with a valid solution grid, then the puzzle must have one or more solutions. if the puzzle has ten 9s, then it has to be inconsistent, and hence
*invalid*.

- If we started with a valid solution grid, then the puzzle must have one or more solutions. if the puzzle has ten 9s, then it has to be inconsistent, and hence

- Is there one one unique solution?

- In general the notion of a
*puzzle*would require one unique solution.

- In general the notion of a

- Is the puzzle minimal?

- Essentially minimal means that every clue is needed to make the puzzle unique.

Even with the above questions answered, the answers to the simple question *How many puzzles exist? is unknown. *

The best estimate for the total number of minimal puzzles is from Berthier's paper and is:

- 3.1055 +/- 0.0020 E37

If the various symmetries are considered, and the 3.1055E37 puzzels are reduced to canonical form, then the best estimate for the number of unique puzzles is also from Berthier's paper and is:

- 2.5477 +/- 0.0016 E+25

The best we could do is put a crude upper limit on the number of uniquely different puzzles. We can surmise from experimental efforts to find puzzles with different number of clues that the minimum number of clues needed is 17 and that no more than 39 for a minimal puzzle. So a crude upper limit of the number of **unique minimal puzzles** is:

#Clues ways to pick %minimal Total minimal 17 1.2845E+17 ? ? 18 4.5670E+17 ? ? 19 1.5143E+18 ? ? 20 4.6944E+18 ? ? 21 1.3636E+19 ? ? 22 3.7190E+19 ? ? 23 9.5400E+19 ? ? 24 2.3055E+20 ? ? 25 5.2565E+20 ? ? 26 1.1322E+21 ? ? 27 2.3063E+21 ? ? 28 4.4478E+21 ? ? 29 8.1288E+21 ? ? 30 1.4090E+22 ? ? 31 2.3180E+22 ? ? 32 3.6219E+22 ? ? 33 5.3780E+22 ? ? 34 7.5924E+22 ? ? 35 1.0196E+23 ? ? 36 1.3028E+23 ? ? 37 1.5844E+23 ? ? 38 1.8346E+23 ? ? 39 2.0228E+23 ? ? Total=9.9653E+23 total correct for minimal?

- 5,472,730,538 x 9.9653E+23
- = 5.472e+9 x 9.9653E+23 =
**5.453e+33**

The fraction of minimal random puzzles so generated has to be pretty small because the number has to come down by about eight orders of magnitude.

The best estimate

- Unbiased Statistics of a Constraint Satisfaction Problem – a Controlled-Bias Generator by Denis Berthier