Mirrored from Sudopedia, the Free Sudoku Reference Guide


Generalized Fishy Cycles

(sorry for the dollar signs, I haven't found yet how to do LaTeX on this wiki)

Contents

Preliminaries

In an incomplete sudoku, we construct a graph as follows. Take a fixed symbol $s$ (in a typical sudoku, $s\in\{1,\ldots,9\}$) which can still be placed somewhere on the board. As vertices, we take all positions in the sudoku where the symbol $s$ can still be placed. Two vertices are connected iff their corresponding positions belong to the same row, column or zone.

Definition 1.1. We call an edge `strong' if the two connected positions are the only ones in their respective row, column or zone. We call an edge `weak' if the two connected positions are not the only ones in their respective row, column or zone.

Remark 1.2. Note that this is not necessarily well-defined, since they can be the only ones in their row/column but not in their zone, or vice versa. In this case, we can immediately remove all other options for symbol $s$ in that zone or row/column, and consider the link as strong.

Remark 1.3. The choice for these names can be explained as follows:

We will mark the vertices where the $s$ is placed with $1$ and the others with $0$, hence a strong link will have $(0,1)$ or $(1,0)$, while a weak link can have $(0,0)$, $(0,1)$ or $(1,0)$. This marking function is denoted $f:V\to\{0,1\}$, the set of all marking functions is denoted $M$.

Notation 1.4. We usally denote a strong link between vertices $v$ and $w$ as $v=w$, and a weak link as $v-w$. For example, a $3$-cycle with two strong links and one weak link can then be written as $v_1-v_2=v_3=v_1$. If we want to denote a link without specifying wether it is weak or strong, we denote $v\sim w$.

Reducing the cycles

Now we will study the cycles in this graph. In a common sudoku, there are many such cycles (hundreds, even thousands in some cases), but only a few turn out to be useful (i.e. give information). We will study in detail which cycles contribute information, and how a human (or computer) can look specifically for these cycles.

But first we have to define information.

Definition 2.1. We obtain information about an edge if \begin{itemize} \item either the edge is strong, and at least one of the markings $(0,1)$ or $(1,0)$ is not possible, \item or the edge is weak, and at least one of the markings $(0,0)$, $(0,1)$ or $(1,0)$ is not possible. \end{itemize} We obtain information from the chain if we obtain information about any of its edges.

Now we can begin our classification, but we start with an important remark which will greatly reduce the number of cases to be handled.

Remark 2.2. A cycle of the form $v_1\sim \cdots \sim v_i=v_{i+1}=v_{i+2}\sim \cdots \sim v_1$ gives information if and only if the shorter cycle $v_1\sim \cdots \sim v_i\sim v_{i+3}\sim \cdots \sim v_1$ does. In other words, we may remove any two consecutive strong links by merging the three respective vertices to one big vertex.

We define the formal term rewriting system $\Phi$ as the system applying the above shortening on the set of (weak-strong) cycles. To find out wether a cycle provides information, it is sufficient to study wether its normal form (with respect to $\Phi$) provides information. Hence, we only need to study cycles in which there are no two consecutive strong links. Now classify these cycles and the information they provide, depening on the parity of the length of the cycle (an invariant in the term rewriting process) and the maximum number of consecutive weak links.

Conclusions on unreduced weak-strong cycles

Translating our information back to the unreduced model is simple: if we have a conclusion on a vertex (i.e. all possible markings on one of its incident edges have either $0$ or $1$ on that vertex) and that vertex gets expanded to a pair of strong links, then that expansion must be either from $0$ to $0=1=0$ or from $1$ to $1=0=1$.

The reduced forms from the previous section are handy for the classification and its drawings. However, for some human players it may slow them down slightly if they can't do the reduction in their mind.

How do we avoid using the reduction? Well, instead of looking at consecutive weak links, one can look at the parity of the number of strong links between two weak links. Two weak links are consecutive in the reduced cycle if and only if there is an even number of strong links between them in the original cycle. With this information, one can easily do the exercice of writing up the classification without reduction now.

Finding the useful cycles and translating our conclusions back to the sudoku

There are often many, many cycles in this graph. However, only a few usually give information, so it's crucial to spot the difference quickly. An algorithm to find these cycles with minimal effort could be the following:

For every vertex $v_1$, manually track all cycles starting and ending in $v_1$, but stop the current cycle as soon as one of the following happens:

hereby we note that the second stop condition is just the unreduced extension of the first, you can also reduce the cycles in your head while tracing them, and only use the first stop condition.

In this way, we skip all cycles that do not give extra information: in its reduced form, such a cycle allows at most one pair of two consecutive weak links, with $v_1$ the middle vertex. This removes most of the information-free cycles (in fact, all of them, except the trivial case where all links are strong), and none of the cycles that can contribute information. This algorithm is easy to implement on a computer, but with some routine it can also be easily done by a human.

Finally, translating our information back to the sudoku is easy:

Final remarks

I will end with a short word on the techniques which are generalized by this one. Technically, it also generalizes the trivial techniques such as Full House, Last Digit, Hidden Single and Naked Single. However, these are usually spotted on sight so you will never get any benefit from them being generalized by this cycle procedure.

More interestingly, it generalizes several of the currently known single-digit techniques, including several Fishes and Single Digit Patterns, and some chain techniques such as Fishy Cycles. While harder to apply than most of these individual techniques, I find it is (once used to it) certainly less time-consuming than applying each trick separatedly.

This page was last modified 13:39, 10 January 2010.