# Sudoku solver

## Question

How to describe a sudoku and the sudoku rules in Opasnet so that it can be solved automatically?

### Procedure

The following terms are used:

• The possible space of solutions is described by a logical vector A, which contains all possible values given the current information. A is indexed by h, i, j, k, and l. A develops in time, when more information occurs or is processed. In the beginning, all hypotheses are considered as potentially TRUE.
• h = {1, 2, ..., 9}, i = {1, 2, ..., 9}, j = {1, 2, ..., 9}, k = {1, 2, ..., 9}, l = {1, 2, ..., 81} are indices for hypothesis, row, column, area, and cell, respectively. Note l is just another way to say (i,j), as l = i + (j-1)*9. Also, k is known if (i,j) is known, as k = ceiling(i/3) + (ceiling(j/3)-1)*3. However, k does not contain all information that (i,j) or l contains.
• ah and bh are sub-vectors of A in such a way that ah = Ah,m and bh = Ah,n, where m and n (m < n) are values from index l.

1. Expand the missing index values of the Hypotheses table to create the full A.
2. Take the sudoku data table and replace hypotheses with data, if available.
3. Compare all two-cell pairs at once using matrices.
1. Create matrices of all critical properties: SudRow, SudCol, SudArea, and Hypothesis. These are compared pairwise, two cells at a time.
2. Use rules to deduce if a pair is incompatible or not.
3. Aggregate plausible hypotheses in the second cell across all plausible values in the first cell. This gives a set of hypotheses that are plausible in at least some conditions.
4. Aggregate along the first cell: each second cell must be compatible with all other (first) cells.
5. Do not apply these rules if the first and second cells are the same.
4. Go through every row, column, and area and find hypotheses where there is exactly one cell where it is plausible.
1. Remove all other hypotheses from these cells.
5. Do steps 3 and 4 until the sudoku does not improve (i.e., further hypotheses are not falsified).
6. Take a user-defined list of cells for which a random sample from plausible hypotheses is taken. It would be elegant to do this stepwise, but for simplicity, let's do it at once. Therefore, the cells that the user selects is critical, and a wise user will not select cells where uncertainties are clearly interdependent.
7. Solve all sudokus that are created in the sampling.
8. If a sudoku results in cells with zero plausible hypotheses, remove that iteration.
9. Calculate the number of different solutions still plausible and print it.
10. If the number is smaller than 100, print also the solutions.

## Rationale

←--#: . Development needs to improve the solver:

1. Change sudoku input string in a way that it can have also other possibilities than a single number or any number. For example, syntax  would mean that any of those four numbers could be in a particular position. This makes the interpretation of the sudoku string a bit problematic, because then its length is no longer fixed to 81. However, this way it is possible to restrict possibilities iterativerly.
2. When guessing, a new rule should be used: if a particular hypothesis is falsified in ALL scenarios, it can be falsified always. This rule can be extended in a way that when no unique solution is found, the solver would make guesses automatically and gradually falsify all wrong hypotheses. --Jouni (talk) 09:36, 31 January 2016 (UTC) (type: truth; paradigms: science: defence)

### Data

This table will be expanded by fillna to be a 9*9*9 array (formatted as data.frame). As default, each hypothesis is assumed to be true unless shown otherwise.

List of all possible hypotheses, which are a priori assumed to be true.
Hypotheses(Boolean)
ObsSudRowSudColHypothesisResult
11TRUE
22TRUE
33TRUE
44TRUE
55TRUE
66TRUE
77TRUE
88TRUE
99TRUE
101TRUE
112TRUE
123TRUE
134TRUE
145TRUE
156TRUE
167TRUE
178TRUE
189TRUE
191TRUE
202TRUE
213TRUE
224TRUE
235TRUE
246TRUE
257TRUE
268TRUE
279TRUE
Rules of inference
the table is actually for illustration only because the code is too complex to implement from a table entry. Rules 1-4 come directly from the rules of the sudoku game. All other rules are logically derived from them.
Rules(-)
ObsRule nameRuleDescription
1rule1If a hypothesis is TRUE in cell A, that hypothesis must be FALSE in cell B, if B is on the same row as A.
2rule2If a hypothesis is TRUE in cell A, that hypothesis must be FALSE in cell B, if B is on the same column as A.
3rule3If a hypothesis is TRUE in cell A, that hypothesis must be FALSE in cell B, if B is on the same area as A.
4rule4If cells A and B are actually the same cell, no inferences are made.This is a stronger rule than others.
5rule5If a hypothesis in cell B is TRUE at least once given all hypotheses in cell A, the hypothesis in B is considered TRUE.This rule loses a lot of information about dependencies between cells, but it saves huge amounts of memory. In any case, even then, the set of rules effectively narrow down the potential hypothesis space.
6rule6Rules 1-5 apply to all pairs of cells A and B.
7rule7If a hypothesis in cell B is FALSE after applying rule 5 for even one cell A, it is FALSE always.
8rule8If a hypothesis is TRUE in exactly one cell in a particular row, all other hypothesis in that cell are FALSE.
9rule9If a hypothesis is TRUE in exactly one cell in a particular column, all other hypothesis in that cell are FALSE.
10rule10If a hypothesis is TRUE in exactly one cell in a particular area, all other hypothesis in that cell are FALSE.
11rule11If two cells on the same row contain exactly two TRUE hypotheses that are the same in both cells, these hypotheses cannot be TRUE in any other cell on that row.This rule is analogous to rule1 but with two cells. However, this rule (and the respective rules for columns and areas) are not implemented in the code.
Sudoku data (this is "the most difficult sudoku in thr world").
Sudoku(-)
ObsSudRow123456789
118
2236
33792
4457
55457
6613
77168
88851
9994

### Formula

 Enter sudoku as a string, with spaces in empty cells:If you want to guess, enter the (list of) cell(s) here:Do you want to see intermediate results?:No Yes ``` library(OpasnetUtils) hypotheses <- tidy(opbase.data("Op_en5817.hypotheses")) for(i in 1:3) { hypotheses[[i]] <- ifelse(hypotheses[[i]] == "", NA, as.integer(as.character(hypotheses[[i]]))) } hypotheses <- unique(fillna(hypotheses, marginals = c(1, 2, 3))) # Get the sudoku data # data <- tidy(opbase.data("Op_en5817.sudoku"), objname="data") # data\$SudRow <- as.numeric(data\$SudRow) # data\$SudCol <- as.numeric(data\$SudCol) # Maailman vaikein data <- " 8 36 7 9 2 5 7 457 1 3 1 68 85 1 9 4 " data <- " 4 2 9 58 6 7 5 8 751 8 4 4 8 2 765 7 2 2 1 68 1 3 9 " data <- " 3 8 2 5 8 6 71 4 2 124 967 9 4 35 4 2 6 3 2 6 " data <- " 45 81 2 6 5 4 9 1 4 9 8 89 2 45 5 8 9 6 2 1 7 9 3 35 41 " # Neljä tähteä data <- " 23 9 5 6 9 76 5 6 38 4 523 7 39 4 2 94 8 4 3 5 72 " if(!is.null(userdata)) data <- userdata hypotheses <- data.frame( SudRow = rep(rep(1:9, each = 9), times = 9), SudCol = rep(rep(1:9, each = 9), each = 9), Hypothesis = rep(rep(1:9, times = 9), times = 9), Result = TRUE ) SudMake <- function(hypotheses, data) { hypotheses\$SudRow <- as.numeric(hypotheses\$SudRow) hypotheses\$SudCol <- as.numeric(hypotheses\$SudCol) hypotheses\$SudCell = hypotheses\$SudRow + (hypotheses\$SudCol - 1) * 9 hypotheses\$SudArea = ceiling(hypotheses\$SudRow / 3) + (ceiling(hypotheses\$SudCol / 3) - 1) * 3 hypotheses\$Result <- as.logical(hypotheses\$Result) data <- data.frame( SudRow = rep(1:9, each = 9), SudCol = rep(1:9, times = 9), dataResult = gsub(" ", "", strsplit(gsub("\n", "", data), split = "")[]) ) # oprint(tapply(data["Hypothesis"], data[c("SudRow", "SudCol")], function(x) paste(x, sep = "", collapse = ""))) out <- merge(hypotheses, data) out\$dataResult <- as.character(out\$dataResult) out\$Result <- ifelse(out\$dataResult == "", out\$Result, (as.character(out\$dataResult) == as.character(out\$Hypothesis))) return(out) } ykköskarsinta <- function(out2, condition) { valinta <- as.data.frame(as.table(tapply(out2\$Result, out2[c("Hypothesis", condition)], sum) == 1)) # haetaan ehto-hypoteesikombinaatiot valinta <- valinta[valinta\$Freq, colnames(valinta) != "Freq"] # rajataan ainoisiin ratkaisuihin valinta <- merge(valinta, out2) # yhdistetään SudCell-tietoon valinta <- valinta[valinta\$Result , ] # poistetaan turhat #print(valinta\$SudCell) out2[out2\$SudCell %in% valinta\$SudCell , ]\$Result <- FALSE #hylätään aluksi kaikki hypoteesit ykkösruuduista #print(paste(out2\$Hypothesis, out2\$SudCell) %in% paste(valinta\$Hypothesis, valinta\$SudCell)) out2[paste(out2\$Hypothesis, out2\$SudCell) %in% paste(valinta\$Hypothesis, valinta\$SudCell) , ]\$Result <- TRUE # palautetaan oikea vastaus ykkösruutuihin return(out2) } SudSolve <- function(sudoku, verbose = FALSE) { iteraatio <- 1 repeat{ if(verbose) {print(SudShow(sudoku))} samerow <- matrix(sudoku\$SudRow, nrow = nrow(sudoku), ncol = nrow(sudoku)) # rule6 samerow <- samerow == t(samerow) samecol <- matrix(sudoku\$SudCol, nrow = nrow(sudoku), ncol = nrow(sudoku)) samecol <- samecol == t(samecol) samearea <- matrix(sudoku\$SudArea, nrow = nrow(sudoku), ncol = nrow(sudoku)) samearea <- samearea == t(samearea) samehypo <- matrix(sudoku\$Hypothesis, nrow = nrow(sudoku), ncol = nrow(sudoku)) samehypo <- samehypo == t(samehypo) samecell <- matrix(sudoku\$SudCell, nrow = nrow(sudoku), ncol = nrow(sudoku)) samecell <- samecell == t(samecell) rule1 <- ! (samerow & samehypo) rule2 <- ! (samecol & samehypo) rule3 <- ! (samearea & samehypo) # rule4 <- ifelse(samecell, NA, TRUE) temp <- sudoku\$Result & rule1 & rule2 & rule3 #& rule4 temp <- ifelse(samecell, NA, temp) # Tämän säännön täytyy ajaa yli kaikkien muiden. temp2 <- apply(temp, 2, function(x) tapply(x, sudoku\$SudCell, any)) # rule5 temp3 <- apply(temp2, 2, function(x) { # rule7 x <- ifelse(is.na(x), TRUE, x) return(all(x)) }) sudtemp <- sudoku sudtemp\$Result <- ifelse(sudtemp\$Result, temp3, sudtemp\$Result) if(any(tapply(sudtemp\$Result, sudtemp["SudCell"], sum) == 0)) { if(verbose) {warning("Implausible solution\n")} return(sudtemp) } sudtemp <- ykköskarsinta(sudtemp, "SudCol") # rule8 sudtemp <- ykköskarsinta(sudtemp, "SudRow") # rule9 sudtemp <- ykköskarsinta(sudtemp, "SudArea") # rule10 iteraatio <- iteraatio + 1 test <- all(sudtemp\$Result == sudoku\$Result) if(verbose) cat("Has the solution changed during iteration", iteraatio, "?", !test, "\n") sudoku <- sudtemp if(test | iteraatio > 15) {break} } return(sudoku) } SudShow <- function(sudoku) { if(class(sudoku) == "data.frame") {sudoku <- list(sudoku)} for(i in 1:length(sudoku)) { out <- sudoku[[i]] out <- out[out\$Result , ] out <- tapply(out\$Hypothesis, out[c("SudRow", "SudCol")], function(x) paste(x, sep = "", collapse = "")) if(exists("oprint")) oprint(out) else print(out) } } SudGuess <- function(sudoku, cellguess, verbose = FALSE) { if(class(sudoku) == "data.frame") {sudokulist <- list(sudoku)} else {sudokulist <- sudoku} for(i in cellguess) { # Jokainen lisäruutu erikseen if(verbose) cat("Guessing at cell", i, "\n") lap <- 0 templist <- list() for(j in 1:length(sudokulist)) { # Jokainen sudokuskenaario erikseen. sudo <- sudokulist[[j]] nscen <- nrow(sudo[sudo\$SudCell %in% i & sudo\$Result , ]) for(k in 1:nscen) { # Jokainen valitun ruudun mahdollinen vaihtoehto erikseen. faagi <- rep(FALSE, nscen) faagi[k] <- TRUE temp <- sudo temp[temp\$SudCell %in% i & temp\$Result, "Result"] <- faagi temp <- SudSolve(temp) if(verbose) {print(SudShow(temp))} if(!any(tapply(temp\$Result, temp["SudCell"], sum) == 0)) {templist[[lap + k]] <- temp} } lap <- lap + k } templist <- templist[!sapply(templist, is.null)] sudokulist <- templist } if(length(sudokulist) == 1) {sudokulist <- sudokulist[]} return(sudokulist) } example <- SudMake(hypotheses, data) SudShow(example) example <- SudSolve(example, verbose = verbose) SudShow(example) if(!is.null(cellguess)) { example <- SudGuess(example, cellguess, verbose = verbose) SudShow(example) } ```