Sudoku solver: Difference between revisions

From Opasnet
Jump to navigation Jump to search
(new version)
(→‎Rationale: improvement ideas for later)
 
(7 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[Category:Decision analysis]]
[[Category:Decision analysis]]
[[Category:Contains R code]]
{{method|moderator=Jouni|stub=Yes}}
{{method|moderator=Jouni|stub=Yes}}


Line 7: Line 8:


== Answer ==
== Answer ==
You need the following tables.
{| {{prettytable}}
|+ '''Hypotheses
! Row || Column || Result || Description
|----
| All || All || 1,2,3,4,5,6,7,8,9 || For all row and column locations it applies that the plausible hypotheses are a single integer between 1 and 9 (unless more information is available).
|}
{| {{prettytable}}
|+ '''Area descriptions
! Row || Column || Area
|----
|| 1|| 1|| A
|----
|| 1|| 2|| A
|----
|| 1|| 3|| A
|----
|| 1|| 4|| B
|----
|| …|| ||
|----
|| 2|| 1|| A
|----
|| …|| ||
|----
|| 4|| 1|| D
|----
|| …|| ||
|----
|| 9|| 9|| I
|}
{| {{prettytable}}
|+ '''Rules of exclusion when comparing two cells.
! Property1 || Condition1|| Property2|| Condition2|| Rule|| Description
|----
| Row
| Same
| Column
| Different
| Same integer not allowed
| Two cells with the same row and different column are not allowed to have the same integer.
|----
| Row
| Different
| Column
| Same
| Same integer not allowed
| Two cells with the different row and same column are not allowed to have the same integer.
|----
| Area
| Same
| Column
| Different
| Same integer not allowed
| Two cells with the same area and different column are not allowed to have the same integer.
|----
| Area
| Same
| Row
| Different
| Same integer not allowed
| Two cells with the same area and different row are not allowed to have the same integer.
|----
|}
{| {{prettytable}}
|+ '''The sudoku data (this example is "the most difficult sudoku in the world")
! rowspan="2"| Row
!colspan="9"| Column
|----
! 1|| 2|| 3|| 4|| 5|| 6|| 7|| 8|| 9
|----
|| '''1'''|| 8|| || || || || || || ||
|----
|| '''2'''|| || || 3|| 6|| || || || ||
|----
|| '''3'''|| || 7|| || || 9|| || 2|| ||
|----
|| '''4'''|| || 5|| || || || 7|| || ||
|----
|| '''5'''|| || || || || 4|| 5|| 7|| ||
|----
|| '''6'''|| || || || 1|| || || || 3||
|----
|| '''7'''|| || || 1|| || || || || 6|| 8
|----
|| '''8'''|| || || 8|| 5|| || || || 1||
|----
|| '''9'''|| || 9|| || || || || 4|| ||
|----
|}


===Procedure===
===Procedure===
Line 123: Line 27:
# Go through every row, column, and area and find hypotheses where there is exactly one cell where it is plausible.
# Go through every row, column, and area and find hypotheses where there is exactly one cell where it is plausible.
## Remove all other hypotheses from these cells.
## Remove all other hypotheses from these cells.
# Do steps 3 and 4 for five times, assuming that all hypotheses have been eliminated by then, if it is possible to eliminate them using these rules.
# Do steps 3 and 4 until the sudoku does not improve (i.e., further hypotheses are not falsified).
# Take a user-defined list of cells for which a random sample from plausible hypotheses is taken. It would be elegant to do this stepwise, but for simplicity, let's do it at once. Therefore, the cells that the user selects is critical, and a wise user will not select cells where uncertainties are clearly interdependent.
# Take a user-defined list of cells for which a random sample from plausible hypotheses is taken. It would be elegant to do this stepwise, but for simplicity, let's do it at once. Therefore, the cells that the user selects is critical, and a wise user will not select cells where uncertainties are clearly interdependent.
# Solve all sudokus that are created in the sampling.
# Solve all sudokus that are created in the sampling.
Line 132: Line 36:
== Rationale ==
== Rationale ==


 
{{defend|# |Development needs to improve the solver:
# Change sudoku input string in a way that it can have also other possibilities than a single number or any number. For example, syntax [3578] would mean that any of those four numbers could be in a particular position. This makes the interpretation of the sudoku string a bit problematic, because then its length is no longer fixed to 81. However, this way it is possible to restrict possibilities iterativerly.
# When guessing, a new rule should be used: if a particular hypothesis is falsified in ALL scenarios, it can be falsified always. This rule can be extended in a way that when no unique solution is found, the solver would make guesses automatically and gradually falsify all wrong hypotheses.|--[[User:Jouni|Jouni]] ([[User talk:Jouni|talk]]) 09:36, 31 January 2016 (UTC)}}


=== Data ===
=== Data ===


This table will be expanded by fillna to be a 9*9*9 array (formatted as data.frame). As default, each hypothesis is assumed to be true unless shown otherwise.
This table will be expanded by fillna to be a 9*9*9 array (formatted as data.frame). As default, each hypothesis is assumed to be true unless shown otherwise.
; List of all possible hypotheses, which are ''a priori'' assumed to be true.


<t2b name="Hypotheses" index="SudRow,SudCol,Hypothesis" obs="Result" unit="Boolean">
<t2b name="Hypotheses" index="SudRow,SudCol,Hypothesis" obs="Result" unit="Boolean">
Line 168: Line 76:
</t2b>
</t2b>


;Sudoku data
; Rules of inference: the table is actually for illustration only because the code is too complex to implement from a table entry. Rules 1-4 come directly from the rules of the sudoku game. All other rules are logically derived from them.
 
<t2b name="Rules" index="Rule name" obs="Rule" desc="Description" unit="-">
rule1|If a hypothesis is TRUE in cell A, that hypothesis must be FALSE in cell B, if B is on the same row as A.|
rule2|If a hypothesis is TRUE in cell A, that hypothesis must be FALSE in cell B, if B is on the same column as A.|
rule3|If a hypothesis is TRUE in cell A, that hypothesis must be FALSE in cell B, if B is on the same area as A.|
rule4|If cells A and B are actually the same cell, no inferences are made.|This is a stronger rule than others.
rule5|If a hypothesis in cell B is TRUE at least once given all hypotheses in cell A, the hypothesis in B is considered TRUE.|This rule loses a lot of information about dependencies between cells, but it saves huge amounts of memory. In any case, even then, the set of rules effectively narrow down the potential hypothesis space.
rule6|Rules 1-5 apply to all pairs of cells A and B.|
rule7|If a hypothesis in cell B is FALSE after applying rule 5 for even one cell A, it is FALSE always.|
rule8|If a hypothesis is TRUE in exactly one cell in a particular row, all other hypothesis in that cell are FALSE.|
rule9|If a hypothesis is TRUE in exactly one cell in a particular column, all other hypothesis in that cell are FALSE.|
rule10|If a hypothesis is TRUE in exactly one cell in a particular area, all other hypothesis in that cell are FALSE.|
rule11|If two cells on the same row contain exactly two TRUE hypotheses that are the same in both cells, these hypotheses cannot be TRUE in any other cell on that row.|This rule is analogous to rule1 but with two cells. However, this rule (and the respective rules for columns and areas) are not implemented in the code.
</t2b>
 
;Sudoku data (this is "the most difficult sudoku in thr world").


<t2b name="Sudoku" index="SudRow,SudCol" locations="1,2,3,4,5,6,7,8,9" unit="-">
<t2b name="Sudoku" index="SudRow,SudCol" locations="1,2,3,4,5,6,7,8,9" unit="-">
Line 184: Line 108:
=== Formula ===
=== Formula ===


<rcode variables="name:userdata|description:Enter sudoku as a string, with spaces in empty cells|type:text">
<rcode variables="
name:userdata|description:Enter sudoku as a string, with spaces in empty cells|type:text|
name:cellguess|description:If you want to guess, enter the (list of) cell(s) here|
name:verbose|description:Do you want to see intermediate results?|type:selection|options:FALSE;No;TRUE;Yes|default:FALSE
">


library(OpasnetUtils)
library(OpasnetUtils)
Line 265: Line 193:
"
"


if(!is.null(userdata)) data <- userdata


hypotheses <- data.frame(
hypotheses <- data.frame(
Line 313: Line 242:
repeat{
repeat{
if(verbose) {print(SudShow(sudoku))}
if(verbose) {print(SudShow(sudoku))}
samerow <- matrix(sudoku$SudRow, nrow = nrow(sudoku), ncol = nrow(sudoku))
samerow <- matrix(sudoku$SudRow, nrow = nrow(sudoku), ncol = nrow(sudoku)) # rule6
samerow <- samerow == t(samerow)
samerow <- samerow == t(samerow)


Line 336: Line 265:
temp <- ifelse(samecell, NA, temp) # Tämän säännön täytyy ajaa yli kaikkien muiden.
temp <- ifelse(samecell, NA, temp) # Tämän säännön täytyy ajaa yli kaikkien muiden.


temp2 <- apply(temp, 2, function(x) tapply(x, sudoku$SudCell, any))
temp2 <- apply(temp, 2, function(x) tapply(x, sudoku$SudCell, any)) # rule5


temp3 <- apply(temp2, 2, function(x) {
temp3 <- apply(temp2, 2, function(x) { # rule7
x <- ifelse(is.na(x), TRUE, x)
x <- ifelse(is.na(x), TRUE, x)
return(all(x))
return(all(x))
Line 351: Line 280:
}
}
sudtemp <- ykköskarsinta(sudtemp, "SudCol")
sudtemp <- ykköskarsinta(sudtemp, "SudCol") # rule8
sudtemp <- ykköskarsinta(sudtemp, "SudRow")
sudtemp <- ykköskarsinta(sudtemp, "SudRow") # rule9
sudtemp <- ykköskarsinta(sudtemp, "SudArea")
sudtemp <- ykköskarsinta(sudtemp, "SudArea") # rule10
iteraatio <- iteraatio + 1
iteraatio <- iteraatio + 1
test <- all(sudtemp$Result == sudoku$Result)
test <- all(sudtemp$Result == sudoku$Result)
cat("Onko sama sudoku kierroksella ", iteraatio, "?", test, "\n")
if(verbose) cat("Has the solution changed during iteration", iteraatio, "?", !test, "\n")
sudoku <- sudtemp
sudoku <- sudtemp
if(test | iteraatio > 15) {break}
if(test | iteraatio > 15) {break}
}
}
return(sudoku)
return(sudoku)
Line 377: Line 306:
if(class(sudoku) == "data.frame") {sudokulist <- list(sudoku)} else {sudokulist <- sudoku}
if(class(sudoku) == "data.frame") {sudokulist <- list(sudoku)} else {sudokulist <- sudoku}
for(i in cellguess) { # Jokainen lisäruutu erikseen
for(i in cellguess) { # Jokainen lisäruutu erikseen
if(verbose) cat("Guessing at cell", i, "\n")
lap <- 0
lap <- 0
templist <- list()
templist <- list()
Line 398: Line 328:
if(length(sudokulist) == 1) {sudokulist <- sudokulist[[1]]}
if(length(sudokulist) == 1) {sudokulist <- sudokulist[[1]]}
return(sudokulist)
return(sudokulist)
}
example <- SudMake(hypotheses, data)
SudShow(example)
example <- SudSolve(example, verbose = verbose)
SudShow(example)
if(!is.null(cellguess)) {
example <- SudGuess(example, cellguess, verbose = verbose)
SudShow(example)
}
}



Latest revision as of 09:36, 31 January 2016



Question

How to describe a sudoku and the sudoku rules in Opasnet so that it can be solved automatically?

Answer

Procedure

The following terms are used:

  • The possible space of solutions is described by a logical vector A, which contains all possible values given the current information. A is indexed by h, i, j, k, and l. A develops in time, when more information occurs or is processed. In the beginning, all hypotheses are considered as potentially TRUE.
  • h = {1, 2, ..., 9}, i = {1, 2, ..., 9}, j = {1, 2, ..., 9}, k = {1, 2, ..., 9}, l = {1, 2, ..., 81} are indices for hypothesis, row, column, area, and cell, respectively. Note l is just another way to say (i,j), as l = i + (j-1)*9. Also, k is known if (i,j) is known, as k = ceiling(i/3) + (ceiling(j/3)-1)*3. However, k does not contain all information that (i,j) or l contains.
  • ah and bh are sub-vectors of A in such a way that ah = Ah,m and bh = Ah,n, where m and n (m < n) are values from index l.


  1. Expand the missing index values of the Hypotheses table to create the full A.
  2. Take the sudoku data table and replace hypotheses with data, if available.
  3. Compare all two-cell pairs at once using matrices.
    1. Create matrices of all critical properties: SudRow, SudCol, SudArea, and Hypothesis. These are compared pairwise, two cells at a time.
    2. Use rules to deduce if a pair is incompatible or not.
    3. Aggregate plausible hypotheses in the second cell across all plausible values in the first cell. This gives a set of hypotheses that are plausible in at least some conditions.
    4. Aggregate along the first cell: each second cell must be compatible with all other (first) cells.
    5. Do not apply these rules if the first and second cells are the same.
  4. Go through every row, column, and area and find hypotheses where there is exactly one cell where it is plausible.
    1. Remove all other hypotheses from these cells.
  5. Do steps 3 and 4 until the sudoku does not improve (i.e., further hypotheses are not falsified).
  6. Take a user-defined list of cells for which a random sample from plausible hypotheses is taken. It would be elegant to do this stepwise, but for simplicity, let's do it at once. Therefore, the cells that the user selects is critical, and a wise user will not select cells where uncertainties are clearly interdependent.
  7. Solve all sudokus that are created in the sampling.
  8. If a sudoku results in cells with zero plausible hypotheses, remove that iteration.
  9. Calculate the number of different solutions still plausible and print it.
  10. If the number is smaller than 100, print also the solutions.

Rationale

←--#: . Development needs to improve the solver:

  1. Change sudoku input string in a way that it can have also other possibilities than a single number or any number. For example, syntax [3578] would mean that any of those four numbers could be in a particular position. This makes the interpretation of the sudoku string a bit problematic, because then its length is no longer fixed to 81. However, this way it is possible to restrict possibilities iterativerly.
  2. When guessing, a new rule should be used: if a particular hypothesis is falsified in ALL scenarios, it can be falsified always. This rule can be extended in a way that when no unique solution is found, the solver would make guesses automatically and gradually falsify all wrong hypotheses. --Jouni (talk) 09:36, 31 January 2016 (UTC) (type: truth; paradigms: science: defence)

Data

This table will be expanded by fillna to be a 9*9*9 array (formatted as data.frame). As default, each hypothesis is assumed to be true unless shown otherwise.

List of all possible hypotheses, which are a priori assumed to be true.
Hypotheses(Boolean)
ObsSudRowSudColHypothesisResult
11TRUE
22TRUE
33TRUE
44TRUE
55TRUE
66TRUE
77TRUE
88TRUE
99TRUE
101TRUE
112TRUE
123TRUE
134TRUE
145TRUE
156TRUE
167TRUE
178TRUE
189TRUE
191TRUE
202TRUE
213TRUE
224TRUE
235TRUE
246TRUE
257TRUE
268TRUE
279TRUE
Rules of inference
the table is actually for illustration only because the code is too complex to implement from a table entry. Rules 1-4 come directly from the rules of the sudoku game. All other rules are logically derived from them.
Rules(-)
ObsRule nameRuleDescription
1rule1If a hypothesis is TRUE in cell A, that hypothesis must be FALSE in cell B, if B is on the same row as A.
2rule2If a hypothesis is TRUE in cell A, that hypothesis must be FALSE in cell B, if B is on the same column as A.
3rule3If a hypothesis is TRUE in cell A, that hypothesis must be FALSE in cell B, if B is on the same area as A.
4rule4If cells A and B are actually the same cell, no inferences are made.This is a stronger rule than others.
5rule5If a hypothesis in cell B is TRUE at least once given all hypotheses in cell A, the hypothesis in B is considered TRUE.This rule loses a lot of information about dependencies between cells, but it saves huge amounts of memory. In any case, even then, the set of rules effectively narrow down the potential hypothesis space.
6rule6Rules 1-5 apply to all pairs of cells A and B.
7rule7If a hypothesis in cell B is FALSE after applying rule 5 for even one cell A, it is FALSE always.
8rule8If a hypothesis is TRUE in exactly one cell in a particular row, all other hypothesis in that cell are FALSE.
9rule9If a hypothesis is TRUE in exactly one cell in a particular column, all other hypothesis in that cell are FALSE.
10rule10If a hypothesis is TRUE in exactly one cell in a particular area, all other hypothesis in that cell are FALSE.
11rule11If two cells on the same row contain exactly two TRUE hypotheses that are the same in both cells, these hypotheses cannot be TRUE in any other cell on that row.This rule is analogous to rule1 but with two cells. However, this rule (and the respective rules for columns and areas) are not implemented in the code.
Sudoku data (this is "the most difficult sudoku in thr world").
Sudoku(-)
ObsSudRow123456789
118
2236
33792
4457
55457
6613
77168
88851
9994

Formula

Enter sudoku as a string, with spaces in empty cells:

If you want to guess, enter the (list of) cell(s) here:

Do you want to see intermediate results?:

+ Show code

See also

Keywords

References


Related files

<mfanonymousfilelist></mfanonymousfilelist>