Sudoku solver: Difference between revisions

Revision as of 17:04, 5 May 2013

This page is a knowledge crystal of subtype method. The page identifier is Op_en5817
Moderator:Jouni (see all)
This page is a stub. You may improve it into a full page.
Upload data {{#opasnet_base_link:Op_en5817}}

Question

How to describe a sudoku and the sudoku rules in Opasnet so that it can be solved automatically?

Answer

You need the following tables.

**Hypotheses**
Row	Column	Result	Description
All	All	1,2,3,4,5,6,7,8,9	For all row and column locations it applies that the plausible hypotheses are a single integer between 1 and 9 (unless more information is available).

**Area descriptions**
Row	Column	Area
1	1	A
1	2	A
1	3	A
1	4	B
…
2	1	A
…
4	1	D
…
9	9	I

**Rules of exclusion when comparing two cells.**
Property1	Condition1	Property2	Condition2	Rule	Description
Row	Same	Column	Different	Same integer not allowed	Two cells with the same row and different column are not allowed to have the same integer.
Row	Different	Column	Same	Same integer not allowed	Two cells with the different row and same column are not allowed to have the same integer.
Area	Same	Column	Different	Same integer not allowed	Two cells with the same area and different column are not allowed to have the same integer.
Area	Same	Row	Different	Same integer not allowed	Two cells with the same area and different row are not allowed to have the same integer.

**The sudoku data (this example is "the most difficult sudoku in the world")**
Row	Column
Row	1	2	3	4	5	6	7	8	9
1	8
2			3	6
3		7			9		2
4		5				7
5					4	5	7
6				1				3
7			1					6	8
8			8	5				1
9		9					4

Procedure

The following terms are used:

The possible space of solutions is described by a logical vector A, which contains all possible values given the current information. A is indexed by h, i, j, k, and l. A develops in time, when more information occurs or is processed. In the beginning, all hypotheses are considered as potentially TRUE.
h = {1, 2, ..., 9}, i = {1, 2, ..., 9}, j = {1, 2, ..., 9}, k = {1, 2, ..., 9}, l = {1, 2, ..., 81} are indices for hypothesis, row, column, area, and cell, respectively. Note l is just another way to say (i,j), as l = i + (j-1)*9. Also, k is known if (i,j) is known, as k = ceiling(i/3) + (ceiling(j/3)-1)*3. However, k does not contain all information that (i,j) or l contains.
a_h and b_h are sub-vectors of A in such a way that a_h = A_h,m and b_h = A_h,n, where m and n (m < n) are values from index l.

Expand the missing index values of the Hypotheses table to create the full A.
Take the sudoku data table and replace hypotheses with data, if available.
Compare all two-cell pairs at once using matrices.
1. Create matrices of all critical properties: SudRow, SudCol, SudArea, and Hypothesis. These are compared pairwise, two cells at a time.
2. Use rules to deduce if a pair is incompatible or not.
3. Aggregate plausible hypotheses in the second cell across all plausible values in the first cell. This gives a set of hypotheses that are plausible in at least some conditions.
4. Aggregate along the first cell: each second cell must be compatible with all other (first) cells.
5. Do not apply these rules if the first and second cells are the same.
Go through every row, column, and area and find hypotheses where there is exactly one cell where it is plausible.
1. Remove all other hypotheses from these cells.
Do steps 3 and 4 for five times, assuming that all hypotheses have been eliminated by then, if it is possible to eliminate them using these rules.
Take a user-defined list of cells for which a random sample from plausible hypotheses is taken. It would be elegant to do this stepwise, but for simplicity, let's do it at once. Therefore, the cells that the user selects is critical, and a wise user will not select cells where uncertainties are clearly interdependent.
Solve all sudokus that are created in the sampling.
If a sudoku results in cells with zero plausible hypotheses, remove that iteration.
Calculate the number of different solutions still plausible and print it.
If the number is smaller than 100, print also the solutions.

Rationale

Data

This table will be expanded by fillna to be a 9*9*9 array (formatted as data.frame). As default, each hypothesis is assumed to be true unless shown otherwise.

Hypotheses(Boolean)
Obs	SudRow	SudCol	Hypothesis	Result
1			1	TRUE
2			2	TRUE
3			3	TRUE
4			4	TRUE
5			5	TRUE
6			6	TRUE
7			7	TRUE
8			8	TRUE
9			9	TRUE
10		1		TRUE
11		2		TRUE
12		3		TRUE
13		4		TRUE
14		5		TRUE
15		6		TRUE
16		7		TRUE
17		8		TRUE
18		9		TRUE
19	1			TRUE
20	2			TRUE
21	3			TRUE
22	4			TRUE
23	5			TRUE
24	6			TRUE
25	7			TRUE
26	8			TRUE
27	9			TRUE

Sudoku data

Sudoku(-)
Obs	SudRow	1	2	3	4	5	6	7	8	9
1	1	8
2	2			3	6
3	3		7			9		2
4	4		5				7
5	5					4	5	7
6	6				1				3
7	7			1					6	8
8	8			8	5				1
9	9		9					4

Formula

+ Show code - Hide code


library(OpasnetUtils)

hypotheses <- tidy(opbase.data("Op_en5817.hypotheses"))
for(i in 1:3) {
	hypotheses[[i]] <- ifelse(hypotheses[[i]] == "", NA, as.integer(as.character(hypotheses[[i]])))
}

hypotheses <- unique(fillna(hypotheses, marginals = c(1, 2, 3)))

# Get the sudoku data

# data <- tidy(opbase.data("Op_en5817.sudoku"), objname="data")
# data$SudRow <- as.numeric(data$SudRow)
# data$SudCol <- as.numeric(data$SudCol)

# Maailman vaikein

data <- "
8        
  36     
 7  9 2  
 5   7   
    457  
   1   3 
  1    68
  85   1 
 9    4  
"

data <- "
  4 2   9
58 6   7 
   5  8  
 751 8  4
    4    
8  2 765 
  7  2   
 2   1 68
1   3 9  
"

data <- "
 3  8 2  
   5   8 
6 71     
 4      2
 124 967 
9      4 
     35 4
 2   6   
  3 2  6 
"

data <- "
  45   81
  2 6 5 4
 9   1   
4  9  8  
 89 2 45 
  5  8  9
   6   2 
1 7 9 3  
35   41  
"

# Neljä tähteä

data <- "
   23 9  
     5 6 
9  76  5 
  6   38 
4  523  7
 39   4  
 2  94  8
 4 3     
  5 72   
"


hypotheses <- data.frame(
	SudRow = rep(rep(1:9, each = 9), times = 9),
	SudCol = rep(rep(1:9, each = 9), each = 9),
	Hypothesis = rep(rep(1:9, times = 9), times = 9),
	Result = TRUE
)

SudMake <- function(hypotheses, data) {

	hypotheses$SudRow <- as.numeric(hypotheses$SudRow)
	hypotheses$SudCol <- as.numeric(hypotheses$SudCol)
	hypotheses$SudCell = hypotheses$SudRow + (hypotheses$SudCol - 1) * 9
	hypotheses$SudArea = ceiling(hypotheses$SudRow / 3) + (ceiling(hypotheses$SudCol / 3) - 1) * 3
	hypotheses$Result <- as.logical(hypotheses$Result)

	data <- data.frame(
		SudRow = rep(1:9, each = 9), 
		SudCol = rep(1:9, times = 9), 
		dataResult = gsub(" ", "", strsplit(gsub("\n", "", data), split = "")[[1]])
	)

#	oprint(tapply(data["Hypothesis"], data[c("SudRow", "SudCol")], function(x) paste(x, sep = "", collapse = "")))

	out <- merge(hypotheses, data)
	out$dataResult <- as.character(out$dataResult)
	out$Result <- ifelse(out$dataResult == "", out$Result, (as.character(out$dataResult) == as.character(out$Hypothesis)))
	return(out)
}

ykköskarsinta <- function(out2, condition) {

		valinta <- as.data.frame(as.table(tapply(out2$Result, out2[c("Hypothesis", condition)], sum) == 1)) # haetaan ehto-hypoteesikombinaatiot
		valinta <- valinta[valinta$Freq, colnames(valinta) != "Freq"] # rajataan ainoisiin ratkaisuihin
		valinta <- merge(valinta, out2) # yhdistetään SudCell-tietoon
		valinta <- valinta[valinta$Result , ] # poistetaan turhat
#print(valinta$SudCell)
		out2[out2$SudCell %in% valinta$SudCell , ]$Result <- FALSE #hylätään aluksi kaikki hypoteesit ykkösruuduista
#print(paste(out2$Hypothesis, out2$SudCell) %in% paste(valinta$Hypothesis, valinta$SudCell))
		out2[paste(out2$Hypothesis, out2$SudCell) %in% paste(valinta$Hypothesis, valinta$SudCell) , ]$Result <- TRUE # palautetaan oikea vastaus ykkösruutuihin
		return(out2)
	}

SudSolve <- function(sudoku, verbose = FALSE) {

	iteraatio <- 1
	repeat{
		if(verbose) {print(SudShow(sudoku))}
		samerow <- matrix(sudoku$SudRow, nrow = nrow(sudoku), ncol = nrow(sudoku))
		samerow <- samerow == t(samerow)

		samecol <- matrix(sudoku$SudCol, nrow = nrow(sudoku), ncol = nrow(sudoku))
		samecol <- samecol == t(samecol)

		samearea <- matrix(sudoku$SudArea, nrow = nrow(sudoku), ncol = nrow(sudoku))
		samearea <- samearea == t(samearea)

		samehypo <- matrix(sudoku$Hypothesis, nrow = nrow(sudoku), ncol = nrow(sudoku))
		samehypo <- samehypo == t(samehypo)

		samecell <- matrix(sudoku$SudCell, nrow = nrow(sudoku), ncol = nrow(sudoku))
		samecell <- samecell == t(samecell)

		rule1 <- ! (samerow & samehypo)
		rule2 <- ! (samecol & samehypo)
		rule3 <- ! (samearea & samehypo)
	#	rule4 <- ifelse(samecell, NA, TRUE)

		temp <- sudoku$Result & rule1 & rule2 & rule3 #& rule4
		temp <- ifelse(samecell, NA, temp) # Tämän säännön täytyy ajaa yli kaikkien muiden.

		temp2 <- apply(temp, 2, function(x) tapply(x, sudoku$SudCell, any))

		temp3 <- apply(temp2, 2, function(x) {
			x <- ifelse(is.na(x), TRUE, x)
			return(all(x))
		})
		
		sudtemp <- sudoku
		sudtemp$Result <- ifelse(sudtemp$Result, temp3, sudtemp$Result)
		
		if(any(tapply(sudtemp$Result, sudtemp["SudCell"], sum) == 0)) {
			if(verbose) {warning("Implausible solution\n")}
			return(sudtemp)
		}
		
		sudtemp <- ykköskarsinta(sudtemp, "SudCol")
		sudtemp <- ykköskarsinta(sudtemp, "SudRow")
		sudtemp <- ykköskarsinta(sudtemp, "SudArea")
		iteraatio <- iteraatio + 1
		test <- all(sudtemp$Result == sudoku$Result)
cat("Onko sama sudoku kierroksella ", iteraatio, "?", test, "\n")
		sudoku <- sudtemp
	if(test | iteraatio > 15) {break}
	}
	return(sudoku)
}

SudShow <- function(sudoku) {

	if(class(sudoku) == "data.frame") {sudoku <- list(sudoku)}
	for(i in 1:length(sudoku)) {
		out <- sudoku[[i]]
		out <- out[out$Result , ]
		out <- tapply(out$Hypothesis, out[c("SudRow", "SudCol")], function(x) paste(x, sep = "", collapse = ""))
		if(exists("oprint")) oprint(out) else print(out)
	}	
}

SudGuess <- function(sudoku, cellguess, verbose = FALSE) {
	if(class(sudoku) == "data.frame") {sudokulist <- list(sudoku)} else {sudokulist <- sudoku}
	for(i in cellguess) { # Jokainen lisäruutu erikseen
		lap <- 0
		templist <- list()
		for(j in 1:length(sudokulist)) { # Jokainen sudokuskenaario erikseen.
			sudo <- sudokulist[[j]]
			nscen <- nrow(sudo[sudo$SudCell %in% i & sudo$Result , ])
			for(k in 1:nscen) { # Jokainen valitun ruudun mahdollinen vaihtoehto erikseen.
				faagi <- rep(FALSE, nscen)
				faagi[k] <- TRUE
				temp <- sudo
				temp[temp$SudCell %in% i & temp$Result, "Result"] <- faagi
				temp <- SudSolve(temp)
				if(verbose) {print(SudShow(temp))}
				if(!any(tapply(temp$Result, temp["SudCell"], sum) == 0)) {templist[[lap + k]] <- temp}
			}
			lap <- lap + k
		}
		templist <- templist[!sapply(templist, is.null)]
		sudokulist <- templist
	}
	if(length(sudokulist) == 1) {sudokulist <- sudokulist[[1]]}
	return(sudokulist)
}

Keywords

References

Related files

@@ Line 115: / Line 115: @@
 # Expand the missing index values of the Hypotheses table to create the full A.
 # Take the sudoku data table and replace hypotheses with data, if available.
-# Compare two cells a and b in the sudoku. Make a for-loop for the first cell a: for(m in 1:nrow(l))).
+# Compare all two-cell pairs at once using matrices.
-## Make another for-loop for the second cell b: for(n in (m+1):nrow(l)).
+## Create matrices of all critical properties: SudRow, SudCol, SudArea, and Hypothesis. These are compared pairwise, two cells at a time.
-### Make a third for-loop for the hypotheses of a: for(h in 1:nrow(a))
+## Use rules to deduce if a pair is incompatible or not.
-#### Make a fourth for-loop for the rules: for(r in 1:nrow(rules)).
+## Aggregate plausible hypotheses in the second cell across all plausible values in the first cell. This gives a set of hypotheses that are plausible in at least some conditions.
-##### Test for the rule with the pair of cells, creating a set of plausible hypothesis for the other cell b conditional on the first cell a.
+## Aggregate along the first cell: each second cell must be compatible with all other (first) cells.
-##### If the set of plausible hypotheses for b is empty, the condition is implausible; remove the condition and thus that hypothesis from the first cell a.
+## Do not apply these rules if the first and second cells are the same.
-##### Take the union of plausible hypothesis for b (which then covers all plausible hypotheses unconditionally) and replace the current hypotheses of b with the new hypotheses.
+# Go through every row, column, and area and find hypotheses where there is exactly one cell where it is plausible.
-### Do the third for-loop again but now with the other cell b.
+## Remove all other hypotheses from these cells.
-# Do the second and first loops for all values of m and n.
+# Do steps 3 and 4 for five times, assuming that all hypotheses have been eliminated by then, if it is possible to eliminate them using these rules.
-# Do another set of loops for each row, column, and area to test whether there is exactly one cell where each hypothesis is plausible. (Remember, that in sudoku, unlike in most parts of the world, we know that each hypothesis is correct at exactly one cell). If a unique cell is found, remove all other hypotheses from that cell.
+# Take a user-defined list of cells for which a random sample from plausible hypotheses is taken. It would be elegant to do this stepwise, but for simplicity, let's do it at once. Therefore, the cells that the user selects is critical, and a wise user will not select cells where uncertainties are clearly interdependent.
-## Make a loop for each row and then hypothesis.
+# Solve all sudokus that are created in the sampling.
-## Make a loop for each column and then hypothesis.
+# If a sudoku results in cells with zero plausible hypotheses, remove that iteration.
-## Make a loop for each area and then hypothesis.
-# If a unique solution was not found and if the current set of hypotheses is not the same as the previous set, save the current set as "previous set" and go to number 2.
 # Calculate the number of different solutions still plausible and print it.
 # If the number is smaller than 100, print also the solutions.
@@ Line 187: / Line 185: @@
 <rcode variables="name:userdata|description:Enter sudoku as a string, with spaces in empty cells|type:text">
+library(OpasnetUtils)
+hypotheses <- tidy(opbase.data("Op_en5817.hypotheses"))
+for(i in 1:3) {
+	hypotheses[[i]] <- ifelse(hypotheses[[i]] == "", NA, as.integer(as.character(hypotheses[[i]])))
+}
+hypotheses <- unique(fillna(hypotheses, marginals = c(1, 2, 3)))
+# Get the sudoku data
+# data <- tidy(opbase.data("Op_en5817.sudoku"), objname="data")
+# data$SudRow <- as.numeric(data$SudRow)
+# data$SudCol <- as.numeric(data$SudCol)
 # Maailman vaikein
@@ Line 252: / Line 265: @@
 "
-if(!is.null(userdata)) data <- userdata
-data
 hypotheses <- data.frame(
@@ Line 261: / Line 272: @@
 	Result = TRUE
 )
-library(OpasnetUtils)
-hypotheses <- tidy(opbase.data("Op_en5817.hypotheses"))
-for(i in 1:3) {
-	hypotheses[[i]] <- ifelse(hypotheses[[i]] == "", NA, as.integer(as.character(hypotheses[[i]])))
-}
-hypotheses <- unique(fillna(hypotheses, marginals = c(1, 2, 3)))
-# Get the sudoku data
-# data <- tidy(opbase.data("Op_en5817.sudoku"), objname="data")
-# data$SudRow <- as.numeric(data$SudRow)
-# data$SudCol <- as.numeric(data$SudCol)
 SudMake <- function(hypotheses, data) {
@@ Line 299: / Line 295: @@
 }
-SudSolve <- function(sudoku) {
+ykköskarsinta <- function(out2, condition) {
-	ykköskarsinta <- function(out2, condition) {
 		valinta <- as.data.frame(as.table(tapply(out2$Result, out2[c("Hypothesis", condition)], sum) == 1)) # haetaan ehto-hypoteesikombinaatiot
@@ Line 307: / Line 301: @@
 		valinta <- merge(valinta, out2) # yhdistetään SudCell-tietoon
 		valinta <- valinta[valinta$Result , ] # poistetaan turhat
+#print(valinta$SudCell)
 		out2[out2$SudCell %in% valinta$SudCell , ]$Result <- FALSE #hylätään aluksi kaikki hypoteesit ykkösruuduista
+#print(paste(out2$Hypothesis, out2$SudCell) %in% paste(valinta$Hypothesis, valinta$SudCell))
 		out2[paste(out2$Hypothesis, out2$SudCell) %in% paste(valinta$Hypothesis, valinta$SudCell) , ]$Result <- TRUE # palautetaan oikea vastaus ykkösruutuihin
 		return(out2)
 	}
-#	temp <- sudoku["Result"]
+SudSolve <- function(sudoku, verbose = FALSE) {
-#	for(m in 1:nrow(sudoku)) {
-#		a <- sudoku[m , ]
-#		if(a$Result) {
-#			temp[[m]] <- ifelse(a$SudRow == sudoku$SudRow & a$Hypothesis == sudoku$Hypothesis, FALSE, sudoku$Result)
-#			temp[[m]] <- ifelse(a$SudCol == sudoku$SudCol & a$Hypothesis == sudoku$Hypothesis, FALSE, temp[[m]])
-#			temp[[m]] <- ifelse(a$SudArea == sudoku$SudArea & a$Hypothesis == sudoku$Hypothesis, FALSE, temp[[m]])
-#		} else {
-#			temp[[m]] <- FALSE
-#		}
-#		temp[[m]] <- ifelse(a$SudCell == sudoku$SudCell, NA, temp[[m]])
-#	}
-	for(i in 1:5) {
+	iteraatio <- 1
+	repeat{
+		if(verbose) {print(SudShow(sudoku))}
 		samerow <- matrix(sudoku$SudRow, nrow = nrow(sudoku), ncol = nrow(sudoku))
 		samerow <- samerow == t(samerow)
@@ Line 349: / Line 335: @@
 		temp <- sudoku$Result & rule1 & rule2 & rule3 #& rule4
 		temp <- ifelse(samecell, NA, temp) # Tämän säännön täytyy ajaa yli kaikkien muiden.
-	#	temp2 <- data.frame(Result = rep(NA, 9^2))
-	#	temp3 <- rep(NA, 9^3)
-	#	temp <- as.data.frame(temp)
-	#	for(m in 1:nrow(sudoku)) {temp2[m] <- tapply(temp[[m]], sudoku$SudCell, any)}
 		temp2 <- apply(temp, 2, function(x) tapply(x, sudoku$SudCell, any))
-	#	for(n in 1:nrow(sudoku)) {
-	#		temp2[[n]] <- ifelse(is.na(temp2[[n]]), TRUE, temp2[[n]])
-	#		temp3[n] <- all(temp2[[n]])
-	#	}
 		temp3 <- apply(temp2, 2, function(x) {
@@ Line 366: / Line 342: @@
 			return(all(x))
 		})
-		sudoku$Result <- ifelse(sudoku$Result, temp3, sudoku$Result)
-		sudoku <- ykköskarsinta(sudoku, "SudCol")
+		sudtemp <- sudoku
-		sudoku <- ykköskarsinta(sudoku, "SudRow")
+		sudtemp$Result <- ifelse(sudtemp$Result, temp3, sudtemp$Result)
-		sudoku <- ykköskarsinta(sudoku, "SudArea")
+		if(any(tapply(sudtemp$Result, sudtemp["SudCell"], sum) == 0)) {
+			if(verbose) {warning("Implausible solution\n")}
+			return(sudtemp)
+		}
+		sudtemp <- ykköskarsinta(sudtemp, "SudCol")
+		sudtemp <- ykköskarsinta(sudtemp, "SudRow")
+		sudtemp <- ykköskarsinta(sudtemp, "SudArea")
+		iteraatio <- iteraatio + 1
+		test <- all(sudtemp$Result == sudoku$Result)
+cat("Onko sama sudoku kierroksella ", iteraatio, "?", test, "\n")
+		sudoku <- sudtemp
+	if(test | iteraatio > 15) {break}
 	}
 	return(sudoku)
 }
@@ Line 379: / Line 365: @@
 SudShow <- function(sudoku) {
-	pasthyp <- function(x) {
+	if(class(sudoku) == "data.frame") {sudoku <- list(sudoku)}
-		return(paste(x, sep = "", collapse = ""))
+	for(i in 1:length(sudoku)) {
+		out <- sudoku[[i]]
+		out <- out[out$Result , ]
+		out <- tapply(out$Hypothesis, out[c("SudRow", "SudCol")], function(x) paste(x, sep = "", collapse = ""))
+		if(exists("oprint")) oprint(out) else print(out)
+	}
+}
+SudGuess <- function(sudoku, cellguess, verbose = FALSE) {
+	if(class(sudoku) == "data.frame") {sudokulist <- list(sudoku)} else {sudokulist <- sudoku}
+	for(i in cellguess) { # Jokainen lisäruutu erikseen
+		lap <- 0
+		templist <- list()
+		for(j in 1:length(sudokulist)) { # Jokainen sudokuskenaario erikseen.
+			sudo <- sudokulist[[j]]
+			nscen <- nrow(sudo[sudo$SudCell %in% i & sudo$Result , ])
+			for(k in 1:nscen) { # Jokainen valitun ruudun mahdollinen vaihtoehto erikseen.
+				faagi <- rep(FALSE, nscen)
+				faagi[k] <- TRUE
+				temp <- sudo
+				temp[temp$SudCell %in% i & temp$Result, "Result"] <- faagi
+				temp <- SudSolve(temp)
+				if(verbose) {print(SudShow(temp))}
+				if(!any(tapply(temp$Result, temp["SudCell"], sum) == 0)) {templist[[lap + k]] <- temp}
+			}
+			lap <- lap + k
+		}
+		templist <- templist[!sapply(templist, is.null)]
+		sudokulist <- templist
 	}
-	out <- aggregate(
+	if(length(sudokulist) == 1) {sudokulist <- sudokulist[[1]]}
-		ifelse(sudoku$Result, sudoku$Hypothesis, ""),
+	return(sudokulist)
-		sudoku[c("SudRow", "SudCol", "SudCell")],
-		pasthyp
-	)
-	out <- array(out[[4]], dim = c(9,9))
-	return(out)
 }
-oprint(SudShow(SudSolve(SudMake(hypotheses, data))))
 </rcode>