Code Review: Sudoku solver

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
41 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

Tom Schrijvers
On Thu, 6 Apr 2006, Claus Reinke wrote:

>> Curry does not have a constraint solver of its own. It simply delegates all
>> constraints to the FD solver of SICStus Prolog.
>
> or that of SWI Prolog (which prompted my attempt to install Curry).
>
> which was implemented by..    hi, again!-)  (*)

The SWI-Prolog FD library is just a prototype implementation... looking
for someone to replace it with a state-of-the-art implementation.

>> The all_different constraint subsumes the rules that you describe,
>> depending on the consistency algorithm used. FD solvers implement general
>> but efficient algorithms that are much more complex than a few simple
>> rules.
>
> I haven't yet been able to get Curry to work on my windows machine,
> but it seems to do a straightforward translation of these constraints to
> Prolog > +FD solver, close to the one I've attached (an easy way to "use" external
> constraint solvers from Haskell;-).
:)

> the docs you pointed to state that all_different, in declarative view, simply
> translates into mutual inequalities between the list members, and although I
> don't fully understand the sources, they seem to confirm that not much more
> is going on.

The SWI-Prolog prototype library does nothing more than the declarative
meaning, that's why it's a prototype.

State-of-the-art all_different implementations are a lot more complicated
(often based on graph algorithms) to do very strong propagation.

Here is a paper about solving Sudoku with constraint (logic )
programming comparing a number of all_different algorithms and additional
tricks:
http://www.computational-logic.org/iccl/master/lectures/winter05/fcp/fcp/sudoku.pdf

Cheers,

Tom

--
Tom Schrijvers

Department of Computer Science
K.U. Leuven
Celestijnenlaan 200A
B-3001 Heverlee
Belgium

tel: +32 16 327544
e-mail: [hidden email]
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

Daniel Fischer-4
In reply to this post by haskell-2
Am Mittwoch, 5. April 2006 15:09 schrieb Chris Kuklewicz:

> Henning Thielemann wrote:
> > On Mon, 3 Apr 2006, Jared Updike wrote:
> >> or ambiguously) with your Sudoku solver? A rough mesaure of the
> >> difficulty of the unsolved puzzle could be how long the solver took to
> >> solve it (number of steps) (and the number of possible solutions)? Are
> >> puzzles with multiple solutions usually considered harder or easier?
> >> Are these considered proper puzzles?
> >
> > It's an interesting test to run a Sudoku solver on an empty array. :-)
>
> I am cleaning up my old (aka inexperienced) solver based on Knuth's dancing
> links to put on the wiki.  The code is very different than most Haskell
> solutions, since it revolves around a mutable data structure (which is not
> an MArray).
>
> It "solves" an empty array in 81 steps with no backtracking.   It will
> produce a list of all the solutions of an empty board quite efficiently.
>
> Cleaning up my "logic" based solver will take longer.
I've cleaned up my solver, removed a lot of redundant inference steps and made
the board a DiffArray (is that really faster?).
Now it completely solves (following all guesses) the 36,628 17-hint puzzles in
about 32 minutes (1909 secs).
It "solves" an empty board in 81 steps without false guesses, but still takes
over four seconds (that's the price for excessive inference).

I've also written a version using David F. Place's EnumSet instead of [Int],
that takes less MUT time, but more GC time, so is slower on the 36,628 test,
but faster for a single puzzle.

If anybody feels like improving the code (especially finding better names for
the functions) and/or placing it on the wiki, I'll be honoured.

Just out of curiosity, speed was not the objective when I wrote my solver, I
wanted to avoid guesswork (as much as possible), but in comparison with Cale
Gibbard's and Alson Kemp's solvers (which are much more beautifully coded),
it turned out that mine is blazingly fast, so are there faster solvers around
(in Haskell, in other languages)?

Cheers,
Daniel

--

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
        -- Blair P. Houghton

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Sudoku.hs (10K) Download Attachment
Check.hs (278 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

David Place

On Apr 6, 2006, at 6:05 PM, Daniel Fischer wrote:

I've also written a version using David F. Place's EnumSet instead of [Int],

that takes less MUT time, but more GC time, so is slower on the 36,628 test, 

but faster for a single puzzle.


That's a curious result.  Did you compile with optimization?  It should compile into primitive bit-twiddling operations and do no allocating at all.  I'd be curious to see how fast my solver works on the 36,628 test.  I'm afraid to run my ancient limping powerbook in such a tight loop for that long.  It gets too hot!  

If you'd find it amusing to give it a whirl, I'd love to know the result.



--------------------------------

David F. Place

[hidden email]



_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

incrsud.hs (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re[2]: Code Review: Sudoku solver

Bulat Ziganshin-2
In reply to this post by Daniel Fischer-4
Hello Daniel,

Friday, April 7, 2006, 2:05:03 AM, you wrote:

> I've cleaned up my solver, removed a lot of redundant inference steps and made
> the board a DiffArray (is that really faster?).

btw, DiffArray implementation can be made significantly faster by
using IORefs instead of MVars

--
Best regards,
 Bulat                            mailto:[hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

Claus Reinke
In reply to this post by Daniel Fischer-4
> Just out of curiosity, speed was not the objective when I wrote my solver, I
> wanted to avoid guesswork (as much as possible), but in comparison with Cale
> Gibbard's and Alson Kemp's solvers (which are much more beautifully coded),
> it turned out that mine is blazingly fast, so are there faster solvers around
> (in Haskell, in other languages)?

if I modify your solver to produce similar output to mine (input/first
propagation, solved puzzle, number and list of guesses made), your's
takes about a third of the time of mine (solving 36628 17hint puzzles
in 6m vs 17m, 2GHz Pentium M), and I wasn't exactly unhappy with
my solver before I did this comparison!-)

like you, I've been trying to remove guesses, and the speed came as a
welcome bonus (I'm still using lists all over the place, with lots of not
nice adhoc code still remaining; not all propagators are iterated fully
yet because I only recently removed a logic bug that slowed down the
search instead of speading it up; ..). so the more interesting bit is that
our solvers disagree on which are the most difficult puzzles (requiring
the largest number of guesses):

df
puzzles involving guesses: 5319
largest number of guesses:
    10 (#36084), 11 (#22495)

cr
puzzles involving guesses: 8165
largest number of guesses:
    10 (#9175), 10 (#17200), 10 (#29823), 10 (#30811)

df's solver needs 0/0/3/0 for cr's trouble spots, while cr's solver needs
5/9 guesses for df's. lots of potential for interesting investigations, though
mostly for me!-)

cheers,
claus

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Sudoku.hs (12K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

Daniel Fischer-4
In reply to this post by David Place
Am Freitag, 7. April 2006 01:50 schrieben Sie:
> On Apr 6, 2006, at 6:05 PM, Daniel Fischer wrote:
> > I've also written a version using David F. Place's EnumSet instead
> > of [Int],
> > that takes less MUT time, but more GC time, so is slower on the
> > 36,628 test,
> > but faster for a single puzzle.
>
> That's a curious result.  Did you compile with optimization?  It

I considered that curious, too.
Everything was compiled with -O2 (does -O3 give better results?, would adding
-optc-On improve performance? I'll try).
What makes it even weirder is that the EnumSet-version does indeed allocate
fewer bytes and performs fewer garbage collections (small difference,
though).
But I got consistent results, when running on a large number of puzzles, the
list-version's faster gc'ing led to shorter overall run times.
The same held when compiled with -O3 -optc-O3, however, I've been stupid,
my excuse is that I was ill this week, the list version spent 46.5% gc'ing and
the set version 53.5%, which is not really cricket, so today I used -AxM,
x <- [10,16,32,64], all reducing GC time to reasonable 0.5 - 2%. That, plus a
few polishings, reduced the running time to about 16 minutes for EnumSet, a
little more for lists.

But, lo and behold, I  also tried how plai Array fared in comparison to
DiffArray and ... reduced the running time to under ten minutes (a little
above for the list version), 5% GC time without -AxM, 1.2% with -A8M.

And I thought, DiffArrays were supposed to be fast!

> should compile into primitive bit-twiddling operations and do no
> allocating at all.  I'd be curious to see how fast my solver works on

Well, since I've wrapped the sets in the Entry datatype, I wouldn't expect
that, switching from Poss (singleton k) to Def k costs.
I tried making Board a DiffArray (Int,Int) (Set Int), but then I had the
problem that either I lost the information gained by placing & forbidding for
those positions where the number of possibilities dropped to one by
inference, or had to scan the grid and re-place every now and then, both
resulting in poor performance.

> the 36,628 test.  I'm afraid to run my ancient limping powerbook in
> such a tight loop for that long.  It gets too hot!
>
> If you'd find it amusing to give it a whirl, I'd love to know the
> result.

I ran your incrsud on the first fifteen 17-hint puzzles, took over 20s, so I
decided against the full 36,628 test. Extrapolation makes me believe it'd
take thirteen to fourteen hours.
The really big thing is to include the "if there is only one place to put a
number in a row/column/cell, then put it there" inference step. Further
inference has smaller impact (but the group-inference bought me a 20%
speedup, which isn't bad, is it?).
But using Array instead of DiffArray gave almost 40%, that's impressive.

Attached is the fastest version I have, oddly, compiled with -O2 it's faster
than with -O3 -optc-O3 (on my computer), how come?

setUni +RTS -A8M -sstderr
True
99,859,933,904 bytes allocated in the heap
104,713,900 bytes copied during GC
    150,260 bytes maximum residency (72 sample(s))

      11558 collections in generation 0 (  6.83s)
         72 collections in generation 1 (  0.16s)

         13 Mb total memory in use

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time  554.68s  (568.29s elapsed)
  GC    time    6.99s  (  7.22s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time  561.67s  (575.51s elapsed)

  %GC time       1.2%  (1.3% elapsed)

  Alloc rate    180,031,610 bytes per MUT second

  Productivity  98.8% of total user, 96.4% of total elapsed

an average of 0.015s per 17-hint puzzle, cool!

Cheers,
Daniel

>
> --------------------------------
> David F. Place
> mailto:[hidden email]

--

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
        -- Blair P. Houghton

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe

SudokuSet.hs (14K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

Daniel Fischer-4
In reply to this post by Claus Reinke
Am Freitag, 7. April 2006 17:33 schrieben Sie:

> > Just out of curiosity, speed was not the objective when I wrote my
> > solver, I wanted to avoid guesswork (as much as possible), but in
> > comparison with Cale Gibbard's and Alson Kemp's solvers (which are much
> > more beautifully coded), it turned out that mine is blazingly fast, so
> > are there faster solvers around (in Haskell, in other languages)?
>
> if I modify your solver to produce similar output to mine (input/first
> propagation, solved puzzle, number and list of guesses made), your's
> takes about a third of the time of mine (solving 36628 17hint puzzles
> in 6m vs 17m, 2GHz Pentium M), and I wasn't exactly unhappy with
> my solver before I did this comparison!-)

Mine's even faster now (at least on my computer, would you care to test it on
your's? If you don't want to get EnumSet, just change DiffArray to Array,
worked wonders for me), I'll dig into yours tomorrow to see what I can get
out of it to improve my algorithm.

>
> like you, I've been trying to remove guesses, and the speed came as a
> welcome bonus (I'm still using lists all over the place, with lots of not
> nice adhoc code still remaining; not all propagators are iterated fully

lists and adhoc code tend to be performance killers, I doubled the speed of
mine by de-adhoccing the code (and that although I introduced the
speed-killer DiffArray)

> yet because I only recently removed a logic bug that slowed down the
> search instead of speading it up; ..). so the more interesting bit is that
> our solvers disagree on which are the most difficult puzzles (requiring
> the largest number of guesses):
>
> df
> puzzles involving guesses: 5319

If that's not a typo, I'm baffled. My original needed to guess in 5309
puzzles, and I can't imagine what inference I could have dropped when
cleaning up the code.

> largest number of guesses:
>     10 (#36084), 11 (#22495)
>
> cr
> puzzles involving guesses: 8165
> largest number of guesses:
>     10 (#9175), 10 (#17200), 10 (#29823), 10 (#30811)
>
> df's solver needs 0/0/3/0 for cr's trouble spots, while cr's solver needs
> 5/9 guesses for df's. lots of potential for interesting investigations,
> though mostly for me!-)
^^^^^^^^^^^^^^^^^^^^^^^^^^
I'm not sure about that :-)
>
> cheers,
> claus

Cheers back,
Daniel

--

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
        -- Blair P. Houghton

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re[2]: Code Review: Sudoku solver

Bulat Ziganshin-2
In reply to this post by Daniel Fischer-4
Hello Daniel,

Saturday, April 8, 2006, 3:06:03 AM, you wrote:
> And I thought, DiffArrays were supposed to be fast!

1. your arrays are too small (8 elements only)
2. DiffArray use internally MVars. with IORefs they will be a lot
faster



--
Best regards,
 Bulat                            mailto:[hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

haskell-2
In reply to this post by Daniel Fischer-4
Daniel Fischer wrote:

> But, lo and behold, I  also tried how plai Array fared in comparison to
> DiffArray and ... reduced the running time to under ten minutes (a little
> above for the list version), 5% GC time without -AxM, 1.2% with -A8M.
>
> And I thought, DiffArrays were supposed to be fast!

No.  DiffArray's are faster for the usual imperative single threaded usage
pattern.  The haddock documentation explains:

> Diff arrays have an immutable interface, but rely on internal updates in
> place to provide fast functional update operator //.
>
> When the // operator is applied to a diff array, its contents are physically
> updated in place. The old array silently changes its representation without
> changing the visible behavior: it stores a link to the new current array
> along with the difference to be applied to get the old contents.
>
> So if a diff array is used in a single-threaded style, i.e. after //
> application the old version is no longer used, a'!'i takes O(1) time and a //
> d takes O(length d). Accessing elements of older versions gradually becomes
> slower.
>
> Updating an array which is not current makes a physical copy. The resulting
> array is unlinked from the old family. So you can obtain a version which is
> guaranteed to be current and thus have fast element access by a // [].

I assume the usage in a Sudoku solver involves a non-trivial amount of
back-tracking.  So as the solver backs up and goes forward again it ends up
being much more work than having used a plain Array.

And as was pointed out by someone else on this list: to be thread safe the
DiffArray uses MVar's (with locking) instead of IOVars.

But I expect the main problem is that a DiffArray is simply not the right
mutable data structure for the job.

I have had the flu this week, so I did not finish cleaning up my port of Knuth's
mutable dancing links based Sudoku solver.  But it uses a much more lightweight
way to alter a mutable data structure both going forward and backwards while
backtracking.  And I can use STRef's to build it, instead of MVars.


--
Chris
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re[2]: Code Review: Sudoku solver

Bulat Ziganshin-2
Hello Chris,

Saturday, April 8, 2006, 12:21:07 PM, you wrote:

> backtracking.  And I can use STRef's to build it, instead of MVars.

may be it's better to use unboxed arrays/references?



--
Best regards,
 Bulat                            mailto:[hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

haskell-2
In reply to this post by haskell-2
I have finished my cleanup of the dancing links based solver for Sudoku.

I don't have time to compare performance with the other programs that have been
posted recently, or to do more profiling of my code.

For those who will say "It is ugly, imperative, and ugly!" please remember this
is a conversion of Knuth's c-code, which depended on five non-trivial goto jumps
because he did not have tail recursion.  And the whole point of the algorithm
are the imperative unlink & relink routines acting on a sparse binary matrix.

This does not use any clever logic, but it does pick the tightest constraint at
every step.  This means if there is only one obvious possibility for a
position/row/column/block then it will immediately act on it.

The literate source file is attached.

[ My clever logic solver may eventually be cleaned up as well. ]

--
Chris


 A Sukodku solver by Chris Kuklewicz (haskell (at) list (dot) mightyreason (dot) com)

 I compile on a powerbook G4 (Mac OS X, ghc 6.4.2) using
 ghc -optc-O3 -funbox-strict-fields -O2 --make -fglasgow-exts

 This is a translation of Knuth's GDANCE from dance.w / dance.c

 http://www-cs-faculty.stanford.edu/~uno/preprints.html
 http://www-cs-faculty.stanford.edu/~uno/programs.html
 http://en.wikipedia.org/wiki/Dancing_Links

 I have an older verison that uses lazy ST to return the solutions on
 demand, which was more useful when trying to generate new puzzles to
 solve.

> module Main where

> import Prelude hiding (read)
> import Control.Monad
> import Control.Monad.Fix
> import Data.Array.IArray
> import Control.Monad.ST.Strict
> import Data.STRef.Strict
> import Data.Char(intToDigit,digitToInt)
> import Data.List(unfoldr,intersperse)

> new = newSTRef
> {-# INLINE new #-}
> read = readSTRef
> {-# INLINE read #-}
> write = writeSTRef
> {-# INLINE write #-}
> modify = modifySTRef
> {-# INLINE modify #-}

 Data types to prevent mixing different index and value types

> type A = Int
> newtype R = R A deriving (Show,Read,Eq,Ord,Ix,Enum)
> newtype C = C A deriving (Show,Read,Eq,Ord,Ix,Enum)
> newtype V = V A deriving (Show,Read,Eq,Ord,Ix,Enum)
> newtype B = B A deriving (Show,Read,Eq,Ord,Ix,Enum)

 Sudoku also has block constraints, so we want to look up a block
 index in an array:

> lookupBlock :: Array (R,C) B
> lookupBlock = listArray bb [ toBlock ij | ij <- range bb ]
>     where ra :: Array Int B
>           ra = listArray (0,pred (rangeSize b)) [B (fst b) .. B (snd b)]
>           toBlock (R i,C j) = ra ! ( (div (index b j) 3)+3*(div (index b i) 3) )

 The values for an unknown location is 'u'.
 The bound and range are given by b and rng.  And bb is a 2D bound.

> u = V 0  -- unknown value
> b :: (Int,Int)
> b = (1,9) -- min and max bounds
> rng = enumFromTo (fst b)  (snd b)  -- list from '1' to '9'
> bb = ((R (fst b),C (fst b)),(R (snd b),C (snd b)))

  A Spec can be turned into a parsed array with ease:

> type Hint = ((R,C),V)
> newtype Spec = Spec [Hint] deriving (Eq,Show)

> type PA = Array (R,C) V

> parse :: Spec -> PA
> parse (Spec parsed) = let acc old new = new
>                       in accumArray acc u bb parsed

 The dancing links algorithm depends on a sparse 2D node structure.
 Each column represents a constraint.  Each row represents a Hint.
 The number of possible hints is 9x9x9 = 271

> type (MutInt st)  = (STRef st) Int

 The pointer types:

> type (NodePtr st) = (STRef st) (Node st)
> type (HeadPtr st)  = (STRef st) (Head st)

 The structures is a 2D grid of nodes, with Col's on the top of
 columns and a sparse collection of nodes.  Note that topNode of Head
 is not a strict field.  This is because the topNode needs to refer to
 the Head, and they are both created monadically.

> type HeadName = (Int,Int,Int) -- see below for meaning

> data Head st = Head {headName:: !HeadName
>                     ,topNode:: (Node st) -- header node for this column
>                     ,len:: !(MutInt st)  -- number of nodes below this head
>                     ,next,prev:: !(HeadPtr st)  -- doubly-linked list
>                     }

> data Node st = Node {getHint:: !Hint
>                     ,getHead:: !(Head st)  -- head for the column this node is in
>                     ,up,down,left,right :: !(NodePtr st)  -- two doubly-linked lists
>                     }

> instance Eq (Head st) where
>     a == b = headName a == headName b

> instance Eq (Node st) where
>     a == b = up a == up b

 To initialize the structures is a bit tedious.  Knuth's code reads in
 the problem description from a data file and builds the structure
 based on that.  Rather than short strings, I will use HeadName as the
 identifier.
 
 The columns are (0,4,5) for nodes that put some value in Row 4 Col 5
                 (1,2,3) for nodes that put Val 3 in Row 2 and some column
                 (2,7,4) for nodes that put Val 4 in Col 7 and some row
                 (3,1,8) for nodes that put Val 8 in some (row,column) in Block 1

 The first head is (0,0,0) which is the root.  The non-root head data
 will be put in an array with the HeadName as an index.

> headNames :: [HeadName]
> headNames = let names = [0,1,2,3]
>             in (0,0,0):[ (l,i,j) | l<-names,i<-rng,j<-rng]

 A "row" of left-right linked nodes is a move.  It is defined by a
 list of head names.

> type Move = [(Hint,HeadName)]

 Initial hints are enforced by making them the only legal move for
 that location.  Blank entries with value 'u = V 0' have a move for
 all possible values [V 1..V 9].

> parseSpec :: Spec -> [Move]
> parseSpec spec =
>   let rowsFrom :: Hint -> [Move]
>       rowsFrom (rc@(R r,C c),mv@(V v')) =
>           if mv == u then [ rsyms v | v <- rng ]
>           else [ rsyms v' ]
>         where (B b) = lookupBlock ! rc
>               rsyms :: A -> Move
>               rsyms v = map ( (,) (rc,V v) ) [(0,r,c),(1,r,v),(2,c,v),(3,b,v)]
>   in concatMap rowsFrom (assocs (parse spec))
 mkDList creates doubly linked lists using a monadic smart
 constructor and the recursive "mdo" notation as documented at
 http://www.haskell.org/ghc/docs/latest/html/users_guide/syntax-extns.html#mdo-notation
 http://www.cse.ogi.edu/PacSoft/projects/rmb/

 For more fun with this, see the wiki page at
 http://haskell.org/hawiki/TyingTheKnot

> mkDList :: (MonadFix m) => (b -> a -> b -> m b) -> [a] -> m b
> mkDList _ [] = error "must have at least one element"
> mkDList mkNode xs = mdo (first,last) <- go last xs first
>                         return first
>   where go prev []     next = return (next,prev)
>         go prev (x:xs) next = mdo this <- mkNode prev x rest
>                                   (rest,last) <- go this xs next
>                                   return (this,last)

 toSimple takes a function and a header node and iterates (read . function)
 until the header is reached again, but does not return the header
 itself.

> toSingle step header = loop =<< (read . step) header
>     where loop y = if header/=y then liftM (y:) (read (step y) >>= loop)
>                                 else return []
>        

 forEach is an optimization of (toSimple step header >>= mapM_ act)

> forEach step header act = loop =<< (read . step) header
>      where loop y = if header/=y then (act y >> (read (step y)) >>= loop)
>                                  else return ()

 Now make the root node and all the head nodes. This also exploits mdo:

> makeHeads :: [HeadName] -> (ST st) (Head st)
> makeHeads names = mkDList makeHead names
>     where makeHead before name after = mdo
>             ~newTopNode <- liftM4 (Node ((R 0,C 0),V 0) newHead) (new newTopNode) (new newTopNode)
>                                                                 (new newTopNode) (new newTopNode)
>             newHead <- liftM3 (Head name newTopNode)                        
>                                    (new 0) (new after) (new before)
>             return newHead

 The Head nodes will be places in an array for easy lookup while building moves:

> type HArray st = Array HeadName (Head st)
> hBounds = ((0,1,1),(3,9,9))
> type Root st =  (Head st,HArray st)

 The addMove function creates the (four) nodes that represent a move and adds
 them to the data structure.  The HArray in Root makes for a fast
 lookup of the Head data.

> addMove :: forall st. (Root st) -> Move -> (ST st) (Node st)
> addMove (_,ha) move = mkDList addNode move
>     where addNode :: (Node st) -> (Hint,HeadName) -> (Node st) -> (ST st) (Node st)
>           addNode before (hint,name) after = do
>             let head = ha ! name
>             let below = topNode head
>             above <- read (up below)
>             newNode <- liftM4 (Node hint head) (new above) (new below)
>                                                (new before) (new after)
>             write (down above) newNode
>             write (up below) newNode
>             modify (len head) succ
>             l <- read (len head)
>             seq l (return newNode)
 Create the column headers, including the fast lookup array.  These
 will be resused between puzzles.

> initHA :: (ST st) (Root st)
> initHA = do
>   root <- makeHeads headNames
>   heads <- toSingle next root
>   let ha = array hBounds (zip (map headName heads) heads)
>   return (root,ha)

 Take the Root from initHA and a puzzle Spec and fill in all the Nodes.

> initRoot :: (Root st) -> Spec -> (ST st) ()
> initRoot root spec = do
>   let moves = parseSpec spec
>   mapM_ (addMove root) moves

  Return the column headers to their condition after initHA

> resetRoot :: (Root st) -> (ST st) ()
> resetRoot (root,ha) = do
>   let heads@(first:_) = elems ha
>   let resetHead head = do
>         write (len head) 0
>         let node = topNode head
>         write (down node) node
>         write (up node) node
>       reset (last:[]) = do
>         write (prev root) last
>         write (next root) first
>       reset (before:xs@(head:[])) = do
>         resetHead head
>         write (prev head) before
>         write (next head) root
>         reset xs
>       reset (before:xs@(head:after:_)) = do
>         resetHead head
>         write (prev head) before
>         write (next head) after
>         reset xs
>   reset (root:heads)
 getBest iterates over the unmet constraints (i.e. the Head that are
 reachable from root). It locates the one with the lowest number of
 possible moves that will solve it, aborting early if it finds 0 or 1
 moves.

> getBest :: (Head st) -> (ST st) (Maybe (Head st))
> getBest root = do
>   first <- read (next root)
>   if first == root then return Nothing
>     else do
>       let findMin m best head | head == root = return (Just best)
>                               | otherwise = do
>             l <- read (len head)
>             if l <= 1 then return (Just head)
>               else if l < m then findMin l head =<< read (next head)
>                      else findMin l best =<< read (next head)
>       findMin 10 first first
 The unlink and relink operations are from where Knuth got the name
 "dancing links".  So long as "a" does not change in between, the
 relink call will undo the unlink call.  Similarly, the unconver will
 undo the changes of cover and unconverOthers will undo coverOthers.

> unlink :: (a->STRef st a) -> (a->STRef st a) -> a -> (ST st) ()
> unlink prev next a = do
>   before <- read (prev a)
>   after <- read (next a)
>   write (next before) after
>   write (prev after) before

> relink :: (a->STRef st a) -> (a->STRef st a) -> a -> (ST st) ()
> relink prev next a = do
>   before <- read (prev a)
>   after <- read (next a)
>   write (next before) a
>   write (prev after) a

> cover :: (Head st) -> (ST st) ()
> cover head = do
>   unlink prev next head
>   let eachDown rr = forEach right rr eachRight
>       eachRight nn = do
>         unlink up down nn
>         modify (len $ getHead nn) pred
>   forEach down (topNode head) eachDown

> uncover :: (Head st) -> (ST st) ()
> uncover head = do
>   let eachUp rr = forEach left rr eachLeft
>       eachLeft nn = do
>         modify (len $ getHead nn) succ
>         relink up down nn
>   forEach up (topNode head) eachUp
>   relink prev next head

> coverOthers :: (Node st) -> (ST st) ()
> coverOthers node = forEach right node (cover . getHead)

> uncoverOthers :: (Node st) -> (ST st) ()
> uncoverOthers node = forEach left node (uncover . getHead)

 A helper function for gdance:

> choicesToSpec :: [(Node st)] -> Spec
> choicesToSpec = Spec . (map getHint)

 This is the heart of the algorithm.  I have altered it to return only
 the first solution, or produce an error if none is found.

 Knuth used several goto links to do what is done below with tail
 recursion.

> gdance :: (Head st) -> (ST st) Spec -- [Spec]
> gdance root =
>     let
>         forward choices = do
>             maybeHead <- getBest root
>             case maybeHead of
>                 Nothing -> if null choices
>                              then error "No choices in forward" -- return [] -- for [Spec]
>                              else do -- nextSols <- recover choices -- for [Spec]
>                                      return $ (choicesToSpec choices) -- :nextSols -- for [Spec]
>                 Just head -> do cover head
>                                 startRow <- readSTRef (down (topNode head))
>                                 advance (startRow:choices)
>
>         advance choices@(newRow:oldChoices) = do
>             let endOfRows = topNode (getHead newRow)
>             if (newRow == endOfRows)
>               then do uncover (getHead newRow)
>                       if (null oldChoices)
>                         then error "No choices in advace" -- return [] -- for [Spec]
>                         else recover oldChoices
>               else do coverOthers newRow
>                       forward choices
>
>         recover (oldRow:oldChoices) = do
>             uncoverOthers oldRow
>             newRow <- readSTRef (down oldRow)
>             advance (newRow:oldChoices)
>
>     in forward []

 Convert a text board into a Spec

> parseBoard :: String -> Spec
> parseBoard s = Spec (zip rcs vs'check)
>   where rcs :: [(R,C)]
>         rcs = [ (R r,C c) | r <- rng, c <- rng ]
>         isUnset c = (c=='.') || (c==' ') || (c=='0')
>         isHint c = ('1'<=c) && (c<='9')
>         cs = take 81 $ filter (\c -> isUnset c || isHint c) s
>         vs :: [V]
>         vs = map (\c -> if isUnset c then u else (V $ digitToInt c)) cs
>         vs'check = if 81==length vs then vs else error ("parse of board failed\n"++s)
 This is quite useful as a utility function which partitions the list into groups of n elements.
 Used by showSpec.

> groupTake :: Int->[a]->[[a]]
> groupTake n b = unfoldr foo b
>     where foo [] = Nothing
>           foo b = Just (splitAt n b)
 
 Make a nice 2D ascii board from the Spec (not used at the moment)

> showSpec :: Spec -> String
> showSpec spec = let pa = parse spec
>                     g = groupTake 9 (map (\(V v) -> if v == 0 then '.' else intToDigit v) $ elems pa)
>                     addV line = concat $ intersperse "|" (groupTake 3 line)
>                     addH list = concat $ intersperse ["---+---+---"] (groupTake 3 list)
>                 in unlines $ addH (map addV g)

  One line display

> showCompact spec = map (\(V v) -> intToDigit v) (elems (parse spec))

 The main routine is designed to handle the input from http://www.csse.uwa.edu.au/~gordon/sudoku17

> main = do
>   all <- getContents
>   let puzzles = zip [1..] (map parseBoard (lines all))
>   root <- stToIO initHA
>   let act :: (Int,Spec) -> IO ()
>       act (i,spec) = do
>         answer <- stToIO (do initRoot root spec
>                              answer <- gdance (fst root)
>                              resetRoot root
>                              return answer)
>         print (i,showCompact  answer)
>   mapM_ act puzzles

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

Daniel Fischer-4
In reply to this post by Daniel Fischer-4
Am Samstag, 8. April 2006 02:20 schrieb Daniel Fischer:

> Am Freitag, 7. April 2006 17:33 schrieben Sie:
> > > Just out of curiosity, speed was not the objective when I wrote my
> > > solver, I wanted to avoid guesswork (as much as possible), but in
> > > comparison with Cale Gibbard's and Alson Kemp's solvers (which are much
> > > more beautifully coded), it turned out that mine is blazingly fast, so
> > > are there faster solvers around (in Haskell, in other languages)?
> >
> > if I modify your solver to produce similar output to mine (input/first
> > propagation, solved puzzle, number and list of guesses made), your's
> > takes about a third of the time of mine (solving 36628 17hint puzzles
> > in 6m vs 17m, 2GHz Pentium M), and I wasn't exactly unhappy with
> > my solver before I did this comparison!-)
>
> Mine's even faster now (at least on my computer, would you care to test it
> on your's? If you don't want to get EnumSet, just change DiffArray to
> Array, worked wonders for me), I'll dig into yours tomorrow to see what I
> can get out of it to improve my algorithm.

Unforunately, no new inference rules :-(
Two things I don't like about your code:
1. no type declarations
2. too much name shadowing, that makes following the code difficult

apart from that: clever
>
> > like you, I've been trying to remove guesses, and the speed came as a
> > welcome bonus (I'm still using lists all over the place, with lots of not
> > nice adhoc code still remaining; not all propagators are iterated fully
>
> lists and adhoc code tend to be performance killers, I doubled the speed of
> mine by de-adhoccing the code (and that although I introduced the
> speed-killer DiffArray)

I believe if you change the representation of puzzles from [(pos,range)]
to an Array, you'll get a significant speedup

>
> > yet because I only recently removed a logic bug that slowed down the
> > search instead of speading it up; ..). so the more interesting bit is
> > that our solvers disagree on which are the most difficult puzzles
> > (requiring the largest number of guesses):
> >
> > df
> > puzzles involving guesses: 5319
>
> If that's not a typo, I'm baffled. My original needed to guess in 5309

Rot! Typo in _my_ previous message, 5319 is correct.

> puzzles, and I can't imagine what inference I could have dropped when
> cleaning up the code.
>
> > largest number of guesses:
> >     10 (#36084), 11 (#22495)
> >
> > cr
> > puzzles involving guesses: 8165
> > largest number of guesses:
> >     10 (#9175), 10 (#17200), 10 (#29823), 10 (#30811)
> >
> > df's solver needs 0/0/3/0 for cr's trouble spots, while cr's solver needs
> > 5/9 guesses for df's. lots of potential for interesting investigations,

We use different guessing strategies (plus, I also have group-inference).
But the given number of guesses is the number of guesses in the successful
branch, and I think a better measure for the nefariousness of a puzzle is the
accumulated number of guesses in all branches or the number of branches
(still better is the time needed to solve the puzzle).

This is the list of the 30 puzzles with the most branches:
puzzle  #branches
  3992   213
  7475   120
 12235   117
  5341    69
 11815    60
  9402    60
 11544    59
  9184    54
 10403    50
 31110    48
  8575    48
  1489    45
  2732    40
 11523    39
  6730    39
 10929    38
   960    35
 19474    32
  6412    31
  1599    30
 36084    29
 21832    29
 22495    28
  4657    28
 34747    27
 10404    27
 29931    26
   942    25
   563    24

the top 30 in CPUTime (in milliseconds, cpuTimePrecision = 10^10)
  3992  6480
  9184  1520
 31110  1470
 10403  1310
 12235  1260
  7475  1130
  2732  1080
   960  1050
  5341   990
 11544   960
 11815   930
  1395   730
 10929   710
  1863   710
  1330   700
 20807   630
  4181   610
 10634   570
 34401   550
   959   550
 34747   520
  1599   520
 14912   510
 29282   500
  7983   500
 29273   480
 23958   470
  2245   460
  2232   440
 36425   430

so puzzle 3992 is outstandingly bad in both respects (I fed it into my old
step by step solver and boy, failure is detected _very_ late in practically
all branches) and from a couple of tests I have the vague impression that the
correlation between the number of guesses in the successful branch and time
is not very strong (3992 has 6, 9184 and 2732 only 3, 31110 has 5, 10403 8,
12235 9, 7475 6 and 960 7), but I don't think I'll do a statistical analysis,
I'll stick to time as the measure.

Here's the meanest puzzle:

0 0 0 0 4 0 0 5 9
3 0 0 2 0 0 0 0 0
1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 7 0 0
0 4 6 0 0 0 0 0 0
9 5 0 0 0 0 0 0 0
0 0 0 0 5 6 0 4 0
0 0 0 8 0 0 3 0 0
0 0 0 0 0 0 0 0 0

and that's so mean that David Place's incrsud beats my solver on this by a
factor of 5.5!

> > though mostly for me!-)
>
> ^^^^^^^^^^^^^^^^^^^^^^^^^^
> I'm not sure about that :-)
>
> > cheers,
> > claus
>
Cheers again,
Daniel

--

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
        -- Blair P. Houghton

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

Daniel Fischer-4
In reply to this post by haskell-2
Am Samstag, 8. April 2006 20:28 schrieb Chris Kuklewicz:
> I have finished my cleanup of the dancing links based solver for Sudoku.
>
> I don't have time to compare performance with the other programs that have
> been posted recently, or to do more profiling of my code.

Your dancing links:

ckSud +RTS -sstderr -H32M -A8M < sudoku17 > Solutions.txt
ckSud +RTS -sstderr -H32M -A8M
62,941,602,892 bytes allocated in the heap
330,404,632 bytes copied during GC
    465,944 bytes maximum residency (41 sample(s))

       2023 collections in generation 0 ( 15.60s)
         41 collections in generation 1 (  0.30s)

         32 Mb total memory in use

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time  734.59s  (781.93s elapsed)
  GC    time   15.90s  ( 16.73s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time  750.49s  (798.66s elapsed)

  %GC time       2.1%  (2.1% elapsed)

  Alloc rate    85,682,629 bytes per MUT second

  Productivity  97.9% of total user, 92.0% of total elapsed

Without -HxM, -AxM:

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time  597.47s  (915.94s elapsed)
  GC    time  912.65s  (1363.63s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time  1510.12s  (2279.57s elapsed)

My version using EnumSet (with the faster 'size'):

sudokus +RTS -sstderr > Solutions
sudokus +RTS -sstderr
82,190,535,600 bytes allocated in the heap
771,054,072 bytes copied during GC
    153,512 bytes maximum residency (394 sample(s))

     286104 collections in generation 0 ( 33.98s)
        394 collections in generation 1 (  0.35s)

          2 Mb total memory in use

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time  482.51s  (1105.12s elapsed)
  GC    time   34.33s  ( 79.90s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time  516.84s  (1185.02s elapsed)

  %GC time       6.6%  (6.7% elapsed)

  Alloc rate    170,339,548 bytes per MUT second

  Productivity  93.4% of total user, 40.7% of total elapsed

Nice that original Haskell code can beat a translation from C.

However:
setSud ../puzzle3992 +RTS -sstderr
628|743|159
395|261|478
174|589|632
---+---+---
832|195|764
746|328|591
951|674|283
---+---+---
213|956|847
469|817|325
587|432|916
===========
888,672,920 bytes allocated in the heap
  3,352,784 bytes copied during GC
     45,648 bytes maximum residency (1 sample(s))

       3287 collections in generation 0 (  0.21s)
          1 collections in generation 1 (  0.00s)

          2 Mb total memory in use

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    4.77s  (  4.77s elapsed)
  GC    time    0.21s  (  0.22s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    4.98s  (  4.99s elapsed)

  %GC time       4.2%  (4.4% elapsed)

  Alloc rate    186,304,595 bytes per MUT second

  Productivity  95.8% of total user, 95.6% of total elapsed

But
ckSud +RTS -sstderr < oneBad
ckSud +RTS -sstderr
(1,"673941852148256379295378461534167928826594713917832546351729684762483195489615237")
(2,"683941752179258463245376918534167829826594371917832546351729684762483195498615237")
(3,"829143657361785492547629381678954213934271568215836974152468739486397125793512846")
(4,"713642958825917436649835127594781362378264519261593784136478295482359671957126843")
(5,"763942158425817936189635427594781362378264519216593784631478295842359671957126843")
(6,"269743158371865924458921637945137286836492571712658349597386412683214795124579863")
(7,"269743158371865924485921637943157286856492371712638549597386412638214795124579863")
(8,"628743159395261478174589632832195764746328591951674283213956847469817325587432916")
(9,"983541762761328945524679813679483251835162497142957638457816329296735184318294576")
(10,"578942361923165748614837925867491532235786194149253876796514283452378619381629457")
(11,"938145762127396854654872931873629145546713289291584673415967328369258417782431596")
(12,"792548361531672894846931572657384129483129657219756438965817243174263985328495716")
(13,"738249561296517483154386927673192854981654372425873619547928136319765248862431795")
(14,"957842361386719452124653879598364127673281945412975638845137296231596784769428513")
(15,"598241367673958124421736985254873691317692548986415732742169853165384279839527416")
 25,708,036 bytes allocated in the heap
  9,097,220 bytes copied during GC
    329,648 bytes maximum residency (5 sample(s))

         97 collections in generation 0 (  0.42s)
          5 collections in generation 1 (  0.04s)

          2 Mb total memory in use

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    0.23s  (  0.23s elapsed)
  GC    time    0.46s  (  0.46s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    0.69s  (  0.69s elapsed)

  %GC time      66.7%  (66.7% elapsed)

  Alloc rate    111,774,069 bytes per MUT second

  Productivity  33.3% of total user, 33.3% of total elapsed

The infamous puzzle 3992 is the eighth in oneBad, so the links dance rings
around me for that.
I wonder where the dancing links run into difficulties.
I'll see whether I can grok (that does mean understand, doesn't it?) your
programme one of these days.

>
> For those who will say "It is ugly, imperative, and ugly!" please remember
> this is a conversion of Knuth's c-code, which depended on five non-trivial
> goto jumps because he did not have tail recursion.  And the whole point of
> the algorithm are the imperative unlink & relink routines acting on a
> sparse binary matrix.
>
> This does not use any clever logic, but it does pick the tightest
> constraint at every step.  This means if there is only one obvious
> possibility for a position/row/column/block then it will immediately act on
> it.
>
> The literate source file is attached.
>
> [ My clever logic solver may eventually be cleaned up as well. ]

I really would like to see that.

Cheers,
Daniel

--

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
        -- Blair P. Houghton

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

haskell-2
In reply to this post by Daniel Fischer-4
I just ran a simple metric for the dancing-links solver.

The only real metric I could use was the number of "coverOthers" calls which
counts the number of selections made (there is no distinction between certainty
and guessing).

So the minimum # of selections is 81 (of which 17 are the free hints, and 64
were real moves).  Each selection beyond 81 involved backtracking.

Of the 36628 puzzles, there were 21395 puzzles that were solved with 81
selections, and therefore no backtracking.  So 58% of them were (some with luck)
solved instantly.  This low percentage is why this is not a by logic solver.

puzzles      selections each  selections  excess/above 81
21395 minimal  81 (17+64*1)     1732995        0
 9841 up to   145 (17+64*2)     1043838   246717 (== 1043838 - 9841*81)
 3192 up to   273 (17+64*4)      611275   352723
 1215 up to   529 (17+64*8)      453302   354887
  576 up to  1041 (17+64*16)     415496   368840
  248 up to  2065 (17+64*32)     354028   333940
  105 up to  4113 (17+64*64)     300909   292404
   42 up to  8209 (17+64*128)    248645   245243
   10 up to 16401 (17+64*256)    120724   119914
    3 up to 32785 (17+64*512)     62319    62076
    1 used  32875                 32875    32794 (puzzle # 10995)
-----                           -------  -------
36628                           5376406  2409538

I think the important thing about the list is that the work in each group is
double the one before it, but the count of puzzles is always less than half the
one before it.  Thus the total selections decreases.  So the hard puzzles are
time consuming but not catastrophic.

Anyway, 94% of the puzzles were solved with no more than 17+64*4 selections.

And only 50.7% of the selections beyond the 17 hints required backtracking.
(Computed from 2409538 / ( 5376406 - 36628 * 17 ) ).  That's an average of 65.8
bad selections per puzzle that requires 64 good selections to solve.  This seems
like an excellent benchmark for comparing brute force methods.

A final point about dancing links is that this is a generic data structure and
algorithm for solving any "binary cover" problem.   (Such as arranging
pentaminos).  It is not specialized to knowing anything about Sudoku.  The
initialization tells it "there are 324 constraints" and "this is a comprehensive
set of moves, limited only by the initial hints".  And it runs only 50% slower
than bit-twiddling specialized logic.

The funny thing is that Sudoku is too small for this method, by which I mean
that the constant cost of initializing the data structure that is used can
become comparable to running the algorithm.  I had to profile my code while
rewriting it and optimize the initialization code to have a much smaller
overhead than Knuth's code.  Most non-Sudoku problems that one needs to brute
force with this will run much longer than any setup costs.

Finally, here were the worst 30 puzzles:

selections puzzle#
32875   10995

23577   412
20707   27267
18035   26632

14350   21765
14247   33677
13966   10992
13660   7250
13453   28608
12243   19106
12181   10993
9300    18177
8686    24590
8638    12445

8161    19387
8157    7921
7916    35782
7900    33131
7850    36162
7663    21951
7427    31110
7321    16608
7268    33554
7200    1259
7108    9750
6805    28607
6781    21919
6716    14969
6432    26234
6361    15697

I don't notice any overlap with the list of hard puzzles to solve with the other
programs.

--
Chris
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

Daniel Fischer-4
In reply to this post by haskell-2
Am Samstag, 8. April 2006 10:21 schrieb Chris Kuklewicz:

> Daniel Fischer wrote:
> > But, lo and behold, I  also tried how plain Array fared in comparison to
> > DiffArray and ... reduced the running time to under ten minutes (a little
> > above for the list version), 5% GC time without -AxM, 1.2% with -A8M.
> >
> > And I thought, DiffArrays were supposed to be fast!
>
> No.  DiffArray's are faster for the usual imperative single threaded usage
>
> pattern.  The haddock documentation explains:

Well, it's single threaded for 31,309 of the puzzles and hasn't much branching
for most of the others.
I hoped that when guessing was necessary, a copy was made to be used for the
other branches, but apparently I couldn't persuade the compiler to do that.

> > Diff arrays have an immutable interface, but rely on internal updates in
> > place to provide fast functional update operator //.
> >
> > When the // operator is applied to a diff array, its contents are
> > physically updated in place. The old array silently changes its
> > representation without changing the visible behavior: it stores a link to
> > the new current array along with the difference to be applied to get the
> > old contents.
> >
> > So if a diff array is used in a single-threaded style, i.e. after //
> > application the old version is no longer used, a'!'i takes O(1) time and
> > a // d takes O(length d). Accessing elements of older versions gradually
> > becomes slower.
> >
> > Updating an array which is not current makes a physical copy. The
> > resulting array is unlinked from the old family. So you can obtain a
> > version which is guaranteed to be current and thus have fast element
> > access by a // [].
>
> I assume the usage in a Sudoku solver involves a non-trivial amount of
> back-tracking.  So as the solver backs up and goes forward again it ends up
> being much more work than having used a plain Array.

I tried to keep backtracking to a minimum, and in most cases that minimum is
reached.
>
> And as was pointed out by someone else on this list: to be thread safe the
> DiffArray uses MVar's (with locking) instead of IOVars.
>
> But I expect the main problem is that a DiffArray is simply not the right
> mutable data structure for the job.

Should I try an MArray?
And what's more promising, IOArray or STArray?
>
> I have had the flu this week, so I did not finish cleaning up my port of

So did I, hope you're well again.

> Knuth's mutable dancing links based Sudoku solver.  But it uses a much more
> lightweight way to alter a mutable data structure both going forward and
> backwards while backtracking.  And I can use STRef's to build it, instead
> of MVars.

Cheers,
Daniel
--

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
        -- Blair P. Houghton

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

Claus Reinke
In reply to this post by Daniel Fischer-4
>> Mine's even faster now (at least on my computer, would you care to test it
>> on your's? If you don't want to get EnumSet, just change DiffArray to
>> Array, worked wonders for me), I'll dig into yours tomorrow to see what I
>> can get out of it to improve my algorithm.
>
> Unforunately, no new inference rules :-(

indeed, the crucial difference between our versions was that I'd avoided
group inference; I thought it would be expensive to implement and, based
on my own limited experience, rarely used - boy, was I wrong on both
accounts: it is easily implemented, with a little care for the inner loop, not
too inefficient, either, and the examples I mentioned (on which my solver
needed guesswork) simply fall apart once grouping is added (it was quite
impressive to take the output of my old propagation code and apply
grouping by hand!).

now my solver takes slightly fewer guesses than your's, though your's
is still significantly faster (same factor of about 3 between them, both
having improved by a factor of 2 or so..).

> Two things I don't like about your code:
> 1. no type declarations

that's what type inference is for!-) I tend to add them only as debugging
assertions, when language limitations require them, or when I'm fairly
certain that the code is stable (ie., dead;-). it is useful to think about types,
but engraving them into the code is overrated (any good ide should be
able to show them as if they'd been written in anyway..; lacking such ide,
:browse Main will do the trick, though ghci's current formatting is ugly!).

> 2. too much name shadowing, that makes following the code difficult

depends on which parts you mean - I don't like overusing primed or
numbered variables, and using the same variable name through a
sequence of steps is usually the last step before factoring that variable
out into some monad anyway (I did mention that the code was still
ad-hoc, not cleaned up).

> apart from that: clever

as in: you'd prefer it if I'd express myself more clearly?-) agreed.
I've been working on cleaning up the code (also trying to remove
stupidities as they become exposed in the process..).

>> lists and adhoc code tend to be performance killers, I doubled the speed of
>> mine by de-adhoccing the code (and that although I introduced the
>> speed-killer DiffArray)

there's nothing wrong with lists, just as arrays are not synonymous with
good performance - it all depends on the usage patterns. there are several
interesting examples in this problem:

1 when I first tried EnumSet for the ranges, there was no substantial
    performance difference to a list-based representation.

    in theory, the change sounds like a clear win (constant vs linear time
    membership test), but (a) the ranges are small (they start at size 9,
    and we do our best to shrink them rapidly, so even if they were at
    average size 6 or 7, the test would take an average of 3 or 4 steps);
    (b) membership is not the only operation - for the inner loops in my
    code, singleton testing and extraction of the single element from a
    singleton were equally important, and here the relation was reversed:
    lists give us those operations in constant time while they were linear
    in the old EnumSet.

    Bulat's tabling suggestion (which I interpreted rather more extremely
    by tabling the actual calls to size and range2list:-) solves this issue
    beautifully, making EnumSet starting to look competitive in spite of
    the particular usage pattern in this problem, and when the grouping
    inference asked for more calls to size, as well as intersection and
    union, the choice flipped completely, now in favour of EnumSet
    (has agreement been reached about an updated version of EnumSet?).
 
    [if I might make a suggestion for EnumSet: it is not just that the
     implementation of the standard API can be more efficient for these
     small sets, but there are additional operations that ought to be
     included (unless the EnumSet.Set constructor gets exported),
     such as lifting the Ix instance from Word to Set a]

2 DiffArrays: they are neither a speed killer nor a universal cure.

    they were intended to enable functional arrays with inplace
    update for sequential access, but without the complex analyses
    needed to ensure such sequential access, and with the reverse
    diff to old versions preserving the persistence of copies.

    using them, we'll get inplace update plus a long trail recording
    the diffs needed to reconstruct the old versions, and if we use
    them single-threadedly, that trail will just be ignored and the
    garbage collector will neglect to copy it at some point. but if
    we combine them with backtracking, those trails are likely to
    kill much of the performance gains, and it is not so much the
    width of branching as the depth of each branch: we iterate on
    that array to narrow ranges and to update positions, and each
    tiny update extends that trail of reverse diffs (they are not even
    collapsed, so they can be longer than the array range) - so
    when we finally abandon a branch and need to return to any
    of the older versions of that array, we are going through that
    trail, linearly, to reconstruct that old version from the new one
    (doing an identity update before entering a branch should give
    us two independent copies, though we'd need to make that
    update not on the current version of the array..).

3 Arrays: again, they are no universal performance cure, and
    their advantages depend on the usage patterns.

    tabling calls to size is an ideal example in favour: a single array
    construction vs frequent reads. in contrast, a sequence of
    frequent small updates, as in constraint propagation, seems to
    be rather disadvantageous: every time we narrow a range or
    fix a position, the whole array gets copied! on the other hand,
    those arrays are small, and the cost of copying may be lost in
    other costs. but if we have to go through many elements,
    the cost can be reduced by doing updates in bulk: a _list_ of
    updates, with only one new copy being created, incrementally.

    there is, however, one part of the problem where I'd expect
    arrays to be advantageous, and that is in generating those lists
    of updates: here we only inspect the sudoku matrix, reading
    its elements according to various more or less complex views.
    this part should be efficiently handled over an array, and as
    your code shows, doing so may even result in fairly readable
    code (although the clarity of my own code has improved as
    well since the time when it used to be very concerned with
    constructing those several views in rather complex ways;-).

> I believe if you change the representation of puzzles from [(pos,range)]
> to an Array, you'll get a significant speedup

since my solver's logic now seems to be on par, I might try that
for the generation of constraints, but I still doubt it is ideal for the
actual updating of the matrix - perhaps bulk updates might be
sufficient to remedy that.

but the important lesson is that one can leave non-algorithmic
performance concerns to the very end (where they need to be
guided by profiling, and limited to the actual "inner loops") - I
got my solver fairly fast without ever worrying about specialised
representations, or strictness, etc.; focussing on algorithmic
aspects was sufficient and necessary, however:

the naive generate-and-test is impractical, and while simple
constraint propagation will lead to drastic speedups, those are
not sufficient (they still left me in the 10000puzzles/hour stage),
so more careful propagation is needed.

my first narrowing propagation sped up the difficult puzzles, but
slowed down the simple ones (and there are a lot of those in this
set); I had already started on complex heuristics to apply that
propagation selectively before I recalled my own good principles
and looked at the algorithm again, as that slowdown should not
have happened in the first place.

and in fact, it turned out to be related to one of those optimizations
we are all too likely to throw in automatically, without much thought:
to avoid stumbling over the fixed positions as singletons ranges
again and again, I had disabled that propagation, and had instead
left it to each of the other propagation steps to note the appearance
of new singletons, while ignoring the old ones.

so, when my new narrowing step narrowed the range on some
position to a singleton, without notifying anyone, that singleton
got duly ignored by everyone else, assuming that it had already
been taken care of. result: longer searches, in spite of more
information!-( rectifying that, and simplifying some structures,
got the time for the whole set down to the 17min I reported.

on the other hand, one cannot always ignore performance completely,
relying only on the algorithm: but usage patterns and profiling of actual
code are the key here. when I finally added my group inference, it did
slice down the number of guesses for the previously difficult puzzles,
but the actual runtime for the whole set of puzzles increased by a few
minutes! profiling showed some rather extravagantly complex code
("clever" in the worst sense!-) in the innermost loop, cured by
simplifying (and clarifying) the code in question. the next step was
moving to EnumSet, followed by the size/foldBits issue, cured by
Bulat's tabling suggestion. now we're down to 8min20s or so, still
with lists for backtracking and for the main matrix. (compared to
2min40s or so for your's)

> Rot! Typo in _my_ previous message, 5319 is correct.

I changed the output format to a line per puzzle (solution/number
and list of guesses - I still like that metric as each guess means that
we've run out of ideas, which suggests that we want to get the number
of guesses down to zero), using the same format for both solvers to
allow for easy grep/diff.
 
>> > puzzles involving guesses: 8165
>> > largest number of guesses:
>> >     10 (#9175), 10 (#17200), 10 (#29823), 10 (#30811)

down to 5319 and 0/0/3/0 now.

> We use different guessing strategies (plus, I also have group-inference).

yep, that convinced me that group inference is a must (now added).

as for guessing strategies, you're sorting the non-fixed positions by size
of range before extracting candidates for guesses, whereas I simply take
them in the order they appear in the puzzle (which may have been
somewhat hard to see thanks to another one of those overcomplicated
pieces of code that has now disappeared..). I had thought about that, but
couldn't make up my mind about preferring any order, and while the
numbers of guesses for our two solvers vary wildly between puzzles,
the total of guesses shows no advantage of sorting (the total looks
slightly smaller for my solver, it has no two-digit guesses, and fewer
puzzles with more than 6 guesses, so if anything, the evidence is against
sorting, but I'd rather focus on reducing the guesses than try to attach
any significance to these variations..).

if I eliminate the sorting from your solver, both solvers deliver the same
results, so that's nice!-) but it also means that we have room for further
improvement.

one reason why I got interested in this problem in the first place was
that I had been reading about constraint propagation, and wanted to
get some hands-on experience. in particular, there is a technique
called "generalized propagation":
   
    Generalised Constraint Propagation Over the CLP Scheme,
    by Thierry Le Provost and Mark Wallace. Journal of Logic
    Programming 16, July 1993. Also [ECRC-92-1]
    http://eclipse.crosscoreop.com:8080/eclipse/reports/ecrc_genprop.ps.gz

if I may oversimplify things a bit:

1 split inferences into branching (search) and non-branching (propagation)
2 distribute common information from non-branching inferences
    out of branches
3 to enhance applicability of step 2, add approximation of information

applied to our sudoku problem, we have deterministic propagation
and branching search, and what we do if we run out of propagation
opportunities is to select a non-fixed position, then branch on each
of the numbers in its range. adding generalised propagation (as I
understand it) amounts to a form of lookahead: we take each of
those branches, but look only at the deterministic information we
can extract in each (ignoring further search). then we check whether
the branches have any of that information in common, and apply any
such common information to our matrix. rinse and repeat. an obvious
way to refine this process by approximation is to build the union of
the ranges on each position for the matrices resulting from all the
branches.

using this idea with a single level of branch lookahead, and selecting
a position with the widest range for inspection, reduces the number
of puzzles requiring guesses to 373, with a maximum depth of 3
guesses.

it doesn't reduce runtime much, yet (7min50s), but I haven't
even started profiling, either. I guess I have to look into
representations now, if I ever want my solver to catch up with
your's in terms of runtime;-)

cheers,
claus

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

Daniel Fischer-4
Moin Claus,

Am Montag, 10. April 2006 10:11 schrieben Sie:

> >> Mine's even faster now (at least on my computer, would you care to test
> >> it on your's? If you don't want to get EnumSet, just change DiffArray to
> >> Array, worked wonders for me), I'll dig into yours tomorrow to see what
> >> I can get out of it to improve my algorithm.
> >
> > Unforunately, no new inference rules :-(
>
> indeed, the crucial difference between our versions was that I'd avoided
> group inference; I thought it would be expensive to implement and, based
> on my own limited experience, rarely used - boy, was I wrong on both
> accounts: it is easily implemented, with a little care for the inner loop,
> not too inefficient, either, and the examples I mentioned (on which my
> solver needed guesswork) simply fall apart once grouping is added (it was
> quite impressive to take the output of my old propagation code and apply
> grouping by hand!).

Just to be sure, with group inference added, your solver must guess for
exactly the same puzzles as mine?

>
> now my solver takes slightly fewer guesses than your's, though your's
> is still significantly faster (same factor of about 3 between them, both
> having improved by a factor of 2 or so..).
>
> > Two things I don't like about your code:
> > 1. no type declarations
>
> that's what type inference is for!-) I tend to add them only as debugging
> assertions, when language limitations require them, or when I'm fairly
> certain that the code is stable (ie., dead;-). it is useful to think about
> types, but engraving them into the code is overrated (any good ide should

My IDE is kate/kwrite + ghci + hugs, if you know a better one (which doesn't
involve loading down a 99MB thing like eclipse), I'd be interested to try it
out.

> be able to show them as if they'd been written in anyway..; lacking such
> ide,
>
> :browse Main will do the trick, though ghci's current formatting is ugly!).
> :

Well, I didn't :b Main and add the signatures before I printed it out, and
since I read it in bed, I had to figure them out myself.
And as somebody else somewhere stated, a type signature is about the most
useful piece of documentation.
But I concede that there are also good reasons for leaving them out
(especially in the early development stage).

> > 2. too much name shadowing, that makes following the code difficult
>
> depends on which parts you mean - I don't like overusing primed or

guesses, candidates and the where clause of findNarrowPos

> numbered variables, and using the same variable name through a
> sequence of steps is usually the last step before factoring that variable
> out into some monad anyway (I did mention that the code was still
> ad-hoc, not cleaned up).

Yes, and if you hadn't, I would have been annoyed by it (after all, I'm the
only one who may write unreadable code :-)

>
> > apart from that: clever
>
> as in: you'd prefer it if I'd express myself more clearly?-) agreed.

That, too - don't we all want everybody else to express themselves
immaculately lucid?
But mostly: clever, as in: clever

> I've been working on cleaning up the code (also trying to remove
> stupidities as they become exposed in the process..).
>

I'd like to see it, when you consider it fit for posting.

> >> lists and adhoc code tend to be performance killers, I doubled the speed
> >> of mine by de-adhoccing the code (and that although I introduced the
> >> speed-killer DiffArray)
>
> there's nothing wrong with lists, just as arrays are not synonymous with
> good performance - it all depends on the usage patterns. there are several

Of course, and I love lists! I meant, lists are slow in situations like these,
where you repeatedly access elements at various positions, as in
findSingletonPos and findNarrowPos, where you have to traverse the whole list
for each row/column/block. Such selections are much faster done over an array
(I believe, but imagining a 1024 by 1024 sudoku, I'm rather confident)

> interesting examples in this problem:
>
> 1 when I first tried EnumSet for the ranges, there was no substantial
>     performance difference to a list-based representation.
>
>     in theory, the change sounds like a clear win (constant vs linear time
>     membership test), but (a) the ranges are small (they start at size 9,
>     and we do our best to shrink them rapidly, so even if they were at
>     average size 6 or 7, the test would take an average of 3 or 4 steps);
>     (b) membership is not the only operation - for the inner loops in my
>     code, singleton testing and extraction of the single element from a
>     singleton were equally important, and here the relation was reversed:
>     lists give us those operations in constant time while they were linear
>     in the old EnumSet.

Yes, and for those reasons I didn't expect much difference from changing to
EnumSet, but in the group inference, I form a lot of unions (and before I
cleaned up my code, also many differences) which I expected to be
significantly faster with EnumSet, so I gave it a try and it was faster even
before Bulat's optimization. After that, using EnumSet is about twice as fast
as using lists.

>
>     Bulat's tabling suggestion (which I interpreted rather more extremely
>     by tabling the actual calls to size and range2list:-) solves this issue

Err, I'm not sure what you mean. Did you also build a table
ranges :: Array Word [Int]
ranges = listArray (l,u) $ map toList' [l .. u] ?
interesting idea, I should try that.

I did, lost me 10 seconds, so that's not the way.

>     beautifully, making EnumSet starting to look competitive in spite of
>     the particular usage pattern in this problem, and when the grouping
>     inference asked for more calls to size, as well as intersection and
>     union, the choice flipped completely, now in favour of EnumSet
>     (has agreement been reached about an updated version of EnumSet?).
>
>     [if I might make a suggestion for EnumSet: it is not just that the
>      implementation of the standard API can be more efficient for these
>      small sets, but there are additional operations that ought to be
>      included (unless the EnumSet.Set constructor gets exported),
>      such as lifting the Ix instance from Word to Set a]
>
> 2 DiffArrays: they are neither a speed killer nor a universal cure.

I was talking about my programme where it was a speed killer.
I'm well aware of the fact that they may be _much_ faster than ordinary
arrays, and I hoped that they were here, but they weren't.
Today, I wrote a version using IOArray, hoping to get incredibly fast in-place
updating, and explicitly making copies via freeze and thaw when guessing is
required, but no luck (maybe I just don't know how to do it properly), spent
85% of the time in GC, 18.3% with -H64M -A32M, MUT time is about 15% more
than plain Array and EnumSet.
If I had even the vaguest idea how I could provide an
instance MArray IOUArray Entry IO
(or such with Entry replaced by (Int, Set Int) or even
(# Int, Word #)), I would give that a try.
I don't understand it, I would have thought that when I call newArray, an
array of appropriate size is created somewhere on the heap (or wherever) and
stays there until no longer referenced and writeArray would just change the
contents of the relevant memory-cell and not touch the rest, so that should
be fast and allocate less than plain Array, but it seems that isn't so :-(

Now I've changed writeArray to unsafeWrite (analogously for read), that
brought MUT time to less than the plain Array version, but still spends a lot
of time gc'ing (16.4% with -H64M -A64M, 10.2% with -H64M -A64M, that brought
total time to less than plain Array, but had a maximum residency of
74,252,696 bytes, too much for my liking, plain Array has maximum residency
of several K)

>
>     they were intended to enable functional arrays with inplace
>     update for sequential access, but without the complex analyses
>     needed to ensure such sequential access, and with the reverse
>     diff to old versions preserving the persistence of copies.
>
>     using them, we'll get inplace update plus a long trail recording
>     the diffs needed to reconstruct the old versions, and if we use
>     them single-threadedly, that trail will just be ignored and the
>     garbage collector will neglect to copy it at some point. but if
>     we combine them with backtracking, those trails are likely to
>     kill much of the performance gains, and it is not so much the
>     width of branching as the depth of each branch: we iterate on
>     that array to narrow ranges and to update positions, and each
>     tiny update extends that trail of reverse diffs (they are not even
>     collapsed, so they can be longer than the array range) - so
>     when we finally abandon a branch and need to return to any
>     of the older versions of that array, we are going through that
>     trail, linearly, to reconstruct that old version from the new one
>     (doing an identity update before entering a branch should give
>     us two independent copies, though we'd need to make that
>     update not on the current version of the array..).

I tried to achieve that, but failed.

>
> 3 Arrays: again, they are no universal performance cure, and
>     their advantages depend on the usage patterns.
>
>     tabling calls to size is an ideal example in favour: a single array
>     construction vs frequent reads. in contrast, a sequence of
>     frequent small updates, as in constraint propagation, seems to
>     be rather disadvantageous: every time we narrow a range or
>     fix a position, the whole array gets copied! on the other hand,
>     those arrays are small, and the cost of copying may be lost in
>     other costs. but if we have to go through many elements,
>     the cost can be reduced by doing updates in bulk: a _list_ of
>     updates, with only one new copy being created, incrementally.
>
>     there is, however, one part of the problem where I'd expect
>     arrays to be advantageous, and that is in generating those lists
>     of updates: here we only inspect the sudoku matrix, reading
>     its elements according to various more or less complex views.
>     this part should be efficiently handled over an array, and as
>     your code shows, doing so may even result in fairly readable
>     code (although the clarity of my own code has improved as
>     well since the time when it used to be very concerned with
>     constructing those several views in rather complex ways;-).
>
> > I believe if you change the representation of puzzles from [(pos,range)]
> > to an Array, you'll get a significant speedup
>
> since my solver's logic now seems to be on par, I might try that
> for the generation of constraints, but I still doubt it is ideal for the
> actual updating of the matrix - perhaps bulk updates might be
> sufficient to remedy that.
>
> but the important lesson is that one can leave non-algorithmic
> performance concerns to the very end (where they need to be
> guided by profiling, and limited to the actual "inner loops") - I
> got my solver fairly fast without ever worrying about specialised
> representations, or strictness, etc.; focussing on algorithmic
> aspects was sufficient and necessary, however:

agree

>
> the naive generate-and-test is impractical, and while simple
> constraint propagation will lead to drastic speedups, those are
> not sufficient (they still left me in the 10000puzzles/hour stage),
> so more careful propagation is needed.
>
> my first narrowing propagation sped up the difficult puzzles, but
> slowed down the simple ones (and there are a lot of those in this
> set); I had already started on complex heuristics to apply that
> propagation selectively before I recalled my own good principles
> and looked at the algorithm again, as that slowdown should not
> have happened in the first place.
>
> and in fact, it turned out to be related to one of those optimizations
> we are all too likely to throw in automatically, without much thought:
> to avoid stumbling over the fixed positions as singletons ranges
> again and again, I had disabled that propagation, and had instead
> left it to each of the other propagation steps to note the appearance
> of new singletons, while ignoring the old ones.
>
> so, when my new narrowing step narrowed the range on some
> position to a singleton, without notifying anyone, that singleton
> got duly ignored by everyone else, assuming that it had already
> been taken care of. result: longer searches, in spite of more
> information!-( rectifying that, and simplifying some structures,
> got the time for the whole set down to the 17min I reported.
>
> on the other hand, one cannot always ignore performance completely,
> relying only on the algorithm: but usage patterns and profiling of actual
> code are the key here. when I finally added my group inference, it did
> slice down the number of guesses for the previously difficult puzzles,
> but the actual runtime for the whole set of puzzles increased by a few
> minutes! profiling showed some rather extravagantly complex code
> ("clever" in the worst sense!-) in the innermost loop, cured by
> simplifying (and clarifying) the code in question. the next step was
> moving to EnumSet, followed by the size/foldBits issue, cured by
> Bulat's tabling suggestion. now we're down to 8min20s or so, still
> with lists for backtracking and for the main matrix. (compared to
> 2min40s or so for your's)
>
> > Rot! Typo in _my_ previous message, 5319 is correct.
>
> I changed the output format to a line per puzzle (solution/number
> and list of guesses - I still like that metric as each guess means that
> we've run out of ideas, which suggests that we want to get the number
> of guesses down to zero), using the same format for both solvers to

and if we get down to zero, all's fine, but I think we shouldn't ignore the
false guesses.

> allow for easy grep/diff.
>
> >> > puzzles involving guesses: 8165
> >> > largest number of guesses:
> >> >     10 (#9175), 10 (#17200), 10 (#29823), 10 (#30811)
>
> down to 5319 and 0/0/3/0 now.
>
> > We use different guessing strategies (plus, I also have group-inference).
>
> yep, that convinced me that group inference is a must (now added).
>
> as for guessing strategies, you're sorting the non-fixed positions by size
> of range before extracting candidates for guesses, whereas I simply take
> them in the order they appear in the puzzle (which may have been
> somewhat hard to see thanks to another one of those overcomplicated
> pieces of code that has now disappeared..). I had thought about that, but

I have also tried guessing on the first blank, increased time about 12%, so
until I find a better criterion, I'll stay with fewest possibilities.
The (simple) idea behind that choice is, if I have only two possibilities, I
have a 50% chance of being right on the first attempt - that reasoning of
course was based on the 'find one solution and stop' target, for determining
all solutions, i.e. following all branches, I'm not so sure I could justify
it.
However, I've also tried the opposite (just to compare), guess at a position
with maximum number of choices: MUCH slower.

> couldn't make up my mind about preferring any order, and while the
> numbers of guesses for our two solvers vary wildly between puzzles,
> the total of guesses shows no advantage of sorting (the total looks
> slightly smaller for my solver, it has no two-digit guesses, and fewer
> puzzles with more than 6 guesses, so if anything, the evidence is against
> sorting, but I'd rather focus on reducing the guesses than try to attach
> any significance to these variations..).

Oh, absolutely, avoiding guesses is the best.

>
> if I eliminate the sorting from your solver, both solvers deliver the same
> results, so that's nice!-) but it also means that we have room for further
> improvement.
>
> one reason why I got interested in this problem in the first place was
> that I had been reading about constraint propagation, and wanted to
> get some hands-on experience. in particular, there is a technique
> called "generalized propagation":
>
>     Generalised Constraint Propagation Over the CLP Scheme,
>     by Thierry Le Provost and Mark Wallace. Journal of Logic
>     Programming 16, July 1993. Also [ECRC-92-1]
>     http://eclipse.crosscoreop.com:8080/eclipse/reports/ecrc_genprop.ps.gz
>
> if I may oversimplify things a bit:
>
> 1 split inferences into branching (search) and non-branching (propagation)
> 2 distribute common information from non-branching inferences
>     out of branches
> 3 to enhance applicability of step 2, add approximation of information
>
> applied to our sudoku problem, we have deterministic propagation
> and branching search, and what we do if we run out of propagation
> opportunities is to select a non-fixed position, then branch on each
> of the numbers in its range. adding generalised propagation (as I
> understand it) amounts to a form of lookahead: we take each of
> those branches, but look only at the deterministic information we
> can extract in each (ignoring further search). then we check whether
> the branches have any of that information in common, and apply any
> such common information to our matrix. rinse and repeat. an obvious
> way to refine this process by approximation is to build the union of
> the ranges on each position for the matrices resulting from all the
> branches.
>
> using this idea with a single level of branch lookahead, and selecting
> a position with the widest range for inspection, reduces the number
> of puzzles requiring guesses to 373, with a maximum depth of 3
> guesses.

But this method, if I'm not grossly mistaken, does involve guessing - refined,
educated guessing, but guessing still. On the other hand, one might correctly
state that even the wildest guess and check is logic (proof by contradiction;
if I put values v_1 to v_n at positions p_1 to p_n respectively and then
value v_(n+1) at position p_(n+1), I finally can't satisfy the constraints,
hence, given v_1 ... v_n at p_1 ... p_n, at p_(n+1), v_(n+1) cannot be ...)

>
> it doesn't reduce runtime much, yet (7min50s), but I haven't
> even started profiling, either. I guess I have to look into
> representations now, if I ever want my solver to catch up with
> your's in terms of runtime;-)
>
> cheers,
> claus

Cheers,
Daniel

--

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
        -- Blair P. Houghton

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

haskell-2
Daniel Fischer wrote:
> My IDE is kate/kwrite + ghci + hugs, if you know a better one (which doesn't
> involve loading down a 99MB thing like eclipse), I'd be interested to try it
> out.

(I already knew emacs, so using haskell-mode is my choice)

>> I changed the output format to a line per puzzle (solution/number
>> and list of guesses - I still like that metric as each guess means that
>> we've run out of ideas, which suggests that we want to get the number
>> of guesses down to zero), using the same format for both solvers to
>
> and if we get down to zero, all's fine, but I think we shouldn't ignore the
> false guesses.

As a metric, you need to include all the blind alleys.

>> as for guessing strategies, you're sorting the non-fixed positions by size
>> of range before extracting candidates for guesses, whereas I simply take
>> them in the order they appear in the puzzle (which may have been
>> somewhat hard to see thanks to another one of those overcomplicated
>> pieces of code that has now disappeared..). I had thought about that, but
>
> I have also tried guessing on the first blank, increased time about 12%, so
> until I find a better criterion, I'll stay with fewest possibilities.
> The (simple) idea behind that choice is, if I have only two possibilities, I
> have a 50% chance of being right on the first attempt - that reasoning of
> course was based on the 'find one solution and stop' target, for determining
> all solutions, i.e. following all branches, I'm not so sure I could justify
> it.
> However, I've also tried the opposite (just to compare), guess at a position
> with maximum number of choices: MUCH slower.

The *only* "optimization" in Knuth's dancing-links binary-code-problem-solver is
that it always chooses the constraint with the minimum number of possibilities.
 (Sometimes there is only one possibility).

>> if I eliminate the sorting from your solver, both solvers deliver the same
>> results, so that's nice!-) but it also means that we have room for further
>> improvement.
>>
>> one reason why I got interested in this problem in the first place was
>> that I had been reading about constraint propagation, and wanted to
>> get some hands-on experience. in particular, there is a technique
>> called "generalized propagation":
>>
>>     Generalised Constraint Propagation Over the CLP Scheme,
>>     by Thierry Le Provost and Mark Wallace. Journal of Logic
>>     Programming 16, July 1993. Also [ECRC-92-1]
>>     http://eclipse.crosscoreop.com:8080/eclipse/reports/ecrc_genprop.ps.gz
>>
>> if I may oversimplify things a bit:
>>
>> 1 split inferences into branching (search) and non-branching (propagation)
>> 2 distribute common information from non-branching inferences
>>     out of branches
>> 3 to enhance applicability of step 2, add approximation of information
>>
>> applied to our sudoku problem, we have deterministic propagation
>> and branching search, and what we do if we run out of propagation
>> opportunities is to select a non-fixed position, then branch on each
>> of the numbers in its range. adding generalised propagation (as I
>> understand it) amounts to a form of lookahead: we take each of
>> those branches, but look only at the deterministic information we
>> can extract in each (ignoring further search). then we check whether
>> the branches have any of that information in common, and apply any
>> such common information to our matrix. rinse and repeat. an obvious
>> way to refine this process by approximation is to build the union of
>> the ranges on each position for the matrices resulting from all the
>> branches.
>>
>> using this idea with a single level of branch lookahead, and selecting
>> a position with the widest range for inspection, reduces the number
>> of puzzles requiring guesses to 373, with a maximum depth of 3
>> guesses.

373 out of the list of 36628 ?  Impressive.

> But this method, if I'm not grossly mistaken, does involve guessing - refined,
> educated guessing, but guessing still. On the other hand, one might correctly
> state that even the wildest guess and check is logic (proof by contradiction;
> if I put values v_1 to v_n at positions p_1 to p_n respectively and then
> value v_(n+1) at position p_(n+1), I finally can't satisfy the constraints,
> hence, given v_1 ... v_n at p_1 ... p_n, at p_(n+1), v_(n+1) cannot be ...)

There is no clear definition that separates logic and guessing, because the
algorithm does not understand your semantics.  I would say the semantics of
"lookahead 1 step" are different from guessing because the amount of computation
involved is predictable ahead of time: you can look at the current possibilities
and know how much work goes into "lookahead 1 step" but you can't look at the
current state and know how much work a brute force search will take.

>
>> it doesn't reduce runtime much, yet (7min50s), but I haven't
>> even started profiling, either. I guess I have to look into
>> representations now, if I ever want my solver to catch up with
>> your's in terms of runtime;-)
>>
>> cheers,
>> claus
>
> Cheers,
> Daniel
>

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re[2]: Code Review: Sudoku solver

Bulat Ziganshin-2
In reply to this post by Daniel Fischer-4
Hello Daniel,

Tuesday, April 11, 2006, 3:26:06 AM, you wrote:

> Today, I wrote a version using IOArray, hoping to get incredibly fast in-place
> updating, and explicitly making copies via freeze and thaw when guessing is
> required, but no luck (maybe I just don't know how to do it properly), spent

Arrays implementation does the same. for example:

class HasBounds a => IArray a e where
    unsafeReplace arr ies = runST (unsafeReplaceST arr ies >>= unsafeFreeze)

unsafeReplaceST :: (IArray a e, Ix i) => a i e -> [(Int, e)] -> ST s (STArray s i e)
unsafeReplaceST arr ies = do
    marr <- thaw arr
    sequence_ [unsafeWrite marr i e | (i, e) <- ies]
    return marr

may be you should try using unsafeFreeze or unsafeThaw? they differ
from 'safe' analogs in what they don't take a copy of array, but
modify just modify appropriate flag in array header

> 85% of the time in GC, 18.3% with -H64M -A32M, MUT time is about 15% more
> than plain Array and EnumSet.
> If I had even the vaguest idea how I could provide an
> instance MArray IOUArray Entry IO
> (or such with Entry replaced by (Int, Set Int) or even
> (# Int, Word #)), I would give that a try.

on 32-bit CPU you can use just "IOUArray Int Word64" :)

> Now I've changed writeArray to unsafeWrite (analogously for read), that

are you used `unsafeAt` for Arrays?

> brought MUT time to less than the plain Array version, but still spends a lot
> of time gc'ing (16.4% with -H64M -A64M, 10.2% with -H64M -A64M, that brought
> total time to less than plain Array, but had a maximum residency of
> 74,252,696 bytes, too much for my liking, plain Array has maximum residency
> of several K)

if you will use "-H1G" the residency will be even larger ;)

the real problem is IOArray - it's usage raise GC times significantly
in ghc 6.4. ghc 6.6 solved this problem. anyway, using of unboxed
array will be much faster. because you use only 9 bits for
enumeration and i think even less for Int, it's no problem for you to
use "UArray Int Int" and manually divide "Int" extracted to two parts
- Int and Set. The other custom solution is mangling indexes - for
example "2*i" for Int and "2*i+1" for Set, or "i" for Int and "i+10"
for Set - of course, in this cases you should allocate large enough array


>I don't understand it, I would have thought that when I call
>newArray, an array of appropriate size is created somewhere on the
>heap (or wherever) and stays there until no longer referenced and
>writeArray would just change the contents of the relevant memory-cell
>and not touch the rest, so that should be fast and allocate less than
>plain Array, but it seems that isn't so :-(

you are right. IOArray raise GC times not because it allocates more
but because on each GC (even minor ones, which happens after each
256kb allocated with default -A/-H settings) ALL IOArrays should be
scanned because pointers data from new allocated part of heap can be
written to these arrays (after all, these arrays are mutable and can
contain references to other data). the solution is either using
UNBOXED arrays that can't cpntain references and therefore not scanned
or immutable arrays that again is not scanned because they cannot be
changed. ghc 6.6 just saves the list of all arrays/references that was
mutated since last GC. the ultima problem is what 2-stage GC
collection mechanism is optimized for immutable functional
datastructures - not IOArrays

--
Best regards,
 Bulat                            mailto:[hidden email]

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Reply | Threaded
Open this post in threaded view
|

Re: Code Review: Sudoku solver

haskell-2
In reply to this post by Daniel Fischer-4
> I believe if you change the representation of puzzles from [(pos,range)]
> to an Array, you'll get a significant speedup
>>> yet because I only recently removed a logic bug that slowed down the
>>> search instead of speading it up; ..). so the more interesting bit is
>>> that our solvers disagree on which are the most difficult puzzles
>>> (requiring the largest number of guesses):
>>>
>>> df
>>> puzzles involving guesses: 5319
>> If that's not a typo, I'm baffled. My original needed to guess in 5309
>
> Rot! Typo in _my_ previous message, 5319 is correct.
>

After posting my cleaned up dancing links solver, I went back to my logical
solver.  I sent the 36628 line sudoku17 puzzle through it and it could solve
31322 of the puzzles, leaving 5306 resistant.  I haven't check my algorithms
against your code, but it does not seem like I do any spectacularly more clever.

I don't have any time to clean up my code, but I may try turning off some steps
to see if one of them gives my the same 5319 number you are seeing.

--
Chris Kuklewicz
_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
123