Wednesday, April 9, 2008

WebWeekly: Creating an Online Boggle Solver :: Solving the Puzzle

4GuysFromRolla.com

Internet.com Network
Wednesday April 9, 2008

Creating an Online Boggle Solver :: Solving the Puzzle
By By Scott Mitchell

Introduction
My family enjoys playing games and one of our favorites is Boggle, an addictive word game where players attempt to find as many words in a 4x4 grid of letters. At the end of a game, players are left wondering whether there were any unearthed words. To answer this question once and for all, I created an online Boggle solver using ASP.NET version 3.5.

This article is the second installment in a two-part series. Last week's article, Building the User Interface, examined the Boggle solver web page's user interface, which consists of 16 TextBox Web controls arranged in a 4x4 grid and three Button Web controls for solving the user-entered puzzle, solving a randomly-generated puzzle, and clearing the board. A ListView control is used to display the solutions in a three-column HTML <table>. The user interface also included a handful of JavaScript functions to ease entering the board data.

This second and final installment details the code used to solve the puzzle. Solving the puzzle requires having a dictionary of legal words and objects that mirror the structure and functionality of the board and of solutions. These objects are implemented as classes that include internal data structures that use a number of features in the .NET Framework, including: Generics; automatic properties; and caching. The complete source code is available for download at the end of the article.

Try out the live demo or read on to learn more!

Have You Read Building the User Interface Yet?
If you have not yet read the first installment of this two-part series, Building the User Interface, please do so before tackling this article. The Building the User Interface article includes information about the rules of Boggle, design decisions, and other important topics.

Maintaining a Dictionary of Legal Words
For the Boggle solver to determine what combinations of letters constitute valid solutions it needs to maintain a dictionary of legal words. The dictionary used by the Boggle solver is implemented as a class named BoggleDictionary. The BoggleDictionary class is used by the Boggle solver in the following way: for each new combinations of letters, the solver asks the BoggleDictionary, "Is this combination of letters a legal word?" For example, consider the following Boggle board:

 r e i b t m f w i r a e r h s t 
 
The solver will try various legal combinations of letters, like "fari", "farh", "fars", and "farm". With each of these combinations, it asks the BoggleDictionary, "Is this combination of letters valid?" The BoggleDictionary, then, would reply in the affirmative for "farm" and in the negative for "fari", "farh", and "fars". Furthermore, the Boggle solver needs to ascertain from the BoggleDictionary whether it makes sense to keep probing for additional letters. For example, the letters "irat" is not a valid word, but if the Boggle solver gives up at this point it won't find the word "irate". However, the Boggle solver should not foolishly chase word combinations that have no hope of reaching a solution. That is, the letters "hrir" is not a valid word, nor are there any legal words that start with "hrir", so there's no need to keep probing for more solutions.

One of the challenges in implementing BoggleDictionary is to ensure that the words in the dictionary can be efficiently searched in the manner by which the Boggle solver requires. Because the dictionary will be very large (tens of thousands of words, for instance) and because there are many different ways for the solver to combine the tiles on the board, the efficiency with which the dictionary can be searched will dramatically impact the overall performance of the puzzle solver.

In the end, I had the BoggleDictionary use a Dictionary object with the first n letters of each valid word combination used as the key, where n is the minimum number of characters that must appear in a word for it to be considered a solution. (According to the official rules, any word with three or more letters is valid; I typically only count four-letter words as valid.) Each Dictionary key points to a list of words that start with the letters.

The Dictionary structure is illustrated by the figure below, which shows a subset of the Dictionary keys and their corresponding values. The picture below shows the Dictionary structure when searching for four-letter words. It depicts three keys: "fata", "fate", and "fath". Each key's value is a list of strings that start with the same value as the key. For example, the key "fata" has a list of strings as its value that includes words like "fatal", "fatally", "fatalistic", and others.

The BoggleDictionary structure.

Background on the Dictionary Data Structure
If you are not familiar with the Dictionary object, it is .NET's Generics-based implementation of a hashtable. A hashtable is a type of data structure that stores values indexed by some developer-defined key (in our case, a string); like an array, which indexes elements ordinally, a hashtable's values can be retrieved very efficiently when accessed by the key. For more information on hashtables and the Dictionary, see: An Extensive Examination of Data Structures Using C# 2.0: The Queue, Stack, and Hashtable.

Now that we've decided how to model the words in the BoggleDictionary class, we need to determine the allowable words that will populate the internal Dictionary structure. There are many websites that offer lists of English words in the form of a text file. I already had such a file on my computer from a previous project that lists words by the letters they contain, ordered alphabetically. Specifically, on each line the text file contains an entry of the following form:

alphabetical list of letters constituting the words|word1|word2|...|wordN

For example, the words "polyps" and "sloppy" contain the same combination of letters: one "o"; one "l"; two "p"s; and one "s". These letters, when arranged alphabetically, are "loppsy"; consequently, the following entry exists in the dictionary text file:

loppsy|polyps|sloppy

The dictionary text file is read in and its words are populated into the Dictionary object in the BoggleDictionary class's Load method. The Load method accepts a single string input parameter, dictionaryPath, which specifies the full phsyical path to the dictionary file. (Note that the Load method is marked as private; it's not meant to be called from outside the class. Rather, it's called from the BoggleDictionary's constructor.)

private void Load(string dictionaryPath) {
   // Read in the contents of dictionaryPath
   string[] entries = File.ReadAllLines(dictionaryPath);

   // Each entry has the form SORTED_LETTERS|Word1|Word2|...|WordN
   // Just grab the various words
   foreach (string entry in entries)
   {
      string[] words = entry.Split("|".ToCharArray());
      for (int wordIndex = 1; wordIndex < words.Length; wordIndex++)
      {
         // Only process words within the range of interest
         if (words[wordIndex].Length >= index_size && words[wordIndex].Length <= max_word_size)
         {
            List<string> DictTableEntryList = null;

            // Get the first index_size letters of the word
            string dictIndex = words[wordIndex].Substring(0, index_size);

            // See if we have a DictEntry table for these index_size letters
            if (!DictTable.ContainsKey(dictIndex))
            {
               // No entry, add it
               DictTableEntryList = new List<string>();
               DictTable.Add(dictIndex, DictTableEntryList);
            }
            else
            {
               // Entry exists, reference it
               DictTableEntryList = DictTable[dictIndex];
            }

            // Add the word to the DictTableEntry
            DictTableEntryList.Add(words[wordIndex]);
         }
      }
   }
}

The Load method starts by reading in the lines of the dictionaryPath text file into the entries string array. It then loops through each string in the array, splitting the line by the pipe character (|), which is what delimits each word. These "pieces" are loaded into the words string array using the string class's Split method.

Recall that the first entry on each line is the alphabetical arrangement of letters in the remaining words on the same line. Therefore, we don't want to process this first entry, only the subsequent ones. Hence, the loop that walks through the elements of the words starts at index 1 (instead of 0). For each word in words whose length is greater than or equal to index_size (which is the minimum number of letters needed in a word to be considered a solution) and less than or equal to max_word_size (which is the total number of tiles on the board and, therefore, the maximum length of any solution), then the item is added to the Dictionary object.

Before adding the word to the Dictionary object we must first make sure that the Dictionary object (DictTable) contains an appropriate key. For example, when adding the word "fatal", we must ensure that the Dictionary object has the key "fata". This is handled by the call to ContainsKey. If the key does not exist, the Dictionary object entry is added; otherwise, if it already exists, the list of string is referenced. In the end, the key is created (if needed) and the word is added to the Dictionary object.

From the Internet.com eBook Library: Navigating Your IT Career

A career in information technology usually has its share of
ups and downs. Download this Internet.com eBook to learn
where the jobs are in IT, how to negotiate a salary, and
helpful advice on job security and how to deal with a layoff.
Join Internet.com now to download!
http://www.devx.com/ebook/Link/34938

Interested in placing your TEXT AD HERE? Click Here
 
Caching the Dictionary
Every time a user solves a Boggle board, the solver must first load the dictionary. However, the dictionary information is likely to change infrequently, if at all. It is a prime candidate for caching. As discussed in .NET Data Caching and Caching with ASP.NET, ASP.NET offers a rich and robust API for caching objects in memory.

The BoggleDictionary includes a GetFromCache method that looks for a BoggleDictionary instance in the NET data cache. If it is not found, a new instance of the class is created and stored in the cache.

public static BoggleDictionary GetFromCache(string dictionaryPath, int minWordLen)
{
   string CacheKey = string.Concat("Boggle_Dictionary-", dictionaryPath, "-", minWordLen);

   // See if the dictionary is in the cache
   BoggleDictionary dict = HttpContext.Current.Cache[CacheKey] as BoggleDictionary;
   if (dict == null)
   {
      // Load dictionary
      dict = new BoggleDictionary(dictionaryPath, minWordLen);

      // Cache dictionary
      HttpContext.Current.Cache.Insert(CacheKey, dict, new CacheDependency(dictionaryPath));
   }

   return dict;
}

If the BoggleDictionary object is not found in the cache, it is inserted back into the cache via the call to Cache.Insert, which includes a cache depedency. A cache dependency is a mechanism by which to instruct the cache to evict an item if a dependent object is changed. In this case, we instruct the data cache to automatically remove the BoggleDictionary instance from the cache whenever the specified dictionary text file (dictionaryPath) is modified. That means that the cached BoggleDictionary object will automatically be "refreshed" if you modify the dictionary text file (such as by adding or removing words).

Modeling Solutions
During the puzzle solving process, every time a solution is found it needs to be remembered so that it may later be displayed in the web page. (It must also be remembered so that identical solutions are not returned. That is, with many board arrangements there are more than one way to find the same solution. We do not want to include these duplicates in the output.)

The solutions are modeled via the BoggleWord class, which has two properties:

  • Word - the actual solution
  • Score - the score for the solution. The score is based on the length of the words and is spelled out in the official Boggle rules.
The Word property demonstrates a new new feature of C# 3.0: automatic properties. Oftentimes, classes include very simple property statements that return and set a private member variable without performing any special logic or validation in the getter or setter. These simplistic properties have the following pattern:

private string _StringProperty;
public string StringProperty
{
   get
   {
      return _StringProperty;
   }
   set
   {
      _StringProperty = value;
   }
}

These types of properties can now be condensed into a much more terse syntax:

public string StringProperty
{
   get;
   set;
}

Or, if you prefer an uber-terse syntax:

public string StringProperty { get; set; }

Behind the scenes, the compiler automatically adds the private member variable and flushes out the getter and setter. For more on automatic properties, see C# 3.0 Features: Automatic Properties.

The BoggleWord class has two methods used in dispaying the solutions in the web page:

  • ToString - returns the solution in the format, "word (x points)", where word is the solution and x is the number of points for the solution.
  • CompareTo - used to sort the results alphabetically.
There's also a BoggleWordList class that holds a collection of BoggleWord instances. The BoggleWordList class also has a TotalScore property that returns the sum of its BoggleWord scores.

BoggleBoard - the Puzzle Solving Workhorse
The BoggleBoard represents a particular Boggle board configuration. We examined this class in the Building the User Interface article. Recall that the web page that collects the user's input solves the puzzle by first creating a BoggleBoard instance based on the values entered by the user in the 16 TextBox controls and then calling the BoggleBoard's Solve method. The Solve method returns a BoggleWordList that contains the solutions for the board (if any).

The BoggleBoard class's Solve method starts by retrieving the BoggleDictionary object in the data cache by calling BoggleDictionary.GetFromCache. It then loops through the 16 tiles in the board and, for each tile, calls the SolvePuzzle method, which finds solutions starting from the specified position and branching outwards recursively.

public BoggleWordList Solve() {
   // Load the dictionary
   string dictionaryPath = HttpContext.Current.Server.MapPath("~/App_Data/Dictionary.txt");
   BoggleDictionary dict = BoggleDictionary.GetFromCache(dictionaryPath, min_word_length);

   BoggleWordList solutions = new BoggleWordList();

   for (int rowPos = 0; rowPos < ROWS; rowPos++)
      for (int colPos = 0; colPos < COLS; colPos++)
         SolvePuzzle(dict, solutions, rowPos, colPos, string.Empty);

   return solutions;
}

The SolvePuzzle method accepts as inputs:

  • The BoggleDictionary object to use to determine whether a particular combination of letters is a valid word, and whether to keep searching.
  • The set of current solutions
  • The row and column position to use to get the next letter
  • The current combination of letters being constructed
The SolvePuzzle method starts by asking the BoggleDictionary object whether it's worthwhile to keep on searching given the current combination of letters. The BoggleDictionary class's KeepGoing method, which we won't examine in detail, returns a Boolean value indicating whether there are any words starting with the passed-in combination of letters. In a nutshell, the KeepGoing method always returns true if the length of the passed-in string is less than the number of letters that make up the Dictionary object's key. If the passed-in string is long enough, it actually looks up the combination in the Dictionary object to determine whether it makes sense for the solver to continue probing.

private void SolvePuzzle(BoggleDictionary dict, BoggleWordList solutions, int rowPos, int colPos, string current) {
   // Try to drill into to neighboring tiles (if it's worth the hassle)
   if (dict.KeepGoing(current)) {

If the BoggleDictionary object's KeepGoing method gives us a thumbs up, we then take the value at rowPos, colPos and tack it onto the existing combination of letters being tested. If the new combination of letters contains at least as many letters as are needed for a valid solution, the BoggleDictionary's Contains method is called to see if the solution is valid. If it is - and if the solution has not already been found, it's added to the list of solutions.

      string originalValue = board[rowPos, colPos];
      string newChain = current + originalValue;

      if (newChain.Length >= min_word_length)
      {
         // See if current is in the dictionary
         if (dict.Contains(newChain) && !solutions.Contains(newChain))
            solutions.Add(newChain);
      }

The next step is the most important: attempting to form new combinations with neighboring tiles. In short, we want to try to form new combinations with every one of the current tile's neighbors. This is accomplished with two loops to check all positions immediately adjacent to the current tile. A conditional statement is used to ensure that the neighbor being considered is within the board of play and that it has not already been used in this exploration. (According to Boggle's rules you cannot use a particular tile more than once when forming a solution.) Used tiles are marked by a tilde (~).

If the neighbor is valid - it's on the board of play and has not already been used in this particular solution attempt - then the SolvePuzzle method is called, passing in the new position and new combination of letters.

      for (int vert = -1; vert <= 1; vert++)
         for (int horiz = -1; horiz <= 1; horiz++)
         {
            if (rowPos + vert >= 0 && rowPos + vert < ROWS &&
               colPos + horiz >= 0 && colPos + horiz < COLS &&
               !(vert == 0 && horiz == 0) &&
               board[rowPos + vert, colPos + horiz] != "~")
            {
               board[rowPos, colPos] = "~";
               SolvePuzzle(dict, solutions, rowPos + vert, colPos + horiz, newChain);
               board[rowPos, colPos] = originalValue;
            }
         }
   }
}

And that's all there is to it! The entire SolvePuzzle method is a mere 29 lines of code (including whitespace and comments). This simplicity is possible thanks to the power of recursion.

Conclusion
In this article and the previous one we looked at building an online Boggle solver using ASP.NET version 3.5. The Boggle solver uses many interesting .NET Framework features to accomplish its goal. The BoggleDictionary uses Generics to create a strongly-typed Dictionary data structure for efficiently storing and querying the dictionary; the BoggleWord class demonstrates the use of automatic properties; and ASP.NET's data cache is used to cache a BoggleDictionary instance in memory so that it does not need to be recreated from the ground up every time the solver is used.

The complete code for this application is available for download at the end of this article. I also invite you to try out a live demo.

Happy Programming!

  • By Scott Mitchell


    Further Readings:

  • Building the User Interface
  • Boggle Wikipedia Entry
  • C# 3.0 Features: Automatic Properties
  • Introducing NET Generics
  • .NET Data Caching
  • An Extensive Examination of Data Structures Using C# 2.0
  • Attachments

  • Download the Demo (in ZIP format)
  • Live Demo




  • JupiterOnlineMedia

    internet.comearthweb.comDevx.commediabistro.comGraphics.com

    Search:

    Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

    Jupitermedia Corporate Info


    Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

    Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

    All newsletters are sent from the domain "internet.com." Please use this domain name (not the entire "from" address, which varies) when configuring e-mail or spam filter rules, if you use them.

    No comments: