This week’s problem set requires us to implement a spell-checker. The usage is
speller [dictionary] text, where speller is our program, dictionary is optional, text is a file that contains the paragraphs that need to be spell-checked. If user does not specify a dictionary, a default dictionary will be used.
The main body of the program will be in
speller.c had been written by the CS50 staff, we just have to implement functions that is called by main function of
speller.c. These functions are in
dictionary.h contains declaration of these functions. Do not edit
load function is called to load dictionary into heap. After that, the text file containing paragraphs of words to be spell-checked is opened. Then, main continue gets character from text file until end of file. When a whole word is met,
check function is called to check if the word is in dictionary. If it is not, the misspelled word is printed. Lastly, after all the words are spell-checked, the memory in the heap is freed using
There are some codes implemented by the CS50 staff to record the time taken for each function to run. Besides the misspelled words, you should see the following information when you run your program:
WORDS MISSPELLED: 644
WORDS IN DICTIONARY: 143091
WORDS IN TEXT: 19190
TIME IN load: 0.17
TIME IN check: 0.03
TIME IN size: 0.00
TIME IN unload: 0.09
TIME IN TOTAL: 0.29
The four functions that you need to implement in dictionary.c are
unload. You can use either a hash table or trie to store your dictionary in heap. Optimise your algorithm to achieve the fastest running time.
Besides achieving fast running time, you also need to ensure no memory is leaked. Use Valgrind to check memory leaks.
I used a trie to store my dictionary. Trie is made up of nodes, each node has a structure:
// structure of node in trie
typedef struct node
struct node* node;
The isWord boolean is to tell if the node is letter of the end of a word. Each node has an array of 27 pointers (26 for alphabets, 1 for apostrophe), each pointer pointing to another node. Each dictionary word has a maximum length of 47, contains alphabet or apostrophe (apostrophe cannot be at the start of a word).
How do we insert a word into trie? We iterate over each character of the word, for each iteration, we check if struct node pointer of the array of node pointed by cursor pointer is NULL, if it is, we malloc some memory for it. If we arrive at the end of the word, we make isWord true. Cursor pointer is a pointer variable that we initialized to keep track of pointers when we go down the trie.
check returns true if word is in dictionary else false. We iterate over each character of a word to be checked. We go to corresponding element in children, if it is NULL, return false because the word is not in the trie. If it is not NULL, we move to the next character. Once we are at the end of the word, we check if isWord is true, if true, return true as the word is in trie.
size returns number of words in dictionary if loaded else 0 if not yet loaded. This function only needs one line.
unload unloads dictionary from memory. Returns true if successful else false. We have to free trie from leaves up to the root. I used a recursive function called
freeNode that frees the memory of node and every node allocated under it.
* Free the memory of node and every node allocated under it.
bool freeNode(struct node* node)
// iterate through all children of node
for (int i = 0; i < 27; i++)
// For each child, if that child is not null, then call freeNode() on that child.
if (node->node[i] != NULL)
// After the loop, free this node itself.
It is recursive because, when I pass in the pointer to root node to the function, it frees the root node’s children by iterating over each child to see if they are NULL, if any child node is not NULL, we proceed to call another
freeNode on the child to free the child’s children node. Lastly, we free the node itself.