I can’t say Speller is the most difficult Pset of Cs50 neither can I say it’s the easiest, though I haven’t check future pset 6 & 7, I hope there are going to be cool.
It took me 5 days before I could complete it, a lot of errors, mistakes, regrets, even I am too closed to quit from cs50 specifically.
With the Knowledge of Pointers, Array, and Linked list, Speller problem’s set helps me learn about the Hash table.
Speller Step by Step Solution
The main problem here is for us to write a Code that Spell-check a file after loading a dictionary of words from disk into memory, thus taking into consideration an actual “wall-clock,” not asymptotic, time. we are to implement the fastest spell checker we can.
Firstly, we need to download the program distribution code with wget, after which we can see the various pre-written file, I hope you read the Specification and watch all 6 walkthroughs.
from that file, we are only required to update empty functions on Dictionary.c
The dictionary.c has 5 function’s we are to update. they are:
- Load
- Hash
- Size
- Check
- Unload
Load function
This function is responsible for taking the Dictionary and loading it into a hash table.
The load function will take a string(char*) as its argument which is the Dictionary file we are to open up and read from to load all the data into your hash table, it’s going to return a boolean value, true if successful and false if unsuccessful
TO DO
To implement the load function, we need to consider the following.
- Opening a dictionary file
- Reading Strings from file one at a time
- Creating a node for each word
- Hashing word to Obtain a hash value
- Insert node into the hash table at that location
Opening the Dictionary file
recall that in Week 4, we are being taught how to open a file for a different operation like reading and writing with FILE *.
For this Pset we are opening the dictionary file for reading ‘r’, then we need to check if Dictionary is not equal to NULL
Reading String from the file
We need to make use of the fscanf, a function that takes in three arguments, the File (where we are reading from), the second argument is ‘%s’ (which means we are reading strings from the file) and the last argument is the word (A character array where we are going to store the strings we reading from the file). We also need to take into consideration how we are going to read all string in that Dictionary till it reaches the end of file (EOF)
Create a new node for each word
in this section, we are going to create a new node with malloc that is going to store the particular word into hash table, and also copy strings from word into the new node with strcasecmp function, declared in strings.h
Hash word to obtain a value
to hash word to a value, we will need to call the hash function like
int index = hash(word);
Insert node into hash table at that location
This is where most of us are not getting right, when you check the file you are being given to, dictionary.c, you will notice that we have node *table[N]; where N is the number of buckets we want to wrap our linked list to.
so to insert a node into the hash table at the location index,
- we will check if table[index] has not been used earlier, if not,
- then we add the new word into the Hash table else
- we add the word to the next available index in the hash table
- I also created a global integer variable to count the number of words being loaded into the hash table and increment it.
- Close all open file with fclose.
- then change the Return false at the very end of the function to true.
My Load function code
/ Loads dictionary into memory, returning true if successful else false
bool load(const char *dictionary)
{
//opening dictionary file for reading
FILE *open_dictionary = fopen(dictionary,"r");
if (open_dictionary == NULL)
{
return false;
}
char Dword[LENGTH + 1];
while(fscanf(open_dictionary,"%s", Dword) != EOF)
{
node *newNode = malloc(sizeof(node));
if (newNode == NULL)
{
return false;
}
strcpy(newNode -> word, Dword);
newNode -> next = NULL;
int index = hash(Dword);
if (table[index] == NULL)
{
table[index] = newNode;
}
else
{
newNode -> next = table[index];
table[index] = newNode;
}
Count_size++;
}
fclose(open_dictionary);
return true;
}
Hash function
This function will take a string (char *) and return an unsigned int (non-negative integer), it is responding for spinning out where we are going to store the word from the Dictionary.
At the very top of the Dictionary.c there is node *table[N];, where N is the number of buckets being given to us, we can increase the default N from 1 for the efficiency of the program.
from the Walkthrough we are told that we can browse through the internet to source for a good hash table,
Hash function code
// Hashes word to a number
unsigned int hash(const char *word)
{
unsigned int value = 0;
unsigned int key_len = strlen(word);
for (int i = 0; i < key_len; i++)
{
value = value + 37 * tolower(word[i]);
}
value = value % N;
return value;
}
Hash explanation
- I create a new unsigned integer Value and set it to 0
- I also store the string length of word in the variable key_len
- Then I iterate through the key_len of each word
- adding value (0) with 37 and multiply it to Ascii digit of the word
- then I do the value mod N, to ensure that the return value is not greater than N.
- then return value
Size function
This function is meant to return the number of word that are in the dictionary.
for easy implementation, I have included the count_size in the load function already. So all I need to do is to return the count_size in this function.
size function code
// Returns number of words in dictionary if loaded else 0 if not yet loaded
unsigned int size(void)
{
return Count_size;
}
Check Function
in the Check function, our job is to take a word and check if it is in the dictionary or not, the check function will take a char* representing the word as its input and the output will be a boolean variable (True or false). The check function is required to be case insensitive
Pseudocode
- Hash word to obtain a hash value
- Access linked list at that index in the hash table
- Traverse linked list, looking for the word with strcasecmp function (strcasecmp compares two string case insensitive, it is declared in strings.h)
- Create a new node *cursor and set it to the first element in the linked list
- Then Check until Cursor is not NULL(if the cursor is not pointing to anything)
- compare the strings with the help of strcasecmp,
- if same, return true
- else return false
- compare the strings with the help of strcasecmp,
check function code
bool check(const char *word)
{
int index = hash(word);
node *cursor = table[index];
while (cursor != NULL)
{
if(strcasecmp(cursor -> word, word) == 0)
{
return true;
}
cursor = cursor -> next;
}
return false;
}
Unload function
This function is responsible for freeing up any memory that has been allocated in the process of running spell checker, thereby returning true if successful
Pseudocode
- iterate through the bucket of linked list (N)
- Check Until hashtable of bucket is not NULL (table[i])
- create a temporary node and set it to point to the next node.
- free any memory allocated as you go
- then put temporary node into table[i].
Unload function code
// Unloads dictionary from memory, returning true if successful else false
bool unload(void)
{
for (int i = 0; i < N; i++)
while (table[i] != NULL)
{
node *tmp = table[i] ->next;
free(table[i]);
table[i] = tmp;
}
return true;
}
Cs50 Speller code:
Conclusion
I find this Pset helpful, it helps me develop a basic understanding to the linked list and hash table, more generally Data Structures.
thanks for reading
please, if you find this solution helpful, don’t hesitate to drop a comment
to me this pset is somehow time consuming and lot of brainstorming…..
How is yours?
Realy , very good job 🙂
Thanks.
bro, thank you very much you helped me alot
Thank you!