Introduction to C Programming - Coding a Command Line Hangman Game
2014-08-20
I wrote a command line hangman game in C. This post walks through the process explaining in detail each step of the way.
Necessary Tools
Editing
All you need is a text editor, but something better is very helpful. Something like jEdit, or even a full IDE, like Eclipse.
Compiling
If you're compiling from within an IDE, you'll need to figure that out on your own. Otherwise, you simply open the command line to the directory containing your hangman.c file and issue the command gcc -o hangman hangman.c
Running
An IDE can compile and run the code is one go. Depending on the OS, the command line option for running a executable may differ. On Linux it's ./hangman
to run the result of the above compile command.
Includes
At the top of any C program, the first necessary elements are the includes, which provide access to the advanced built in functionality of C. You can find more information at cplusplus.com, which is an excellent reference site for C. These are the include we'll be needing.
#include <stdio.h> // The Standard Input and Output library; functions for reading and writing data
#include <string.h> // The String library; functions for manipulating strings
#include <stdlib.h> // The Standard library; general purpose functions
Structs and Pointers
Next, we know a hangman game requires words. In my design the program loads all the words initially, so we never need to go back to the disk. We require data structures to hold all that data.
struct word { // define a word structure as:
size_t length; // an unsigned integer (non-negative)
char * text; // a pointer to one or more characters
};
struct wordlist { // define a wordlist as:
size_t count; // a count of words
struct word * list[90000]; // a pointer to 90000 word pointers worth of memory
size_t maxlen; // the length of the longest word
} * words; // a pointer, named words, to a wordlist
Let's talk for a second about what we're looking at here.
Structs
A struct is a way of grouping related data, to make it easier to manage. It's a precursor to the idea of object oriented programming, but structs contain no functionality, only data.
To learn about structs, try these tutorials: simple, detailed.
Pointers
Pointers are conceptually very simple. A pointer is a variable containing a location in memory. Pointers needs to know what type of data they point to, so when the data is requested, C knows how much to read. The syntax VariableType * pointerToDataOfType
defines a pointer, while the syntax *pointerToDataOfType
requests the data pointed to by the pointer (de-referencing).
There is a shortcut when using pointers to structs. Namely, words->count
is equivalent to (*words).count
, as both de-reference the words pointer to provide access to the count
variable.
For more on pointers, read this simple pointer tutorial or, if you're feeling bold, this more in depth one.
Pointers and Memory Allocation
Now let's populate those data structures with information from a file on the disk. To keep things simple, we'll be reading a plain text file of whitespace separated words.
void load(char * path) { // word loading function, requires a wordlist location argument
char temp[30]; // 30 characters worth of memory for holding each word temporarily
size_t index = 0; // which word we're currently handling
words->maxlen = 1; // the maxlen value from our above-mentioned wordlist struct
FILE * pFileWords; // FILE struct (part of C itself) pointer
pFileWords = fopen(path, "r"); // the fopen function opens a file and returns a pointer to it
if (pFileWords == NULL) { // if the file couldn't be opened for whatever reason
printf("\nno wordlist found at %s\n", path);
exit(1); // tell the user and exit the program
}
// the fscanf function scans a file (1st argument), according to format (2nd argument),
// and saves the data (subsequent arguments), returning EOF (end of file) when applicable
while (fscanf(pFileWords, "%s", temp) != EOF) {
size_t length = strlen(temp); // the strlen function returns the length of a string
if (words->maxlen < length) { // if the current word is longer than the previous longest
words->maxlen = length; // that should be the new maxlen
}
// the malloc function allocates memory on the heap, which is larger but slower
// the size we're allocating here is enough to hold a word struct
words->list[index] = malloc(sizeof(struct word));
// now we allocate memory for the actual characters of each word
// (char *) casts this memory as a pointer to one or more characters
words->list[index]->text = (char *)malloc(length*sizeof(char)+1);
strcpy(words->list[index]->text, temp); // strcpy copies a string, copying the 2nd arg to the 1st
words->list[index]->length = length; // each word also remembers its length
index++; // words->list[index] is now occupied; thus we increment index by 1
}
words->count = index; // once all words have been read, the number of words == index
}
Whew. That's a lot of code and comments. This is a good point to pause for recapitulation and rumination.
Memory Allocation
When variables are created directly, without an asterisk (not a pointer), the memory is allocated automatically on the stack, which is smaller and faster. For example, char temp[30]
allocates 30 characters worth of memory.
On the other hand, when memory is allocated explicitly, using malloc
or the like, it allocates that memory on the heap, which is larger and slower. Usually such memory would need to be set free (otherwise you have what's called a 'memory leak'). In our case however, the entire word-list is kept on the heap throughout the game.
For more about the stack vs. the heap, see this stackoverflow question.
For a less detailed overview, as well as information regarding freeing memory, check this stackoverflow question.
Game Loop and Basic IO
This is an old school game loop of the sort that waits for user input. It has two nested loops. The inner one manages a single game, while the outer one generates new games infinitely.
int play(int minlen) { // main game function, requires a minimum length argument
printf("\n");
srand(time(NULL)); // initialize the random number generator
do { // main game loop
struct word * word; // pointer, named word, to a word struct
do {
word = words->list[rand()%words->count]; // pointing to a random word
} while (word->length < minlen); // of at least minlen length
int guess;
int chances = 5;
char wrong[6] = ""; // incorrect guesses
char found[word->length+1]; // found characters
memset(found, '\0', word->length+1); // null terminate found
memset(found, '_', word->length); // otherwise fill with underscores
// prints to the console, what you've found, remaining chances, and wrong guesses
printf("solved %s | chances %i | wrong '%s' ", found, chances, wrong);
do { // loop for each ...
guess = getchar(); // get the next character from the console
if (guess == '\n') { // on enter, show our game state again
printf("solved %s | chances %i | wrong '%s' ", found, chances, wrong);
continue; // and move on to waiting for the next char
}
else if (guess == EOF) {
printf("\n"); // when we encounter an end of file (Ctrl-D)
return 0; // abandon the game function
}
else if (guess < 97 || guess > 122) {
continue; // if it's not a lowercase ASCII char, ignore
}
else { // if it is a lowercase ASCII char
int i;
for (i = 0; i < strlen(found); i++) { // loop through word
if (word->text[i] == guess) { // comparing to each char
found[i] = guess; // adding to found as needed
}
}
// if char guess is in neither found nor wrong
// strchr returns the index of a char (2nd arg) in a string (1st arg)
if (strchr(found, guess) == NULL && strchr(wrong, guess) == NULL) {
strncat(wrong, (char *)&guess, 5); // append guess to wrong
chances--; // reduce the remaining chances
}
}
// loop so long as we're still missing characters and not out of chances
} while (strchr(found, '_') != NULL && chances > 0);
printf("solved %s | chances %i | wrong '%s'\n", word->text, chances, wrong);
if (chances > 0) {
printf("VICTORY!\n\n");
} else {
printf("DEFEAT!\n\n");
}
while (getchar() != '\n') { /* do nothing */ }; // clear build-up of bad chars
} while(1); // loop forever
return 0;
}
Input / Output
In this code we use two basic IO paradigms, character based and formatted data based.
Character based IO is very simple, but there are a couple subtle oddness to be aware of. First, because getchar is a blocking function, all code execution stops until it receives a character. Second, because stdin (standard input) is buffered by default, no inputted characters are handed off to the program until enter (\n
) is pressed. Read the above code again with this in mind.
Formatted data based IO is more complex, to the point where I need to look it up every time I use it for anything but the simplest operations. The basic notion is to pass the functions printf or scanf a string with special markers (starting with %
), which are filled with the values of subsequent arguments.
Main and Arguments
Custom argument parsing, the way I'm doing here, is difficult. It's easy to make mistakes, especially as the number of arguments grow.
// the main function is automatically run and accepts arguments from the command line
// argc is a count of the arguments, argv is an array of arguments (space separated)
int main(int argc, char * argv[]) {
char * path = "/usr/share/wordlist";
long minlen = 3; // the default wordlist location and minimum word length
if (argc > 1) { // we've received arguments
if (strcmp(argv[1], "-h") == 0) { // print help and exit if the first argument is '-h'
printf("\nhangman command line arguments:\n\
\n-w path\n\tPath to a whitespace seperated list of words.\
\n-s size\n\tMinimum word size.\n");
return 0;
}
else {
int i = 1;
do { // looping through our arguments
if (strcmp(argv[i], "-w") == 0) { // if one of them is '-w'
if (i+1 < argc) { i++; } // if there's another argument
else {
printf("\n-w requires a wordlist path\n");
return 1;
}
path = argv[i]; // make that the path of our wordlist
}
if (strcmp(argv[i], "-s") == 0) { // if one of them is '-s'
if (i+1 < argc) { i++; } // if there's another argument
else {
printf("\n-s requires a minimum size\n");
return 1;
}
minlen = strtol(argv[i], NULL, 0); // make that our minimum length
}
i++;
} while (i < argc);
}
}
load(path); // call our load function (remember that?)
if (minlen > words->maxlen) { // exit if the longest word is shorter than the requested minimum
printf("\nsize must not be larger than the longest word, %u\n", words->maxlen);
exit(1);
}
return play(minlen); // otherwise call our play function, which loops infinitely
}
Arguments
As you can see, even three fairly simple command line arguments are not simple to parse. There are actually established rules, but even those are often ignored, and they don't cover every scenario.
That being said, the -h
for help with command line arguments is basically universal. This allows you to inform the user of any oddities in an accessible way.
If you want to write more complex command line options or you're a stickler for the conventions, you should probably using the getopt function.
Fin
Take this C knowledge I've (hopefully) imparted and do something wonderful.
The entire source, minus most of the comments, can be found on GitHub.