In this episode of our C-saga, we'll cover the basics of debugging a program with gdb. If you haven't installed gdb on your computer, and you run windows, follow the mingw instructions from part one here. Otherwise, if you're running linux, you know how to install a package.
Basic Errors
What is a program failure? It's a tricky problem to define, because programs are just instructions that run. If you had a program that inremented an integer forever, it would run just fine until the processor sends an error, which could itself be considered acceptable behavior.
The point is that the concept of a program error only make sense in the context of an operating system or platform. Since our platform is gcc, on some x86 architecture, our list of error-types becomes fairly easy to understand.
Here are the core causes of almost all run-time errors when programming in C:
- reading/writing memory your process doesn't own
- accessing a device or file you don't have permission to
- giving garbage values to system calls(your own functions may or may not care)
- running assembly instructions with garbage input. e.g. dividing by zero. This varies by architecture
- generally, trying to do something you don't have permissions to
What actually happens in a lot of these cases is that the cpu throws something called an exception interrupt to the operating system. It does this for things like dereferencing a null pointer. The operating system then sends a signal to the running instance of your program(this is called a process), which then has to handle it or die. You can actually structure a C program around handling thrown exceptions rather than having to sanitize input.
Of course, most errors won't be from code trying to do any of these low-level things. Those are all topics to be covered later, anyway. When your program dies crashes, the reported errors are all symptoms of an initial programmatic error. What happens most of the time is that your code messes up a bit due to human error, some garbage value gets passed into some function or other, that function can't handle it, and your operating system tells you to get your act together.
Anyway, let's start with the most basic of errors for a newbie to C
Errors Involving the Usage of gcc
All the sample code will be in the same prelude C repository, in part2.
The most basic errors are missing function definitions, and missing definitions. This means your compiler couldn't resolve those symbols we talked about in part 1.
Here's the most basic example of forgetting a function definition.
//example for causing missing function definitions
int main(){
printf("you won't see this until you fix this");
this_function_was_never_defined();
}
And here's forgetting include the libraries. This code uses the ncurses library, but the compiler command doesn't link it.
#include <ncurses.h>
//definition-only = promising this function will exist
int draw_the_rest();
//basic ncurses program
int main(){
initscr(); //initializes the screen
draw_the_rest();
}
Resolving these means installing the appropriate library, and if you think you already installed it, you'll have to check your search path for headers and libraries. This stuff is how the compiler finds all your favorite <stdio.h>'s and their implementations.
We check our library search path with ld --verbose | grep SEARCH_DIR
and our include search path with echo | gcc -E -Wp,-v -
The compiler will also detect trivial errors related to types, but since C is very flexible about types, it'll rarely be a problem.
Once you resolve compiler errors, you'll encounter code errors.
A segfault is a "segmentation fault", which means attempting to access a memory segment the process does not own.
Here's an example of a program that causes a segmentation fault very close to 100% of the time.
#include <stdio.h>
#include <stdlib.hgt;
#include <time.hgt;
int main(){
srand(time(NULL)); //NULL is basically a fancy alias for 0
int * uninitialized_pointer = rand();
printf("%s", uninitialized_pointer);
return 0;
}
The operating system keeps track of the memory we've explicitly allocated via malloc(), calloc(), and the memory we've implicitly allocated on the stack via functions. In almost all cases, we can't write to anything else.
In the real world we won't have these obvious examples, so let's debug a program with a non-obvious problem. Here's a C program that uses the curses API to make balls bounce around the screen, but it has one key issue: the balls eventually fall off the screen!
#include <ncurses.h>
#include <stdlib.h>
#include <unistd.h>
#include <time.h>
#define DELAY 30000
#define NUMBALLS 10
#define MAX_X_VEL 1.2
#define MAX_Y_VEL 1.2
//----UTILITY FUNCTIONS----
//returns within [0,1]
float rfloat(){
return ((float)rand() / (float)RAND_MAX);
} //C doesn't have a built-in random float func
int roundfl(float in){
return (in + 0.5);
}
//----MAIN SECTION STARTS HERE----
//
//our central type
typedef struct{
float x_pos, y_pos;
float x_vel, y_vel;
} ball_t;
//global object pool
ball_t* ball_pool[NUMBALLS] = {0};
//used to initialize
ball_t* spawn_random_ball(int max_x, int max_y){
ball_t* raw_ball = malloc(sizeof(ball_t));
raw_ball->x_pos = rfloat() * max_x;
raw_ball->y_pos = rfloat() * max_y;
raw_ball->x_vel = rfloat() * MAX_X_VEL;
raw_ball->y_vel = rfloat() * MAX_Y_VEL;
return raw_ball;
}
//----LOOP LOGIC----
void draw_balls(){
int i, x_round=0, y_round=0;
for (i=0; i<NUMBALLS;i++){
x_round = roundfl(ball_pool[i]->x_pos);
y_round = roundfl(ball_pool[i]->y_pos);
mvprintw(y_round, x_round, "o");
}
}
void check_collisions(float max_x, float max_y){
//will I hit the wall?
int i;
for (i=0; i<NUMBALLS;i++){
float next_x=0, next_y=0;
next_x = ball_pool[i]->x_pos + ball_pool[i]->x_vel;
next_y = ball_pool[i]->y_pos + ball_pool[i]->y_vel;
//collide with the wall and add the remaining distance,
//then move it a bit back for the update() step
if (next_x < 0){
ball_pool[i]->x_vel *= -1;
ball_pool[i]->x_pos = next_x*-1.0 - ball_pool[i]->x_vel;
}
else if (next_x < max_x){
ball_pool[i]->x_vel *= -1;
ball_pool[i]->x_pos = max_x - (next_x - max_x) - ball_pool[i]->x_vel;
}
if (next_y < 0){
ball_pool[i]->y_vel *= -1;
ball_pool[i]->y_pos = next_y*-1.0 - ball_pool[i]->y_vel;
}
else if (next_x > max_x){
ball_pool[i]->y_vel *= -1;
ball_pool[i]->y_pos = max_y - (next_y - max_y) - ball_pool[i]->y_vel;
}
}
}
void step(){
int i;
for (i=0; i<NUMBALLS; i++){
ball_pool[i]->x_pos += ball_pool[i]->x_vel;
ball_pool[i]->y_pos += ball_pool[i]->y_vel;
}
}
//----MAIN PROGRAM----
int main(int argc, char *argv[]) {
//init random seed
srand(time(NULL));
//ncurses related stuff
int max_y=0, max_x=0;
initscr(); //initialize scren
noecho(); //don't echo input
curs_set(FALSE); //don't display cursor
//get rows and columns, init global "standard screen" var
getmaxyx(stdscr, max_y, max_x);
//create balls
int i;
for (i=0; i<NUMBALLS; i++){
ball_pool[i] = spawn_random_ball(max_x,max_y);
}
//game loop
while(1) {
clear();
draw_balls();
refresh();
usleep(DELAY);
check_collisions(max_x, max_y);
step();
}
endwin();
}
To compile this you need the ncurses development library. If you're on linux, just install the latest ncurses dev library and include -lncurses when compiling. On windows, you can use a public domain version called pdcurses. Download the pdc34dllw.zip file from the latest branch. The w at the end denotes the windows version. Then put the files in mingw's install folder as such: .h files to "include" folder, .lib to the "lib" folder, pdcurses.dll to the "bin" folder. Then add -lpdcurses to the end of your compilation command. This way gcc knows to link against its libraries.
Debugging Errors with GDB
In order to debug this code properly, we need to compile it with debugging enabled. We by adding the -g flag to our compilation command. This includes a lot of meta-information in the executable, such as a link to the sources, which gdb takes advantage of to make debugging much easier.Go ahead, run gcc -g big_hunt_example.c, and then start it with gdb by running gdb a.out
So, a thing you'd want to do is to stop gdb at some point and print out the states of variables. We can dot his.Suspend a process in gdb with ctrl+z. gdb nicely handles the interrupt signal by suspending the child process.
You can start and restart the program at any time by typing run, then ctrl+z to suspend execution
![]() |
| suspending an ncurses app might look weird, but it's fine |
Alright, we're in. But wait, where are we?
![]() |
| bt = backtrace |
![]() |
| l = list |
You can list the code you're in by typing list, this is also aliased by just typing l. This also accepts a function or line number, so you can look code up directly and then decide you want to set a break point there.
"Wait, Break point?", you ask? Break points are just places in the code where the debugger will stop execution and let you examine local variables. That's right, let's reset our program by typing run again. This time let's break at main()
Since it's not detecting the wall, let's examine the collision detection function break check_collisions
Hit run again and...
Oh no, that's called every time we make a frame! Well, while we're here we can check out the local variables with info locals. You can check their values using print val_name Since we only care about it when one of our balls has gone out of bounds! Well, instead of having to keep stepping inside check_collisions for 10*(velocity of the fastest dude) number of times, we can set a conditional breakpoint.
break check_collisions if ball_pool[i]->y_pos > 120
run
which will wait for all the variables in the statement to be in scope and for the statement to be true before evaluating, from here we can step through and see where our y, bound isn't being checked. Give it 5 minutes of reading the code in that section, and you should figure out the issue.
While we're here let's use print to execute random code
print max_x
print max_x=2
print max_x // yep, we messed with a variable

At this point it's pretty obvious what the problem in our code is, so let's save a checkpoint so we don't have to rerun our program every time with the same watchpoint to get here. Just run checkpoint and afterwards gdb will tell you the checkpoint number so you can go back to it by typing restart checkpoint_number
Ah, that's the line! We just copied it from the x position check but forgot to change it. How silly!
That's about it for this tutorial, here are some features you should be aware of so you can look them up later.
reverse-continue continue debugging but run in reverse
gdb program core analyses core dump file created by program
define gdb functions, useful to have some in your .gdbinit
disassemble f to see the assembly of function f
examine <address> lets you inspect memory or registers directly. Has a lot of options for interpreting memory as different types, and even instructions!
Its a lot to take in at once, and proficiency with gdb is built more with muscle memory more than with reading. However, it should all make sense within the context of C programming and its innards. Hopefully this will give you enough knowledge to debug your code effectively and efficiently. And by your code, I really mean mine, tyvm, gg no re









No comments :
Post a Comment