--- id: debugging title: Debugging author: Benjamin Qi, Aaron Chew description: "Debugging your code is an extremely important skill. Here are some useful debugging-related tips." --- ## Within Your Program using a script to stress test some parts from above video ### Style {' '} don't agree with everything but important to read nonetheless ### Assertions & Warnings includes static_assert and #define NDEBUG subset of above #warning, #error ### Printing Variables Although not feasible if you have to write all code from scratch, [this template](https://github.com/bqi343/USACO/blob/master/Implementations/content/contest/CppIO.h) is very helpful for simplifying input / output / debug output. Note that `dbg()` only produces debug output when `-DLOCAL` is included as part of the compilation command, so you don't need to comment out those lines before submitting. [Examples - Debug Output](https://github.com/bqi343/USACO/blob/master/Implementations/content/contest/CppIO_test.cpp) ## Compiling I use the following to compile and run. ``` co() { g++ -std=c++17 -O2 -o $1 $1.cpp -Wall -Wextra -Wshadow -DLOCAL -Wl,-stack_size -Wl,0xF0000000; } run() { co $1 && ./$1 & fg; } ``` ### Mac According to [this comment](https://codeforces.com/blog/entry/60999?#comment-449312), `-Wl,-stack_size -Wl,0xF0000000` increases the stack size on Mac. Without it, you might get a runtime error if you have many levels of recursion. This matters particularly for contests such as Facebook Hacker Cup where you submit the output of a program you run locally. (Documentation?) ### [Warning Options](https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html) Includes `-Wall -Wextra -Wshadow` [Variable shadowing](https://en.wikipedia.org/wiki/Variable_shadowing) should be avoided whenever possible. more options ### Other Options Ben - I don't use these because they can significantly slow down compilation time and I don't find these messages particularly helpful (But maybe I'm wrong? I'm not so familiar with these.) In Errichto's blog he says that he uses the following as part of his compilation command: ``` -fsanitize=undefined -fsanitize=address -D_GLIBCXX_DEBUG -g ``` Let's demonstrate what each of these do with the following program `prog.cpp`, which gives a segmentation fault. ```cpp #include using namespace std; int main() { vector v; cout << v[-1]; } ``` `g++ prog.cpp -o prog -fsanitize=undefined && ./prog` produces: ``` /usr/local/Cellar/gcc/9.2.0_1/include/c++/9.2.0/bits/stl_vector.h:1043:34: runtime error: pointer index expression with base 0x000000000000 overflowed to 0xfffffffffffffffc zsh: segmentation fault ./prog ``` `g++ prog.cpp -o prog -fsanitize=address && ./prog` produces: ``` AddressSanitizer:DEADLYSIGNAL ================================================================= ==31035==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000106ac6326 bp 0x7ffee913aaa0 sp 0x7ffee913aa20 T0) ==31035==The signal is caused by a READ memory access. ==31035==Hint: address points to the zero page. #0 0x106ac6325 in main (prog:x86_64+0x100001325) #1 0x7fff72208cc8 in start (libdyld.dylib:x86_64+0x1acc8) ==31035==Register values: rax = 0xfffffffffffffffc rbx = 0x00007ffee913aa20 rcx = 0xfffffffffffffffc rdx = 0x20000fffffffffff rdi = 0x00007ffee913aa40 rsi = 0x1fffffffffffffff rbp = 0x00007ffee913aaa0 rsp = 0x00007ffee913aa20 r8 = 0x0000000000000000 r9 = 0x0000000000000000 r10 = 0x0000000000000000 r11 = 0x0000000000000000 r12 = 0x00000fffdd227544 r13 = 0x00007ffee913aa80 r14 = 0x00007ffee913aa20 r15 = 0x0000000000000000 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (prog:x86_64+0x100001325) in main ==31035==ABORTING zsh: abort ./prog ``` Finally, `g++ prog.cpp -o prog -D_GLIBCXX_DEBUG -g && ./prog` produces: ``` /usr/local/Cellar/gcc/9.2.0_1/include/c++/9.2.0/debug/vector:427: In function: std::__debug::vector<_Tp, _Allocator>::reference std::__debug::vector<_Tp, _Allocator>::operator[](std::__debug::vector<_Tp, _Allocator>::size_type) [with _Tp = int; _Allocator = std::allocator; std::__debug::vector<_Tp, _Allocator>::reference = int&; std::__debug::vector<_Tp, _Allocator>::size_type = long unsigned int] Error: attempt to subscript container with out-of-bounds index -1, but container only holds 0 elements. Objects involved in the operation: sequence "this" @ 0x0x7ffee2503a50 { type = std::__debug::vector >; } zsh: abort ./prog ``` Another example with `prog.cpp` as the following: ```cpp #include using namespace std; int main() { int v[5]; cout << v[5]; } ``` `g++ prog.cpp -o prog -fsanitize=undefined && ./prog` produces: ``` prog.cpp:6:13: runtime error: index 5 out of bounds for type 'int [5]' prog.cpp:6:13: runtime error: load of address 0x7ffee0a77a94 with insufficient space for an object of type 'int' 0x7ffee0a77a94: note: pointer points here b0 7a a7 e0 fe 7f 00 00 25 b0 a5 0f 01 00 00 00 b0 7a a7 e0 fe 7f 00 00 c9 8c 20 72 ff 7f 00 00 ^ 32766% ``` `g++ prog.cpp -o prog -fsanitize=address && ./prog` produces: ``` ================================================================= ==31227==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffeef0e4a54 at pc 0x000100b1bce5 bp 0x7ffeef0e4a10 sp 0x7ffeef0e4a08 READ of size 4 at 0x7ffeef0e4a54 thread T0 #0 0x100b1bce4 in main (prog:x86_64+0x100000ce4) #1 0x7fff72208cc8 in start (libdyld.dylib:x86_64+0x1acc8) Address 0x7ffeef0e4a54 is located in stack of thread T0 at offset 52 in frame #0 0x100b1bc35 in main (prog:x86_64+0x100000c35) This frame has 1 object(s): [32, 52) 'v' (line 5) <== Memory access at offset 52 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: stack-buffer-overflow (prog:x86_64+0x100000ce4) in main Shadow bytes around the buggy address: 0x1fffdde1c8f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1fffdde1c900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1fffdde1c910: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1fffdde1c920: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1fffdde1c930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x1fffdde1c940: 00 00 00 00 f1 f1 f1 f1 00 00[04]f3 f3 f3 f3 f3 0x1fffdde1c950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1fffdde1c960: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1fffdde1c970: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1fffdde1c980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1fffdde1c990: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==31227==ABORTING zsh: abort ./prog ``` ## Running (on Mac) According to [StackOverflow](https://stackoverflow.com/a/60516966/5834770) the `& fg` is necessary for getting `zsh` on Mac to display crash messages (such as segmentation fault). For example, consider the running the first `prog.cpp` above with `run prog`. If `& fg` is removed from the run command above then the terminal displays no message at all. Leaving it in produces the following (ignore the first two lines): ``` [2] 30594 [2] - running ./$1 zsh: segmentation fault ./$1 ``` ### Measuring Time & Memory Usage - [CF Comment](https://codeforces.com/blog/entry/49371?#comment-333749) - [time -v on Mac](https://stackoverflow.com/questions/32515381/mac-os-x-usr-bin-time-verbose-flag) - use `gtime` For example, suppose that `prog.cpp` consists of the following: ```cpp #include using namespace std; const int BIG = 1e7; int a[BIG]; int main() { int sum = 0; for (int i = 0; i < BIG; ++i) sum += a[i]; cout << sum; } ``` Then `co prog && gtime -v ./prog` gives the following: ``` Command being timed: "./prog" User time (seconds): 0.01 System time (seconds): 0.01 Percent of CPU this job got: 11% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.22 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 40216 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 91 Minor (reclaiming a frame) page faults: 10088 Voluntary context switches: 3 Involuntary context switches: 38 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 ``` Note that $10^7$ integers require $4\cdot 10^7\cdot 10^{-3}\approx 40000$ kilobytes of memory, which is close to $40216$ in the above output as expected. ## Stress Testing See Errichto's video for details. You can use a [simple script](https://github.com/bqi343/USACO/blob/master/Implementations/content/contest/stress.sh) to test two solutions against each other. ## Debuggers Using a debugger varies from language to language and even IDE to different IDE. For now I will describe the basic operations of a debugger. A debugger allows you to pause a code in its execution and see the values as a given point in the debugger. To do this, set a "breakpoint" at a certain line of code. When the code runs to that breakpoint, it will pause and you will be able to inspect all the different variables at that certain instance. There are two more useful and common operations. Once you are at the breakpoint, you may want to see what happens after the current line is executed. This would be the "Step Over" button that will allow you to move to the next line. Say you are at a line with the following code: `dfs(0,-1)`, if you click "step over" the debugger will ignore showing you what happens in this function and go to the next line. If you click "step in," however, you will enter the function and be able to step through that function. In essense, a debugger is a tool to "trace code" for you. It is not much different from just printing the values out at various points in your program. Pros of using a debugger: - No need to write print statements so you save time - You can step through the code in real time Cons of using a debugger: - You cannot see the overall "output" of your program at each stage. For example, if I wanted to see every single value of `i` in the program, I could not using a debugger. - Most advanced competitive programmers do not use debuggers; it is quite time inefficient. (gdb, valgrind?)