A Hands On Exercise in Python DebuggingBefore we even begin, here's how look it up for yourself:Useful Resourceshttp://docs.python.org/library/pdb.html - the official pdb documentation
http://openp2p.com/pub/a/python/2005/09/01/debugger.html - lengthy O'Reilly article
http://pythonconquerstheuniverse.wordpress.com/category/the-python-debugger/ - concise blog entry about the basics
http://aymanh.com/python-debugging-techniques - broad debugging blog entry
http://www.jcameroncooper.com/pytexas2010 - this document; http://www.jcameroncooper.com/pytexas - in-class page see also http://pytexas.org/TalkProposals2010 for original proposal
This is about it! Doesn't feature in Dive Into Python or most other general Python sources. Who Am I?J. Cameron Cooper. Long-time Python/Zope/Plone developer, author of "Building Websites with Plone. Independent developer, sometime trainer. And I started using structured debugging tools way too late; I was intimidated. But you shouldn't be! How are you debugging?Staring at the code? Developing or repeating code in the console? Changing things randomly? Throwing in print statements? Throwing in log statements? Those all have their place, but pdb can make it easier. Here's some very simple code for calculating a mean, with a test at the end: def mean(first, second): result = first + second / 2 return result shouldbe = 6 m = mean(4, 8) print("mean(4,8) =", m) print("should be:", shouldbe) print("correct?", m==shouldbe) Go ahead and save this locally and run it with your Python. (It's written for Python 3, but just change the prints for Python 2 if that's what you have. Python 3 also has different division rules...) There's (at least) one problem in here, as the program will tell you when it runs. This is pretty simple, and you can probably figure it out by staring. You can use 'print', but you might have to make a few runs to figure it out... def mean(first, second): result = first + second / 2 print("first", first) print("second", second) print("result", result) print("first+second", a+b return result is maybe helpful, but it's also a lot of typing, and other expressions and experiments might tell you more. Plus, it mixes debugging prints with real prints, and you have to sort them visually. Imagine if there were lots of output. So let's just jump in with pdb and see how it works. The magic words are import pdb; pdb.set_trace() ...def mean(first, second): result = first + second / 2 import pdb; pbd.set_trace() return result Then run it: > .../mean.py(4)mean() -> return result (Pdb) first 4 (Pdb) second 8 (Pdb) result 8.0 (Pdb) first+second 12 (Pdb) first+second/2 8.0 (Pdb) 12/2 6.0 (Pdb) (first+second)/2 6.0 (Pdb) c mean(4,8) = 8.0 should be: 6 correct? False Ah, order of operations. Should have parens in there. So, what did we just do? The Wonderful 'pdb'We just got into a 'pdb' session, basically an interactive console in the middle of the running program. We were looking at the live variables at that point in time. We can examine other parts of the program state as well, and move around in it. There's several ways to start a pdb session; the easiest is to pick a point in the code and insert pdb.set_trace() . It's handy to stack the import with it so it's easy to remove. Thus the magic line above. When the interpreter hits this call, it opens a pdb session on the console, right in the middle of the running program; you'll note the output after the call only happens after we quit pdb.Let's get back in there and look around some more. It has interactive help if you type " ? ":(Pdb) ? Documented commands (type help <topic>): ======================================== EOF bt cont enable jump pp run unt a c continue exit l q s until alias cl d h list quit step up args clear debug help n r tbreak w b commands disable ignore next restart u whatis break condition down j p return unalias where Miscellaneous help topics: ========================== exec pdb Undocumented commands: ====================== print retval rv You can also use ? for specifics about anything listed. Take the 'q ' command. What does it do?(Pdb) ? q q(uit) or exit - Quit from the debugger. The program being executed is aborted. You get the same for ? exit ; this is a thing pdb does; its commands come in word or one-letter format. In fact, the ? is a shorthand for help . Often these are written like h(elp) ; meaning the bit outside the parens or the whole thing can be used. The single-letter versions are easier to type, so you'll usually see those.Here's another thing pdb does: hit enter on a blank line. (Pdb) q(uit) or exit - Quit from the debugger. The program being executed is aborted. Unlike the normal interpreter, it repeats the last command when you return a blank line. In a debugger, we're often running the same command a lot (to go through uninteresting areas of code or pass through loops.) Besides the commands listed, pdb will also print out values of variables, like we did with second and such. But what if our variables are named like the pdb commands? Just use the bang: ! Exit out of pdb (you know how now), and modify the code to include a retval variable; this is a common name and also a pdb command. (So are many single letters.)def mean(first, second): retval = first + second / 2 import pdb; pdb.set_trace() return retval Then run it: (Pdb) retval *** Not yet returned! (Pdb) ? retval *** No help on retval (Pdb) !retval 8.0
The bang says "use the variable by this name, not the command." You can also use print/p to do this. Getting into the habit of using one of these for variable inspection can be a good idea if you like one-letter variables.Let's look at what else we can do. Examine the Environmentshow location with l(ist)(Pdb) ?l l(ist) [first [,last]] List source code for the current file. Without arguments, list 11 lines around the current line or continue the previous listing. With one argument, list 11 lines starting at that line. With two arguments, list the given range; if the second argument is less than the first, it is a count. Helpful for seeing where we are in the code. (Pdb) l 1 def mean(first, second): 2 retval = first + second / 2 3 import pdb; pdb.set_trace() 4 -> return retval 5 6 shouldbe = 6 7 m = mean(4, 8) 8 print("mean(4,8) =", m) 9 print("should be:", shouldbe) 10 print("correct?", m==shouldbe) [EOF] Here, we can see that the next line to execute is #4. Subsequent uses of list will keep listing code; you could hit l, enter, enter, enter to get a long list of code below where you are. If you want to see the same place again, you'll need to use arguments.(Pdb) [EOF] (Pdb) l [EOF] (Pdb) l 1 1 def mean(first, second): 2 retval = first + second / 2 3 import pdb; pdb.set_trace() 4 -> return retval 5 6 shouldbe = 6 7 m = mean(4, 8) 8 print("mean(4,8) =", m) 9 print("should be:", shouldbe) 10 print("correct?", m==shouldbe) [EOF] see args with a(rgs)(Pdb) ?a a(rgs) Print the arguments of the current function. A shorthand for printing all the function/method arguments individually. (Pdb) a first = 4 second = 8 see return value with retvalUndocumented (it has no help entry.) Not all that common, and you can do the same thing knowing the name of the return variable. If the return is not a single variable but an expression, it will let you know what is calculated. Let's try this code: def mean(first, second): import pdb; pdb.set_trace() return first + second / 2 The next line will return the value; we can go to it with the n(ext) command (more on that soon) and check what it will return.> .../mean.py(3)mean() -> return first + second / 2 (Pdb) next --Return-- > .../mean.py(3)mean()->8.0 -> return first + second / 2 (Pdb) retval 8.0 Note that the "current line" info also provides the value. look at stack with w(here)(Pdb) ?w w(here) Print a stack trace, with the most recent frame at the bottom. An arrow indicates the "current frame", which determines the context of most commands. 'bt' is an alias for this command. (Pdb) w <string>(1)<module>() .../mean.py(6)<module>() -> m = mean(4, 8) > .../mean.py(3)mean()->8.0 -> return first + second / 2 mean.py on line 6 called us, and we're currently in line 3 in the method mean() about to return the value 8.0 .execute statements with ! or p(print)(Pdb) ? exec (!) statement Execute the (one-line) statement in the context of the current stack frame. The exclamation point can be omitted unless the first word of the statement resembles a debugger command. To assign to a global variable you must always prefix the command with a 'global' command, e.g.: (Pdb) global list_options; list_options = ['-l'] We've seen this already, but it does more than just values; you can execute whole (one-line) expressions. And you can do it without the ! if you don't start with a debugger reserved word.Going back to this code: def mean(first, second): retval = first + second / 2 import pdb; pdb.set_trace() return retval (Pdb) !retval+first 12.0 (Pdb) first+second 12 You can even change variables before letting the program run on: (Pdb) !retval = 33 (Pdb) continue mean(4,8) = 33 should be: 6 correct? False Neat! Move AroundYou don't have to stay stuck in the spot you put the set_trace . go forward with s(tep), n(ext)(Pdb) ? n n(ext) Continue execution until the next line in the current function is reached or it returns. (Pdb) ? s s(tep) Execute the current line, stop at the first possible occasion (either in a function that is called or in the current function). Let's modify our code so we can look at these better. Let's inspect the "main" part of the program by moving the set_trace into there:def mean(first, second): retval = first + second / 2 return retval import pdb; pdb.set_trace() shouldbe = 6 m = mean(4, 8) print("mean(4,8) =", m) print("should be:", shouldbe) print("correct?", m==shouldbe) next command keeps us at the current level:> .../mean.py(6)<module>() -> shouldbe = 6 (Pdb) n > .../mean.py (7)<module>() -> m = mean(4, 8) (Pdb) n > .../mean.py (8)<module>() -> print("mean(4,8) =", m) (Pdb) n mean(4,8) = 8.0 > .../mean.py (9)<module>() -> print("should be:", shouldbe) (Pdb) n should be: 6 > .../mean.py (10)<module>() -> print("correct?", m==shouldbe) (Pdb) n correct? False --Return-- > .../mean.py (10)<module>()->None -> print("correct?", m==shouldbe) Note how the line number keeps increasing, and when we hit the mean() method, we don't see inside it. We also see the print function printing our values. Of course, we could stop and check on them at any time. The step command follows into method calls:> .../mean.py(6)<module>() -> shouldbe = 6 (Pdb) (Pdb) s > .../mean.py(7)<module>() -> m = mean(4, 8) (Pdb) s --Call-- > .../mean.py(1)mean() -> def mean(first, second): (Pdb) s > .../mean.py(2)mean() -> retval = first + second / 2 (Pdb) s > .../mean.py(3)mean() -> return retval (Pdb) s --Return-- > .../mean.py(3)mean()->8.0 -> return retval (Pdb) s > .../mean.py(8)<module>() -> print("mean(4,8) =", m) (Pdb) s --Call-- > .../python31/lib/idlelib/rpc.py(546)__getattr__() -> def __getattr__(self, name): (Pdb) We went through the details of mean . And then...Oops, we went into the print method. Should've used next for that one. We'll see how to save ourselves from this situation next.fast-forward with r(eturn), c(ontinue), and unt(il)(Pdb) ? r r(eturn) Continue execution until the current function returns. (Pdb) ? c c(ont(inue)) Continue execution, only stop when a breakpoint is encountered. (Pdb) ? until unt(il) Continue execution until the line with a number greater than the current one is reached or until the current frame returns Note that "until" has an odd short form unt ; u is for "up".The return command fast forwards to the return statement for this method and then stops, remaining in pdb.We can escape from the print method (where we got testing step ) by return ing.(Pdb) r --Return-- > c:\program files\python31\lib\idlelib\rpc.py(550)__getattr__()-><idlelib.rpc....at 0x01F43DB0> -> return MethodProxy(self.sockio, self.oid, name) (Pdb) r mean(4,8) = 8.0 > .../mean.py(9)<module>() -> print("should be:", shouldbe) (Pdb) If we set the trace back inside mean and re-run we can return into a more normal result:(Pdb) r --Return-- > .../mean.py(4)mean()->8.0 -> return retval (Pdb) The continue command doesn't stop at method boundaries; it just lets the program run. If you have another pdb breakpoint (a set_trace , say), you'll go back into pdb; otherwise, it'll go like normal. With only one break point: > .../mean.py(3)mean() -> retval = first + second / 2 (Pdb) c mean(4,8) = 8.0 should be: 6 correct? False Let's add another: def mean(first, second): import pdb; pdb.set_trace() retval = first + second / 2 return retval shouldbe = 6 m = mean(4, 8) import pdb; pdb.set_trace() print("mean(4,8) =", m) print("should be:", shouldbe) print("correct?", m==shouldbe) And we can see: > .../mean.py(3)mean() -> retval = first + second / 2 (Pdb) n > .../mean.py(4)mean() -> return retval (Pdb) c > .../mean.py(9)<module>() -> print("mean(4,8) =", m) (Pdb) !m 8.0 (Pdb) c mean(4,8) = 8.0 should be: 6 correct? False until command is rarer, and is fairly specific. Put in this code:import pdb; pdb.set_trace() a = 22 for x in range(a): mean(x, x+1) b = 8 and try to step through it. > .../mean.py(12)<module>() -> a = 22 (Pdb) n > .../mean.py(13)<module>() -> for x in range(a): mean(x, x+1) (Pdb) n > .../mean.py(13)<module>() -> for x in range(a): mean(x, x+1) (Pdb) n > .../mean.py(13)<module>() -> for x in range(a): mean(x, x+1) (Pdb) Hey, that's pretty annoying. (Pdb) unt > .../mean.py(14)<module>() -> b = 8 Ah, that's better. go back with j(ump)(Pdb) ? j j(ump) lineno Set the next line that will be executed. You can go backwards, too. Or just plain random access. Use with care! Note that you can't leave your current scope. Let's try this code: print(1) print(2) print(3) import pdb; pdb.set_trace() print(4) print(5) print(6) print(7) Jump around! 1 2 3 > .../sequence(5)<module>() -> print(4) (Pdb) l 2 print(2) 3 print(3) 4 import pdb; pdb.set_trace() 5 print(4) 6 print(5) 7 -> print(6) 8 print(7) [EOF] (Pdb) n 4 > .../ sequence(6)<module>() -> print(5) (Pdb) j 1 > .../ sequence(1)<module>() -> print(1) (Pdb) n 1 > .../ sequence(2)<module>() -> print(2) (Pdb) n 2 > .../ sequence(3)<module>() -> print(3) (Pdb) j 6 > .../ sequence(6)<module>() -> print(5) (Pdb) n 5 > .../ sequence(7)<module>() -> print(6) (Pdb) c 6 7 The StackYou can also navigate the stack, going into methods that called you to see how they look when they call you. printing the stack with w(here)... and what it meansWe've seen this above, actually, but it's handy to look at the stack to see where we're going. Time to load up some new code: def top(): a = 12 b = 3 print("Hello") middle() print("done!") def middle(): a = 0 c = 9 import pdb; pdb.set_trace() print("world") bottom() def bottom(): d = 88 print("!") top() Hello > .../nested(12)middle() -> print("world") (Pdb) w .../nested(19)<module>() -> top() .../ nested(5)top() -> middle() > .../ nested(12)middle() -> print("world") You can read that stack trace just like any other. We're in middle , on line 12, called from top at line 5, called from the module on line 19.One unusual thing, which is important: see the right angle next to nested(12)middle()? That's where we are in the stack--the end. navigating the stack with u(p) and d(own)We can easily see the value of a in middle :(Pdb) !a 0 but what does it look like in top ? We can go into our caller using up :(Pdb) u > .../nested(5)top() -> middle() (Pdb) !a 12 (Pdb) w .../nested(19)<module>() -> top() > .../nested(5)top() -> middle() .../nested(12)middle() -> print("world") The right angle has moved up a line in the trace, and you can see we are looking at the value of a in top .You should also try list here.Of course, we get out of there using down (and only after an up .)(Pdb) d > .../nested(12)middle() -> print("world") (Pdb) !a 0 Other Ways Into pdbWe've been hard-coding so far with After a crash: post-mortem with 'pm' and 'post_mortem'You can also debug a crashing program with pdb, and you don't have to guess where to stick in a set_trace . This is "post-mortem" debugging.Here's some new code. Save it in a file ( mean.py is fine), start the interpreter and import it.def divide(x, y): a = 3 / x b = 4 / y return a + b pdb.pm() uses the last generated traceback. Run this in the interpreter to create a crash:>>> divide(3,4) 2.0 >>> divide(4,0) Traceback (most recent call last): File "<pyshell#77>", line 1, in <module> divide(4,0) File "<pyshell#75>", line 3, in divide b = 4 / y ZeroDivisionError: int division or modulo by zero Then we can pm it:>>> import pdb; pdb.pm() > ../mean.py(7)divide() -> b = 4 / y
We can grab a specific traceback (or the current one) with >>> try: ... divide(4,0) ... except: ... pdb.post_mortem() > .../mean.py(7)divide() -> b = 4 / y (Pdb) c or >>> try: ... divide(12,0) ... except: ... import sys ... tb = sys.exc_info()[2] >>> tb <traceback object at 0x0198F738> >>> import pdb; pdb.post_mortem(tb) > .../mean.py(7)divide() -> b = 4 / y (Pdb) c As you can see, you can run post-mortems automatically in exception handling, or on demand interactively.Without error or source changes: -m, run, runeval, and runcallYou can start from line 1 in pdb by doing:
This might be useful with a simple file, but is also useful for running without modifying the source. Perhaps you don't have write permission, or just don't want to get in there. Another way is to open up the interpreter and use one of several methods: Let's say we have the mean-calculating code saved as mean.py .$ python >>> import pdb >>> from mean import mean >>> pdb.run("mean(1,4)") > <string>(1)<module>()->None (Pdb) s --Call-- > .../mean.py(1)mean() -> def mean(first, second): (Pdb) l 1 -> def mean(first, second): 2 retval = first + second / 2 3 return retval 4 5 shouldbe = 6 6 m = mean(4, 8) 7 print("mean(4,8) =", m) 8 print("should be:", shouldbe) 9 print("correct?", m==shouldbe) [EOF]
Using run you pass in some statement as a string, as you see above. Returns nothing.Using runeval you also pass in a string, except it returns the evaluated value.>>> pdb.runeval("mean(2,4)") > <string>(1)<module>()->None (Pdb) c 4.0 >>> runcall you pass in a callable object (the method/function) and parameters, rather than a string.>>> pdb.runcall(mean, 2, 4) > c:\users\cameron\documents\pytexas\mean-correct.py(2)mean() -> retval = first + second / 2 (Pdb) c 4.0 >>> Once you're in pdbOf course, simply running pdb can be a big pain without the right breakpoint; lots of next s would be required in some programs. But pdb also lets you set breakpoints without code.setting breakpoints with b(reak) and tbreak(Pdb) ?b b(reak) ([file:]lineno | function) [, condition] With a line number argument, set a break there in the current file. With a function name, set a break at first executable line of that function. Without argument, list all breaks. If a second argument is present, it is a string specifying an expression which must evaluate to true before the breakpoint is honored. The line number may be prefixed with a filename and a colon, to specify a breakpoint in another file (probably one that hasn't been loaded yet). The file is searched for on sys.path; the .py suffix may be omitted. (Pdb) ?tbreak tbreak same arguments as break, but breakpoint is removed when first hit. We need a new program to see this best; call it callself.py :def callself(a=0): a = a + 1 print(a) if a < 10: callself(a) Import it in console: >>> from callself import callself >>> import pdb; pdb.run("callself()") > <string>(1)<module>() (Pdb) s --Call-- > .../callself(1)callself() -> def callself(a=0): (Pdb) l 1 -> def callself(a=0): 2 a = a + 1 3 print(a) 4 if a < 10: 5 callself(a) [EOF] (Pdb) b 3 Breakpoint 1 at .../callself:3 (Pdb) l 1 1 -> def callself(a=0): 2 a = a + 1 3 B print(a) 4 if a < 10: 5 callself(a) [EOF]
We've stopped at the temporary breakpoint. Note the B indicating breakpoint. Temporary is similar, but goes away first time by. managing breakpoints with cl(ear), disable, enableAlso notice above how it gives us a "breakpoint number". You can use this with disable to stop using the breakpoint temporarily. (Pdb) disable 1 (Pdb) c 2 3 4 5 6 7 8 9 10 You can do the same with clear for permanent removal. Also, enable will undo disable .skipping breakpoints with condition and ignoreWe can do a similar think with ignore ; ignore waits for a few times by:>>> import pdb; pdb.run("callself()") > <string>(1)<module>() (Pdb) s --Call-- > c:\users\cameron\documents\pytexas\callself(1)callself() -> def callself(a=0): (Pdb) b 3 Breakpoint 1 at c:\users\cameron\documents\pytexas\callself:3 (Pdb) ignore 1 3 Will ignore next 3 crossings of breakpoint 1. (Pdb) c 1 2 3 > c:\users\cameron\documents\pytexas\callself(3)callself() -> print(a) (Pdb) !a 4 condition sets a condition (which you can also do on breakpoint set):(Pdb) condition 1 a==8 (Pdb) c 5 6 7 > c:\users\cameron\documents\pytexas\callself(3)callself() -> print(a) (Pdb) !a 8 hooking into breakpoints with 'commands'You can also specify some commands to run when hitting a breakpoint: (Pdb) condition 1 Breakpoint 1 is now unconditional. (Pdb) commands 1 (com) print "I broke" (com) print a (com) end (Pdb) c 8 'I broke' 9 > c:\users\cameron\documents\pytexas\callself(3)callself() -> print(a) (Pdb) c 9 'I broke' 10 > c:\users\cameron\documents\pytexas\callself(3)callself() -> print(a) On a signal (like Ctrl-C)Put this code at the top of your program, and you can launch pdb at any point with Ctrl-C. import signal import pdb def int_handler(signal, frame): pdb.set_trace() signal.signal(signal.SIGINT, int_handler) You'll want to n(ext) to get to the actual code (because it stops in the int_handler .) To continue, use c(ontinue) ; to stop use q/exit . It'll throw an ugly exception, but don't worry about it. Note that time.sleep has its own problems with being interrupted.An easy test: put the following code after the signal setup, run, and hit Ctrl-C. Logging still has a place in debugging; if you have lots of tight loops, or want to see a big spew of data, or want end-users to be able to report. Basic logging is pretty simple:
|