PYTHON PROGRAMMING

Bugs in Python? Pdb to the Rescue!

Is the Pdb debugger worth learning and using?

Various tools can be used to debug Python code, from the simplest print() function, via static but more advanced icecream and its sibling ycecream, to the various interactive debuggers that IDEs offer. My choice, however, has always been the built-in pdb debugger, along with the built-in breakpoint() function.

Debugging

Debugging lies in the heart of programming. You start debugging when you start learning programming, and you stop debugging when you’ve promised you’ve just written your very last line of code — and if you keep that very promise.

You could think that one way to decrease the time spent on debugging your code is to write good code. Let’s face it: More often than not, writing good code means… debugging a lot during development. Certainly, a good programmer will write better code and make fewer mistakes — but this does not mean he or she does not need to debug.

There is, however, one way to debug less: To debug less, write good unit tests.

To debug less, write good unit tests.

Whether or not you’re using test-driven development, write good tests. Writing good tests means writing a sufficient number of well-written tests. I don’t aim here to discuss testing, so I’ll leave you with this thought; I wrote more about testing here:

Make Yourself Enjoy Writing Unit Tests

Most developers dislike writing tests. If you’re among them, do your best to change that.

medium.com

We can assume that all programmers need to debug their code. Some may say they don’t, but that’s not true. They do; they simply don’t use dedicated debugging tools, called debuggers. Instead, they run their code for particular input, then they check it, and then, seeing something is wrong, they change the code and repeat the process. So, despite not using debuggers, they do debug their code; they just have to spend more time doing that. Debuggers were created for a reason!

Often, a single call to the print() can do the job. But don’t fool yourself: This is not a very effective way of debugging. I am not saying you shouldn’t use it — but that’s an overly simplistic method that will work in the simplest situations only.

Many of those who use IDEs for code development like using debuggers built into these IDEs. Visual Studio Code has its own debugger, Pycharm has one, and even Thonny has one.

You can also use various debuggers available as Python packages to be installed from PyPi. Open PyPi and look for a term “debugger”; you will have a lot of hits, though you may need quite some patience to find those that can help you debug your code.

You can read about Python debuggers in the below Towards Data Science article:

5 Python Debugging Tools That Are Better Than “Print”

Debug your code faster and more efficiently.

towardsdatascience.com

It discusses — though doesn’t show how to use — pdb, PyCharm’s and Visual Studio’s (and VS Code’s) debuggers, Komodo, and Jupyter Visual Debugger.

Static versus interactive debuggers

Debuggers can be either static and interactive. The former only show objects; the latter let you play with them.

Both can be helpful, but it’s interactive ones that offer the most debugging power, resulting from their ability to stop the program and look around. You can see and use all the objects in the local and global scopes; you can check if a particular command or a set of commands will work or not. That’s why more often than not I prefer interactive over static debugging.

The print() function is a perfect example of static debugging. IDE debuggers are usually interactive.

There is, however, a debugger that offers simplicity and power at the same time. It’s pdb, a built-in interactive Python debugger:

pdb - The Python Debugger

The module defines an interactive source code debugger for Python programs. It supports setting (conditional)…

docs.python.org

Yes, pdb is built-in, so you don’t have to install it. It comes with your Python installation, and you can use it in any environment. And yes, pdb is interactive. That’s actually most of what I expect from a debugger!

Yes, pdb is built-in, so you don’t have to install it. It comes with your Python installation, and you can use it in any environment. And yes, pdb is interactive.

In this article, we will discuss the basics of pdb. We will cover the basics of this powerful tool, but be aware that it offers much more. A good thing is that these basics are more than enough to start using pdb. To be honest, I seldom use pdb’s more advanced options. Thus, reading this article will equip you with powerful tools for debugging Python code.

A few words about pdb

One of the pdb’s advantages is that you can use it anywhere, without the need of installing anything additional to what your virtual environment already has. It can be a remote environment — pdb will work just fine. Just run it and voilà, you have your interactive debugger ready to be used remotely. Or locally, for that matter.

First things first. Let me explain how to use pdb, and then you can decide if it’s a tool for you.

Basically, you can use pdb in two modes. First, you can run your Python program in the pdb mode. This means the program will be executed, line by line, until it completes its execution or until an error occurs. Then the program is re-run in a post-mortem mode, meaning that it will stop right before the error and you will be able to see what’s going on in the local and global scopes.

Second, you can add a so-called breakpoint to your code, and the debugger will stop your program right there. You can also add more breakpoints. Certainly, the debugger will be able to stop the program only if no error has been raised before the breakpoint. Below, we will discuss both of these scenarios.

The pdb mode

To run your program in the pdb mode, simply run it this way:

$ python -m pdb myapp.py

This means that the pdb console will open and the myapp.py script will be run line by line. You can change this behavior and run it up to either the first error or the end of the program. Best to show how this works using some examples.

We will use the following script, saved to myapp.py:

def foo(s):
    if not isinstance(s, str):
        raise TypeError
    return s.upper()


if __name__ == "__main__":
    for s in ("string1", "string2"):
        _ = foo(s)

(It’s a playground script, nothing to be proud of. Wwe do need simplistic cases to analyze.)

We will also use its incorrect version, in which Python will throw an error; this script is saved in a myapp_error.py file:

def foo(s):
    if not isinstance(s, str):
        raise TypeError
    return s.upper()


if __name__ == "__main__":
    for s in ("string1", 10):
        _ = foo(s)

As you see, the correct program will run a for loop, and in each loop, it will run the foo() function for different values of the s argument: first, for "string1" and then for "string2", both correct. In the incorrect version, instead of running foo("string2"), foo() will be run with an incorrect value of 10, which shall lead to TypeError being raised.

For the moment, the only pdb commands you need to know are

c, or continue; another version of the command is cont;
n, or next; and
q, or quit.

Sometimes you will have to use quit two or three times, or even more, to exit the debugger.

The continue command executes the program until one of the two following things happen: either the program ends or an error is thrown. To see how this works, let’s run the correct version of the our script, myapp.py:

$ python -m pdb myapp.py
> /{path}/myapp.py(1)<module>()
-> def foo(s: str):
(Pdb) c
The program finished and will be restarted
> /{path}/myapp.py(1)<module>()
-> def foo(s: str):
(Pdb)

(In the code block, {path} represents a long path from my computer.)

As you see, after running the shell command python -m pdb myapp.py, we’re navigated to a new pdb session, and the debugger is awaiting our first command. As shown above, c will continue the program until the first error or the its end. Since we ran the correct script, the debugger did not encounter any problems, and it printed The program finished and will be restarted. This moved us back to the first line of our program, and the debugger awaited, again, our command. We could now, for example, start debugging line by line (as shown below).

Let’s see what happens if we use the c command for the incorrect script:

$ python -m pdb myapp_error.py
> /{path}/myapp_error.py(1)<module>()
-> def foo(s: str):
(Pdb) c
Traceback (most recent call last):
  File "/usr/lib/python3.9/pdb.py", line 1726, in main
    pdb._runscript(mainpyfile)
  File "/usr/lib/python3.9/pdb.py", line 1586, in _runscript
    self.run(statement)
  File "/usr/lib/python3.9/bdb.py", line 580, in run
    exec(cmd, globals, locals)
  File "<string>", line 1, in <module>
  File "/{path}/myapp_error.py", line 1, in <module>
    def foo(s: str):
  File "/{path}/myapp_error.py", line 3, in foo
    raise TypeError
TypeError
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
> /{path}/myapp_error.py(3)foo()
-> raise TypeError
(Pdb)

As you see, this time the program raised an error (TypeError, without a message). When an uncaught error is thrown, the program stops execution and the debugger enters so-called post mortem debugging. This is when you can learn what happened with your program and why it failed.

Hit n and pdb will run the next line of the code. Not the next command — the next line, so if the next command is splitted to two or more lines, you will have to call each of them to eventually call the command. Note this pdb session:

$ python -m pdb myapp.py
> /{path}/myapp.py(1)<module>()
-> def foo(s: str):
(Pdb) n
> /{path}/myapp.py(1)<module>()
-> if __name__ == "__main__":
(Pdb) 
> /{path}/myapp.py(1)<module>()
-> for s in ("string1", 10, "string2"):
(Pdb) 
> /{path}/myapp.py(1)<module>()
-> _ = foo(s)
(Pdb) 
> /{path}/myapp.py(1)<module>()
-> for s in ("string1", 10, "string2"):
(Pdb) 
> /{path}/myapp.py(1)<module>()
-> _ = foo(s)
(Pdb) 
TypeError
> /{path}/myapp.py(1)<module>()
-> _ = foo(s)
(Pdb)

First, note that when you use a command (here, n), you do not have to repeat it to run it. pdb remembers your last command and hitting enter runs it again. After hitting it a couple of times, it took us to error that stopped the program.

Note that in the pdb mode, tab-completion does not work in a regular way. This does not mean it doesn’t work at all; you just have to use the p command before entering anything else. For instance, hitting the Tab key in this scenario:

(Pdb) al

will lead to nothing. But hitting it here:

(Pdb) p al

will lead to completing the alpha name:

(Pdb) p alpha

There are many pdb commands for you to use. You will find them here:

pdb - The Python Debugger

Source code: Lib/pdb.py The module pdb defines an interactive source code debugger for Python programs. It supports…

docs.python.org

Before moving on, I’d like to share with you a simple command; maybe not the most important one, but one I’ve appreciated quite a lot in my past. It’s pp, for pretty-print:

(Pdb) {f"{x_i = }, {alpha = }, and {beta = }": (x_i + alpha)/(1 + beta) for x_i in x}
{'x_i = 1, alpha = 4, and beta = 0': 5.0, 'x_i = 2, alpha = 4, and beta = 0': 6.0, 'x_i = 3, alpha = 4, and beta = 0': 7.0}
(Pdb) pp {f"{x_i = }, {alpha = }, and {beta = }": (x_i + alpha)/(1 + beta) for x_i in x}
{'x_i = 1, alpha = 4, and beta = 0': 5.0,
 'x_i = 2, alpha = 4, and beta = 0': 6.0,
 'x_i = 3, alpha = 4, and beta = 0': 7.0}

As you see, calling an expression and calling it with the pp command makes much of a difference. Hence, it’s good to remember it.

One more thing. Even though the dict comprehension above is long, I did not split the line into two or more ones. This is because pdb does not allow for doing so, not in its debugging mode — but you can use its interactive mode, which you run with the interact command:

(Pdb) interact
>>> {f"{x_i = }, {alpha = }, and {beta = }":
...      (x_i + alpha)/(1 + beta) for x_i in x}
{'x_i = 1, alpha = 4, and beta = 0': 5.0, 'x_i = 2, alpha = 4, and beta = 0': 6.0, 'x_i = 3, alpha = 4, and beta = 0': 7.0}

Remember that in the interactive mode, pdb commands don’t work. To leave this mode and return to the pdb one, hit <Ctrl + D>.

Debugging using the breakpoint() function

Above, we discussed debugging in the pdb mode. Oftentimes, however, it will be easier to set a so-called breakpoint. A breakpoint is a location in code in which you want the program to stop and analyze; you can create more than one breakpoints in your code, and the code will stop in each of them — unless an error is thrown.

To create one, add a call to the breakpoint() function in the very location in your code where you want to debugger to stop and let you in:

def y(x, alpha, beta):
    breakpoint()
    return [(xi + alpha)/(1 + beta) for xi in x]

x = [1, 2, 3]
y(x)

Running this script will lead you to this very debug session:

-> return [(xi + alpha)/(1 + beta) for xi in x]
(Pdb) l
  1     def y(x, alpha, beta):
  2         breakpoint()
  3  ->     return [(xi + alpha)/(1 + beta) for xi in x]
  4  
  5  
  6     x = [1, 2, 3]
  7     y(x, 4, 0)
[EOF]
(Pdb)

The l (list) command shows you eleven lines surrounding the location you’re in at this moment. You also use ll (longlist), which would print the whole source code for the current function or frame.

The rest is the same as before, as you’ve entered the pdb mode, which we discussed above. The obvious advantage of using the breakpoint() function is the availability to stop the program exactly where you want. Frankly, I use breakpoint() in almost all my debugging sessions.

A breakpoint in code lets you stop for a moment and check out what’s going on inside the code, in the very location you want to check. Photo by Malte Helmhold on Unsplash

An object gone missing?

You may encounter a strange situation— although it’s strange only for those who do not know how to manage it. Sometimes, you may find pdb behaving in a very peculiar way: although it will see local variables, it will… not see these local variables.

Sounds like total nonsense? Let me explain. Consider this very simple function:

def y(x, alpha, beta):
    return [(xi + alpha)/(1 + beta) for xi in x]

It calculates values of a simple model for a list of values, x, given two model parameters, alpha and beta. For example:

>>> def y(x, alpha, beta):
...     return [(xi + alpha)/(1 + beta) for xi in x]
... 
>>> x = [1, 2, 3]
>>> y(x, .25, 0)
[1.25, 2.25, 3.25]

Now imagine you would like to get inside the function and check the function yourself, for a number of x lists. You can certainly do it with pdb’s help:

>>> def y(x, alpha, beta):
...     breakpoint()
...     return [(xi + alpha)/(1 + beta) for xi in x]
... 
>>> y(x, .25, 0)
> <stdin>(3)y()
(Pdb) alpha, beta
(0.25, 0)
(Pdb) [(xi + alpha)/(1 + beta) for xi in x]
*** NameError: name 'alpha' is not defined

What? What has just happened? How come pdb does not see alpha — didn’t it just saw it? It did, in this very line:

(Pdb) alpha, beta
(0.25, 0)

So, it sees alpha and beta — but it does not see them?

Maybe we should assign values to these variables once more? Let’s check this out:

(Pdb) alpha = .25; beta = 0
(Pdb) alpha
0.25
(Pdb) [(xi + alpha)/(1 + beta) for xi in x]
*** NameError: name 'alpha' is not defined

No, this didn’t help at all.

The problem is, list comprehensions — and other comprehensions, for that matter — have their own scope, and local variables are invisible there. Fortunately, you have a number of solutions for this, as shown below.

The interactive mode

The interactive mode, actually, can be quite helpful in various situations. You can start it using the interact command inside the pdb shell:

(Pdb) interact
*interactive*
>>> [(xi + alpha)/(1 + beta) for xi in x]
[1.25, 2.25, 3.25]

As you can see, in the interactive mode the code works in a regular way.

Add the missing object(s) to globals

A particular object is missing, so simply add it to globals():

(Pdb) globals()['alpha'] = alpha
(Pdb) [(xi + alpha)/(1 + beta) for xi in x]
*** NameError: name 'beta' is not defined

As you see, pdb sees alpha but it doesn’t see beta. One solution is to add it to globals() the same way we added alpha, but this is no fun to provide all the global variables one by one; the next solution does the trick in just one command.

Add all locals to globals

Both locals() and globals() are dictionaries, so we can simply add the former to the latter. You can do so in the following way:

(Pdb) globals().update(locals())
(Pdb) [(xi + alpha)/(1 + beta) for xi in x]
[1.25, 2.25, 3.25]

I hope you enjoyed this article. While the article doesn’t cover all the knowledge of pdb, it does provide sufficient knowledge to use this debugger in most scenarios.

In my over 5-year Python practice, I’ve noticed that few people use pdb to debug code. I don’t know why. IDE debuggers can offer more indeed, but pdb’s great strength is its availability in the Python standard library.

I am not sure this is something to be proud of, but I will honest with you: pdb is the debugger of my choice. I practically do not use any other debuggers. I have never had any problems with it; on the contrary, it has helped in all my Python projects.

When I was experimenting with other debuggers, I did have various problems from time to time. Maybe it’s on me; maybe I did not practice them long enough to enjoy their power. That can be true — but I can say I have practiced pdb long enough to say that despite its simplicity, it can be a fantastic debugger.

Thanks for reading. If you enjoyed this article, you may also enjoy other articles I wrote; you will see them here. And if you want to join Medium, please use my referral link below:

Join Medium with my referral link - Marcin Kozak

As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…

medium.com