(Deprecated) Debugging Python Applications with the gdb Python Debugging Extension in gdb4hpc
NOTE: These commands are deprecated in favor of python mode. Python mode has more functionality and less requirements. See the Debugging Python Applications with Python Mode document for more info.
Background
gdb is a debugger for compiled binaries, not interpreted languages like Python. This means that when debugging a Python application with gdb, gdb is debugging the Python interpreter internals and not the Python application code. For people developing Python applications, this isn’t very useful.
In the source of the cpython interpreter, there is a gdb extension that can be loaded by gdb to help translate data and state internal to the python interpreter into data and state relevant to the python source code.
By loading the extension in a gdb session, gdb gains gains Python equivalents
to the gdb backtrace
, list
, up
, down
, info locals
, and print
commands and pretty printers for internal Python data type.
gdb4hpc supports using this extension by loading it into the gdb instances attached to each rank and providing aggregated versions of the commands provided by the extension.
See https://docs.python.org/3/howto/gdb_helpers.html for more info about the gdb extension for debugging the Python interpreter.
Requirements
gdb4hpc will automatically load the Python extension for you, but the following conditions need to be met for the extension to work.
The Python interpreter being debugged is the cpython interpreter
This is true in most situations. cpython is used in your Linux distribution’s Python installation, a virtual environment, a Conda environment, and cray-python. This list is not exhaustive, and cpython is used in even more places. If you don’t have evidence suggesting otherwise, you are probably using cpython.
The Python interpreter being debugged has debug information available
Unfortunately, this is not true by default in most situations.
How to Check for Debug Info
To check if your Python interpreter has debug info, run gdb $(which python3)
in a terminal and check the output for messages like “no debugging symbols
found”.
You can also run file $(which python3)
and look for “with debug_info”.
$ file -L $(which python3)
/usr/bin/python3: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=e84005b12f8e1e707681c3c55d5631f67299312f, for GNU/Linux 3.2.0, with debug_info, not stripped
If the output of the
file
command containssymbolic link to python
, use the-L
flag to tellfile
to follow links before checking for debug info.
Note that using file
is not 100% reliable because there are multiple ways to
provide debug info for the Python interpreter, and this method only checks one
of them. Other methods exist which load the debug info dynamically from a
standalone debug info file, and this method doesn’t check that.
If no debug info is found with the above methods, proceed to How to Get Debug Info below.
How to Get Debug Info
If you don’t have debug info available for your Python interpreter, there are a few approaches you can take to get it.
Install the Python Debug Info Package
If you are using your distribution’s Python interpreter (e.g.
/usr/bin/python3
), most distributions have a package called python-dbg
or
python-debuginfo
that you can install. You will need admin privileges to
install it, or will need to request that a system administrator install it.
Once the package is installed, gdb4hpc will automatically detect the debug info
and use it.
Note that installing the package only supplies debug info for the specific
Python interpreter at /usr/bin/python3
which comes with your Linux
distribution. For info about getting debug info for other Python interpreters
(cray-python, Python compiled from source, custom interpreters in Conda
environments, etc.), see the other sections below.
Use cray-python
The Cray Programming Environment comes with a cpython interpreter that is pre-built with some HPC libraries (mpi4py, numpy, etc). Some installations of cray-python include debug info.
To load cray-python and check if it has debug info available, run the following:
$ module load cray-python
$ which python3
/opt/cray/pe/python/3.11.7/bin/python3
$ gdb $(which python3)
...
Reading symbols from /opt/cray/pe/python/3.11.7/bin/python3...
If you don’t see any messages indicating debug symbols are missing (e.g. “no debugging symbols found”), your cray-python installation has debug info.
Compile Python from Source with Debug Info
If none of the above options are available, you can build Python from source. Building Python from source is fairly straightforward and outlined here: https://devguide.python.org/getting-started/setup-building/#unix
The important part is adding --with-pydebug
to the ./configure
call. This
has performance implications for the Python interpreter as it adds extra checks
and assertions.
When you build Python, you also build its package manager, pip3
. You can use
the pip3
installed in the same location as your built Python to install
libraries to the newly built Python instance.
Connecting to a Python Application in gdb4hpc
Via launch
Find the path of your current Python interpreter:
$ which python3
/opt/cray/pe/python/3.11.7/bin/python3
Use it in the launch
command in gdb4hpc to launch your Python application.
Pass the name of your Python script to the launched Python interpreter with the
-a
option. This adds command line arguments to the launched job. The below
launch command is the equivalent of srun /opt/cray/pe/python/3.11.7/bin/python3 my-python-app.py
.
Use --non-mpi
to tell gdb4hpc that the binary (the Python interpreter) does
not contain a call to MPI_Init
. You can still debug applications that use MPI
in Python code, e.g. mpi4py, the --non-mpi
is specifically about the Python
interpreter that will be running the mpi4py code.
dbg all> launch $pyapp{3} /opt/cray/pe/python/3.11.7/bin/python3 -a "my-python-app.py" --non-mpi
Using
--non-mpi
and debugging Python interpreters are not supported on systems with the ALPS or PALS work load managers.attach
is still available on these systems.
Via attach
Launch your Python app as you would normally (e.g. with srun
or sbatch
etc.):
$ srun -n4 python3 ./main4.py
Next, obtain your application’s job ID and attach to your job with gdb4hpc.
On a Slurm System
Use squeue -s
to find the job ID.
$ squeue -s
STEPID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
9431671.0 allnodes python3 hpe R 0:01 1 n023
The job ID is in the STEPID
column. Start gdb4hpc and attach to your application.
dbg all> attach $pyapp 9431671.0
On a Flux System
Use flux jobs
to find the job ID.
$ flux jobs
JOBID USER NAME ST NTASKS NNODES TIME INFO
ƒ2BzGNARhpB user python3 R 2 1 3.021s node1
Start gdb4hpc and attach to your application.
dbg all> attach $pyapp ƒ2BzGNARhpB
Depending on the environment, Flux job IDs might display with
ƒ
(unicode U+0192), notf
(ascii). This can cause some problems in old terminal emulators when copying and pasting. Set the environment variableFLUX_F58_FORCE_ASCII
to 1 to forceflux jobs
to use an ascii f.
Obtaining the job ID on Other Systems
For other workload managers, check the help text of the attach
command in gdb4hpc via the help attach
command.
Commands
The Python extension adds pretty printers for Python types and the following commands.
maint pyext bt
maint pyext bt
is the Python equivalent of gdb4hpc’s bt
/ backtrace
commands. It
shows all ranks’ backtraces merged into a single tree:
dbg all> maint pyext bt
a{0..2}: #2 <module> at main4.py:48
a{0..2}: #1 main at main4.py:39
|
|_ a{0,2}: #0 scatter at main4.py:18
|
\_ a{1}: #1 scatter at main4.py:17
a{1}: #0 sleep
This backtrace shows that all ranks started at <module>
(basically Python’s
_start
equivalent), then diverged in the scatter
function. Rank 1 is
currently in sleep
.
maint pyext list
maint pyext list
is the Python equivalent of gdb4hpc’s list
command.
dbg all> maint pyext list
a{0,2}:
15 def scatter(nums):
16 if world_rank == 1:
17 time.sleep(5)
>18 return comm.scatter(nums)
19
20 def get_avg(a):
21 avg = 0.0
22 for n in a:
a{1}:
15 def scatter(nums):
16 if world_rank == 1:
>17 time.sleep(5)
18 return comm.scatter(nums)
19
20 def get_avg(a):
21 avg = 0.0
22 for n in a:
maint pyext list
takes range arguments similar to gdb4hpc’s list
command that allow
you to list a specific range.
dbg all> maint pyext list 10,12
10 if world_rank == 0:
11 return [[j * nums_per_rank + i for i in range(nums_per_rank)] for j in range(world_size)]
12 else:
maint pyext print
and maint pyext locals
maint pyext print
and maint pyext locals
are the Python equivalent of gdb4hpc’s print
and
info locals
commands.
maint pyext print
can’t evaluate arbitrary expressions like gdb4hpc’s print
command
can. It can look up the value of local, global, and builtin names.
maint pyext locals
shows the local variables in the current Python frame and their
values.
maint pyext up
and maint pyext down
maint pyext up
and maint pyext down
are the Python equivalent of gdb4hpc’s up
and down
commands. They navigate the Python call stack and affect the results of
maint pyext print
and maint pyext locals
. For example, if you wanted to print a variable in
the top level function of your Python application, you could repeatedly use
maint pyext up
to climb up the stack to the top level frame, then use maint pyext locals
.
Hybrid Applicaitons
The maint pyext
commands provided by the extension only display information about
Python code. If you are debugging a hybrid application, use the non py-
versions of the commands to debug the native code as you usually would.
For example, maint pyext bt
can be used to show the Python backtrace, but it won’t
include any frames from native code. If native code is being run, the usual
backtrace
command can be used to see it.
You will also see the frames of the Python interpreter itself, which can be
quite large. You can pass an argument to backtrace
to limit the number of
frames that are displayed. Native extension frames will be at the bottom of the
backtrace stack, so limiting the amount of frames will work nicely to isolate
them.
dbg all> backtrace 5
app{0}: #4 step_one at src/C++/cpp_bt.cpp:62
app{0}: #3 step_two at src/C++/cpp_bt.cpp:67
app{0}: #2 rare_branch at src/C++/cpp_bt.cpp:54
app{0}: #1 step_three at src/C++/cpp_bt.cpp:73
app{0}: #0 merge at src/C++/cpp_bt.cpp:58
app{1..4}: #4 loop at src/C++/cpp_bt.cpp:86
app{1..4}: #3 step_one at src/C++/cpp_bt.cpp:62
app{1..4}: #2 step_two at src/C++/cpp_bt.cpp:69
app{1..4}: #1 step_three at src/C++/cpp_bt.cpp:73
app{1..4}: #0 merge at src/C++/cpp_bt.cpp:58
Loading Your Own Extensions
You can use gdb4hpc’s gdbmode
feature to directly control gdb4hpc’s many
instances of gdb. If you want to load your own extensions, you can use the
source
command in gdb mode.
gdb4hpc automatically does this for you and loads its included Python debugging extension. But if you had your own extension or a newer version that you wanted to use, you could load it like this:
dbg all> gdbmode
Entering gdb pass-thru mode. Type "end" to exit mode...
gdbmode> source path-to-my-python-extension.py
gdbmode> end
Or equivalently:
dbg all> gdb source path-to-my-python-extension.py