If the suggestions are unclear, use "man" to find out more about the commands.
pgrep -n a.out
pkill -9 a.out
, where you replace
a.out by the name of your binary. Also, consider
pgrep with -o, and with no flags.
gcc
or g++
, I use the following flags:gcc -g3 -O0 FILE.c
-O0
is for optimization level 0.
Otherwise, the compiler will inter-mix assembly instructions from
different source-level statements. The -g3
flag
says to include debugging information, including values of C/C++
macros (level 3 debugging). The simpler -g
flag
will not save the values of your C macros.
gdb --args <COMMAND STRING>
(gdb) info threads # What threads are there?
(gdb) thread 2 # Show me thread number 2
(gdb) where 5 # Show me the last 5 call frames of the stack
for this thread
(gdb) where -5 # Show me the first 5 call frames of the stack
(gdb) frame 3 # Show me call frame number 3
(gdb) print x # Show me the local variable x of call frame number 3
(gdb) list
(gdb) macro expand MACRO_CALL(ARGS)
Do gdb help COMMAND
to see the meanings of
the commands.(gdb) apropos kernel-calls
(or whatever).
(gdb) print my-utility-len-linked-list(mylist)
This also works on system calls:
(gdb) print lseek(4, 0, 0)
This finds the file offset for file descriptor number 4,
where the final 0 corresponds to SEEK_SET, whose value is
found from grep -R SEEK_SET /usr/include .
Similarly, you can find out what fd 4 corresponds to:
(gdb) set $pid = getpid()
(gdb) shell ls -l /proc/$pid/fd
NOTE: getpid() can be called only after
the gdb run (after the target is running).
where
will show you the file and
line number of a call frame, but GDB list
will not
show you the source code. This is because GDB has a default directory
search path that does not yet include the directory for the current
call frame. The solution is:(gdb) dir PREFIX_DIR_PATH
As an example, suppose libc.so crashes on you. And the call frame
shows that you are at line 100 in glibc: "io/readlink.c:100"
% ldd /bin/ls
...
libc.so.6 => /lib64/libc.so.6 (0x00007f9ab5d8f000)
% ls -l /lib64/libc.so.6
lrwxrwxrwx. 1
root root 12 Dec 23 11:42 /lib64/libc.so.6 -> libc-2.17.so
% wget .../glibc-2.17.tar.gz # From GLIBC downloads on the web
% tar xf glibc-2.17.tar.gz
And now you are ready to see the source code of the call frame
inside glibc, to help in debugging:
(gdb) dir DOWNLOAD_DIR/glibc-2.17
(gdb) where 100 # Should now show source code for io/readlink.c:100
set follow-fork-mode child
will follow the child
process
on fork. parent will have it follow the parent process.
break fork
will have GDB stop before executing the
fork system call.
set follow-fork-mode
is
set detach-on-fork off
.
This will suspend either the parent or child (whichever process
you are not debugging). Then use info inferiors
and inferior
INFERIOR_NUMBER to
choose which process to debug currently. (This extends the
hierarchy of what to debug by using:
"frame NUMBER",
"thread NUMBER",
"inferior NUMBER".)
(gdb) break _exit
(gdb) break exit
gdb a.out <PID>
gdb a.out
(gdb) attach <PID>
(where the attach command is given within gdb); a convenient
single command that finds the PID is:gdb a.out `pgrep a.out | tail -1`
echo 0 > /proc/sys/kernel/yama/ptrace_scope
prctl(PR_SET_PTRACER, PR_SET_PTRACER_ANY, 0, 0, 0);
to the course code of your target early
(e.g., in main or in a constructor like
DmtcpWorker::DmtcpWorker()
), and re-compile.
{volatile int x = 1; while(x);}
Then do "gdb attach" and
(gdb) print x=0
An alternate form to stop at the 5th occurrence is:
{static int x = 1; if (x++ > 5) while(x);}
(gdb) display/5i $pc
followed by stepi (si) or nexti (ni). To set a breakpoint in
assembly, try:
(gdb) break *0xbfdea000
for a breakpoint
at the given address.
(gdb) info threads
(gdb) thread <NUM>
(gdb) thread apply all where full
(gdb) thread set scheduler-locking on
ulimit -c unlimited
Then: gdb a.out core
ls core.*
or on some machines, either:
/proc/sys/kernel/core_pattern
to find where
the core file was saved; or
coredumpctl
to access the core file.
(gdb) set breakpoint always-inserted on
help set breakpoint always-inserted
.
(gdb) set detach-on-fork off
(gdb) info inferiors
(gdb) inferior <NUM>
(gdb) break 'dmtcp::FileConnection::doLocki<TAB>
doLocki
in this example):(gdb) info functions doLocki
info symbol address
, and
info types regexp
, and
info functions regexp
(as above), and
info variables regexp
.
print getpid()
). The corresponding
technique in C++ is:(gdb) info functions methodName
this
,
the current object.
handle SIGCHLD stop
print $_siginfo
# if GDB stops at signal
(gdb) # Diagnose signal handlers (Here, the SIGCHLD macro is: 17)call malloc(sizeof(struct sigaction))
call __sigaction(17, 0, $1)
print *((struct sigaction *)$1)
help catch
to stop at fork, exec, exceptions,
etc.
gdb -x gdbinit a.out
(gdb) source gdbinit
gdbinit
, instead of remembering the seuqnce
of commands..gdbinit
(the
GDB initialization file), because
I don't like hidden files, but you can also do
ln -s .gdbinit gdbinit
to "unhide" the hidden file.)set history save on
to capture
the GDB commands from your previous session, and copy them into
gdbinit
.
define procmaps
python gdb.execute("shell cat /proc/" + str(gdb.selected_inferior().pid) + "/maps")
end
And execute:
(gdb) source mygdbinit
(gdb) procmaps
Or alternatively, integrate a new GDB command. In mygdbinit:
python
class Procmaps(gdb.Command):
"""procmaps (using the Python API)"""
def __init__(self):
super(Procmaps, self).__init__("procmaps", gdb.COMMAND_USER)
self.dont_repeat()
def invoke(self, arg, from_tty):
# argv = gdb.string_to_argv(arg) # not needed here
gdb.execute("shell cat /proc/" +
str(gdb.selected_inferior().pid) + "/maps")
Procmaps()
end
And execute:
(gdb) source mygdbinit
(gdb) help procmaps
(gdb) procm (auto-complete the GDB command)
strace -o myoutput a.out
(trace system calls
based on kernel API: /usr/include/asm/unistd*.h ;
decide in advance if it should trace all child processes or not;
the flags -f and -ff exist for tracing
parent and all children)
ltrace -o myoutput a.out
(not as useful as
strace, but sometimes interesting: trace library
calls instead of system calls).
ps auxw | grep a.out
pstree -pu $USER
or pstree -lu $USER
(tree of processes and child
process; names in curly braces are additional threads);
Note idioms like: pstree -p | grep -C2 a.out
top
and htop
(and you can use things
like strace
directly from inside htop
man iostat
man vmstat
for
disk/file I/O (Blk_read/s / Blk_wrtn/s),
and paging to disk (bi/bo/id), respectively. A local disk (not
on the network; SANs are different) can sequentially
read or write (not both at once) roughly at a rate
from 50 MB/s to 100 MB/s.
If you are accessing files mostly and you don't
see that bandwidth, then your program is not efficient. If you are
paging to disk and you do see a bandwidth
anywhere near that bandwidth, then you are using too much RAM.
watch -d ls -l /tmp/myfile.txt
watch -d "pstree -l | grep -A1 `basename $SHELL`"
(repeatedly execute COMMAND for
watch -d COMMAND
)
find dmtcp| xargs grep SUBSTRING
grep -r SUBSTRING
grep -C3 ...; grep -A5 ...; (and so on.)
grep
and google are your friends when
searching for information. Besides "grep'ing" through source code,
here is a grep trick you may not have seen:find /usr/share/man/man3 | xargs zgrep MYSTRING
find /usr/share/man/man3 | xargs gzip -dc | grep -C3 MYSTRING
less /proc/PID/maps
ls -l /proc/PID/fd
lsof | grep a.out
If you discover an interesting socket with SOCKET_ID
through 'lsof' or 'ls -l /proc/PID/fd', then find
the other end of the socket:
lsof | grep SOCKET_ID
lsof | grep TCP
(or else: lsof | grep TCP| grep $USER
)man lsof
for other information
on sockets, ports, etc.
cat /proc/PID/environ | tr '\0' '\n'
man proc
, or at
https://www.kernel.org/doc/Documentation/filesystems/proc.txt
/dev
directory can be informative. For example,
compare: ls /dev
with: cat /proc/cpuinfo
man sys
,
man sysctl
, and
https://www.kernel.org/doc/Documentation/filesystems/sysfs.txt
mount
,
and also consider /etc/fstab
to see the default
filesystems that are mounted over your root filesystem (over '/').
ls -l /dev/pts | grep $USER
tty
)
nm -D a.out
(or nm -D library.so) exists for
seeing all of the dynamic symbols in the ELF symbol table.
(ELF specifies a static symbol table (extended using gcc -g
), which is used by GDB, but can be stripped out with strip
.
ELF also specifies a separate dynamic symbol table so that if a base
executable calls "foo", the runtime loader can search
the "library search path" to find a matching definition of "foo"
in some library.)touch myfile.c &&make -n myfile.o
Note also the form nm -o for printing out filenames. This
can be useful with brute force strategies:strings -a a.out
(for some binary, a.out)
cpp
-I. -Iother/path/to/include/files myfile.c
and you can see the expanded C or C++ code with no #include files.
This often makes it easier to find the syntax error.
cpp -dM
-I. -Iother/path/to/include/files myfile.c
cpp -E
exists to limit expansion to #define
and other directives, but not macros.
gcc -E
will stop after the preprocessing stage, and
before the compilation stage.
rm myfile.o; make myfile.o
and copy the command line used by 'make' to build myfile.o.
If 'make' uses libtools, you may also have to remove hidden
directories with names like .libs .touch myfile.c &&make -n myfile.o
man ld.so
env LD_DEBUG=help a.out
env LD_DEBUG=files a.out
(and try other options to LD_DEBUG)
ldd a.out
(for some binary, a.out)
pushd /proc/PID;
ls -l exe;
echo -n "cmdline: "; cat -v cmdline;
echo ""; cat -v environ; echo "";
popd
./configure --enable-debug; make clean; make
and then run and look at
/tmp/dmtcp-USER@HOST/jassert*
files
for your value of USER and HOST. Before
your next test, rm -rf /tmp/dmtcp-USER@HOST .
list 'dmtcp::myC<TAB>
info functions substring
to discover
the full signature in C++.
*(int *)__errno_location()
env LD_LIBRARY_PATH=/usr/lib/debug dmtcp_checkpoint a.out
(Presumably, after you checkpoint, the restarted a.out process
will be using the pre-checkpoint libraries and hence the
debugging versions. So, probably you don't need to
use env LD_LIBRARY_PATH=/usr/lib/debug for the restart
command. But if you're unsure, it doesn't hurt.)
add-symbol-file FILE ADDR
where FILE is the full pathname you identified in the
/proc/PID/maps file. The ADDR
will be the hexadecimal sum of:
readelf -S FILE
objdump -h FILE
p/x addr1 + addr2
where addr1 and addr2 are the two addresses we discussed. If those
addresses are in hexadecimal, make sure to include 0x
at the beginning of each hexadecimal number.
ldd /bin/ls | grep libc.so; ls -l /lib64/libc.so.6
to find libc and its version number (assuming that
ldd
points you to /lib64
).
ld --wrap=symbol
to create wrapper
functions when statically linking some .o files together.
This plays tricks with the ELF symbol tables to create
new symbols, __wrap_symbol
and
__real_symbol
. See man ld
for
more information.
objdump -S a.out > a.out.listing
where a.out should be replaced by your binary.
For a more verbose form, try one of:
gcc -c -g -Wa,-alh,-L file.c > file.s
gcc -c -g -Wa,-ahls=file.s file.c
Variations of this can also produce assembly code that can be
directly assembled by gcc or by as.
For example, if you want to modify and re-compile the source
code for libc.so,
this is normally quite painful. A nice trick is to disassemble
libc.so into assembly, and then cut or copy out the particular assembly
routines that you want to assemble into a modified library.
-fsanitize
is available. Search
for -fsanitize
in
Options for Debugging Your Program or GCC. This alternative
to valgrind will detect do some of the same things as valgrind
(detect memory leads, detect races, array bounds checking,
enum checking, etc.)
catchsegv COMMAND_LINE
man backtrace
It mangles any
C++ names, but they are mostly readable (and utilities exist
for demangling the names). Read the notes of
man backtrace .
(For example, compile with gcc -rdynamic
to get
symbol names.) Also, note man addr2line.backtrace.c
,
for this course./proc/<PID>/maps
.
Use addr2line
to translate hex addresses into
line numbers in source code. (If it's a .so dynamic library,
give it the offset, the hex address minus the beginning
library address as shown by /proc/<PID>/maps
.
less /proc/PID/maps
readelf -S a.out | grep '\.text '
SHELL% nm a.out | grep main
080483e4 T main
(As described in the 'man' pages, 't', 'T', 'd', 'D', 'b', 'B', 'U'
tell you if the symbol is in text, data, bss, or undefined
(presumably defined in a different library). Lower case means
file-private, and upper case means a globally visible symbol.
Look up __attribute__ ((visibility ("hidden"))) for
declaring a symbol library-private:
globally visible within a .o file, but file-private
within the .so (library) file.)
(gdb) help add-symbol-file
along with the above information will allow you to tell gdb
at what address in RAM the executable or library file on disk
was loaded. The file on disk contains the symbol information.
(gdb) shell utils/gdb-add-symbol-file
SHELL% nm a.out
...
080483e4 T main
SHELL% addr2line -C -e /tmp/a.out 080483e4
/tmp/tmp.c:2
sudo dmiprobe -t help