Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use GDB to troubleshoot Python programs

2025-01-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

How to use GDB to troubleshoot Python program faults, I believe that many inexperienced people do not know what to do, so this paper summarizes the causes of the problem and solutions, through this article I hope you can solve this problem.

Some Team is developing some code with Python, involving child processes and trying to eliminate the need for zombie processes. In practice, they encounter the phenomenon of unexpected withdrawal of Python programs. Initially they decided to debug the Python interpreter with GDB to see the source of exit (). After listening to this, I feel that this problem should be debugged in another way. When helping them troubleshoot this program, apart from the original problem, there are other problems.

In order to simplify the site and facilitate debugging, the original and derived problems have been condensed into DebugPythonWithGDB_6.py and DebugPythonWithGDB_7.py.

$vi DebugPythonWithGDB_6.py

PHP

#! / usr/bin/env python#-*-encoding: utf-8-*-import sys, os, signal, subprocess, shlex, tracebackdef on_SIGCHLD (signum, frame): print "[on_SIGCHLD" sys.stdout.write ("signum =% u\ n"% signum) traceback.print_stack (frame) print os.waitpid (- 1, os.WNOHANG) "" try: print os.waitpid (- 1 Os.WNOHANG) except OSError: sys.stdout.write ('Line [% u]: OSError\ n'% sys.exc_info () [2] .tb _ lineno) "print" on_SIGCHLD] "def do_more (count): print' [do_more () begin% u]'% countos.system (r'printf" Child =% u\ n "$$ / bin/sleep 1') "" # # there is a race condition that can increase the probability of triggering an OSError exception # os.system (r'printf "Child =% u\ n" $; / bin/sleep 1') os.system (r'printf "Child =% u\ n" $; / bin/sleep 1') os.system (r'printf "Child =% u\ n" $; / bin/sleep 1') os.system (r'printf "Child =% u\ n" $ / bin/sleep 1') "" print'[do_more () end% u]'% countdef main (prog, args): if 0 = = len (args): print 'Usage:% s'% progelse: sys.stdout.write ("Parent =% u\ n"% os.getpid ()) # # in this case, Ctrl-C is still invalid even with the following code. # signal.signal (signal.SIGINT, signal.SIG_DFL) # # signal.signal (signal.SIGCHLD, signal.SIG_IGN) # signal.signal (signal.SIGCHLD, on_SIGCHLD) # count = 0while True: # # in this example, the parent process is only a scheduling framework and does not need to communicate with the child process, so there is no need to deal with "stdin=None, stdout=None, stderr=None". # child = subprocess.Popen\ (# # do not use args [0] .split () directly, it is not the behavior we # expect when dealing with single and double quotes. Consider such an example. Ls-l "/ tmp/non exist" # shlex.split (args [0]), # # all file descriptors except 0, 1 and 2 will be closed# before the child process is executed#close_fds = True Cwd = "/ tmp") sys.stdout.write ("Child =% u\ n"% child.pid) # # child.send_signal (signal.SIGTERM) # child.terminate () # child.kill () # child.wait () # do_more (count) count + = 1if'_ main__' = _ _ name__: try: main (os.path.basename (sys.argv [0]), sys.argv [1:]) except KeyboardInterrupt: pass

one

two

three

four

five

six

seven

eight

nine

ten

eleven

twelve

thirteen

fourteen

fifteen

sixteen

seventeen

eighteen

nineteen

twenty

twenty-one

twenty-two

twenty-three

twenty-four

twenty-five

twenty-six

twenty-seven

twenty-eight

twenty-nine

thirty

thirty-one

thirty-two

thirty-three

thirty-four

thirty-five

thirty-six

thirty-seven

thirty-eight

thirty-nine

forty

forty-one

forty-two

forty-three

forty-four

forty-five

forty-six

forty-seven

forty-eight

forty-nine

fifty

fifty-one

fifty-two

fifty-three

fifty-four

fifty-five

fifty-six

fifty-seven

fifty-eight

fifty-nine

sixty

sixty-one

sixty-two

sixty-three

sixty-four

sixty-five

sixty-six

sixty-seven

sixty-eight

sixty-nine

seventy

seventy-one

seventy-two

seventy-three

seventy-four

seventy-five

seventy-six

seventy-seven

seventy-eight

seventy-nine

eighty

eighty-one

eighty-two

eighty-three

#! / usr/bin/env python

#-*-encoding: utf-8-*-

Import sys, os, signal, subprocess, shlex, traceback

Def on_SIGCHLD (signum, frame):

Print "[on_SIGCHLD"

Sys.stdout.write ("signum =% u\ n"% signum)

Traceback.print_stack (frame)

Print os.waitpid (- 1, os.WNOHANG)

"

Try:

Print os.waitpid (- 1, os.WNOHANG)

Except OSError:

Sys.stdout.write ('Line [% u]: OSError\ n'% sys.exc_info () [2] .tb _ lineno)

"

Print "on_SIGCHLD]"

Def do_more (count):

Print'[do_more () begin% u]'% count

Os.system (r'printf "Child =% u\ n" $$; / bin/sleep 1')

"

#

# there are competition conditions here, which can increase the probability of triggering OSError exceptions

#

Os.system (r'printf "Child =% u\ n" $$; / bin/sleep 1')

Os.system (r'printf "Child =% u\ n" $$; / bin/sleep 1')

Os.system (r'printf "Child =% u\ n" $$; / bin/sleep 1')

Os.system (r'printf "Child =% u\ n" $$; / bin/sleep 1')

"

Print'[do_more () end% u]'% count

Def main (prog, args):

If 0 = = len (args):

Print 'Usage:% s'% prog

Else:

Sys.stdout.write ("Parent =% u\ n"% os.getpid ())

#

# in this example, Ctrl-C is still invalid even with the following code.

#

Signal.signal (signal.SIGINT, signal.SIG_DFL)

#

# signal.signal (signal.SIGCHLD, signal.SIG_IGN)

#

Signal.signal (signal.SIGCHLD, on_SIGCHLD)

#

Count = 0

While True:

#

# in this example, the parent process is only a scheduling framework, and there is no need to communicate with the child process, so no

# need to deal with "stdin=None, stdout=None, stderr=None" specially.

#

Child = subprocess.Popen\

(

#

# do not directly use args [0] .split (), it is not us when dealing with single and double quotation marks

# expected behavior. Consider this example, ls-l "/ tmp/non exist"

#

Shlex.split (args [0])

#

# all file descriptors except 0, 1 and 2 will be closed

# before the child process is executed

#

Close_fds = True

Cwd = "/ tmp"

)

Sys.stdout.write ("Child =% u\ n"% child.pid)

#

# child.send_signal (signal.SIGTERM)

# child.terminate ()

#

Child.kill ()

#

# child.wait ()

#

Do_more (count)

Count + = 1

If'_ _ main__' = = _ _ name__:

Try:

Main (os.path.basename (sys.argv [0]), sys.argv [1:])

Except KeyboardInterrupt:

Pass

PHP

$python DebugPythonWithGDB_6.py 'python-c "import time Time.sleep (3600) "'Parent = 10244Child = 10245 [do_more () begin 0] [on_SIGCHLDsignum = 17File" DebugPythonWithGDB_6.py ", line 81, inmain (os.path.basename (sys.argv [0]), sys.argv [1:]) File" DebugPythonWithGDB_6.py ", line 76, in maindo_more (count) File" DebugPythonWithGDB_6.py ", line 20, indo_ moreprint' [do_more () begin% u]'% count (10245 9) on_SIGCHLD] Child = 10246 [on_SIGCHLDsignum = 17File "DebugPythonWithGDB_6.py", line 81, inmain (os.path.basename (sys.argv [0]), sys.argv [1:]) File "DebugPythonWithGDB_6.py", line 76, in maindo_more (count) File "DebugPythonWithGDB_6.py", line 21, indo_ moreos.system (r'printf "Child =% u\ n" $$ / bin/sleep 1') Traceback (most recent call last): File "DebugPythonWithGDB_6.py", line 81, inmain (os.path.basename (sys.argv [0]), sys.argv [1:]) File "DebugPythonWithGDB_6.py", line 76, in maindo_more (count) File "DebugPythonWithGDB_6.py", line 21, indo_ moreos.system (r'printf "Child =% u\ n" $$ / bin/sleep 1') File "DebugPythonWithGDB_6.py", line 10, in on_SIGCHLDprint os.waitpid (- 1, os.WNOHANG) OSError: [Errno 10] No child processes

one

two

three

four

five

six

seven

eight

nine

ten

eleven

twelve

thirteen

fourteen

fifteen

sixteen

seventeen

eighteen

nineteen

twenty

twenty-one

twenty-two

twenty-three

twenty-four

twenty-five

twenty-six

twenty-seven

twenty-eight

twenty-nine

thirty

thirty-one

thirty-two

thirty-three

$python DebugPythonWithGDB_6.py 'python-c "import time;time.sleep (3600)"'

Parent = 10244

Child = 10245

[do_more () begin 0]

[on_SIGCHLD

Signum = 17

File "DebugPythonWithGDB_6.py", line 81, in

Main (os.path.basename (sys.argv [0]), sys.argv [1:])

File "DebugPythonWithGDB_6.py", line 76, in main

Do_more (count)

File "DebugPythonWithGDB_6.py", line 20, in do_more

Print'[do_more () begin% u]'% count

(10245, 9)

On_SIGCHLD]

Child = 10246

[on_SIGCHLD

Signum = 17

File "DebugPythonWithGDB_6.py", line 81, in

Main (os.path.basename (sys.argv [0]), sys.argv [1:])

File "DebugPythonWithGDB_6.py", line 76, in main

Do_more (count)

File "DebugPythonWithGDB_6.py", line 21, in do_more

Os.system (r'printf "Child =% u\ n" $$; / bin/sleep 1')

Traceback (most recent call last):

File "DebugPythonWithGDB_6.py", line 81, in

Main (os.path.basename (sys.argv [0]), sys.argv [1:])

File "DebugPythonWithGDB_6.py", line 76, in main

Do_more (count)

File "DebugPythonWithGDB_6.py", line 21, in do_more

Os.system (r'printf "Child =% u\ n" $$; / bin/sleep 1')

File "DebugPythonWithGDB_6.py", line 10, in on_SIGCHLD

Print os.waitpid (- 1, os.WNOHANG)

OSError: [Errno 10] No child processes

The process enters on_SIGCHLD (), but os.waitpid () throws an OSError exception. The help says that if the system call waitpid () returns-1, an exception is thrown: An OSError is raised with the value of errno when the syscall returns-1. 10245 child process waitpid () succeeded in on_SIGCHLD (), and 9 in (10245, 9) indicates that the process was killed by SIGKILL, as expected. The 10246 child process is the shell process generated by os.system () in do_more (). It ends by delivering the SIGCHLD signal to the 10244 parent process. When waitpid () in on_SIGCHLD (), it has already been wait* () elsewhere, the 10246 subprocess has completely disappeared, and the system call waitpid () returns-1 os.waitpid (), a Python function, throwing an exception. The whole process is very complex and is described in pseudo code as follows:

PHP

Do_more () os.system () posix_system () / / posixmodule.c__libc_system () / / weak_alias (_ libc_system, system) do_system () / / sysdeps/posix/system.c/** SIG_IGN** Ctrl-C temporary failure * / sigaction (SIGINT, & sa, & intr) / * shielding (blocking) SIGCHLD signal * / sigaddset (& sa.sa_mask, SIGCHLD) sigprocmask (SIG_BLOCK, & sa.sa_mask) & omask) fork () child process (child process 10246) / * * restore the original SIGINT signal processing method * / sigaction (SIGINT, & intr, (struct sigaction *) NULL) / * * call "sh-c." * / execve () [Shell child process ends Deliver SIGCHLD to DebugPythonWithGDB_6.py] [because the SIGCHLD signal is blocked (blocked), it remains on the pending signal chain in the kernel state] the parent process (parent process 10244) / * synchronous call will block. It is not called asynchronously in the signal handle. * * Subprocess 10246 disappears completely after being recovered by wait* () * / waitpid (pid, & status, 0) / * * restore the original SIGINT signal processing method * / sigaction (SIGINT, & intr, (struct sigaction *) NULL) / * * unblock (block) SIGCHLD * / sigprocmask (SIG_SETMASK, & omask) (sigset_t *) NULL) [the mask (blocking) of the SIGCHLD signal is removed] [the C-level signal handle of DebugPythonWithGDB_6.py signal_handler () arranges a "delayed call" and returns] [on_SIGCHLD () of DebugPythonWithGDB_6.py is not executed at this time. Because the built-in function os.system () has not yet returned] / * * built-in function os.system () returns, the 10244 parent process begins to deal with the "delayed call" and calls the * Python-level signal handle. This SIGCHLD signal is delivered by sub-process 10246. * * on_SIGCHLD () of DebugPythonWithGDB_6.py gets executed * / on_SIGCHLD () / * * calls waitpid (- 1, & status, WNOHANG) to try to process child 10246. * * child process 10246 has been handled by the aforementioned waitpid (pid, & status, 0). Here the system call * returns-1, causing os.waitpid () to throw an OSError exception. * / os.waitpid (- 1, os.WNOHANG)

one

two

three

four

five

six

seven

eight

nine

ten

eleven

twelve

thirteen

fourteen

fifteen

sixteen

seventeen

eighteen

nineteen

twenty

twenty-one

twenty-two

twenty-three

twenty-four

twenty-five

twenty-six

twenty-seven

twenty-eight

twenty-nine

thirty

thirty-one

thirty-two

thirty-three

thirty-four

thirty-five

thirty-six

thirty-seven

thirty-eight

thirty-nine

forty

forty-one

forty-two

forty-three

forty-four

forty-five

forty-six

forty-seven

forty-eight

forty-nine

fifty

fifty-one

fifty-two

fifty-three

fifty-four

fifty-five

fifty-six

fifty-seven

fifty-eight

fifty-nine

sixty

sixty-one

sixty-two

sixty-three

sixty-four

sixty-five

sixty-six

sixty-seven

Do_more ()

Os.system ()

Posix_system () / / posixmodule.c

_ _ libc_system () / / weak_alias (_ _ libc_system, system)

Do_system () / / sysdeps/posix/system.c

/ *

* SIG_IGN

*

* temporary invalidation of Ctrl-C

, /

Sigaction (SIGINT, & sa, & intr)

/ *

* blocking (blocking) SIGCHLD signals

, /

Sigaddset (& sa.sa_mask, SIGCHLD)

Sigprocmask (SIG_BLOCK, & sa.sa_mask, & omask)

Fork ()

Child process (child process 10246)

/ *

* restore the original SIGINT signal processing method

, /

Sigaction (SIGINT, & intr, (struct sigaction *) NULL)

/ *

* call "sh-c..."

, /

Execve ()

[the shell sub-process ends and delivers SIGCHLD to DebugPythonWithGDB_6.py]

[because the SIGCHLD signal has been shielded (blocked), it remains on the pending signal chain in the kernel state]

Parent process (parent process 10244)

/ *

* synchronous calls will block. It is not called asynchronously in the signal handle.

*

* the 10246 sub-process disappeared completely after being reclaimed by wait* ().

, /

Waitpid (pid, & status, 0)

/ *

* restore the original SIGINT signal processing method

, /

Sigaction (SIGINT, & intr, (struct sigaction *) NULL)

/ *

* unblock (block) SIGCHLD

, /

Sigprocmask (SIG_SETMASK, & omask, (sigset_t *) NULL)

[SIGCHLD signal blocking (blocking) is undone]

[DebugPythonWithGDB_6.py 's C-level signal handle signal_handler () arranges a "delayed call" and returns]

[on_SIGCHLD () of DebugPythonWithGDB_6.py is not executed at this time because the built-in function os.system () has not yet been returned]

/ *

* after the return of the built-in function os.system (), the parent process 10244 begins to deal with the "delayed call" and calls

* Python level signal handle. This SIGCHLD signal is delivered by sub-process 10246.

*

* on_SIGCHLD () of DebugPythonWithGDB_6.py is executed

, /

On_SIGCHLD ()

/ *

* call waitpid (- 1, & status, WNOHANG) to try to deal with child process 10246.

*

* subprocess 10246 has been handled by the aforementioned waitpid (pid, & status, 0), where the system calls

* return-1, causing os.waitpid () to throw an OSError exception.

, /

Os.waitpid (- 1, os.WNOHANG)

The reason why the whole process is so complicated is that the signal processing mechanism of Python is more complex, which adds variables to the already very complex Linux signal mechanism. See:

PHP

"2.50 debug the Python interpreter", "22.0 Linux signaling mechanism"

one

two

"2.50 debug the Python interpreter"

"22.0 Linux signal mechanism"

For this example, to ensure that the DebugPythonWithGDB_6.py is not terminated by an OSError exception, simply catch the OSError exception when os.waitpid () is called in on_SIGCHLD ():

PHP

Def on_SIGCHLD (signum, frame): try: print os.waitpid (- 1, os.WNOHANG) except OSError: sys.stdout.write ('Line [% u]: OSError\ n'% sys.exc_info () [2] .tb _ lineno)

one

two

three

four

five

Def on_SIGCHLD (signum, frame):

Try:

Print os.waitpid (- 1, os.WNOHANG)

Except OSError:

Sys.stdout.write ('Line [% u]: OSError\ n'% sys.exc_info () [2] .tb _ lineno)

Some of the above ideas are obtained by dynamic debugging and some by static analysis. Someone may ask, why not intercept the C-level signal handle of the Python process and look at the SIGCHLD signal source to confirm that the 10246 subprocess may be recycled twice? Actually, I wanted to do the same at first, but it didn't work, because Python's C-level signal handle signal_handler () is the most primitive simplex parameter signal handle, not a high-end tri-parameter signal handle. Debug the Python interpreter with GDB:

PHP

# gdb-Q-ex "b * signal_handler"-ex r-args / usr/bin/python2.7-dbg DebugPythonWithGDB_6.py'/ usr/bin/python2.7-dbg-c "import time;time.sleep (3600)" '... Breakpoint 1 at 0x8216f2d: file.. / Modules/signalmodule.c, line 185.Starting program: / usr/bin/python2.7-dbg DebugPythonWithGDB_6.py / usr/bin/python2.7-dbg\-c\\ "import\ time\" Time.sleep\ (3600\)\ "[Thread debugging using libthread_db enabled] Using host libthread_db library" / lib/i386-linux-gnu/i686/cmov/libthread_db.so.1 ".Parent = 10284Child = 10288 [do_more () begin 0] Child = 10289Breakpoint 1, signal_handler (sig_num=17) at.. / Modules/signalmodule.c:185185 {(gdb) py-bt#10 Frame 0xb7c20034, for file DebugPythonWithGDB_6.py, line 21 In do_more (count=0) os.system (r'printf "Child =% u\ n" $$) / bin/sleep 1') # 13 Frame 0xb7cb37dc, for file DebugPythonWithGDB_6.py, line 76, in main (prog='DebugPythonWithGDB_6.py', args= ['/ usr/bin/python2.7-dbg-c "import time) Time.sleep (3600) "'], count=0, child=) do_more (count) # 16 Frame 0xb7cbe49c, for file DebugPythonWithGDB_6.py, line 81, in () main (os.path.basename (sys.argv [0]), sys.argv [1:]) (gdb) bt 7: 0 signal_handler (sig_num=17) at. / Modules/signalmodule.c:185#1#2 0xb7fdcd3c in _ kernel_vsyscall () # 3 0xb7db25eb in _ sigprocmask (how=how@entry=2, set=0x0, set@entry=0xbffff0d4) Oset=oset@entry=0x0) at. / sysdeps/unix/sysv/linux/sigprocmask.c:57#4 0xb7dc2084 in do_system (line=line@entry=0xb7cbf9e4 "printf\" Child =% u\ n\ "$$ / bin/sleep 1 ") at. / sysdeps/posix/system.c:161#5 0xb7dc2380 in _ libc_system (line=line@entry=0xb7cbf9e4" printf\ "Child =% u\\ n\" $; / bin/sleep 1 ") at. / sysdeps/posix/system.c:184#6 0xb7fa9bfb in system (line=0xb7cbf9e4" printf\ "Child =% u\\ n\" $; / bin/sleep 1 ") at pt-system.c:28 (More stack frames follow...)

one

two

three

four

five

six

seven

eight

nine

ten

eleven

twelve

thirteen

fourteen

fifteen

sixteen

seventeen

eighteen

nineteen

twenty

twenty-one

twenty-two

twenty-three

twenty-four

twenty-five

twenty-six

twenty-seven

twenty-eight

twenty-nine

# gdb-Q-ex "b * signal_handler"-ex r-args / usr/bin/python2.7-dbg DebugPythonWithGDB_6.py'/ usr/bin/python2.7-dbg-c "import time;time.sleep (3600)"'

...

Breakpoint 1 at 0x8216f2d: file.. / Modules/signalmodule.c, line 185.

Starting program: / usr/bin/python2.7-dbg DebugPythonWithGDB_6.py / usr/bin/python2.7-dbg\-c\ "import\ time\; time.sleep\ (3600\)\"

[Thread debugging using libthread_db enabled]

Using host libthread_db library "/ lib/i386-linux-gnu/i686/cmov/libthread_db.so.1".

Parent = 10284

Child = 10288

[do_more () begin 0]

Child = 10289

Breakpoint 1, signal_handler (sig_num=17) at.. / Modules/signalmodule.c:185

185 {

(gdb) py-bt

# 10 Frame 0xb7c20034, for file DebugPythonWithGDB_6.py, line 21, in do_more (count=0)

Os.system (r'printf "Child =% u\ n" $$; / bin/sleep 1')

# 13 Frame 0xb7cb37dc, for file DebugPythonWithGDB_6.py, line 76, in main (prog='DebugPythonWithGDB_6.py', args= [/ usr/bin/python2.7-dbg-c "import time;time.sleep (3600)'], count=0, child=)

Do_more (count)

# 16 Frame 0xb7cbe49c, for file DebugPythonWithGDB_6.py, line 81, in ()

Main (os.path.basename (sys.argv [0]), sys.argv [1:])

(gdb) bt 7

# 0 signal_handler (sig_num=17) at.. / Modules/signalmodule.c:185

# 1

# 2 0xb7fdcd3c in _ _ kernel_vsyscall ()

# 3 0xb7db25eb in _ sigprocmask (how=how@entry=2, set=0x0, set@entry=0xbffff0d4, oset=oset@entry=0x0) at.. / sysdeps/unix/sysv/linux/sigprocmask.c:57

# 4 0xb7dc2084 in do_system (line=line@entry=0xb7cbf9e4 "printf\" Child =% u\\ n\ "$; / bin/sleep 1") at.. / sysdeps/posix/system.c:161

# 5 0xb7dc2380 in _ libc_system (line=line@entry=0xb7cbf9e4 "printf\" Child =% u\\ n\ "$; / bin/sleep 1") at.. / sysdeps/posix/system.c:184

# 6 0xb7fa9bfb in system (line=0xb7cbf9e4 "printf\" Child =% u\\ n\ "$; / bin/sleep 1") at pt-system.c:28

(More stack frames follow...)

Look at the system.c:161 of # 4, which is already after waitpid (pid, & status, 0): sigprocmask (SIG_SETMASK, & omask, (sigset_t *) NULL) is used to unblock (block) SIGCHLD. The memory layout is as follows:

PHP

High address direction of memory fpstate / / ESP+0x2DC output/x * (struct _ fpstate *) ($esp+0x2dc) retcode / / ESP+0x2D4 xdebar 3i $esp+0x2d4extramask / / ESP+0x2D0 x/1wx $esp+0x2d0fpstate_unused / / ESP+0x60 output/x * (struct _ fpstate *) ($esp+0x60) sigcontext_ia32 / / ESP+8 output/x * (struct sigcontext *) ($esp+8) sig / / ESP+4 signal value, signal handle * * parameter pretcode / / ESP RetAddr=__kernel_sigreturn// hexdump $esp 0x2dc memory low address direction

one

two

three

four

five

six

seven

eight

nine

ten

eleven

twelve

Memory high address direction

Fpstate / / ESP+0x2DC output/x * (struct _ fpstate *) ($esp+0x2dc)

Retcode / / ESP+0x2D4 xplink 3i $esp+0x2d4

Extramask / / ESP+0x2D0 x/1wx $esp+0x2d0

Fpstate_unused / / ESP+0x60 output/x * (struct _ fpstate *) ($esp+0x60)

Sigcontext_ia32 / / ESP+8 output/x * (struct sigcontext *) ($esp+8)

Sig / / ESP+4 signal value, signal handle * parameter

Pretcode / / ESP RetAddr=__kernel_sigreturn

/ / hexdump $esp 0x2dc

Memory low address direction

PHP

(gdb) x/2wa $esp0xbfffea6c: 0xb7fdcd18 0x11 (gdb) x Charger 3i $esp+0x2d40xbfffed40: pop eax0xbfffed41: mov eax,0x770xbfffed46: int 0x80 (gdb) output/x * (struct sigcontext *) ($esp+8) {gs = 0x3Legendary GSH = 0x0Magi fs = 0x0Med fsh = 0x0BJEL es = 0x7b BJM examples EDS = 0x0Med edi = 0xb7f2a000Commesi = 0x8JEBP = 0x1mesp = 0xbfffeff0EBX = 0x2edx = 0x0edx = 0xbff0decx = 0xbff0d4J eax = 0x0trapno = 0x1err = 0x0eip = 0xbf7dc3ccs 0x73 = 0x0eip _ _ csh = 0x0Legend eflags = 0x246Magi espkeeper attrition signal = 0xbfffeff0Magi ss = 0x7b Lecture ssh = 0x0There is fpstate = 0xbfffed50Law oldmask = 0x0jue CR2 = 0x0}

one

two

three

four

five

six

seven

eight

nine

ten

eleven

twelve

thirteen

fourteen

fifteen

sixteen

seventeen

eighteen

nineteen

twenty

twenty-one

twenty-two

twenty-three

twenty-four

twenty-five

twenty-six

twenty-seven

twenty-eight

twenty-nine

thirty

thirty-one

thirty-two

thirty-three

thirty-four

thirty-five

thirty-six

thirty-seven

(gdb) x/2wa $esp

0xbfffea6c: 0xb7fdcd18 0x11

(gdb) xUnix 3i $esp+0x2d4

0xbfffed40: pop eax

0xbfffed41: mov eax,0x77

0xbfffed46: int 0x80

(gdb) output/x * (struct sigcontext *) ($esp+8)

{

Gs = 0x33

_ _ gsh = 0x0

Fs = 0x0

_ _ fsh = 0x0

Es = 0x7b

_ _ esh = 0x0

Ds = 0x7b

_ _ dsh = 0x0

Edi = 0xb7f2a000

Esi = 0x8

Ebp = 0x1

Esp = 0xbfffeff0

Ebx = 0x2

Edx = 0x0

Ecx = 0xbffff0d4

Eax = 0x0

Trapno = 0x1

Err = 0x0

Eip = 0xb7fdcd3c

Cs = 0x73

_ _ csh = 0x0

Eflags = 0x246

Esp_at_signal = 0xbfffeff0

Ss = 0x7b

_ _ ssh = 0x0

Fpstate = 0xbfffed50

Oldmask = 0x0

Cr2 = 0x0

}

Because it is a simplex parameter signal handle, there is no siginfo, so the signal source cannot be known in the user state. But I analyze that the signal source at this time is not the 10289 sub-process, but the 10288 sub-process. 10288 when SIGCHLD is generated, the SIGCHLD signal has been shielded (blocked) and can only be kept on the pending signal chain of the kernel state. After that, when the SIGCHLD is generated by 10289, the corresponding bit in the sigpending.signal has been set, and the SIGCHLD generated by 10289 is discarded and will not enter the pending signal chain of the kernel state. After the shielding (blocking) of the SIGCHLD signal is removed, the SIGCHLD generated by 10288 is extracted from the pending signal chain of the kernel state for processing. So the breakpoint *. If we fully understand the above experimental results and analysis, we will find that there are competitive conditions for DebugPythonWithGDB_6.py. When the child process corresponding to subprocess.Popen () delivers the SIGCHLD signal, the parent process has two possibilities:

PHP

1) before os.system () calls sigprocmask (SIG_BLOCK, & sa.sa_mask, & omask) 2) after os.system () calls sigprocmask (SIG_BLOCK, & sa.sa_mask, & omask)

one

two

1) before os.system () calls sigprocmask (SIG_BLOCK, & sa.sa_mask, & omask)

2) after os.system () calls sigprocmask (SIG_BLOCK, & sa.sa_mask, & omask)

Case 1) OSError exception will be triggered, and case 2) OSError exception will not be triggered. Execution: $python DebugPythonWithGDB_6.py 'python-c "import time;time.sleep (3600)"' sometimes terminates because of an OSError exception, and sometimes the loop continues. The emergence of this difference is the representation of the competitive environment. Summary: suppose a Python-level signal handle is installed for SIGCHLD, which calls the os.waitpid (- 1, os.WNOHANG) recycling child process. If os.system () is called elsewhere, the OSError exception must be caught outside of os.waitpid (). Mixing in this way is not recommended. This is the end of the analysis of waitpid (). Let's talk about some other problems that arise during debugging. I was surprised to find that Ctrl-C cannot be terminated in case 2), and I have called: signal.signal (signal.SIGINT, signal.SIG_DFL) this is because do_system () calls:

PHP

Sa.sa_handler = SIG_IGN;sigaction (SIGINT, & sa, & intr)

one

two

Sa.sa_handler = SIG_IGN

Sigaction (SIGINT, & sa, & intr)

Causes the Ctrl-C to temporarily fail until the end of do_system (). Assuming that DebugPythonWithGDB_6.py has already occurred, 2), check out its signal processing method:

PHP

# ps auwx | grep pythonroot 10355 0.00.5 8116 5812 pts/0 S+ 15:57 0:00 python DebugPythonWithGDB_6.py python-c "import time Time.sleep (3600) "root 10389 0.000 pts/0 Z + 15:57 0:00 [python] root 10393 0.0 2936 852 pts/1 R + 15:57 0:00 grep python# stap-DMAXACTION=10000-g / usr/share/doc/systemtap-doc/examples/process/psig.stp-x 1035510355: pythonHUP defaultINT ignored / / is not the expected defaultQUIT ignoredILL defaultTRAP defaultABRT defaultBUS defaultFPE defaultKILL defaultUSR1 defaultSEGV defaultUSR2 defaultPIPE ignoredALRM defaultTERM defaultSTKFLT defaultCHLD blocked,caught 0x818a480 0.

one

two

three

four

five

six

seven

eight

nine

ten

eleven

twelve

thirteen

fourteen

fifteen

sixteen

seventeen

eighteen

nineteen

twenty

twenty-one

twenty-two

twenty-three

twenty-four

# ps auwx | grep python

Root 10355 0.00.5 8116 5812 pts/0 S + 15:57 0:00 python DebugPythonWithGDB_6.py python-c "import time;time.sleep (3600)"

Root 10389 0.0 00 pts/0 Z + 15:57 0:00 [python]

Root 10393 0.0 2936 852 pts/1 R + 15:57 0:00 grep python

# stap-DMAXACTION=10000-g / usr/share/doc/systemtap-doc/examples/process/psig.stp-x 10355

10355: python

HUP default

INT ignored / / is not the expected default

QUIT ignored

ILL default

TRAP default

ABRT default

BUS default

FPE default

KILL default

USR1 default

SEGV default

USR2 default

PIPE ignored

ALRM default

TERM default

STKFLT default

CHLD blocked,caught 0x818a480 0

...

The above shows that the way SIGINT is handled is ignored, but it is actually the intersection of ignored and default, but it is basically impossible for us to see default. $vi DebugPythonWithGDB_7.py

PHP

#! / usr/bin/env python#-*-encoding: utf-8-*-import sys, os, subprocess, shlex, tracebackdef do_more (count): print'[do_more () begin% u]'% countos.system (r'printf "Child =% u\ n" $$ / bin/sleep 1') print'[do_more () end% u]'% countdef main (prog, args): if 0 = = len (args): print 'Usage:% s'% progelse: sys.stdout.write ("Parent =% u\ n"% os.getpid () count = 0while True: child = subprocess.Popen\ (shlex.split (args [0]), close_fds = True) Cwd = "/ tmp") sys.stdout.write ("Child =% u\ n"% child.pid) child.kill () do_more (count) count + = 1if'_ main__' = = _ _ name__: try: main (os.path.basename (sys.argv [0]), sys.argv [1:]) except KeyboardInterrupt: pass

one

two

three

four

five

six

seven

eight

nine

ten

eleven

twelve

thirteen

fourteen

fifteen

sixteen

seventeen

eighteen

nineteen

twenty

twenty-one

twenty-two

twenty-three

twenty-four

twenty-five

twenty-six

twenty-seven

twenty-eight

twenty-nine

thirty

thirty-one

thirty-two

thirty-three

#! / usr/bin/env python

#-*-encoding: utf-8-*-

Import sys, os, subprocess, shlex, traceback

Def do_more (count):

Print'[do_more () begin% u]'% count

Os.system (r'printf "Child =% u\ n" $$; / bin/sleep 1')

Print'[do_more () end% u]'% count

Def main (prog, args):

If 0 = = len (args):

Print 'Usage:% s'% prog

Else:

Sys.stdout.write ("Parent =% u\ n"% os.getpid ())

Count = 0

While True:

Child = subprocess.Popen\

(

Shlex.split (args [0])

Close_fds = True

Cwd = "/ tmp"

)

Sys.stdout.write ("Child =% u\ n"% child.pid)

Child.kill ()

Do_more (count)

Count + = 1

If'_ _ main__' = = _ _ name__:

Try:

Main (os.path.basename (sys.argv [0]), sys.argv [1:])

Except KeyboardInterrupt:

Pass

$python DebugPythonWithGDB_7.py 'python-c "import time;time.sleep (3600)"' DebugPythonWithGDB_7.py does not explicitly call wait (), it continues to loop. I thought subprocess.Popen () would generate a bunch of zombie processes. Looking at the relevant processes from another terminal, it was found that there was only one zombie process all the time, and it was quickly recycled. This phenomenon is so strange that it can only be assumed that there is an implicit wait ().

PHP

# gdb-Q-ex "b * waitpid"-ex r-- args / usr/bin/python2.7-dbg DebugPythonWithGDB_7.py'/ usr/bin/python2.7-dbg-c "import time Time.sleep (3600) "... Parent = 14508Child = 14512 [do_more () begin 0] Child = 14513 [do_more () end 0] Breakpoint 1, waitpid () at.. / sysdeps/unix/syscall-template.S:8181 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) (gdb) py-bt#4 Frame 0xb7c21034, for file / usr/lib/python2.7/subprocess.py, line 1363, in _ internal_poll (self=, _ deadstate=2147483647, _ waitpid=, _ WNOHANG=1, _ os_error=) _ ECHILD=10) pid, sts = _ waitpid (self.pid, _ WNOHANG) # 8 Frame 0xb7c6549c, for file / usr/lib/python2.7/subprocess.py, line 762, in _ del__ (self=, _ maxint=2147483647) self._internal_poll (_ deadstate=_maxint) # 18 Frame 0xb7cb37dc, for file DebugPythonWithGDB_7.py, line 22, in main (prog='DebugPythonWithGDB_7.py', args= ['/ usr/bin/python2.7-dbg-c "import time Time.sleep (3600) "'], count=1, child=) cwd =" / tmp "# 21 Frame 0xb7cbe49c, for file DebugPythonWithGDB_7.py, line 31, in () main (os.path.basename (sys.argv [0]), sys.argv [1:]) (gdb) bt 9: 0 waitpid () at.. / sysdeps/unix/syscall-template.S:81#1 0x081f80a3 in posix_waitpid (self=0x0, args= (14512, 1)) at. / Modules/posixmodule.c:6207#2 0x080bc300 in PyCFunction_Call (func=, arg= (14512)) 1), kw=0x0) at. / Objects/methodobject.c:81#3 0x08149d0b in call_function (pp_stack=0xbfffebd4, oparg=2) at. / Python/ceval.c:4033#4 0x081454ec in PyEval_EvalFrameEx (f=Frame 0xb7c21034, for file / usr/lib/python2.7/subprocess.py, line 1363, in _ internal_poll (self=, _ deadstate=2147483647, _ waitpid=, _ WNOHANG=1, _ os_error=, _ ECHILD=10), throwflag=0) at. / Python/ceval.c:2679#5 0x08147a77 in PyEval_EvalCodeEx (co=0xb7c60448 Globals= {'STDOUT':-2,' _ has_poll': True, 'gc':,' check_call':, 'mswindows': False,' select':, 'list2cmdline':,' _ _ all__': ['Popen',' PIPE', 'STDOUT',' call', 'check_call',' check_output', 'CalledProcessError'],' errno':,'_ demo_posix': '_ _ package__': None,' PIPE':-1,'_ cleanup':,'_ eintr_retry_call':, 'call':,' _ doc__': 'subprocess-Subprocesses with accessible I, O streams\ n\ nThis module allows you to spawn processes, connect to their\ ninput/output/error pipes, and obtain their return codes. This module\ nintends to replace several older modules and functions:\ n\ nos.system\ nos.spawn*\ nos.popen*\ npopen2.*\ ncommands.*\ n\ nInformation about how the subprocess module can be used to replace these\ nmodules and functions can be found below.\ n\... (truncated), locals=0x0, args=0xb7c655e4, argcount=1, kws=0xb7c655e8, kwcount=1, defs=0xb7c73a20, defcount=5, closure=0x0) at. / Python/ceval.c:3265#6 0x0814a1e5 in fast_function (func=, pp_stack=0xbfffeef4, nasty 3, na=1 Nk=1) at. / Python/ceval.c:4129#7 0x08149e93 in call_function (pp_stack=0xbfffeef4, oparg=256) at. / Python/ceval.c:4054#8 0x081454ec in PyEval_EvalFrameEx (f=Frame 0xb7c6549c, for file / usr/lib/python2.7/subprocess.py, line 762, in _ del__ (self=, _ maxint=2147483647), throwflag=0) at. / Python/ceval.c:2679 (More stack frames follow...)

one

two

three

four

five

six

seven

eight

nine

ten

eleven

twelve

thirteen

fourteen

fifteen

sixteen

seventeen

eighteen

nineteen

twenty

twenty-one

twenty-two

twenty-three

twenty-four

twenty-five

twenty-six

twenty-seven

twenty-eight

twenty-nine

thirty

# gdb-Q-ex "b * waitpid"-ex r-args / usr/bin/python2.7-dbg DebugPythonWithGDB_7.py'/ usr/bin/python2.7-dbg-c "import time;time.sleep (3600)"'

...

Parent = 14508

Child = 14512

[do_more () begin 0]

Child = 14513

[do_more () end 0]

Breakpoint 1, waitpid () at.. / sysdeps/unix/syscall-template.S:81

81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)

(gdb) py-bt

# 4 Frame 0xb7c21034, for file / usr/lib/python2.7/subprocess.py, line 1363, in _ internal_poll (self=, _ deadstate=2147483647, _ waitpid=, _ WNOHANG=1, _ os_error=, _ ECHILD=10)

Pid, sts = _ waitpid (self.pid, _ WNOHANG)

# 8 Frame 0xb7c6549c, for file / usr/lib/python2.7/subprocess.py, line 762, in _ del__ (self=, _ maxint=2147483647)

Self._internal_poll (_ deadstate=_maxint)

# 18 Frame 0xb7cb37dc, for file DebugPythonWithGDB_7.py, line 22, in main (prog='DebugPythonWithGDB_7.py', args= [/ usr/bin/python2.7-dbg-c "import time;time.sleep (3600)"], count=1, child=)

Cwd = "/ tmp"

# 21 Frame 0xb7cbe49c, for file DebugPythonWithGDB_7.py, line 31, in ()

Main (os.path.basename (sys.argv [0]), sys.argv [1:])

(gdb) bt 9

# 0 waitpid () at.. / sysdeps/unix/syscall-template.S:81

# 1 0x081f80a3 in posix_waitpid (self=0x0, args= (14512, 1)) at.. / Modules/posixmodule.c:6207

# 2 0x080bc300 in PyCFunction_Call (func=, arg= (14512, 1), kw=0x0) at.. / Objects/methodobject.c:81

# 3 0x08149d0b in call_function (pp_stack=0xbfffebd4, oparg=2) at.. / Python/ceval.c:4033

# 4 0x081454ec in PyEval_EvalFrameEx (f=Frame 0xb7c21034, for file / usr/lib/python2.7/subprocess.py, line 1363, in _ internal_poll (self=, _ deadstate=2147483647, _ waitpid=, _ WNOHANG=1, _ os_error=, _ ECHILD=10), throwflag=0) at.. / Python/ceval.c:2679

# 5 0x08147a77 in PyEval_EvalCodeEx (co=0xb7c60448, globals= {'STDOUT':-2,' _ has_poll': True, 'gc':,' check_call':, 'mswindows': False,' select':, 'list2cmdline':,' _ all__': ['Popen',' PIPE', 'STDOUT',' call', 'check_call',' check_output', 'CalledProcessError'],' errno': '_ demo_posix':,' _ package__': None, 'PIPE':-1,' _ cleanup':,'_ eintr_retry_call':, 'call':,' _ doc__': 'subprocess-Subprocesses with accessible I streams O streams\ n\ nThis module allows you to spawn processes, connect to their\ ninput/output/error pipes, and obtain their return codes. This module\ nintends to replace several older modules and functions:\ n\ nos.system\ nos.spawn*\ nos.popen*\ npopen2.*\ ncommands.*\ n\ nInformation about how the subprocess module can be used to replace these\ nmodules and functions can be found below.\ n\... (truncated), locals=0x0, args=0xb7c655e4, argcount=1, kws=0xb7c655e8, kwcount=1, defs=0xb7c73a20, defcount=5, closure=0x0) at.. / Python/ceval.c:3265

# 6 0x0814a1e5 in fast_function (func=, pp_stack=0xbfffeef4, nasty 3, na=1, nk=1) at.. / Python/ceval.c:4129

# 7 0x08149e93 in call_function (pp_stack=0xbfffeef4, oparg=256) at.. / Python/ceval.c:4054

# 8 0x081454ec in PyEval_EvalFrameEx (f=Frame 0xb7c6549c, for file / usr/lib/python2.7/subprocess.py, line 762, in _ del__ (self=, _ maxint=2147483647), throwflag=0) at.. / Python/ceval.c:2679

(More stack frames follow...)

Call stack backtracking shows that:

View: / usr/lib/python2.7/subprocess.py:1363

PHP

Try: _ waitpid (self.pid, _ WNOHANG)... except _ os_error as e:... if e.errno = = _ ECHILD: # # This happens if SIGCLD is set to be ignored or waiting for child# processes has otherwise been disabled for our process. This# child is dead, we can't get the status.## http://bugs.python.org/issue15756#...

one

two

three

four

five

six

seven

eight

nine

ten

eleven

twelve

thirteen

fourteen

Try:

_ waitpid (self.pid, _ WNOHANG)

...

Except _ os_error as e:

...

If e.errno = = _ ECHILD:

#

# This happens if SIGCLD is set to be ignored or waiting for child

# processes has otherwise been disabled for our process. This

# child is dead, we can't get the status.

#

# http://bugs.python.org/issue15756

#

...

Considering this situation, you did one of the following actions before calling subprocess.Popen ():

PHP

Signal.signal (signal.SIGCHLD, signal.SIG_IGN) signal.signal (signal.SIGCHLD, on_SIGCHLD)

one

two

Signal.signal (signal.SIGCHLD, signal.SIG_IGN)

Signal.signal (signal.SIGCHLD, on_SIGCHLD)

When _ internal_poll () calls _ waitpid (), the _ os_error exception is caught on the outside, which is to deal with the above possibility. Subprocess.Popen () is not a built-in function and corresponds to many PVM instructions, unlike os.system (), which is a built-in function that corresponds to a single PVM instruction. During the execution of the former, the Python-level signal handle on_SIGCHLD () has a good chance of being executed, and _ internal_poll () calls _ waitpid (), which is likely to encounter a _ os_error exception. The destructor of the Popen () object automatically calls wait* (), so the child process is automatically recycled when the Popen () object leaves scope. It is def, class, lamda, and global that Python can change the scope of variables. The following do not involve scope changes:

PHP

If/elif/elsetry/except/finallyfor/while

one

two

three

If/elif/else

Try/except/finally

For/while

The scope of the Popen () object in DebugPythonWithGDB_7.py is main (). Although not out of scope, the reassignment of the child variable triggers the destructing of the previous Popen () object. This explains why there is always only one zombie process. Reviewing the original and derived problems, it is shown once again that mastering the debugging technology of the Python interpreter can quickly troubleshoot many program faults that seem to be mysterious and basic.

After reading the above, have you mastered how to use GDB to troubleshoot Python programs? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report