Linux problem location

1. CPU working principle

 

 

 2. Linux memory allocation

 

 

 3. stack

1). The place where local variables, function parameters, and function return values ​​are stored.
2). The stack space of each thread is continuous and independent of each other.
3). Use x /100a $esp to see the original data in the stack memory

 3. Function call process

How the function call process organizes data in the stack

 4. Heap three-level heap management

If you want to know more about it, read "glibc memory management ptmalloc source code analysis.pdf" and heapdump source code

 5. Information collection

Thread information stack space All heap global and local variables All information stored in the core can be mined through the above methods

5.1 View memory information

X /100x address

5.2 View call stack X /100a $esp

5.3 View local variables

Info locals

 

5.4 View global variables

P global variable name

5.5 View class instance

Info symbol * class instance address

P * (class name *) class instance address

 5.6 View STL container

Plist pmap …

 5.7 View heap memory allocation:

 Heapdump –a 

6. Troubleshooting example

The process of troubleshooting is the process of analyzing and integrating information. The process of troubleshooting is the process of comprehensively using the various methods introduced in Chapter 2 to collect information, digging into the root of the problem step by step, and stripping away cocoons. Need to constantly try, hypothesize, verify. It is a kind of advanced mental work, which requires a lot of patience and familiarity with the structure of the original code. Sometimes it takes several months of repeated demonstration and verification to locate a deep-seated problem.

The program crashes occasionally? What next?

6.1 Observe the clues of the following core files

1) The core size is about 200M, indicating that the crash is not caused by memory leaks

2) The process number is 5030, and the log can be associated with the process number

3) Error type 11 Segmentation fault. The description is a memory access exception

6.2 Preparing the environment Use gdb to open the core file

 1) Install the dss7016_tools.tar.gz toolkit to ensure that gdb (7.6) heapdump is in place

2) Use gdb UMTS.exe ../UMTS_5030-1437863062-11.core -x .gdbinit to open the core file to ensure that the stl container can be parsed correctly

3) Use info sharedlibrary to check whether the library is loaded correctly

 6.3 Confirm the superficial cause of the crash

1) bt crashes in the CZString::CZString constructor

 2) Disassemble to see where the crash occurred

3) info registers to see the value of the register

 4) The superficial reason confirms that the value of Mov edx,[eax +0x4] eax is wrong, which leads to an error

 6.4 Why the value of Eax does not reverse the first difficult optimization stack

 

1) The value of the parameter __x is wrong why?

See stl_map.h and stl_tree.h

 2) Find the value of __X

 3) Now it can be confirmed that the value in this map is wrong

4) Why is the value in the map wrong? There is no routine process check for super difficult points.

 

 

  

 5) The root cause is found that the CLiveChannel class has been released and the lock inside has been manipulated, resulting in the destruction of the map structure of thread 1, resulting in a segmentation fault

Guess you like

Origin blog.csdn.net/huapeng_guo/article/details/132494333