[Transfer Day 7] | How the function works

2018-05-04

"C++ Disassembly and Reverse Technology" Chapter 6 How Functions Work Reading Notes

  The debug version of the function call:

call     func
 func: 
    push     ebp     ; save ebp 
    mov     ebp,esp
     sub     esp,40h     ; raise esp, open up stack space 
    push     ...     ; save register 
    ...
     pop     ...     ; restore register 
    add     esp,40h     ; lower esp, Release local variable space 
    cmp     ebp, esp     ; check stack balance 
    call     __chkesp     ; enter stack balance error detection function 
    mov     esp, ebp     ; restore esp 
    pop     ebp
     ret

  The function __chkesp is a unique function under the Debug compilation option group, which is used to detect stack balance. In the Debug version, this function is used when all functions exit.

  After the O2 option is used, there will be no code for stack balance check, and there may not be a series of operations such as saving the environment and using ebp to save the current stack bottom, and the code will become concise and effective.

  [call instruction and retn instruction]

call instruction (subroutine call instruction):
Intra-segment transfers:
    push     eip
     jmp     target location
Transfer between segments:
    push     CS
     push     eip
     jmp     target position
 retn /retf command:
Intra-segment branch jumps out of retn:
    pop     eip
 inter-segment transition jump out of retf: 
    pop     eip
     pop     CS

  [Look at a piece of assembly code]

mov     ecx,10h     ; Set ecx to 0x10 
mov     eax,0CCCCCCCCh     ; Initialize the local variable to 0CCCCCCCCh 
rep     stos dword ptr [edi]     ; Write the content of the eax value to the memory pointed to by edi in units of 4 bytes according to the value of ecx

  The purpose of the rep instruction is to repeat the instruction above it. The value of ECX is the number of repetitions. It is generally used to initialize local variables .

  The role of the STOS instruction is to copy the value in eax to the destination address.

 

  Investigation of various calling methods:  

  In the assembly process, " ret xxxx"  is usually used to balance the stack space used by the parameters . When the parameters of the function are indeterminate parameters, the function itself cannot determine the size of the stack space used by the parameters, so the balance operation cannot be performed by the function itself. The caller of this function performs the balancing operation. In order to determine the balancer of the parameters and how the parameters are passed, there is a calling convention for the function. There are three calling conventions in the VC++ environment: _cdecl, _stdcall, and _fastcall.

  * _cdecl: The default calling method of C\C++, the caller balances the stack , and functions with indefinite parameters can be used.

  *_stdcall: The called party balances the stack, and functions with indefinite parameters cannot be used.

  *_fastcall: Parameters are passed in registers, the called party balances the stack , and functions with indeterminate parameters cannot be used.

  The printf function often used in C language is a typical _cdecl calling method. Since printf can have multiple parameters, it can only be called in _cdecl method.

  When the printf function is used multiple times, for the Debug version, it will perform a stack balancing operation after each call ends. After the optimization of the O2 option, the replication and propagation optimization will be adopted, and each parameter balancing operation will be merged, and the stack top pointer esp will be balanced at one time.

  Through analysis, it is found that _cdecl and _stdcall only differ in parameter balance, and the rest are the same. However, after optimization, the function called by _cdecl is used multiple times in the same scope, which will be more efficient than _stdcall. This is because _cdecl can use replication propagation, while _stdcall balances parameters within the function, and cannot use replication propagation this optimization method.

  Among these three calling methods, the _fastcall calling method is the most efficient. Because only it can use registers (ecx, edx) to pass parameters. However, it is necessary to reserve the stack space corresponding to the parameters. In order to prevent the parameters from being passed due to the need to accept other values ​​in the register during the process of passing parameters (such as the above-mentioned local variable initialization, ecx is used). This is different from _cdecl and _stdcall stack parameters, _fatscall parameters ecx, edx copied to the reserved stack space is the local variable space [ebp-xx], and the parameters passed by _cdecl and _sdtcall are under the return address [ ebp+xx] .

  [Stack space development under 64-bit platforms: Unlike x86 compilers that explicitly add and overflow parameters on the stack through push and pop instructions, the x64 code generator reserves enough stack space to call the largest target function (parameter method), then, when the subfunction is called, it reuses the same stack area to set the parameters.

  

  Using ebp or esp addressing:

  In most cases, using ebp to address local variables can only be generated in non-O2 options, this is done to facilitate debugging and to detect stack balance, making the object code more readable.

  In the O2 compilation option, in order to improve the efficiency of the program, these detection tasks are omitted. In the code written by the user, as long as the stack top is stable, you can no longer use ebp, and use esp to directly access local variables, you can Save one register resource. Such as:

lea     eax, [esp + 8 + var_4] ; where esp + 8 is equivalent to ebp .

  

  Parameters of the function:

  C\C++ defines functions with variable length parameters as:

  * At least one parameter is required.

  * All variable-length parameter types are passed in as dword types. //Actually, I think that any parameter passed in should not be less than dword, and the push instruction will pass in an operation word length each time, that is, 32-bit or 64-bit.

  * It is necessary to describe the total number of parameters in a parameter or assign the latter parameter as an end marker.

 

  The return value of the function:

  The VC uses the register eax to save the return value. Since the 32-bit eax register can only save 4 bytes of data, data larger than 4 bytes will be saved by other methods. Usually, eax is used as the return value, only the basic type and the custom type whose sizeof(type) is less than or equal to 4 (except for floating-point numbers).

  If a structure type is returned, and the structure type is full of basic types, it will be returned by register, and stored in the local variable of the caller after return. (This will be analyzed in later chapters, not complete).

 

  

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325299304&siteId=291194637