Goose Factory Attack OS Series (1): System Call and Program Operation

I am too bored at home, I ca n’t always want to sleep for the next interview, so the spirit guy (?) Researched the questions that were not answered well in the previous interview in the afternoon. In fact, it is the operating system. Compilation (? The
follow-up will also write about the process switching in the operating system, etc. What happened, the back is too boring, if you ask anything else, it will be GG, and you can talk to the interviewer after you meet. Otherwise, there is only the embarrassing silence of 3s. By the way, look at the CSAPP that freshman did n’t understand, probably thinking about learning more in April while taking advantage of being at home. I
originally wanted to finish this series in the afternoon . (layout is not disorderly), the result of evil in the name of a large school brother actually practice the night sent me chicken feet, delay my creation.
prior to also read a lot of blog, I feel a little question and answer scenarios The style of the wind is quite fun, at least I can read it myself, because I feel that the interview is a test of the ability to draw pictures, and I also write pictures when I write a blog, I will try my best to write the interview questions and the like, Respect yourself more thinking, on the one hand lessons. Limited personal ability and knowledge, if wrong, correct me also look ha ha ha.

First, talk about system calls

When a process is running, it encounters such operations as reading and writing files. Due to I / O operations, based on the OS knowledge we have learned, we can conditionally reflect that the process will change from running to blocking. . Going deeper, an interrupt will occur at this time. This interrupt will cause the system to save some register information of the current user process to the kernel stack (it needs to continue execution from the interruption position in the future recovery), and then execute the interrupt service routine The interrupt service routine refers to the system call.

The question is coming, the kernel implements a lot of system calls, so how does the process know which one to execute? At this time, you need to pass a system call number to help. This call number is placed in a %eaxregister. In Linux, it is int $0x80an interrupt that executes a system call.

For example, we want to print the current time, then in C ++ can be achieved through library functions.

#include<stdio.h>
#include<time.h>

int main(){
    time_t tt;
    struct tm *t;
    tt=time(NULL);
    t=localtime(&tt);
    printf("time:%d:%d:%d:%d:%d:%d\n",t->tm_year, t->tm_mon+1, t->tm_mday, t->tm_hour, t->tm_min, t->tm_sec);
    return 0;
}

//结构体如下:
struct tm {
   int tm_sec;         /* 秒,范围从 0 到 59                */
   int tm_min;         /* 分,范围从 0 到 59                */
   int tm_hour;        /* 小时,范围从 0 到 23                */
   int tm_mday;        /* 一月中的第几天,范围从 1 到 31                    */
   int tm_mon;         /* 月份,范围从 0 到 11                */
   int tm_year;        /* 自 1900 起的年数                */
   int tm_wday;        /* 一周中的第几天,范围从 0 到 6                */
   int tm_yday;        /* 一年中的第几天,范围从 0 到 365                    */
   int tm_isdst;       /* 夏令时                        */    
};

Using assembly language to realize the system call will find that the system call number is actually put into the% eax register through mov $ 0xd %% eax (the system call number of time () is 13), and then by executing int $ 0x80 , The system will execute the time () system call.

/*有省略...*/
time_t tt;
struct tm *t;
asm volatile(
	"mov $0,%%ebx\n\t"
    "mov $0xd,%%eax\n\t"
    "int $0x80\n\t"
    "mov %%eax,%0\n\t"
    : "=m" (tt)
);

Second, how the program works

This question was asked when the director of Tencent met, and record it.

Interviewer: Students, what do you do before a project runs?

Me: Compile it, um, and check for errors.

Interviewer: Is there anything before compilation?

I:? ? ?

I didn't pay attention to this part before, only said a compilation, and later asked the students should be pre-compiled, compiled, assembled, linked . Knowing the general process, you have to get to the bottom to find out what happened inside the execution of the procedure. Adhere to the question of breaking the casserole (? Hahahaha, to avoid the awkward silence of the interview.

It makes me feel like the interviewer is memorizing. Then understand it better. Here are the steps to run a classic program:

  • After the operating system creates the process, it passes control to the entry of the program, which is often an entry function in the runtime

  • Entry function initializes the runtime library and program running environment, including the construction of heap, I / O, thread, and global variables

  • After the initialization of the entry function is completed, the main function is called, and the main part of the program is officially started.

  • After the main function is executed, return to the entry function. The entry function performs cleanup work, including global variable destructuring, heap destruction, and closing I / O. Then, the system call ends the process

This seems to be easy to understand, but how to understand the corresponding pre-compilation, compilation, assembly, linking , leave it to the next article to write.

The following is a detailed introduction in combination with assembly instructions and memory changes. It's time again for examples. I need to complete a simple addition operation, everyone will write the code, so what is the secret of it?

int add(int a,int b){
    return a+b;
}
int main(){
    int a=1,b=2;
    int c=add(a,b);
}

Assembly code is as follows (only selected parts):

img

img

Here is to start with an interview question from Tencent.

Interviewer: Do you know the memory distribution of the process? Then draw a picture to see.

Me: Interviewer, I've done the painting (wow wow, wow, it's done).

Interviewer: Yes, the answer I want. Then you talk about the low and high addresses, and the direction of stack growth.

Me: Hey, what is it? ? ? (Reluctantly, come up.

When reviewing the difference between processes and threads, we should have the impression that each process will have its own memory space. What information does this part have, see the figure below.

img

The variables that appear in the program are stored in the stack, from top to bottom are high address-low address, suppose there is such an initialization stack, where% esp and% ebp are registers, in order to illustrate their usefulness, has Indicated in the figure.
Insert picture description here

First, execute the code in main, the changes in the stack and the assembly instructions correspond to the following.

[External chain image transfer failed, the source site may have an anti-theft chain mechanism, it is recommended to save the image and upload it directly (img-P4DZz89v-1585929261351) (C: \ Users \ NayelyA \ AppData \ Roaming \ Typora \ typora-user-images \ image-20200403214038609.png)]

The subl $24,%espinstruction will subtract the address value of the top pointer (% esp) from the stack by 24. At this time, it should point to the address 72. Next, you can see that there are two similar commands:

movl $1,-12(%ebp)
movl $2,-8(%ebp)

This means that the number 1 and the number 2 are placed at the current pointer -12 and -8 respectively, that is, the number 1 is stored at the address value of 84, and the number 2 is stored at the address value of 88.

It is not difficult to notice that the next four instructions, very similar, are listed here and then explained.

movl -8(%ebp),%eax
movl %eax,4(%esp)
movl -12(%ebp),%eax
movl %eax,(%esp)

First, the number stored on the address minus 8 on the basis of% ebp is 2, and through the movl instruction, we assign the number 2 to the% eax register, and then through a movl instruction, we put it in% esp +4 position.

Blind guess, the reader may forget what register% eax is, here we will mention it again. At the beginning of the article, we said that it is the call number, more precisely it is an accumulator. When you write a function, and finally return a value x (return x), then this x will be stored in% eax. Of course, I am talking about the 32-bit case, if 64 digits, then% eax only stores the lower 32 digits.

Then the remaining two are not difficult to understand, but still have to talk about it. The number stored by subtracting 12 on the basis of% ebp is 1, through the movl instruction, we assign the number 1 to the% eax register, and then through a movl instruction, we put it in the position of% esp. ** One function of these 4 instructions is to pass parameters to the function add (). ** Draw a picture below for easy understanding.

[External chain image transfer failed, the source site may have an anti-theft chain mechanism, it is recommended to save the image and upload it directly (img-LoQnFlmh-1585929261352) (C: \ Users \ NayelyA \ AppData \ Roaming \ Typora \ typora-user-images \ image-20200403220909628.png)]

After saying so many assignments and parameters, it is always time to call the add () function, the call addinstruction is to do this thing. Here is a brief introduction to the call instruction:

pushl %eip
movel add %eip

The above two instructions are equivalent to call, where the% eip register stores the address of the currently executing instruction. After calling call here, it is equivalent to pushing the address of the current instruction onto the stack. At this time,% esp points to 68, and then the address of the instruction to be executed by the add function is assigned to% eip.

Next, we see some instructions in the add () function, which are actually the same as the main ones. Attentively, you may be wondering why this leal, popl and ret are used? And is it necessary to pushl the first line of each method?

pushl %ebpThe current access function pointer stack when like add () function performed over by pop pop operation,% ebp return to the main function, as if it had happened as meal operation (of course, only on the surface It seems that we actually performed a simple addition). There is also a ret operation, this instruction can be understood as popl %eip, that is to restore% eip to the original value, so that you can continue to execute the next instruction in the main add () method.

After saying so much, there are two instructions left ...

img

After the main function is executed movl %eax,-4(%ebp), the value saved in the% eax register is placed in the address after the value of% ebp minus 4, and then the leave is executed. The leave instruction is equivalent to the following two instructions:

mov %ebp,%esp
pop %ebp

This is equivalent to restoring the initial state of the stack.
[External chain image transfer failed, the source site may have an anti-theft chain mechanism, it is recommended to save the image and upload it directly (img-M0Kos6nS-1585929261354) (C: \ Users \ NayelyA \ AppData \ Roaming \ Typora \ typora-user-images \ image-20200403212928802.png)]
The execution of a program is over.

Published 201 original articles · Like9 · Visitors 10,000+

Guess you like

Origin blog.csdn.net/weixin_40992982/article/details/105303829