2019-2020-1 20199309 "Linux kernel principle and Analysis" in the ninth week of work

General implementation of the process of switching systems and processes

1. Knowledge summary

(1) process scheduling time:

  • When the interrupt processing is directly calling schedule (), or return to user mode tag according need_resched calling schedule ().
  • Kernel thread is a special process, not only the kernel mode user mode, you can directly call schedule () perform switching process can also be scheduled (kernel threads can directly access the kernel function, the system call does not occur) in the interrupt processing . As a special kernel threads of a process scheduling classes can be active, passive can also be scheduled.
  • User mode process can not achieve active schedule, can only be scheduled in interrupt processing (schedule is a core function, not a system call).

(2) suspends the execution of the CPU process, saving time and interrupt a different site. Before and after the interruption is in the same process context, kernel mode to perform just shifted from user mode. Process context contains all the information needed to perform the process:

  • User address space: including program code, data, and so on user stack
  • Information control: the process descriptor, the kernel stack, etc.
  • Hardware context

(3) schedule () function to select a new process to run and call context_switch context switching, a key context_switch macros switch_to be critical context switching.

(4) 0 to 3G users can access, 3G or more kernel mode can access only. All processes are fully shared over 3G, such as switching process to process X Y, but address space is still above the 3G part, but the process descriptor and other process context switch, and only differ when returned. A process which can be "waved" to enter kernel mode, after walking a few will be able to return to user mode, empty when he enters the idle process idling.

2. The key code analysis

(1)schedule

asmlinkage__visible void __sched schedule(void)  
{  
         struct task_struct *tsk = current;  
   
         sched_submit_work(tsk);  
         __schedule();  
}  

schedule () tail called __schedule (), __ schedule () is the key code next = pick_next_task(rq, prev);that encapsulates the process of scheduling algorithm, using certain process to select a scheduling policy process. After obtaining scheduling strategy used context_switch(rq, prev, next);to achieve process context switching. Wherein the key switch_to(prev,next, prev);switch status of the stack and registers.

(2)switch_to

#define switch_to(prev, next, last) //prev指向当前进程,next指向被调度的进程                                   
do {                                                                              
                                                        
         unsigned long ebx, ecx, edx, esi, edi;
                                  
         asm volatile("pushfl\n\t"  //把prev进程的flag保存到prev进程的内核堆栈中
                      "pushl %%ebp\n\t" //把prev进程的基址ebp保存到prev进程的内核堆栈中
           
                      "movl %%esp,%[prev_sp]\n\t"//把prev进程的内核栈esp保存到prev->thread.sp中
                      "movl %[next_sp],%%esp\n\t"//esp指向next进程的内核堆栈栈顶(next->thread.sp) 
                      
                      "movl $1f,%[prev_ip]\n\t"//把"1:\t"地址赋给prev->thread.ip,当prev进程下次被switch_to切回来时,从"1:\t"处执行,即往后执行"popl %%ebp\n\t"和"popfl\n"     
                      "pushl %[next_ip]\n\t"//把next->thread.ip压入next进程的内核堆栈栈顶 
                      __switch_canary                                        
                      "jmp __switch_to\n"//执行__switch_to()函数,完成硬件上下文切换  
                      "1:\t"                                                    
                      "popl %%ebp\n\t"   
                      "popfl\n"                         
                                                                                
                      /* output parameters */                                   
                      : [prev_sp] "=m"(prev->thread.sp),             
                        [prev_ip] "=m"(prev->thread.ip),              
                        "=a" (last),                                                
                                                                                   
                      /* clobbered output registers: */              
                        "=b" (ebx), "=c"(ecx), "=d" (edx),              
                        "=S" (esi), "=D"(edi)                            
                                                                                    
                       __switch_canary_oparam                                     
                                                                                    
                          /* input parameters: */                                  
                      : [next_sp]  "m" (next->thread.sp),              
                        [next_ip]  "m" (next->thread.ip),              
                                                                                   
                      /* regparm parameters for __switch_to():*/  
                      //jmp通过eax寄存器和edx寄存器传递参数
                        [prev]     "a" (prev),                                   
                        [next]     "d" (next)                                    
                                                                                     
                        __switch_canary_iparam                             
                                                                                    
                      : /* reloaded segment registers */                          
                     "memory");                                           
} while (0)  

[prev_sp] "=m"(prev->thread.sp), Before the analysis, when compiled, see reference numbers are used (0%, 1%, 2%, etc.) mark parameters, for better readability, where string ([prev_sp]) Parameters marked (prev- > thread.sp).

First save prev process of flags, ebp, use "movl %%esp,%[prev_sp]"and "movl %[next_sp],%%esp"switch to complete the kernel stack, esp points to the next process kernel stack top of the stack, and then set the thread.ip prev process is "1: \ t" address (prev wait until the next process switch_to cut back when views are performed from: execute "1 \ t" Office). The next-> thread.ip saved to the next process kernel stack top of the stack, followed by the implementation jmp __switch_to(note here is the jmp instead call) to complete the hardware context switching, the kernel stack pop stack next process upon completion of the return of the preservation of next-> thread.ip, eip to this location. Discuss two cases:

  • If the process is cut before the next switch_to been out (before it can be understood as did the prev process), there is a kernel stack next to be cut out of the process is to save the ebp and flags. Because executed movl $1f,%[prev_ip], so next-> thread.ip is "1: \ t" address that pops up at the end of the return __switch_to function execution is "1: \ t", eip point "1: \ t", execution "popl %%ebp"and "popfl"recovery ebp and flag next process, next process can be carried out.
  • If not switch_to been out before the next process, then next-> thread.ip is ret_from_fork. After execution is ret_from_fork __switch_to function returns.

So, if a call, put call __switch_tounder a 1:\tpush, eip point after the execution "1: \ t", this only applies to the first case, the second case can not meet the needs to perform ret_from_fork.

Textbook notes

  • Memory, user space process is called the process address space, which is the system, each user can see the memory space of the process. Process address space consists of addressable virtual memory components.
  • Mm_struct kernel memory descriptor structure showing the process address space. Each memory descriptor corresponding to the process address space unique interval. All mm_struct structures are linked through their own mmlist field in a doubly linked list. The first element is a list init_mm memory descriptor, on behalf of init process's address space.
  • Process task_struct descriptor, mm descriptor field contains the memory used by the process.
  • Kernel threads not process address space, no memory descriptor associated with the kernel process corresponding to the thread descriptor mm domains are also empty. Kernel threads of a process memory descriptor directly before use.
  • There mm_struct vm_area_struct structure, described by its memory area. Linux kernel memory area also referred virtual memory area (VMAS). vm_area_struct describes a separate memory range in the specified address space on a continuum. Each kernel memory region as a separate object management memory, each memory zone has the same property.
  • vm_area_struct structural bodies vm_ops field point to the domain specified memory area associated function table operation, the method used by the kernel of the operating table VMA.
  • Kernel often need to perform some operation on a memory area. find_vma in the specified address space larger than the search for a vm_end addr memory area. It works the same way find_vma_prev () and find_vma (), but returns is less than the first addr of VMA. find_vma_intersection () Returns a VMA and the first section intersects the specified address.
  • do_mmap () to create a new linear address space, about to address a range added to the process's address space. do_munmap () function deletes a specified address spaces from a particular process address space.
  • Address Translation (virtual to physical) required virtual address segment, so that each segment of the virtual address are used as an index pointing to the page table. Page table entry points to the next level page table or point to the final physical page. linux using three page table address translation is completed (top page table page is global directory PGD, two intermediate page table page directory PMD, referred to as a last page table). pgd domain pointing process memory descriptor page global directory.
    image
  • Translation buffer (TLB) as a mapping virtual addresses to physical addresses of the hardware cache, when a request to access a virtual address, the processor will first check whether the TLB cache the virtual address to a physical address mapping, and if found, physical address immediately returned, otherwise, you need to search through the physical address of the page table needed.
  • In order to reduce the operation of the disk I / O, improve system performance, Linux kernel implements disk caching technology called the page cache. That is cached data disk into physical memory, the disk access convert access to physical memory.
  • Page cache size can be dynamically adjusted. Main page cache read cache, write cache, cache recovered three mechanisms to ensure that read, write cache, and cache release.
  • The core data structure of the page is cached address_space object, which is an inode data structure in the target page owner's embedded. Use address_space structure and manage the cache entry page I \ O operations. A file can have multiple virtual addresses (identified by a plurality of vm_area_struct) but only one physical address (the address_space data structure).
  • Each object has a unique base address_space tree. Group is a binary tree, by specifying the file offset, can quickly retrieve desired data in the group tree.
  • Page cached data is more than a new back-end data store when the data is called dirty data.
  • Linux-page write-back cache is done by the flusher threads, flusher thread trigger write-back operation in the following three situations occur.
    • When free memory falls below a threshold: Out of memory when idle, need to release part of the cache, due not only dirty pages in order to be released, so it is a dirty pages are written back to disk, so that it becomes a clean page.

    • When the dirty pages reside within the memory of the time when more than one threshold: Make sure the dirty pages do not reside in memory indefinitely, thereby reducing the risk of data loss.

    • When a user process calls sync () and fsync () system call: give users a way to force the write-back, write-back to deal with demanding scenes.

Guess you like

Origin www.cnblogs.com/fungi/p/11876780.html