The difference between process and thread, coroutine

 Multi-process multi-threading is now a commonplace, and coroutines have become popular in recent years. There is a coroutine library gevent in python, and a coroutine packaged by gevent is also used in the py web framework tornado. This article mainly introduces the difference between processes, threads and coroutines.

1. Concept

  1. Process

A process is a running activity of a program with a certain independent function on a certain data set, and a process is an independent unit for the system to allocate and schedule resources. Each process has its own independent memory space, and different processes communicate through inter-process communication. Because the process is relatively heavy and occupies independent memory, the switching overhead between context processes (stack, register, virtual memory, file handle, etc.) is relatively large, but it is relatively stable and safe.

  2. Thread

A thread is an entity of a process and the basic unit of CPU scheduling and dispatching. It is a basic unit smaller than a process that can run independently. Threads basically do not own system resources themselves, but only have a few resources that are essential in operation. (such as the program counter, a set of registers and stacks), but it can share all the resources owned by the process with other threads belonging to the same process. Communication between threads is mainly through shared memory, context switching is fast, and resource overhead is less, but compared to processes, it is less stable and easy to lose data.

  3. Coroutines

A coroutine is a lightweight thread in user mode, and the scheduling of the coroutine is completely controlled by the user. A coroutine has its own register context and stack. When the coroutine is scheduled to switch, save the register context and stack to other places. When switching back, restore the previously saved register context and stack, and directly operate the stack without the overhead of kernel switching, and you can access global variables without locking. , so the context switch is very fast.

 

Second, the difference:

  1. Comparison of processes and threads

A thread is an execution unit within a process and a schedulable entity within a process. The difference between thread and process:
1) Address space: A thread is an execution unit within a process. There is at least one thread in a process. They share the address space of the process, and the process has its own independent address space.
2) Resource ownership: A process is the unit of resource allocation and ownership, and threads within the same process share the resources of the process
3) A thread is the basic unit of processor scheduling, but a process is not
4) Both can be executed concurrently

5) Each independent thread has a program running entry, sequential execution sequence and program exit, but the thread cannot be executed independently, and must depend on the application program, and the application program provides multiple thread execution control

  2. Comparing coroutines with threads

1) A thread can have multiple coroutines, and a process can also have multiple coroutines alone, so that multi-core CPUs can be used in python.

2) Thread processes are synchronous mechanisms, while coroutines are asynchronous

3) The coroutine can retain the state of the last call, and each time the process is reentrant, it is equivalent to entering the state of the last call

 

 Third, the use of processes and threads, coroutines in python

  1. Multiprocessing generally uses the multiprocessing library to utilize multi-core CPUs, mainly for CPU-intensive programs. Of course, producers and consumers can also use them. The advantage of multi-process is that the crash of one sub-process will not affect the operation of other sub-processes and the main process, but the disadvantage is that too many processes cannot be started at one time, which will seriously affect the resource scheduling of the system, especially the CPU usage and load. Use multi-process to view the article "Summary of python multi-process usage". Note: The use of the python2 process pool in the class will be problematic, and the class function needs to be defined as a global function. For details, please refer to http://bbs.chinaunix.net/thread-4111379-1-1.html

  2. Multithreading generally uses the threading library to complete some IO-intensive concurrent operations. The advantage of multi-threading is fast switching and low resource consumption, but if one thread hangs, it will affect all threads, so it is not stable enough. In reality, there are many scenarios where thread pools are used. For details, please refer to "Python Thread Pool Implementation".

  3. Coroutines generally use the gevent library. Of course, this library is more troublesome to use, so it is not used a lot. On the contrary, the use of coroutines in tornado is much more. Using coroutines to make tornado single-threaded asynchronous, it is said that it can also solve the problem of C10K. Therefore, the most commonly used place for coroutines is in web applications.

To sum up, IO-intensive types generally use multi-threading or multi-processes, CPU-intensive types generally use multi-processes, and non-blocking asynchronous concurrency generally uses coroutines. Of course, sometimes a combination of multi-process thread pools is required, or other combination.

  

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324838174&siteId=291194637