1. About CPU's multi-core and hyper-threading technology
CPU
The physical number is determined by the number of slots on the motherboard, each core CPU
can multiple cores, and each core may have multiple threads.
CPU
Each core of a multi-core (each core is a chiplet) OS
appears to be an independent CPU
.
CPU
For hyperthreading , each core CPU
can have multiple threads (the number is two, such as 1
core dual thread, 2
core 4
thread, 4
core 8
thread), and each thread is a virtual logic CPU
(for example, Windows
the following is the name of the logical processor called), and each thread is also OS
seen as independent CPU
. .
This is an act of deceiving the operating system. Physically, there are still only 1
cores , but from CPU
the perspective of hyperthreading, it thinks that its hyperthreading will speed up the running of the program.
CPU
The multithreading of the program is different from the multithreading of the program. The CPU
multithreading of the hardware is the multithreading of the hardware, and the multithreading of the program is the software multithreading. The software multithreading defines multiple concurrently executed task branches. Acts as a CPU
core to schedule CPU
resources for executing tasks.
To take advantage of hyper-threading, the operating system needs to be specially optimized for hyper-threading.
Hyper- CPU
threading CPU
is stronger than non-multi-threaded cores in terms of capacity, but each thread is not enough to compare with independent CPU
core capacity.
CPU
Multiple threads on each core share the core's CPU
resources. For example, assuming that each CPU
core has only one "engine" resource, then after 1
the virtual thread CPU
uses this "engine", the thread 2
cannot use it and can only wait.
Therefore, the main purpose of hyper-threading technology is to increase more independent instructions on the pipeline (see the previous explanation of the pipeline), so that threads 1
and threads 2
on the pipeline will not compete for the core CPU
resources . So, hyperthreading technology takes superscalar
advantage of the (superscalar) architecture.
Multithreading means that each core can have the state of multiple threads. For example, the thread of a certain core 1
is idle , and the thread 2
is running.
Hyper-threading does not provide parallel processing in the full sense, and each core can still only run one process CPU
at a certain time, because threads 1
and 2
threads share CPU
resources of a certain core. It can be simply considered that each CPU
core has a unique resource in terms of its ability to independently execute processes. If a thread 1
acquires this resource, the 2
thread cannot obtain it. (I will explain in depth how hyperthreading achieves parallel operation later)
However, threads 1
and threads 2
can be executed in parallel in many ways. For example, instructions can be fetched in parallel, decoded in parallel, and executed in parallel. So although a single core can only execute one process at a time, threads 1
and threads 2
can help each other to speed up the execution of processes.
Moreover, if the thread obtains the ability of the core to execute the process 1
at a certain moment, assuming that the process sends IO
a request at this moment, the ability of the thread 1
to execute the process can be 2
obtained by the thread, that is, it can be switched to the thread 2
. This is a switch between execution threads and is very lightweight.(WIKI: if resources for one process are not available, then another process can continue if its resources are available)
There may be a phenomenon in multi-threading: if there are two 2
core 4
threads CPU
to be scheduled, then only two threads will be running. If these two threads are on the same core, the other core will completely idle and be in Wasted state. A more desirable result is to have one on each core that schedules the two processes CPU
separately .
2. The relationship and difference between CPU multithreading and program multithreading
The multithreading of the program is software multithreading; multiple soft threads provide the possibility of concurrent execution of multiple tasks.
CPU
The number of threads in it is hyperthreading, which is hardware multithreading. Each hardware multithreading (hyperthreading) can be seen as logic cpu
, but not in the full sense CPU
.
CPU
Hyper-threads in the core share some functional units in the core. For example, when hyper-threads 1
perform addition operations, hyper-threads 2
can perform multiplication operations, but hyper-threads 2
cannot perform addition operations at the same time. This is just an example to illustrate, which is not reasonable, but it is also It can be seen from this that the logic of CPU
hyperthreading cannot be CPU
regarded as a complete core of physics. In addition, since each hyperthread is CPU
a resource and can run tasks independently, each hyperthread has its own execution state, such as having its own registers and its own PC
.
On the other hand, because hyperthreads are all in the core, they share the core L1
and L2
cache, so each core L1
and L2
need to have a dedicated cache controller and a dedicated cache strategy.
Let's go back to the scheduling problem of multi-process and software multi-threading.
Whether it is multi-process or software multi-threading, they mean multiple task branches that can be executed concurrently. If it is CPU
under tasks can be executed in parallel, so multiple Processes and software multi-threads can be scheduled and assigned to hyper-threads for execution, because hyper-threads are CPU
resources .
3. Multiple CPUs
For the architectural organization of multiple CPUs, there are:
- AMP (Asymmetric multiprocessing): Asymmetric multiprocessor structure
- SMP (Symmetric multiprocessing): Symmetrical multiprocessor structure
- UMA (Uniform memory access): consistent storage access structure
- NUMA (Non-uniform memory access): non-uniform memory access structure
- MPP (massively parallel processing): Large-scale (massive) parallel processing structure
is usually used to illustrateSMP
andNUMA
isMPP
a massively parallel processing structure.
4. SMP
The symmetric multiprocessing structure considers that CPU
all roles are equal, CPU
and all share resources such as memory and bus. In fact, the CPU
internal is also SMP
a structure, and all cores share memory and bus resources.
For
SMP
the structure , since each CPU
core needs to operate shared storage: memory, it is necessary to ensure the consistency of memory data (that is, cache consistency). For CPU1
example , Core1
to modify data in A
, if bus snooping
the cache , it is necessary to send a broadcast on the bus to notify CPU
all nodes Core
to A
invalidate their caches for data . With a small CPU
number of , this isn't a huge problem, but as CPU
the number increases, the bus traffic due to cache coherence and shared objects explodes.
Therefore, SMP
the structure is not conducive to more expansion CPU
. For example 2-4
, the architectureCPU
can be considered if 1 or more, but it is not suitable for use if more than 1 is used .SMP
4
CPU
SMP
5. NUMA
NUMA
(Non-uniform storage access structure) structure enables each bank CPU
to have its own memory resource, CPU
and each bank can CPU
also access other CPU
bank's memory resources through the interconnection module between each bank.
Because CPU
all have their own local memory, they can manage their own memory to ensure their own cache consistency. However, because the memory of CPU
each is separated, the memory accessedCPU1
through the interconnection module is very slow (because it passes through the intermediate data transmission medium and the distance is farther), so when using the structure, the program should try to avoid interactive parallelism between them .CPU2
NUMA
CPU
In addition, CPU
the larger the number, the farther the distance CPU
to access memory may be, and the worse the speed will be, so NUMA
the performance of the structure cannot increase linearly with the increase of CPU
the number .
The figure below is 4
a CPU
composition NUMA
structure , a total of 32
cores, a total of allocated 32G
memory , and each CPU
allocated 8G
memory as its own local memory.
#Original address
Original address: https://www.junmajinlong.com/os/multi_cpu/