[Computer Architecture-05] Pipeline Hazards - Control Hazards

1. Pipeline Hazards

  In a pipelined processor, there are situations where an instruction in the current stage of the pipeline may prevent the execution of the next successive instruction in the pipeline during the expected clock cycle. Such a situation is called a pipeline hazard ( ) Pipeline Hazards. When a pipelining hazard occurs, the ideal speedup brought by pipelining will be reduced. There are three types of adventures as follows.
[注]:流水化加速比 = 非流水化指令平均执行时间 / 流水化指令平均执行时间

(1) Structural hazard ( Structural Hazards) , when the processor works in a pipelined manner, instructions will overlap and execute. If the hardware cannot support all combinations of instructions at the same time, resource conflicts will occur, resulting in structural hazards. That is, the hardware required by the current instruction is working for the previous instruction;
(2) data hazard ( Data Hazards) , when the processor works in a pipelined manner, the instructions will overlap and execute, if there is a dependency between the previous and subsequent instructions, it may lead to data hazards; ( For example, the instruction currently entering the pipeline needs to use the result after the execution of the previous instruction for calculation, and data hazards occur at this time) (
3) Control hazards ( Control Hazards) , branch instructions and other instructions that change the program counter may be pipelined Leading to control hazards, that is, the next step needs to be determined based on the execution results of previous instructions.

  This article first takes a closer look at control adventures, and other adventure types will be studied in subsequent articles. Before that, let's review some of the things you are familiar with.

2. Branch and jump instructions (Branches, Jumps)

  You should have noticed that branch and jump instructions are a bit different from other regular or arithmetic logic instructions. Execution of both types of instructions affects PC-Program Counterthe value of the program counter ( ).

  The branch instruction changes PCthe PCtarget value based on the current value plus the offset—relative addressing instruction.

  Jump instructions (unconditional jumps) PCchange registersPC without caring about the current value - "absolute addressing instructions". PCIt should be noted that double quotation marks are added here. The jump instruction is intended to jump to an absolute address, but in fact, when the instruction length is fixed and the same as the address length, no instruction can accommodate the opcode and jump at the same time. Therefore, the jump instruction is also realized by adding an offset to the current PCvalue .

  • Unconditional jump (Jumps)
    instructions need to care about: opcode ( Opcode), offset ( Offset), program counter ( PC)

  For example, the address to MIPSwhich Jthe instruction jumps to is not directly specifying 32the address of the bit (all MIPSinstructions are 32bit long, it is impossible to use all of them to address the data field, such instructions are invalid, maybe only nop): due to the destination address The highest 4bit cannot be given in the code of the instruction, and 32the highest 4bit takes PCthe highest 4bit the current value. For general programs, the jump28 space supported by the bit address is large enough.256MB

Another example is that the instruction in RISC-Vthe architecture   ( ) uses the immediate number encoding method of the format. After the offset of the integer multiple of is added to the value as the jump target address, so the instruction can be controlled to jump to the range before and after . The instruction stores ( ) in . The standard software system call convention uses registers as return address registers.RV32JARJ2PC1 MiBJALPC + 4rdx1

JALWhen is of rd = x0, it is a simple Jumpinstruction (pseudo-instruction in assembler J).
insert image description here
  OK, when the computer fetches instructions ( IF) to an JUMPinstruction , in IDthe stage the decoder Opcodedetermines , and then needs to determine the offset given in the instruction Offsetand the current program counter PCvalue , This offset then needs to be added to the program counter, which needs to be done in ALUor in a special adder, which eventually changes PCthe value.

  • Register jump (Jump Register)
    instructions need to care about: opcode ( Opcode), register value

  When the computer fetches an instruction ( IF) to a register jump instruction, the decoder at IDthe stage Opcodedetermines that the current instruction is a register jump instruction through the opcode of the instruction, but at this time it does not know the address to jump to but only knows to save the jump The jump address register, and then get the jump address from the register. There is no offset here, but a direct jump to the address held by the jump register.

  • Conditional Branches (Conditional Branches)
    instructions need to care about: opcode ( Opcode), program counter ( PC), register value (for judging conditions), offset ( Offset)

  This becomes a bit complicated. When the computer fetches ( IF) a conditional jump instruction, the decoder in IDthe stage Opcodedetermines that the current instruction is a conditional jump instruction through the opcode of the instruction, and then needs to obtain PCthe value and view the corresponding register, get the conditional result through the value of the register (for example, 0compare to see if it is greater or less than), and then use the offset given by the instruction plus PCthe value of the program counter to get the jump address to complete a branch PCrelated to .

3. Control Hazards

3.1. Pipeline control hazards caused by jump instructions

First, let's understand the basic control adventure. The most basic control adventure is how to ensure the correct execution of the next instruction. Look at the following pipeline diagram.

insert image description here
  In the above pipeline, there are two instructions I1and I2, the instruction 1is to r0take the value from the register and add the immediate value and 10then save the result to r1the register, the instruction 2is to r2take the value from the register and add the immediate value 17and then save the result to r3the register go. These are two very simple instructions, there is no data dependency between these two instructions, so no data hazard will occur in this case. The focus here is on controlling the adventure.

  Instructions 1will executed in the normal five-stage pipeline execution sequence, fetching IF, decoding ID, executing EX, storing MEM, and writing back WB. You may find that the execution of the second instruction here seems to be a bit different. When the second instruction enters the pipeline, the first instruction has just completed the instruction fetching stage. At this time, a problem will arise: "What is the second instruction Is it the instruction we need to execute?" The reason for this problem is that the first instruction has not been decoded, so it is not clear whether the first instruction is a branch or jump instruction. Therefore, the instructions that flow into the pipeline in sequence at this time 2are not necessarily the instructions that the program needs to execute. Until the first instruction is decoded by the decoder in IDthe stage , at which point we learn, "Oh ~ the last instruction was not a branch or jump instruction, or was indeed a branch or jump instruction".

  So what if the instruction 1is a branch or jump instruction? Then in the decoding stage, it will be determined that the instruction 1is such an instruction, so the clock cycle 1of IDthe stage the instruction will change the program counter PCand change the instruction address to be read in the instruction register. This will cause control hazards. In order to avoid risks, we IDwill insert a bubble ( Bubble) at this stage to delay the value fetching stage of the next instruction by one cycle. If this continues, it will become as follows.
[注]:为避免这类冒险,常常会使流水线插入一个空操作 nop。这样的空操作通常被称为流水线气泡或直接称为气泡 (Pipeline Bubble/Bubble)。

insert image description here

  Every instruction that flows into the pipeline will have to consider control hazards, so to avoid hazards, you need to insert one for each IDstage Bubble, then the value of each instruction needs two clock cycles, you will realize that this will be very inefficient assembly line. Now let's analyze such a pipeline in detail, let's take another look at the pipeline drawing method (coordinate change).

insert image description here

In this way, it will be clearly seen that the order of execution of this pipeline is I1, nop, I2, nop, I3, nop, I4... According to the calculation of this pipeline CPI = 2 (1 + 1), ideally CPI = 1, the performance of the machine executed according to this design is strictly halved.
[注]:
非流水化 CPI = 指令执行周期 / 执行指令个数;
流水化 CPI = 理想 CPI + 每条指令的流水线停顿时钟周期;(理想 CPI = 1)

3.2. Solve the control hazard caused by the jump instruction (basic method)

  Now that it is clear that there will be jump instructions in the pipeline, which will bring control hazards to the pipeline, so how to solve this problem? In fact, the method is simple, that is to guess ( Speculate) that the next instruction is not a jump instruction, so directly add the value PCof 4(if the instruction length is 4bytes ).
insert image description here
  The current pipeline processing method is the part circled by the purple circle in the figure. Guess that the startle instruction is not a jump instruction, but directly use an PCadder 4make it point to the next continuous instruction. According to the normal order, it should be executed here. 96Instructions for address 100address, 104address.

  But in fact, looking at this instruction code, 100the instruction at the address is a jump instruction. When the instruction at the 100address is fetched, IDit is found in the decoding stage that the instruction at 100the address is a jump instruction, but at this time 104the instruction at the address has been fetched. (Because the last guess is that the instruction is not a jump instruction, the value will be taken sequentially). 100The address instruction tells us that we should 304fetch the address and execute it. Then we need to do two things at this time. First, prevent the 104address continuing to execute in the pipeline (that is, kill the current pipeline); Second, change PCthe value to the address to jump to.

insert image description here

  In order to solve the first problem above, we add a selector to the pipeline IRSrc. When the previous instruction is interpreted as a jump instruction in the decoding stage, the selector will switch to an empty instruction nop. And at the end of the cycle, an additional adder is used to add a part of the PCinstruction to obtain a new jump address PC, so as to complete the second point.

insert image description here
  The process of executing the program in the above pipeline circuit is described in the form of a timeline table. As shown in the above figure, the second instruction is decoded as a jump instruction at IDthe stage Although I3the instruction fetch has been completed, the selector switches to nop, then I3the instruction will flow into the pipeline and will not execute the actual action. At the same time , the calculation of the value will be completed at the end I2of the instruction IDphase clock cycle. When this clock cycle is reached, the instruction will be fetched again, and the instruction at the address to complete the instruction. jump.PCt3IF304

3.3. Pipeline control risks brought by conditional branches

I1            096      ADD
I2            100      BEQZ   r1   +200
I3            104      ADD
              108      ...
I4            304      ADD

  Here is a piece of instruction code, 100the address instruction will judge whether the value of r1the register is equal to 0or not, and if so, jump to the address offset 200by . In fact, you find that a branch of the pipeline has been generated here, so the problem is that

  • 1. How to know whether to adopt this branch;
  • 2. And what to do after adopting this branch.

3.4. Solve the pipeline control hazard caused by conditional branch instructions (basic method)

  Let’s look at the first question first, how to know whether to use this branch, and whether it can be completed directly in the decoding stage like a jump instruction (that is, judged according to the type of the decoded instruction). This method seems to be used in conditional branch instructions. Unreasonable, because at this time, it is only known that it is a conditional branch instruction, but it is not clear whether the condition is true. Therefore, it is necessary to use a hardware logic unit capable of comparison. Such subtraction or comparison operations are very suitable for ALUcompletion , and then lead to a zero line ( wire), as shown in the figure below.

insert image description here

  According to such an approach, it will be determined whether to select a branch (according to zero wirejudgment . PCThe prediction scheme is still the way of guessing without jumping, then IFthe stage will fetch the address once in the stage ofI2 , and when calculating the clock cycle of the branch, it will fetch the address again , so when we can determine whether to choose the branch, we have already Extracting the next instruction.IDPC + 4 = 104I2IFPC + 4 = 108

  You should have found that when we were able to determine whether to choose a branch, two instructions had already been fetched. Before that, it was not clear whether to killdrop insert nop) these two instructions into the pipeline until the signal zero wireof .

  Then there will be another problem here, think about it, we can use stallthe signal to stop the register movement (change), and then use the selector to redirect the pipeline inflow instructions, and eliminate the previous pipeline business through these two methods. So how should the priority of these two actions be chosen? Is it random or must there be a sequence?

insert image description here

  Ok, now assuming that stallthe signal priority is higher, the red stallsignal line in the above picture will prevent the register from changing,
small note: (the article is not finished, it is being improved in the near future...)

Guess you like

Origin blog.csdn.net/qq_36393978/article/details/129435614