Windows protected-mode learning Notes (x) - TLB

Foreword

First, learn from 滴水编程达人Intermediate courses, 官网:https://bcdaren.com
two, Haidong teacher rocks!

Address Resolution

When (for example: we visit a physical page by means of a linear address MOV EAX,[0x12345678]), the CPU is actually not read only four bytes.

10-10-12 Page

  1. CPU to find the corresponding linear address through a PDE :. 4 bytes
  2. By the CPU to find the linear address and PDE PTE :. 4 bytes
  3. Finally, find the corresponding PTE by physical pages : 4 bytes

I visited a total of 12 bytes , if the spread may be more.

2-9-9-12 page

  1. Found PDPTE : 8 bytes
  2. Found PDE : 8 bytes
  3. Found PTE : 8 bytes
  4. Finally, find the physical page: 4 bytes

I visited a total of 20 bytes , if the spread may be more.


  • In order to improve access efficiency, only linear address of the physical address corresponding thereto to make the recording .
  • Internal CPU to do a table to record these things. It's as fast and efficiently register , called the TLB (- Search.com Lookaside Buffer) .
  • Because soon the TLB efficiency, so its size can not be too large, ranging from dozens to more than only hundreds .

Thoughts : 4GB space in a process, there are numerous linear address , but a TLB can only record up hundreds of records, then this table really make sense?

TLB

TLB structure

TLB
ATTR: Properties
under 10-10-12 paging mode: ATTR = PDE属性 & PTE属性
In the paging mode 2-9-9-12:ATTR = PDPTE属性 & PDE属性 & PTE属性

LRU: Statistics
due to the limited size of the TLB, so when the TLB is full, there are new address to be written upon, TLB will be based on statistical information to determine which addresses are not commonly used, so that will not be used records from the TLB removed .

Note :

  1. Different CPU, TLB different sizes
  2. As long as Cr3 changes, TLB refresh immediately , a core set of TLB
  3. Due to the high operating system 2G mapping is essentially the same, so if Cr3 changed, TLB refresh it, rebuild 2G high above a waste.
    So PDE and PTE has a G flag (PDE is when large pages, G flag only works), if G bit is 1, the refresh will not be refreshed when TLB PDE / PTE
    G Page 1 bit, and when when the TLB is full, CPU based on statistical information will not be used address waste, retain the most commonly used address

TLB species

TLB in the CPU X86 system in the practical application was first started from Intel's 486CPU , the CPU X86 system, in general, have the following four groups TLB:

First set: cache general page table (4K byte pages) instruction page table cache (Instruction-TLB);
a second set of: caching page table ships (4K byte page) of data page table cache (Data-TLB);
third group: the cache is large page table size (2M / 4M page bytes) of the instruction page table cache (instruction-TLB);
group IV: caching large page table (2M / 4M page bytes) of data page table cache (Data-TLB)


注意:以下练习均采用10-10-12分页模式

Exercise 1: Experience the presence of the TLB

Step one: Run Code

Note : In the call gate (int 0x20) any position to set a breakpoint before execution, and run to breakpoint

#include <stdio.h>
#include <windows.h>

DWORD x, y, z;

void __declspec(naked) PageOnNull() {
	__asm
	{
		//保存现场
		push ebp
		mov ebp, esp
		sub esp, 0x100
		push ebx
		push esi
		push edi
	}

	DWORD* pPTE;			// 保存目标线性地址的 PTE 线性地址
	DWORD* pNullPTE;		// 0 地址的 PTE 线性地址
	pNullPTE = (DWORD*)0xC0000000;

	// 挂上 0x50000000 所在位置
	pPTE = (DWORD*)(0xC0000000 + (0x50000000 >> 10));	
	*pNullPTE = *pPTE;

	x = *(DWORD*)0;

	// 挂上 0x60000000 所在位置
	pPTE = (DWORD*)(0xC0000000 + (0x60000000 >> 10));	
	*pNullPTE = *pPTE;

	y = *(DWORD*)0;

	// 刷新 TLB 
	__asm {
		mov eax, cr3
		mov cr3, eax
	}
	
	// 再次读取 0 地址位置的数据
	z = *(DWORD*)0;



	__asm
	{
		//恢复现场
		pop edi
		pop esi
		pop ebx
		mov esp, ebp
		pop ebp
		iretd
	}
}

int main(int argc, char* argv[])
{
	DWORD* p5 = (DWORD*)VirtualAlloc((LPVOID)0x50000000, 4, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);
	DWORD* p6 = (DWORD*)VirtualAlloc((LPVOID)0x60000000, 4, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);

	if (p5 != (DWORD*)0x50000000 || p6 != (DWORD*)0x60000000)
	{
		printf("Error alloc!\n");
		return -1;
	}

	*p5 = 0x1234;
	*p6 = 0x5678;

	__asm
	{
		// 通过中断门提权
		int 0x20
	}

	printf("1. 读 0 地址数据:\n");
	printf("*NULL = 0x%x \n\n", x);

	printf("2. 给 0 地址重新挂上物理页\n\n");

	printf("3. 重新读取 0 地址数据:\n");
	printf("*NULL = 0x%x \n\n", y);

	printf("4. 刷新 TLB \n\n");

	printf("5. 再次读取 0 地址数据:\n");
	printf("*NULL = 0x%x \n", z);

	return 0;
}

Step Two: Set interrupt gate descriptor

First check PageOnNull function in the disassembly of the first address of the interface editor
Function first address
therefore determine the interrupt gate descriptor: 0040ee00`00081030

In use WinDbg IDT[0x20]write interrupt descriptor door
Set interrupt gate descriptor

kd> eq 8003f500 0040ee00`00081030

The third step: continue the program

WinDbg lift interrupt, the virtual machine continues to run, and then continue to run the code down

The result:
operation result
the experimental success!

Experimental summary

  1. Can be found, after the completion of the assignment x, even if the address 0 is put up new physical page , then the assignment of y, x and y are the same output .
  2. But after Cr3 refresh address 0 not put up a new physical page , after the assignment of z, z but the output of a new value .
  3. This is because the former Cr3 refresh, when the first accessed address 0 x, a corresponding linear relationship between addresses and physical addresses is written to the TLB , and therefore when the assignment y, TLB record is not updated, or accessed the original physical page

Exercise 2: making sense of global page

Slightly (to be supplemented)

Exercise 3: INVLPG instruction significance

Slightly (to be supplemented)

Published 45 original articles · won praise 2 · Views 1857

Guess you like

Origin blog.csdn.net/qq_41988448/article/details/102736062