Homemade os minimalist tutorial: the most difficult hello world in history

"Don't worry about nothing after writing os, because you can't finish writing at all"

Since it is the first article in this series that begins to explain the technology, it may be a little more nonsense. But before that, please review or preview the two previous articles:

"OS Tutorial 0: The Hardest Interpretation of the Computer Startup Process in the Whole Network"

"OS Tutorial 1: How Difficult to Write an Operating System"

Then this issue re-made the cover of this series, it looks a little high-tech, hey~

Preface

In the previous article " Homemade OS Minimalist Tutorial 1: How Difficult to Write an Operating System ", based on my experience of getting into the pit from novice to great white, I talked about my mental journey on the road of self-made operating system.

Some people may have fulfilled this wish when they were young. Some people may have completed an operating system following the progress of the course in a certain operating system experimental class of the university. Others may be curiosity after work or thirst for the underlying principles, slowly succumbing to an operating system. Of course, there are people who may have been thinking about it all their lives but haven't done it.

Whether you are writing an operating system or understanding the implementation of an operating system in depth, I think it is the only way to become a master. It can also be said to be a hurdle to be crossed sooner or later, so why not start with me now? Right~

Things to know before you start

  1. Don't rely solely on this series. This series cannot allow you to write an operating system in one follow-up. That is to say, I will only post the core code, and provide the core ideas, and the pits I encountered when I was doing a certain place. A lot of details are needed for you to read more detailed information, but I will tell you what you should read.

  2. Doesn't talk about assembly and C syntax level knowledge. This is also a manifestation of minimalism. Many books and tutorials are relatively large because they mix the knowledge of the operating system level and the assembly with the knowledge of the C language level. It is a bit like a large number of advertisements in a TV series. This will also cause the main process to appear different. prominent.

  3. Grasp the big and let go of the small, but at the same time dig out the details. It sounds contradictory. The main reason is that the operating system needs to be very macroscopic and detailed and super-detailed. It belongs to the existence of heaven and earth. So if you want to absorb the content of a chapter in one squat toilet time, then you should focus on the big and let go of the small, and don't worry about all the details. But if you are stuck in a detail, for example, if you don’t know the specific data structure of the global descriptor table, you can’t proceed. Don’t be stingy at this time. Spend an afternoon or a whole day on the global descriptor table. You don’t have to worry about wasting time, it will only save you time in the long run.

After so much preparation, I finally arrived at the first part of almost all tutorials: hello world

I don’t know why, the word abandon comes to my mind...

What to do with this hello world

The effect to be done is to write the operating system code you wrote, which is a bunch of binary numbers composed of 0101.., to a certain position on the hard disk, insert the hard disk into the computer, and press the computer's power button , And finally output a hello world string on the black screen.

This process began to break up, that is:

  1. Take the hard drive out of the computer

  2. Starting from the first sector of the hard disk, burn 0100001000100111... into it

  3. Put the hard drive back in

  4. Press the power button

  5. Hello world appears on the screen

But this cost is relatively large. We will use software to simulate all the above processes. The first psychological obstacle you need to overcome here is, please believe that they are the same.

We use the bochs virtual machine to replace the real computer, the unformatted virtual hard disk file to replace the real hard disk, and the dd command to imitate the burning process. What you see in the end is like this.

Friendly reminder, the installation and configuration of the bochs software, what is the virtual hard disk file, and the installation and usage of the dd command, you need to check on the Internet yourself, this is also to not affect the main process (yin wei lan). This process is not easy. I thought that the bochs software was installed to the configuration, which tortured me for a long time.

Review the computer startup process

The hello world of the self-made operating system may be the most difficult hello world program. The difficulty lies not in the amount of code, but in the understanding of the boot process. Fortunately, my previous article here is enough to pave the way for "The Hardcore Interpretation of the Computer Startup Process in the Whole Network" . I am also lazy here and ask students who do not understand this process to read it, but here I am also interested in this article. A simple summary: three pre-knowledge and four jumps

Let me first talk about the next three pre-knowledges. These must be assumed to be known to you, otherwise I cannot start with protons, neutrons and atoms. The three pre-knowledges are:

  1. The memory is the place where data is stored. Given an address signal, the memory can return the data corresponding to the address.

  2. The way the CPU works is to constantly fetch instructions from memory and execute them.

  3. Which address the CPU fetches from the memory is determined by the value in a register. This value will continue to perform +1 (abstract meaning) operations, or a jump instruction specifies its value.

With these three pre-knowledges, the next step is to remember the four key jumps after the computer is turned on, because these were determined by the uncles of Intel and BIOS and other manufacturers at the time. There is no reason at all, just remember it is good:

  1. Press the power button, the CPU will force the PC register value to be initialized to 0xffff0, this position is the entry address of the BIOS program (one jump)

  2. The entry address is a jump instruction, jump to position 0xfe05b, and start execution (two jumps)

  3. After performing some hardware testing, the last step is to load (copy) the contents of the boot area to memory 0x7c00 and jump to it (three jumps)

  4. The boot area code is mainly to load the operating system kernel and jump to the loading point (four jumps)

Hands-on implementation of the boot area code

The first three jumps are all done by a piece of code written dead by the CPU hardware and BIOS. We don't need to care about it, and you must not care about it, otherwise you will never get to the hello world step...

What we want to achieve is the fourth jump of the startup area code. The original function of the boot area is to load the operating system kernel, but since we are a hello world program, we only let the boot area code implement a function of outputting a string to the screen~

After so much preparation, the following started to accelerate!

Step 1: Create a new file boot.s

;BIOS把启动区加载到内存的该位置
;所以需设置地址偏移量
p mbr vstart=0x7c00

;直接往显存中写数据
mov ax,0xb800 ;这条就是第一条指令
mov gs,ax
mov byte [gs:0x00],'h'
mov byte [gs:0x02],'e'
mov byte [gs:0x04],'l'
mov byte [gs:0x06],'l'
mov byte [gs:0x08],'o'
mov byte [gs:0x0a],' '
mov byte [gs:0x0c],'w'
mov byte [gs:0x0e],'o'
mov byte [gs:0x10],'r'
mov byte [gs:0x12],'l'
mov byte [gs:0x14],'d'

jmp $

;512字节的最后两字节是启动区标识
times 510-($-$$) db 0
db 0x55,0xaa

The code is easy to understand, there are three main parts

Start: p mbr vstart=0x7c00

Since this code will eventually be loaded by the BIOS from the boot sector of the hard disk to the 0x7c00 location in the memory, it  p mbr vstart=0x7c00 represents this offset, otherwise the variable addresses and jump addresses inside will be incorrect.

End: db 0x55,0xaa

The last two bytes are  0x55 0xaathe logo of the boot area. If it is not these two bytes, the BIOS will not regard it as the boot area. It is just the first sector of the hard disk and will not load the contents inside.

Middle: mov byte [gs:0x00],...

The code in the middle is the key part of the final effect. In the startup process, we talked about the memory distribution in real mode. Knowing that  0xB8000 - 0xB8FFFF this memory space is the memory mapping area of ​​the video memory in text mode, writing data to this memory area is equivalent to writing data to the memory area of ​​the graphics card. To output text on the screen.

As for how the graphics card converts the data written on its memory into small bright spots on the screen, don’t ask me, I don’t know, if your curiosity is greater than the code, it means you are suitable as a hardware engineer, hehe~

Step 2: Compile it

nasm -o boot.bin boot.s

I said before, don't talk about assembly knowledge, if you don't talk about it, you won't talk about it at all, haha~

Step 3: Create a virtual disk image and fill the first sector

Use the tool that comes with bochs to create an unformatted virtual disk image with a size of 60M

bximage -mode=create -hd=60 -q os.raw

Write the newly compiled binary file boot.bin into the first sector of the disk with the dd command, which is equivalent to the process of burning data into the disk.

dd if=boot.bin of=os.raw bs=512 count=1

Since it is an unformatted virtual hard disk file, we use a binary editor to open os.raw, which is exactly the same as the boot.bin we compiled, which is the following string of contents:

Of course boot.bin only has 512 bytes, while os.raw has 60M, but the first 512 bytes at the beginning of them are the same. Note that the last two bytes are 55 AA, which is written in our code as the start area identification. There are a lot of non-zero data at the beginning, this is the compiled machine code. The zeros in the middle are nothing, because neither the compiled machine instructions fall in this area nor the data in this area.

Of course, you can get this binary file without writing assembly code, compiling, and then dd writing. You can also directly type this binary file with the keyboard one by one.

And if you burn this piece of binary data into the first sector of a real hard disk in reality, and plug it into a computer, then when you press the power button, it will appear on the screen. hello world string. Well it's that simple and rude.

Now we have a virtual hard disk file os.raw with the "operating system" burned, and a virtual machine bochs that simulates a computer. Just insert the hard disk and press the power button.

Step 4: Start it with bochs

First, specify the disk in the bochs configuration file bochs.properties, then the bochs virtual machine can read this virtual disk file, which is equivalent to inserting the disk on the real machine.

ata0-master: type=disk, path="os.raw", mode=flat, cylinders=121, heads=16, spt=63

Then start the virtual machine with the bochs command, which is equivalent to pressing the computer power button.

bochs -f bochs.properties

So I saw the picture below

Then please fill in the screen running on the real machine by yourself

Summary flowchart

All the previous efforts are done in the following picture

To sum up, it is really simple, compile the boot.s file written in assembly language into a pure binary format file boot.bin, and then write this pure binary file to the disk (the virtual hard disk file os.raw generated by bochs) One sector, and finally use bochs to start the computer.

After the computer is started, the BIOS will load the binary data that has just been written into the first sector of the disk into the 512 bytes at the beginning of the memory 0x7c00, and then a jump instruction will jump to 0x7c00 to start executing the instruction.

The first instruction is a machine code instruction compiled by the assembly instruction mov ax, 0xb800, and then executed continuously. The meaning of the following instruction is to write data to the memory address 0xB8000-0xB8FFFF mapped to the graphics card, and the graphics card will follow The data of this area, output a small bright spot on our screen, we can see

  hello world  

end

If you follow here, and can use various flexible ways to achieve this hello world, then give yourself a like!

The reason why this hello world is difficult, I have analyzed from my experience, the difficulty lies in a theory, a practice, and a psychological barrier .

The theory is that the computer startup process , which includes a part of the CPU hardware, computer composition principles, and BIOS specifications and other knowledge, mixed together, this part can be through the "The hardest core of the entire network to interpret the computer startup process" article Article combing.

The practice is, the environment is set up , this part is super headache, it is a big mountain. But once you become proficient, it is just a tool. I'm not used to bochs, but qemu is fine. Windows does not work well, and it also works on linux. Nasm looks annoying, and other assembly compilers can be used. In short, when it is difficult, it is a mountain that cannot be climbed. When it is simple, it is just a tool. I hope you will explore this part by yourself, and don't give up after hitting a wall a few times.

The psychological barrier is that the virtual machine is always different from the real machine . I was always entangled from the beginning, can this thing work on a real machine? Before installing the system, I used a CD, so can I write this thing on the CD? Everyone’s virtual machines are all virtual boxes. Can I start them with virtual boxes? Other home installation systems are all U disk booting a PE and then choosing a disk to install the system. Why is there such a big gap between this and that?

Please don't worry about this first, it's like asking people how to do integration and differentiation when you can't add, subtract, multiply and divide. You have to take it step by step, hold back first, and believe that every knowledge will be used later. When you learn later, either you can make this industrial-grade operating system yourself, or you think that those are no longer important. You will feel that the time you struggled with at the beginning is really better to accept and continue learning.

However, some valuable knowledge is acquired precisely because of detours. So, what about him~

That's it for hello world. This program is where the dream begins. The world behind is more exciting. See you in the next class!

(Finish)

Monday is very decadent, Thursday is very hard-core "low concurrency programming"

Guess you like

Origin blog.csdn.net/coderising/article/details/111713993