Talking about Assemblers, Compilers, and Interpreters

author:

hardcore pharaoh

Briefly describe the historical evolution of programming.

-- Erik O'shaughnessy (Author)

In the early days of computers, hardware was very expensive and programmers were cheap. These cheap programmers don't even have the title "programmer" and are often filled with mathematicians or electrical engineers. Early computers were used to quickly solve complex mathematical problems, so mathematicians were naturally suited to "programming" jobs.

What is a program?

First, a little background. Computers can't do anything on their own, and any of their actions require programs to guide them. You can think of programs as very precise recipes that take an input and generate corresponding output. Each step in a recipe consists of instructions that manipulate data. It sounds complicated, but you probably know what the following statement means:

1 + 2 = 3

The plus sign is "instruction" and the numbers 1 and 2 are data. The mathematical equal sign means that both sides of the equation are "equivalent", but in most programming languages ​​using the equal sign on a variable means "assignment". If the computer executes the above statement, it will store the result of the addition (aka "3") somewhere in memory.

Computers know how to do math with numbers and how to move data around in memory structures. We won't expand on memory here, you just need to know that memory is generally divided into two categories: "fast/small space" and "slow speed/large space". The read and write speed of CPU registers is very fast, but the space is very small, which is equivalent to a shorthand note. Main memory usually has a lot of space, but read and write speeds are far worse than registers. As the program runs, the CPU keeps moving the data it needs from main memory into registers, and then puts the results back into main memory.

assembler

Computers were expensive at the time, and manpower was cheap. Programmers spend a lot of time translating handwritten mathematical expressions into instructions that computers can execute. The first computers had only really bad user interfaces, and some even had toggle switches on the front panel. These switches represent the "0"s and "1s" in a memory "cell". The programmer needs to configure a memory unit, choose a storage location, and then commit the unit to memory. This is a time-consuming and error-prone process.

Programmers Betty Jean Jennings (left) and Fran Bilas (right) in action

Then an electrical engineer thought his time was precious and wrote a program that could convert a human-readable "recipe"-like input into a computer-readable version. This was the original "assembler", which caused quite a bit of controversy at the time. The owners of these expensive machines don't want to waste computing resources on tasks that people can already do (albeit slow and error-prone). Over time, however, it became clear that using assembler was faster and more accurate than manually writing machine language, and the "real work" done by the computer increased.

Although the assembler is a big improvement over switching the state of bits on a machine panel, this way of programming is still very professional. The addition example above would look more or less like this in assembly language:

01 MOV R0, 1
02 MOV R1, 2
03 ADD R0, R1, R2
04 MOV 64, R0
05 STO R2, R0

Each line is a computer instruction, preceded by a shorthand for an instruction, followed by the data that the instruction operates on. This little program first "moves" the value 1 to register R0, then moves 2 to register R1. Line 03 adds the values ​​in the two registers, R0 and R1, and stores the result in the R2 register. Finally, lines 04 and 05 determine where the result should be placed in main memory (in this case, address 64). Managing where data is stored in memory is one of the most time-consuming and error-prone parts of programming.

translater

Assemblers were already far superior to handwritten computer instructions, but early programmers still longed to be able to write programs in the way they were accustomed to, as if they were mathematical formulas. This need has driven the development of high-level compiled languages, some of which are a thing of the past, others still in use today. ALGO, for example, is a thing of the past, but languages ​​like Fortran and C continue to solve practical problems.

The genealogy tree of the ALGO and Fortran programming languages

These "high-level" languages ​​allow programmers to write programs in a simpler way. In C, our addition program would look like this:

int x;
x = 1 + 2;

The first statement describes a block of memory that the program will use. In this example, the memory should be the size of an integer named x. The second statement is addition, albeit written backwards. A C programmer would say this is "X is assigned the value 1 plus 2". It should be noted that the programmer does not need to decide where in memory to store x, this task is left to the compiler.

This new program, called a "compiler," can convert programs written in a high-level language into assembly language, and then use an assembler to convert assembly language into machine-readable programs. This combination of programs is often referred to as a "toolchain" because the output of one program directly becomes the input of another program.

The advantage of compiled language over assembly language is reflected in the migration from one computer to another computer of a different model or brand. In the early days of computing, many companies, including IBM, DEC, Texas Instruments, UNIVAC, and Hewlett-Packard, were making many different types of computer hardware. These computers don't have much in common other than they all need to be connected to a power source. They differed considerably in memory and CPU architecture, and it often took years for people to translate programs from one computer to programs on another.

With a high-level language, we just need to migrate the compiler toolchain to the new platform. Programs written in high-level languages ​​can be recompiled on new computers with at most minor modifications, as long as a compiler is available. Compilation of high-level languages ​​is a truly revolutionary achievement.

The IBM PC XT, released in 1983, was an early example of falling hardware prices.

The lives of programmers have improved a lot. By contrast, expressing the problem they want to solve through a high-level language makes things a lot easier. Due to advances in semiconductor technology and the invention of integrated chips, the price of computer hardware has dropped dramatically. Computers are getting faster, more powerful, and much cheaper. At some point later (maybe the late 80s), things turned around and programmers became more valuable than the hardware they used.

interpreter

Over time, a new way of programming emerged. A special program called an "interpreter" can directly read a program and convert it into computer instructions for immediate execution. Like a compiler, an interpreter reads a program and converts it into an intermediate form. But unlike the compiler, the interpreter directly executes this intermediate form of the program. Interpreted languages ​​go through this process every time they are executed; a compiled program only needs to be compiled once, and then the computer only needs to execute the compiled machine instructions each time.

By the way, this feature is what makes interpreted programs feel slower. But modern computers are so powerful that most people can't tell the difference between compiled and interpreted programs.

Interpreted programs (sometimes called "scripts") are even easier to port to different hardware platforms. Because the script does not contain any machine-specific instructions, the same version of the program can run directly on many different computers without any modification. But of course, the interpreter must first be ported to the new machine.

A very popular interpreted language is perl. A complete representation of our addition problem in perl would look like this:

$x = 1 + 2

Although the program looks similar to the C version and runs the same, it lacks the statement to initialize the variables. There are some other differences (beyond the scope of this article), but you should have noticed that the way we write computer programs is very close to how mathematicians use pen and paper to write mathematical expressions.

virtual machine

The newest way to program is virtual machines (often abbreviated VMs). Virtual machines are divided into two categories: system virtual machines and process virtual machines. Both virtual machines provide a different level of abstraction from "real" computing hardware, although their scope is different. A system virtual machine is a piece of software that provides an alternative to physical hardware, while a process virtual machine is designed to execute programs in a "system independent" manner. So in this example, the scope of the process virtual machine (I will refer to this type of virtual machine in the future) is similar to that of the interpreter, because the program is first compiled into an intermediate form, and then the virtual machine is executed. this intermediate form.

The main difference between a virtual machine and an interpreter is that a virtual machine creates a virtual CPU and a virtual set of instructions. With this layer of abstraction, we can write front-end tools to compile programs in different languages ​​into programs acceptable to virtual machines. Perhaps the most popular and well-known virtual machine is the Java Virtual Machine (JVM). The JVM originally only supported the Java language in the 1990s, but today it can run many popular programming languages, including Scala, Jython, JRuby, Clojure, and Kotlin, to name a few. There are other less common examples, which I won't cover here. I also recently learned that my favorite language, Python, is not an interpreted language, but a language that runs on a virtual machine!

Virtual machines continue a historical trend of allowing programmers to use domain-specific programming languages ​​to solve problems with less and less knowledge of a specific computing platform.

That's it

Hope you enjoyed this short article that briefly explains how the software works behind the scenes. Are there any other topics you would like me to discuss next? Let me know in the comments.


via: https://opensource.com/article/19/5/primer-assemblers-compilers-interpreters

Author: Erik O'Shaughnessy Topic: lujun9972 Translator: chen-ni Proofreading: wxy

This article was originally compiled by LCTT, and was launched with honor by Linux China

Guess you like

Origin blog.csdn.net/daocaokafei/article/details/124029369