Embedded C language program data basic storage structure

1. Five major memory partitions

The memory is divided into five areas, which are heap, stack, free storage area, global/static storage area and constant storage area.

1. Stack area (stack): FIFO is the storage area of ​​variables that are allocated by the compiler when needed and automatically cleared when not needed. The variables inside are usually local variables, function parameters, etc.

2. Heap area (heap): Those memory blocks allocated by new, their release is ignored by the compiler, and controlled by our application program. Generally, a new corresponds to a delete. If the programmer does not release it, the operating system will automatically recycle it after the program ends.

3. Free storage area: those memory blocks allocated by malloc, etc. It is very similar to the heap, but it uses free to end its own life.

4. Global/static storage area: Global variables and static variables are allocated to the same block of memory. In the previous C language, global variables were divided into initialized and uninitialized. There is no such distinction in C++. They share the same occupy the same memory area.

5. Constant storage area: This is a relatively special storage area, which stores constants and is not allowed to be modified (of course, you can modify it through illegal means, and there are many methods)

Memory is mainly divided into code segment, data segment and stack. The code segment contains the program code and belongs to read-only memory. The data segment stores global variables, static variables, constants, etc., the heap stores the variables from malloc or new, and other variables are stored in the stack, and the space between the stacks is floating. The memory of the data segment will not be released until the program is executed. The calling function first finds the entry address of the function, then calculates the formal parameters and temporary variables for the function, allocates space in the stack, copies the copy of the actual parameter to the formal parameter, and then performs a stack operation, and pops the stack after the function is executed. Character constants are generally placed in the data segment, and only one copy of the same character constant will be saved.

Two, the storage area of ​​​​the C language program

1. To form an executable program (binary file) from C language code (text file), it needs to go through three stages of compilation-assembly-linking. The compilation process generates an assembler from the C language text file, the assembly process forms the assembler into a binary machine code, and the linking process combines the binary machine code files generated by each source file into one file.

2. After the program written in C language is compiled and connected, a unified file will be formed, which consists of several parts. Several other parts are generated when the program is running, and each part represents a different storage area:

1) Code segment (Code or Text)

The code segment consists of the machine code executed in the program. In the C language, program statements are executed and compiled to form machine code. During the execution of the program, the CPU's program counter points to each machine code in the code segment, and the processor runs them sequentially.

2) Read-only data segment (RO data)

The read-only data segment is some data used by the program that will not be changed. The way to use these data is similar to the look-up table operation. Since these variables do not need to be changed, they only need to be placed in the read-only memory.

3) The read and write data segment (RW data) has been initialized

Initialized data is a variable declared in the program and has an initial value. These variables need to occupy memory space. When the program is executed, they need to be located in a readable and writable memory area, and have an initial value for reading when the program is running. Write.

4) Uninitialized data segment (BBS)

Uninitialized data is declared in the program, but there is no initialized variable, these variables do not need to occupy memory space before the program runs.

5) heap

Heap memory only appears when the program is running, and is generally allocated and released by the programmer. In the case of an operating system, the operating system may reclaim memory after the program (such as a process) ends if the program does not free it.

6) stack (statck)

The heap memory only appears when the program is running. The variables used inside the function, the parameters of the function and the return value will use the stack space, and the stack space is automatically allocated and released by the compiler.

picture

3. The code segment, read-only data segment, read-write data segment, and uninitialized data segment belong to the static area, while the heap and stack belong to the dynamic area. The code segment, read-only data segment and read-write data segment will be generated after connection, the uninitialized data segment will be opened when the program is initialized, and the heap and stack will be allocated and released during the running of the program.

4. The C language program is divided into two states: image and runtime. In the image formed after compiling and linking, only the code segment (Text), the read-only data segment (R0 Data) and the read-write data segment (RW Data) will be included. Before the program runs, the uninitialized data segment (BSS) will be dynamically generated, and the heap (Heap) area and stack (Stack) area will also be dynamically generated when the program is running.

1. Generally speaking, in a static image file, each part is called a section (Section), and each part at runtime is called a segment (Segment). If not distinguished in detail, they are collectively referred to as segments.

2. After the C language is compiled and connected, a code segment (TEXT), a read-only data segment (RO Data) and a read-write data segment (RW Data) will be generated. At runtime, in addition to the above three areas, it also includes the uninitialized data segment (BBS) area, heap (heap) area and stack (Stack) area.

Three, the segment of the C language program

1. Classification of segments

The object code generated by each source program will contain all the information and functions that the source program needs to express. The generation of each segment in the object code is as follows:

1) Code segment (Code)

The code segment is generated by each function in the program, and each statement of the function will finally be compiled and assembled to generate binary machine code

2) Read-only data segment (RO Data)

The read-only data segment is generated by the data used in the program. The characteristics of this part of the data do not need to be changed during operation, so the compiler will put the data into the read-only part. Some syntaxes of the C language will generate read-only data segments.

2. Read-only data segment (RO Data)

The read-only data segment (RO Data) is generated by the data used in the program. The characteristic of this part of data is that it does not need to be changed during operation, so the compiler will put the data into the read-only part. The following conditions will generate a read-only data segment.

1) Read-only global variables

Defining the global variable const char a[100]="abcdefg" will generate a read-only data area with a size of 100 bytes and initialize it with the string "abcdefg". If it is defined as const char a[]="abcdefg" and no size is specified, an 8-byte read-only data segment will be generated according to the length of the "abcdefgh" string.

2) Read-only local variables

For example: the variable const char b[100]=”9876543210” defined inside the function; its initialization process and global variables.

3) Constants used in the program

For example: use printf("informationn") in the program, which contains a string constant, the compiler will automatically put the constant "information n" into the read-only data area.

Note: In const char a[100]={"ABCDEFG"}, a 100-byte data area is defined, but only the first 8 bytes (7 characters and '0' representing the terminator) are initialized . In this usage, the actual following bytes are initialized, but they cannot be written in the program, and are actually useless. Therefore, in the read-only data segment, a complete initialization is generally required.

3. Read and write data segment (RW Data)

The read-write data segment represents a part of the data area that can be read and written in the target file, and they are also called initialized data segments in some occasions. This part of the data segment and code, like the read-only data segment, belongs to the static area in the program, but it has the characteristics of the Association for Science and Technology.

1) Global variables have been initialized

For example: outside the function, define the global variable char a[100]="abcdefg"

2) Local static variables have been initialized

For example: define static char b[100]="9876543210" in the function. The data and arrays defined by static and initialized in the function will be compiled as read and write data segments.

illustrate:

The characteristic of the read-write data area is that it must be initialized in the program. If there is only definition and no initial value, the read-write data area will not be generated, but will be defined as an uninitialized data area (BSS). If the global variable (variable defined outside the function) is added with a static modifier and written in the form of static char a[100], it means that it can only be used inside the file and cannot be used by other files.

4. Uninitialized Data Segment (BSS)

The uninitialized data segment is often called BSS (abbreviation for Block start by symbol in English). Similar to the read and write data segment, it also belongs to the static data area. But the data in this section is not initialized. So it will only be identified in the object file, and will not really be called a segment in the object file, which will be generated at runtime. The uninitialized data segment is only generated during the initialization phase of the run, so its size does not affect the size of the object file.

4. In the program of C language, the problems that need to be paid attention to when using variables:

1. The variables defined in the function body are usually on the stack and do not need to be managed in the program, and are handled by the compiler.

2. The memory space allocated by functions such as malloc, calloc, and realoc to allocate memory is on the heap, and the program must ensure that it is freed after use, otherwise memory leaks will occur

3. All functions define global variables outside the body, and the variables added with the static modifier are stored in the global area (static area) regardless of whether they are inside or outside the function.

4. Variables defined using const will be placed in the read-only data area of ​​the program.

illustrate:

In C language, static variables can be defined: static variables defined in the body of a function can only be valid in the body of the function; static variables defined in the body of all functions can only be valid in this file, and cannot be used in other source files ; For global variables that are not modified with static, they can be used in other source files. These differences are the concept of compilation, that is, if the variable is not used as required, the compiler will report an error. Both static and non-static global variables will be placed in the global (static) of the program.

Fourth, the use of the middle section of the program

The global area (static area) in C language actually corresponds to the following segments:

Read-only data segment: RO Data

Read and write data segment: RW Data

Uninitialized data segment: BSS Data

Generally speaking, directly defined global variables are in the uninitialized data area. If the variable is initialized, it is in the initialized data area (RW Data), and the const modifier will be placed in the read-only area (RO Data).

For example:

const char ro[ ]=”this is a readonlydata”; //Read-only data segment, cannot change the content in the ro array, ro is stored in the read-only data segment.

char rw1[ ]=”this is global readwrite data”; //The read and write data segment has been initialized, and the content in the array rw1 can be changed. It should be a value/assignment instead of giving the "this is global readwrite data" address to rw1, and cannot change char rw1[ ]= "this is global readwrite data"; //The read and write data segment has been initialized, and the data in the array rw1 can be changed content. It should be the value/is the assignment not to give the address of "this is global readwrite data" to rw1, and the value of "this is global readwrite data" cannot be changed. since literal constants are placed in the read-only data segment

char bss_1[100];//uninitialized data segment

const char *ptrconst = "constant data"; //"constant data" is placed in the read-only data segment, and the value in ptrconst cannot be changed, because it is an address assignment. ptrconst points to the address where "constant data" is stored, which is a read-only data segment. But you can change the value of the ptrconst address because it is stored in the read and write data segment.

Example explanation:

int main( ){
   
   short b;//b放置在栈上,占用2个字节char a[100];//需要在栈上开辟100个字节,a的值是其首地址char s[]=”abcde”;//s在栈上,占用4个字节,“abcde”本身放置在只读数据存储区,占6字节。s是一个地址//常量,不能改变其地址数值,即s++是错误的。char *p1;//p1在栈上,占用4个字节char *p2 ="123456";//"123456"放置在只读数据存储区,占7个字节。p2在栈上,p2指向的内容不能更//改,但是p2的地址值可以改变,即p2++是对的。static char bss_2[100]; //局部未初始化数据段static int c=0 ; //局部(静态)初始化区p1 = (char *)malloc(10*sizeof(char)); //分配的内存区域在堆区strcpy(p1,”xxx”); //”xxx”放置在只读数据存储区,占5个字节free(p1); //使用free释放p1所指向的内存return 0;}

illustrate:

1. The read-only data segment needs to include the const data defined in the program (such as: const char ro[]), as well as the data that needs to be used in the program such as "123456". For the definitions of const char ro[] and const char * ptrconst, the memory they point to is located in the read-only data area, and the contents pointed to are not allowed to be modified. The difference is that the former does not allow the value of ro ​​to be modified in the program, and the latter allows the value of ptrconst itself to be modified in the program. For the latter, rewriting it into the following form will not allow the value of ptrconst itself to be modified in the program:

const char * const ptrconst = “const data”;

2. The read and write data segment includes the initialized global variable static char rw1[] and the local static variable static char rw2[]. The difference between rw1 and rw2 is whether it is used inside the function or can be used in the entire file when compiling. For the former, the static modification means that the rw1 variable can be accessed when controlling other files of the program. If there is a static modification, rw1 cannot be used in other C language source files. This effect is for the compile-link feature, but no matter there is static, The variable rw1 will be placed in the read and write data segment. For the latter rw2, it is a local static variable, which is placed in the read-write data area; if it is not modified with static, its meaning will be completely changed, and it will be a local variable in the stack space instead of a static variable.

3. Uninitialized data segment, bss_1[100] and bss_2[200] in case 1 represent uninitialized data segment in the program. The difference is that the former is a global variable, which can be used in all files; the latter is a local variable, which is only used inside the function. The uninitialized data segment does not set the subsequent initialization value, so the size of the area must be specified with a value, and the compiler will set the length that needs to be increased in the BBS according to the size.

4. The stack space includes variables used internally in the function such as short b and char a[100], and the value of the variable p1 in char *p1.

1) The memory pointed to by the variable p1 is built on the heap space, and the heap space can only be used inside the program, but the heap space (such as the memory pointed to by p1) can be passed to other functions as a return value for processing.

2) The stack space is mainly used for the storage of the following three types of data:

a. Dynamic variables inside the function

b. The parameters of the function

c. The return value of the function

3) The main use of the stack space is for the dynamic variables inside the function. The variable space is opened before the function starts, and is automatically recovered by the compiler after the function exits. Look at an example:

int main( ){
   
       char *p = "tiger";    p[1] = 'I';    p++;    printf("%sn",p);}

Prompt after compilation: Segmentation fault

analyze:

char *p = "tiger"; The system has opened up 4 bytes on the stack to store the value of p. "tiger" is stored in the read-only storage area, so the content of "tiger" cannot be changed, *p="tiger" means address assignment, therefore, p points to the read-only storage area, so changing the content pointed to by p will cause segment mistake. But because p is stored on the stack, the value of p can be changed, so p++ is correct.

Five, the use of const

1 Introduction:

const is a keyword in C language, which limits a variable and does not allow it to be changed. Using const can improve the robustness of the program in a certain program. In addition, when viewing other people's code, a clear understanding of the role played by const is helpful for understanding other people's programs.

2. const variables and constants

1) The variable modified by const, its value is stored in the read-only data segment, and its value cannot be changed. called read-only variables.

Its form is const int a=5; here you can use a instead of 5

2) Constant: It also exists in the read-only data segment, and its value cannot be changed. Its form is "abc" ,5

3. For const variables and const-limited content, let’s look at an example first:

typedef char* pStr;int main( ){
   
       char string[6] = “tiger”;    const char *p1 = string;    const pStr p2 = string;    p1++;    p2++;    printf(“p1=%snp2=%sn”,p1,p2);}

After the program is compiled, the error message is

error:increment of read-only variable ‘p2’

1) The basic form used by const is: const char m;

//limit m immutable

2) Replace m in formula 1, const char *pm;

//Limit *pm is immutable, of course pm is variable, so p1++ is right.

3) Replace char in formula 1, const newType m;

//Restrict m to be immutable, pStr in the question is a new type, so p2 in the question is immutable, and p2++ is wrong.

4. const and pointers

In the type declaration, const is used to modify a constant. There are two ways to write it as follows:

1) const in front

const int nValue;//nValue是const

const char *pContent;//*pContent is const, pConst is variable

const (char *)pContent;//pContent is const, *pContent is variable

char *const pContent;//pContent is const, *pContent is variable

const char * const pContent;//pContent and *pContent are both const

2) const is equivalent to the above statement in the back

int const nValue;// nValue是const

char const *pContent;//*pContent is const, pContent is variable

(char *) constpContent;//pContent is const, *pContent is variable

char* const pContent;// pContent is const, *pContent is variable

char const* const pContent;//pContent and *pContent are both const

Explanation: The use of const and pointers together is a very common confusion in C language. The following is the two-day rule:

1) Draw a line along the * sign, if const is on the left side of *, then const is used to modify the variable pointed to by the pointer, that is, the pointer points to a constant; if const is on the right side of *, const is to modify the pointer itself, That is, the pointer itself is constant. You can look at the actual meaning of the above statement according to this rule, I believe it will be clear at a glance.

2) For const (char *) ; because char * is a whole, it is equivalent to a type (such as char), therefore, this is to limit the pointer to be const.

六、data、idata、xdata、pdata、code

In terms of data storage types, the 8051 series has on-chip and off-chip program memory, and on-chip and off-chip data memory. The on-chip program memory is also divided into direct addressing area and indirect addressing type, corresponding to code, data, xdata, idata and the pdata type set according to the characteristics of the 51 series use different memories, which will make the program execution efficiency different. When writing a C51 program, it is best to specify the storage type of the variable, which will help improve the program execution efficiency (this problem will be specifically described later). Slightly different from ANSI-C, it only has SAMLL, COMPACT, and LARGE modes. Various modes correspond to different actual hardware systems, and will have different compilation results.

The difference between data, idata, xdata, and pdata in the 51 series:

data: fixedly refers to the 128 RAMs at the front 0x00-0x7f, which can be directly read and written by acc, with the fastest speed and the smallest generated code.

idata: fixedly refers to the 256 RAMs in the front 0x00-0xff, and the first 128 is exactly the same as the 128 of data, just because the access method is different. idata is accessed in a way similar to pointers in C. The statement in the assembly is: mox ACC, @Rx. (Not important addition: idata in c works well with pointer access)

xdata: External extended RAM, generally refers to the external 0x0000-0xffff space, accessed by DPTR.

pdata: The lower 256 bytes of external extended RAM, read and write when the address appears on A0-A7, use movx ACC, @Rx to read and write. This is rather special, and C51 seems to have this BUG, ​​so it is recommended to use it less. But it also has its advantages. The specific usage is an intermediate issue, so I won't mention it here.

What is the function of unsigned char code table[]code in C language of single chip microcomputer?

The function of the code is to tell the MCU that the data I defined should be placed in the ROM (program storage area), and cannot be changed after writing. In fact, it is equivalent to the addressing MOVX in the assembly (it seems to be), because there is no The method describes in detail whether it is stored in ROM or RAM (register), so this statement is added in the software to replace the assembly instruction, and the corresponding data means to store in RAM.

The program can be simply divided into code (program) area and data (data) area. The code area cannot be changed during operation. The data area contains global variables and temporary variables, which must be changed continuously. Read instructions in the data area, and perform calculations on the data in the data area, so it does not matter what medium the code area is stored on. Like the previous computer programs stored on the card, the code area can also be placed in the rom or in the ram It can also be placed in flash (but the running speed is much slower, mainly reading flash is more time-consuming than reading ram), so the general method is to put the program in flash, and then load it into ram to run; DATA area There is no choice, it must be placed in RAM, and it cannot be changed in rom.

How does bdata use it?

If the program needs 8 or more bit variables, it is inconvenient if you want to assign values ​​to 8 variables at one time. pondering) and it is not possible to define a bit array, there is only one method

char bdata MODE;

sbit MODE_7 = MODE^7;

sbit MODE_6 = MODE^6;

sbit MODE_5 = MODE^5;

sbit MODE_4 = MODE^4;

sbit MODE_3 = MODE^3;

sbit MODE_2 = MODE^2;

sbit MODE_1 = MODE^1;

sbit MODE_0 = MODE^0;

The 8 bit variable MODE_n is defined

This is a definition statement, a special data type of Keilc. Remember it must be sbit

Can't bit MODE_0 = MODE^0;

If the assignment statement is written like this in C language, it is regarded as an XOR operation.

Compared with the RAM in the microcontroller, Flash is an external access device. Although its structural location is installed in the microcontroller, in fact, xdata is placed outside the relative RAM, and flash is relatively outside the RAM.

The inta variable is defined in internal RAM, xdatainta is defined in external RAM or flash, and uchar codea is defined in flash.

uchar code duma[]={0x3f, 0x06, 0x5b, 0x4f, 0x66, 0x6d, 0x7d, 0x07, 0x7f, 0x6f, 0x40, 0x00}; //Select the digital tube segment of the common cathode, the value to be taken by the P2 port.

If you define uchar aa[5], the content in aa[5] is stored in the data storage area (RAM), and the value of each array element can be modified during the program running project. After power-off, the content in aa[5] Data could not be saved.

If it is defined that the content in uchar code bb[5] is stored in the program storage area (such as flash), the value of each element in bb[5] can only be changed when the program is programmed, and cannot be changed in the program running project. Modify, and the data in bb[5] will not disappear after power off.

7. The difference between heap and stack in C language

The C language program is compiled and connected to form a binary image file after compilation and connection. It consists of stack, heap, and data segment (composed of three parts: read-only data segment, initialized read-write data segment, and uninitialized data segment is BBS) And the code segment, as shown in the following figure:

picture

1. Stack area (stack): automatically allocated and released by the compiler, storing function parameter values, local variables and other values. It operates like a stack in a data structure.

2. Heap area (heap): Generally allocated and released by the programmer, if the programmer does not release it, it may cause a memory leak. Note that the heap is not the same as the stack in the data structure, and its class is the same as the linked list.

3. Program code area: store the binary code of the function body.

4. Data segment: consists of three parts:

1) Read-only data segment:

The read-only data segment is some data used by the program that will not be changed. The way to use these data is similar to the look-up table operation. Since these variables do not need to be changed, they only need to be placed in the read-only memory. Generally, variables modified by const and literal constants used in the program are generally stored in the read-only data segment.

2) The initialized read and write data segment:

Initialized data is a variable declared in the program and has an initial value. These variables need to occupy memory space. When the program is executed, they need to be located in a readable and writable memory area, and have an initial value for reading when the program is running. Write. In the program, it is generally an initialized global variable, an initialized static local variable (static modified initialized variable)

3) Uninitialized segment (BSS):

Uninitialized data is declared in the program, but there is no initialized variable, these variables do not need to occupy memory space before the program runs. Similar to the read and write data segment, it also belongs to the static data area. But the data in this section is not initialized. The uninitialized data segment is only generated during the initialization phase of the run, so its size does not affect the size of the object file. In the program, there are generally uninitialized global variables and uninitialized static local variables.

The difference between heap and stack

1. How to apply

(1) Stack (satck): automatically allocated by the system. For example, declare a local variable int b in the function; the system automatically creates space for b in the stack.

(2) Heap (heap): The programmer needs to apply for it (call malloc, realloc, calloc), specify the size, and release it by the programmer. Easy to produce memory leak.

eg: charp;

p = (char *)malloc(sizeof(char));//However, p itself is on the stack.

2. Application size limit

1) Stack: Under Windows, the stack is a data structure that expands to the bottom address, and is a continuous memory area (its growth direction is opposite to that of memory). The stack size is fixed. If the requested space exceeds the remaining space of the stack, overflow will be prompted.

2) Heap: The heap is a high-address extended data structure (its growth direction is the same as that of memory), and it is a discontinuous memory area. This is because the system uses a linked list to store free memory addresses, which are naturally discontinuous, and the traversal direction of the linked list is from the bottom address to the high address. The size of the heap is limited by the virtual memory available on the computer system.

3. System response:

1) Stack: As long as the stack space is larger than the requested space, the system will provide memory for the program, otherwise an exception will be reported indicating stack overflow.

2) Heap: First of all, it should be known that the operating system has a linked list that records free memory addresses, but when the system receives an application from a program, it will traverse the linked list to find the first heap node whose space is larger than the requested space, and then the node The point is deleted from the free list, and the space of the node is allocated to the program. In addition, for most systems, the size of this allocation will be recorded at the first address in this memory space. In this way, the free statement in the code In order to correctly release the memory space. In addition, the size of the heap node found may not be exactly equal to the requested size, and the system will automatically put the excess part back into the free list.

Explanation: For the heap, frequent new/delete will inevitably cause the discontinuity of the memory space, resulting in a large number of fragments and reducing the efficiency of the program. For stacks, this problem does not exist.

4. Application efficiency

1) The stack is automatically allocated by the system, which is fast. But programmers can't control

2) The heap is memory allocated by malloc, which is generally slow and prone to fragmentation, but it is the most convenient to use.

5. Storage content in the heap and stack

1) Stack: When a function is called, the address of the next statement in the main function that is first pushed into the stack, and then the parameters of the function, the parameters are pushed into the stack from right to left, and then the local variables in the function . Note: Static variables are not pushed onto the stack.

When this function call ends, the local variables are popped out of the stack first, then the parameters, and finally the pointer on the top of the stack points to the first stored address, which is the next instruction in the main function, and the program continues to execute from this point.

2) Heap: Generally, one byte is used to store the size of the heap at the head of the heap.

6. Access efficiency

1) Heap: char *s1="hellowtigerjibo"; it is determined when compiling

2) Stack: char s1[]=”hellowtigerjibo”; is assigned at runtime; using an array is faster than using a pointer, and the pointer needs to be transferred with the edx register in the underlying assembly, while the array is read on the stack.

Replenish:

The stack is a data structure provided by the machine system. The computer will provide support for the stack at the bottom layer: assign a special register to store the address of the stack, and there are special instructions for pushing and popping the stack, which determines that the efficiency of the stack is relatively high. The heap is provided by the C/C++ function library, and its mechanism is very complicated. For example, in order to allocate a block of memory, the library function will search the heap memory for available If there is not enough space (maybe due to too much memory fragmentation), it is possible to call the system function to increase the memory space of the program data segment, so that there is a chance to allocate enough memory, and then perform return. Obviously, the heap is much less efficient than the stack.

7. Distribution method:

1) The heap is allocated dynamically, there is no statically allocated heap.

2) The stack has two allocation methods: static allocation and dynamic allocation. Static allocation is done by the compiler, like the allocation of local variables. Dynamic allocation is allocated by the alloca function, but the dynamic allocation of the stack is different from the heap. Its dynamic allocation is released by the compiler without manual implementation.

Guess you like

Origin blog.csdn.net/weixin_41114301/article/details/132287404