The difference between file IO and standard IO

First, let's understand what is file I/O and standard I/O:

File I/O: File I/O is called IO without cache (unbuffered I/O). Without caching means that every read and write calls a system call in the kernel. It is generally referred to as low-level I/O-the basic IO service provided by the operating system, bound to the os, and specific to the linix or unix platform.

Standard I/O: Standard I/O is a standard I/O model established by ANSI C, which is defined in a standard function package and stdio.h header file, and has certain portability. The standard I/O library handles many of the details. Such as cache allocation, performing I/O with optimized length, etc. Standard I/O provides three types of buffers.

(1) Full cache: The actual I/O operation is performed only after the standard I/O cache is filled. 
(2) Line buffer: When a new line character is encountered in input or output, the standard I/O library performs I/O operations. 
(3) Without caching: stderr is enough.

Second, the difference between the two

      File I/O  , also known as low-level disk I/O, follows POSIX-related standards. File I/O is supported on any POSIX-compliant operating system. Standard I/O is called advanced disk I/O and follows ANSI C related standards. Standard I/O can be used as long as the standard I/O library is available in the development environment. (GLIBC is used in Linux, which is a superset of the standard C library. It contains not only functions defined in ANSI C, but also functions defined in the POSIX standard. Therefore, both standard I/O and file I/O).

      When reading and writing files through file I/O, each operation will execute the relevant system call. The advantage of this approach is that it directly reads and writes actual files. The disadvantage is that frequent system calls will increase system overhead. Standard I/O can be regarded as a buffer mechanism encapsulated on the basis of file I/O. The buffer is read and written first, and the actual file is accessed when necessary, thereby reducing the number of system calls.

      In file I/O, a file descriptor is used to represent an open file, and different types of files such as ordinary files, device files, and pipeline files can be accessed. In standard I/O, FILE (stream) is used to represent an open file, which is usually only used to access ordinary files.

3. Finally, look at the functions they use

 

Standard IO

File IO (low-level IO)

Open

fopen,freopen,fdopen

open

closure

fclose

close

read

getc,fgetc,getchar
fgets,gets
fread

read

Write

putc,fputc,putchar
fputs,puts,
fwrite

write

1. fopen give open

Standard I/O uses the fopen function to open a file:

FILE* fp=fopen(const char* path,const char *mod)

Where path is the file name, mod is used to specify the string of the file opening mode, such as "r", "w", "w+", "a", etc., you can add the letter b to specify opening in binary mode ( For *nix systems, there is only one file type, so there is no difference), if it is successfully opened, it returns a FILE file pointer, and if it fails, it returns NULL, where the file pointer does not point to the actual file, but a data about the file information package, which includes buffer information used by the file.

File IO uses the open function to open a file:

int fd=open(char *name,int how);

Similar to fopen, name represents the file name string, and how specifies the open mode: O_RDONLY (read-only), O_WRONLY (write-only), O_RDWR (readable and writable), and other modes, please man 2 open. Successfully returning a positive integer is called a file descriptor, which is significantly different from standard I/O. If it fails, returning -1 is also different from standard I/O returning NULL.

2.fclose与close

As opposed to opening a file, the standard I/O uses fclose to close the file and just pass in the file pointer. If it is successfully closed, it returns 0, otherwise it returns EOF.
For example:

if(fclose(fp)!=0)  
            printf("Error in closing file");

The file IO uses close to close the file opened by open, which is similar to fclose, except that -1 is returned instead of EOF when an error occurs, and 0 is also returned when it is successfully closed. The C language uses error codes for error handling.

3. Read files , getc, fscanf, fgets and read

For file reading in standard I/O, you can use getc to read character by character, or you can use gets (read standard io read), fgets to read in string units (read to the first encountered After a newline character), gets (accepts a parameter, file pointer) does not judge whether the target array can accommodate the characters read in, which may cause storage overflow (not recommended), and fgets uses three parameters: char * fgets
( char *s, int size, FILE *stream);

The first parameter is the same as gets, used to store the input address, the second parameter is an integer, indicating the maximum length of the input string, and the last parameter is the file pointer, pointing to the file to be read. The last is fscanf, which is similar to scanf, except that a parameter is added to specify the file to be operated on, such as fscanf(fp,"%s",words), the read function is used in file IO to read the file opened by the open function, the
function The prototype is as follows:

ssize_t numread=read(int fd,void *buf,size_t qty);

Among them, fd is the file descriptor returned by open, buf is used to store the purpose buffer of data, and qty specifies the number of bytes to be read. If read successfully, returns the number of bytes read (less than or equal to qty)

4. Determine the end of the file

Standard IO's getc will return the special value EOF if the attempt to read reaches the end of the file, while fgets will return NULL when it encounters EOF, but for the *nix read function, the situation is different. read reads the number of bytes specified by qty, and the final read data may not be as much as you requested (qty), and if you want to read after reading to the end, the read function will return 0.

5. Write files : putc, fputs, fprintf and write

Corresponding to reading files, standard C language I/O uses putc to write characters, such as:

putc ( ch , fp );

The first argument is a character and the second is a file pointer. And fputs is similar to this:

fputs(buf,fp);

Only the first parameter is replaced with a string address. And fprintf is similar to printf, adding a parameter to specify the file to write, for example:

fprintf(stdout,"Hello %s.\n","dennis");

Remember that fscanf and fprintf take the FILE pointer as the first parameter, while putc and fputs take the second parameter.

The write function is provided in file IO for writing files, and the prototype is similar to read:

ssize_t result=write(int fd,void *buf ,size_t amt);

fd is the file descriptor, buf is the memory data to be written, and amt is the number of bytes to be written. If the write is successful, return the number of bytes written. By comparing the result with amt, you can judge whether the write is normal. If the write fails, return -1

6. Random access : fseek(), ftell() and lseek()

Standard I/O uses fseek and ftell for random access to files, first look at the fseek function prototype

int fseek(FILE *stream, long offset, int whence);

The first parameter is the file pointer, and the second parameter is a long type offset (offset), indicating the moving distance from the starting point. The third parameter is the mode used to specify the starting point, stdio.h specifies the following mode constants:

SEEK_SET start of file 
SEEK_CUR current position 
SEEK_END end of file

Look at a few calling examples: 
 fseek(fp,0L,SEEK_SET); //find the beginning of the file 
 fseek(fp,0L,SEEK_END); //locate to the end of the file 
 fseek(fp,2L,SEEK_CUR); //file Move the current position forward by 2 bytes

The ftell function is used to return the current location of the file, and the return type is a long type, such as the following call:

fseek(fp,0L,SEEK_END);//locate to the end 
        long last=ftell(fp); //return current position

Then the last at this time is the number of bytes of the file pointed to by the file pointer fp.

Similar to standard I/O, *nix system provides lseek to complete the function of fseek, the prototype is as follows:

off_t lseek(int fildes, off_t offset, int whence);

fildes is a file descriptor, and offset is also an offset, and whence is also a specified starting point mode. The only difference is that lseek has a return value. If it succeeds, it returns the position before the pointer changes, otherwise it returns -1. The value of whence is the same as fseek: SEEK_SET, SEEK_CUR, SEEK_END, but it can also be replaced by integers 0, 1, 2.

 

4. System calls and library functions

         We have been discussing the difference between file I/O and standard I/O above. In fact, it can be said that file I/O is a system call, and standard I/O is a library function. See the picture below:

 

POSIX: Portable Operating System Interface Portable Operating System Interface

ANSI: American National Standrads Institute American National Standards Institute

1. System call

       The operating system is responsible for managing and allocating all computer resources. In order to better serve applications, the operating system provides a set of special interfaces - system calls . Through this group of interfaces, user programs can use various functions provided by the operating system kernel. For example, allocate memory, create processes, realize communication between processes, etc.

       Why are programs not allowed to directly access computer resources? The answer is not safe. In the development of single-chip microcomputer, since no operating system is required, developers can write codes to directly access the hardware. In 32-bit embedded systems, an operating system is usually running, so developers can write code to directly access the hardware. In 32-bit embedded systems, operating systems are usually run, and the way programs access resources has changed. Operating systems basically support multitasking, that is, multiple programs can run at the same time. If a program is allowed to directly access system resources, it will definitely cause many problems. Therefore, the management and allocation of all hardware and software resources are the responsibility of the operating system. The program needs to obtain resources (such as allocating memory, reading and writing serial ports) must be completed by the operating system, that is, the user program sends a service request to the operating system, and the operating system executes the relevant code to process after receiving the request.

       The interface through which a user program makes a request to the operating system is a system call. All operating systems provide system call interfaces, but different operating systems provide different system call interfaces. The Linux system call interface is very streamlined, and it inherits the most basic and useful parts of the Unix system calls. These system calls can be roughly divided into process control, inter-process communication, file system control, storage management, network management, socket control, user management, etc. according to their functions.

2. Library functions

      The library function can be said to be a kind of encapsulation of the system call, because the system call is for the operating system, the system includes Linux, Windows, etc. If the system call is made directly, it will affect the portability of the program, so the library function is used here. For example, the C library, so as long as the C library is installed in the system, these functions can be used, such as printf() scanf(), etc. The C library is equivalent to translating the system functions, so that our APP can call these functions;

3. User programming interface API

     As mentioned earlier, programs using the system call interface can access various resources, but in actual development, the program does not directly use the system call interface, but uses the user programming interface (API ). Why not use the system call interface directly?

The reasons are as follows:

1) The function of the system call interface is very simple and cannot meet the needs of the program.

2) The system call interfaces of different operating systems are not compatible, and the workload of program transplantation is heavy.

    The popular explanation of the user programming interface is the functions in various libraries (the most important is the C library). In order to improve development efficiency, many functions are implemented in the C library. These functions implement common functions for programmers to call. In this way, programmers do not need to write these codes themselves, and can directly call library functions to realize basic functions, which improves the code reuse rate. There is another advantage of using the user programming interface: the program has good portability. The C library is implemented on almost all operating systems, so programs usually only need to be recompiled to run under other operating systems.

    When the user programming interface (API) is implemented, it usually depends on the system call interface. For example, the API function fork() to create a process corresponds to the sys_fork() system call in the kernel space. Many API functions perform their functions through multiple system calls. There are also API functions that do not call any system calls.

     The user programming interface (API) in Linux follows the POSIX standard, the most popular application programming interface standard in Unix. The POSIX standard is a standard system jointly developed by IEEE and ISO/IEC. Based on the Unix practice and experience that was intended at the time, the standard describes the system call programming interface (actually API) of the operating system to ensure that applications can run on a variety of operating systems at the source code level. These system call programming interfaces are mainly realized through the C library (libc).


First, let's understand what is file I/O and standard I/O:

File I/O: File I/O is called IO without cache (unbuffered I/O). Without caching means that every read and write calls a system call in the kernel. It is generally referred to as low-level I/O-the basic IO service provided by the operating system, bound to the os, and specific to the linix or unix platform.

Standard I/O: Standard I/O is a standard I/O model established by ANSI C, which is defined in a standard function package and stdio.h header file, and has certain portability. The standard I/O library handles many of the details. Such as cache allocation, performing I/O with optimized length, etc. Standard I/O provides three types of buffers.

(1) Full cache: The actual I/O operation is performed only after the standard I/O cache is filled. 
(2) Line buffer: When a new line character is encountered in input or output, the standard I/O library performs I/O operations. 
(3) Without caching: stderr is enough.

Second, the difference between the two

      File I/O  , also known as low-level disk I/O, follows POSIX-related standards. File I/O is supported on any POSIX-compliant operating system. Standard I/O is called advanced disk I/O and follows ANSI C related standards. Standard I/O can be used as long as the standard I/O library is available in the development environment. (GLIBC is used in Linux, which is a superset of the standard C library. It contains not only functions defined in ANSI C, but also functions defined in the POSIX standard. Therefore, both standard I/O and file I/O).

      When reading and writing files through file I/O, each operation will execute the relevant system call. The advantage of this approach is that it directly reads and writes actual files. The disadvantage is that frequent system calls will increase system overhead. Standard I/O can be regarded as a buffer mechanism encapsulated on the basis of file I/O. The buffer is read and written first, and the actual file is accessed when necessary, thereby reducing the number of system calls.

      In file I/O, a file descriptor is used to represent an open file, and different types of files such as ordinary files, device files, and pipeline files can be accessed. In standard I/O, FILE (stream) is used to represent an open file, which is usually only used to access ordinary files.

3. Finally, look at the functions they use

 

Standard IO

File IO (low-level IO)

Open

fopen,freopen,fdopen

open

closure

fclose

close

read

getc,fgetc,getchar
fgets,gets
fread

read

Write

putc,fputc,putchar
fputs,puts,
fwrite

write

1. fopen give open

Standard I/O uses the fopen function to open a file:

FILE* fp=fopen(const char* path,const char *mod)

Where path is the file name, mod is used to specify the string of the file opening mode, such as "r", "w", "w+", "a", etc., you can add the letter b to specify opening in binary mode ( For *nix systems, there is only one file type, so there is no difference), if it is successfully opened, it returns a FILE file pointer, and if it fails, it returns NULL, where the file pointer does not point to the actual file, but a data about the file information package, which includes buffer information used by the file.

File IO uses the open function to open a file:

int fd=open(char *name,int how);

Similar to fopen, name represents the file name string, and how specifies the open mode: O_RDONLY (read-only), O_WRONLY (write-only), O_RDWR (readable and writable), and other modes, please man 2 open. Successfully returning a positive integer is called a file descriptor, which is significantly different from standard I/O. If it fails, returning -1 is also different from standard I/O returning NULL.

2.fclose与close

As opposed to opening a file, the standard I/O uses fclose to close the file and just pass in the file pointer. If it is successfully closed, it returns 0, otherwise it returns EOF.
For example:

if(fclose(fp)!=0)  
            printf("Error in closing file");

The file IO uses close to close the file opened by open, which is similar to fclose, except that -1 is returned instead of EOF when an error occurs, and 0 is also returned when it is successfully closed. The C language uses error codes for error handling.

3. Read files , getc, fscanf, fgets and read

For file reading in standard I/O, you can use getc to read character by character, or you can use gets (read standard io read), fgets to read in string units (read to the first encountered After a newline character), gets (accepts a parameter, file pointer) does not judge whether the target array can accommodate the characters read in, which may cause storage overflow (not recommended), and fgets uses three parameters: char * fgets
( char *s, int size, FILE *stream);

The first parameter is the same as gets, used to store the input address, the second parameter is an integer, indicating the maximum length of the input string, and the last parameter is the file pointer, pointing to the file to be read. The last is fscanf, which is similar to scanf, except that a parameter is added to specify the file to be operated on, such as fscanf(fp,"%s",words), the read function is used in file IO to read the file opened by the open function, the
function The prototype is as follows:

ssize_t numread=read(int fd,void *buf,size_t qty);

Among them, fd is the file descriptor returned by open, buf is used to store the purpose buffer of data, and qty specifies the number of bytes to be read. If read successfully, returns the number of bytes read (less than or equal to qty)

4. Determine the end of the file

Standard IO's getc will return the special value EOF if the attempt to read reaches the end of the file, while fgets will return NULL when it encounters EOF, but for the *nix read function, the situation is different. read reads the number of bytes specified by qty, and the final read data may not be as much as you requested (qty), and if you want to read after reading to the end, the read function will return 0.

5. Write files : putc, fputs, fprintf and write

Corresponding to reading files, standard C language I/O uses putc to write characters, such as:

putc ( ch , fp );

The first argument is a character and the second is a file pointer. And fputs is similar to this:

fputs(buf,fp);

Only the first parameter is replaced with a string address. And fprintf is similar to printf, adding a parameter to specify the file to write, for example:

fprintf(stdout,"Hello %s.\n","dennis");

Remember that fscanf and fprintf take the FILE pointer as the first parameter, while putc and fputs take the second parameter.

The write function is provided in file IO for writing files, and the prototype is similar to read:

ssize_t result=write(int fd,void *buf ,size_t amt);

fd is the file descriptor, buf is the memory data to be written, and amt is the number of bytes to be written. If the write is successful, return the number of bytes written. By comparing the result with amt, you can judge whether the write is normal. If the write fails, return -1

6. Random access : fseek(), ftell() and lseek()

Standard I/O uses fseek and ftell for random access to files, first look at the fseek function prototype

int fseek(FILE *stream, long offset, int whence);

The first parameter is the file pointer, and the second parameter is a long type offset (offset), indicating the moving distance from the starting point. The third parameter is the mode used to specify the starting point, stdio.h specifies the following mode constants:

SEEK_SET start of file 
SEEK_CUR current position 
SEEK_END end of file

Look at a few calling examples: 
 fseek(fp,0L,SEEK_SET); //find the beginning of the file 
 fseek(fp,0L,SEEK_END); //locate to the end of the file 
 fseek(fp,2L,SEEK_CUR); //file Move the current position forward by 2 bytes

The ftell function is used to return the current location of the file, and the return type is a long type, such as the following call:

fseek(fp,0L,SEEK_END);//locate to the end 
        long last=ftell(fp); //return current position

Then the last at this time is the number of bytes of the file pointed to by the file pointer fp.

Similar to standard I/O, *nix system provides lseek to complete the function of fseek, the prototype is as follows:

off_t lseek(int fildes, off_t offset, int whence);

fildes is a file descriptor, and offset is also an offset, and whence is also a specified starting point mode. The only difference is that lseek has a return value. If it succeeds, it returns the position before the pointer changes, otherwise it returns -1. The value of whence is the same as fseek: SEEK_SET, SEEK_CUR, SEEK_END, but it can also be replaced by integers 0, 1, 2.

 

4. System calls and library functions

         We have been discussing the difference between file I/O and standard I/O above. In fact, it can be said that file I/O is a system call, and standard I/O is a library function. See the picture below:

 

POSIX: Portable Operating System Interface Portable Operating System Interface

ANSI: American National Standrads Institute American National Standards Institute

1. System call

       The operating system is responsible for managing and allocating all computer resources. In order to better serve applications, the operating system provides a set of special interfaces - system calls . Through this group of interfaces, user programs can use various functions provided by the operating system kernel. For example, allocate memory, create processes, realize communication between processes, etc.

       Why are programs not allowed to directly access computer resources? The answer is not safe. In the development of single-chip microcomputer, since no operating system is required, developers can write codes to directly access the hardware. In 32-bit embedded systems, an operating system is usually running, so developers can write code to directly access the hardware. In 32-bit embedded systems, operating systems are usually run, and the way programs access resources has changed. Operating systems basically support multitasking, that is, multiple programs can run at the same time. If a program is allowed to directly access system resources, it will definitely cause many problems. Therefore, the management and allocation of all hardware and software resources are the responsibility of the operating system. The program needs to obtain resources (such as allocating memory, reading and writing serial ports) must be completed by the operating system, that is, the user program sends a service request to the operating system, and the operating system executes the relevant code to process after receiving the request.

       The interface through which a user program makes a request to the operating system is a system call. All operating systems provide system call interfaces, but different operating systems provide different system call interfaces. The Linux system call interface is very streamlined, and it inherits the most basic and useful parts of the Unix system calls. These system calls can be roughly divided into process control, inter-process communication, file system control, storage management, network management, socket control, user management, etc. according to their functions.

2. Library functions

      The library function can be said to be a kind of encapsulation of the system call, because the system call is for the operating system, the system includes Linux, Windows, etc. If the system call is made directly, it will affect the portability of the program, so the library function is used here. For example, the C library, so as long as the C library is installed in the system, these functions can be used, such as printf() scanf(), etc. The C library is equivalent to translating the system functions, so that our APP can call these functions;

3. User programming interface API

     As mentioned earlier, programs using the system call interface can access various resources, but in actual development, the program does not directly use the system call interface, but uses the user programming interface (API ). Why not use the system call interface directly?

The reasons are as follows:

1) The function of the system call interface is very simple and cannot meet the needs of the program.

2) The system call interfaces of different operating systems are not compatible, and the workload of program transplantation is heavy.

    The popular explanation of the user programming interface is the functions in various libraries (the most important is the C library). In order to improve development efficiency, many functions are implemented in the C library. These functions implement common functions for programmers to call. In this way, programmers do not need to write these codes themselves, and can directly call library functions to realize basic functions, which improves the code reuse rate. There is another advantage of using the user programming interface: the program has good portability. The C library is implemented on almost all operating systems, so programs usually only need to be recompiled to run under other operating systems.

    When the user programming interface (API) is implemented, it usually depends on the system call interface. For example, the API function fork() to create a process corresponds to the sys_fork() system call in the kernel space. Many API functions perform their functions through multiple system calls. There are also API functions that do not call any system calls.

     The user programming interface (API) in Linux follows the POSIX standard, the most popular application programming interface standard in Unix. The POSIX standard is a standard system jointly developed by IEEE and ISO/IEC. Based on the Unix practice and experience that was intended at the time, the standard describes the system call programming interface (actually API) of the operating system to ensure that applications can run on a variety of operating systems at the source code level. These system call programming interfaces are mainly realized through the C library (libc).


Guess you like

Origin blog.csdn.net/psq1508690245/article/details/115205191