Python Lesson (character encoding / file operations) >>> Go to mind map >>> go to the two young
Character Encoding
# Character encoding for the text. # Character encoding is only related with text files. # Input and output text editor is that two processes. # Character code table is correspondence between characters and numbers ASCII code table represent the English character with a bit binary GBK expressed 2Bytes a Chinese character, or represents a letter with 1Bytes. Can represent up to 65,535 characters Unicode Unicode unified representation of all the characters 2Bytes 1 . Waste of storage space 2 .IO frequency increases, reducing the efficiency of the program (fatal) # When unicode encoding format data stored in memory to the hard disk, will follow utf-8 encoded # Will be unicode characters from the original English 2Bytes become 1bytes # 'll unicode Chinese characters from the original 2Bytes become 3Bytes # present computer memory is unicode, utf-8 hard drives are Two characteristics of unicode 1 . When users enter, no matter what the character input is compatible with all the nations of characters 2 Other countries encoded data from the hard disk into memory when unicode encoded with various other countries have a corresponding relationship # Data stored by the memory to the hard disk unicode >>> encoded binary digital format in memory (encode) >>> UTF- . 8 binary data format # data in hard disk memory, a hard disk read binary hard disk utf-8 format >>> data decoding (decode) >>> memory unicode format, binary data # Ensure that no garbled text files to compile what encoding on what coding solution # Python2 py file according to a text file is read into interpreter default ASCII code # to python3 py file according to a text file is read into interpreter default utf-8 # Header # coding: utf-8 encoding support because all English characters, the file header to be able to properly take effect # python3 string default is unicode encoded binary format # PyCharm terminal using a utf-8 format, windows gbk format is used by the terminal. encode () # binary coded to unicode storage and transmission of data utf-8 # bytes type byte string type you can put it as binary data decode () # The hard disk utf-8 format, binary data unicode format decoded into binary data X = ' on ' RES1 = x.encode ( ' GBK ' ) # unicode-coded to storage and transmission of binary data utf-8 Print (RES1) # B '\ XE4 \ XB8 \ x8a ' # bytes type byte string type you can put it as binary data RES2 = res1.decode ( 'GBK ' ) # binary data in the hard utf-8 format decoded into binary data unicode format Print (RES2) # Added: # you he was a # 1bytes | 1bytes | 1bytes | 1bytes | 1bytes | 1bytes | 1bytes # 1 + 7bit | 1 + 7bit | 1 + 7bit | 1 + 7bit | 1 + 7bit | 1 + 7bit | 1 + 7bit
File Handling
# What is a file? # The operating system provides a user interface for easy operation of complex hardware (hard disk) is. # Why manipulate files? # People or applications need to save the data permanently. # Open mode file f = open () Open file handle = ( ' file path ' , ' mode ' ) # Open the file modes are (the default text mode): r, read-only mode [default mode, the file must exist, there is no exception is thrown] w, write-only mode [unreadable; does not exist, create; there is then emptied the contents] a, additional write-only mode [unreadable; does not exist, create; there is only the additional content] t t need to specify the text file encoding parameter when in use (if you do not know the default is the default encoding of the operating system) b binary encoding parameters must not be specified mode parameter can not write do not write the words rt default is read-only text file that default is not to write t t # Method of operating a file reached, f.read () # read all of the content, move the cursor to the end of the file f.readline () # read a single line, the cursor moves to the second row header f.readlines () # read each line content , stored in the list f.write ( ' 1111 \ N222 \ the n- ' ) # write for text mode, you need to write your own line breaks f.write ( ' 1111 \ N222 \ the n- ' .encode ( ' UTF-8 ' )) # for b mode of writing, you need to write your own line breaks f.writelines ([ ' 333 \ the n- ' , ' 444 \ the n- ' ]) # file mode f.writelines ([bytes ( ' 333 \ the n-' , Encoding = ' UTF-. 8 ' ), ' 444 \ n- ' .encode ( ' UTF-. 8 ' )]) # B mode f.readable () # file is readable f.writable () # whether the file is readable f.closed # whether the file is closed in order to prevent forget f.close (), is recommended with keywords to help us manage context with open('a.txt','w') as f: f.encoding # If the file open mode is b, is not the property
END