From the file operation to the character encoding

Relative and absolute paths

Absolute path

Starting from the root directory of your computer to your address file is called an absolute path

E:\下载内容\谷歌浏览器下载\ASSSD_6751.zip

relative path

Relative to the file path to the current folder

./ASSSD_6751.zip

Basic file operations

Read the contents of the file

Python using open () method to open a specific file, open within () method writes the file path.

file_path = 'read.txt'  ##文件相对于当前文件的位置
r = open(file_path,'r')  ## 'r'是只读模式
text = r.read()
print(text)  # Hello World!
r.close()

Write operation can also be

file_path = 'read.txt'  ##即使这个文件不存在,'w'模式下会帮你创建这个文件
w = open(file_path,'w') ## 'w'是只写操作,并且会覆盖原来的文件内容 
w.write('Hi,my name is Gredae')  # Hi,my name is Gredae ##文件里会显示这一段话
w.close()  

Another way is to write additional

file_path = 'read.txt'  ##即使这个文件不存在,'a'模式下会帮你创建这个文件
w = open(file_path,'a') ## 'a'是追加操作,并不会覆盖原来的文件内容  
w.write('Hi,my name is Gredae')
w.close() 

But if we read is Chinese, then he will error

We read.txt

file_path = 'read.txt'
r = open(file_path,'r')
print(r.read())
r.close()

UnicodeDecodeError: 'gbk' codec can't decode byte 0x8c in position 14: illegal multibyte sequence

Found that reported a coding error, we look back and look at the file encoding format: utf-8 format found his encoding format.

And then look back to say what the error: "gbk" codec not in a position to decode byte 0x8c found when decoding is "gbk" decoding.

It's like a Fujianese and Cantonese speaking in dialect, like, how could "hear" it to understand.

To solve this problem we must talk about the character encoding of things

Character Encoding

We usually open Notepad to write something when it is written in the memory (do not believe, then you can try at the time of writing Notepad file shutdown, he will prompt Do you want to save), and so we press time saved will be memory inside things will fall into are recorded to your hard drive. But our memory encoding format with Unicode encoding format hard disk file is not the same. Therefore put Unicode transcoding, we arranged to turn into the encoding format, for example: UTF-8, GBK.

So we appear in the above error when reading the file. We read the file as long as the time to tell it what encoding format for reading documents on it.

file_path = 'read.txt'
r = open(file_path,'r',encoding='UTF-8')
print(r.read())  # 你好,世界!
r.close()

When writing to the file can also set encoding format, but when read also need to use the encoding format used when you save. It is generally used in UTF-8 format.

In fact, under normal circumstances it is not using character encoding, need to be addressed when the time will be used general errors. Just remember what store encoded file, when you get what encoding to take, so as not coding error problem.

Guess you like

Origin www.cnblogs.com/Gredae/p/11317462.html