Yesterday Once More

Yesterday Once More

What is the file

The operating system provides a virtual unit

Step open files

1. Locate the path to the file file_path
2. Open the file Open
3. Read / modify the file Read / Write
4. Save the file the flush
5. The close the file close

Open the file in 3 modes +2 ways

mode

1.w: After emptying written
2.r: read-only do not write
3.a: additional writing

the way

1.b: Binary
2.t: Text

Not recommended for use

1.r +: readable and writeable yet
2.a +: readable and writeable yet
3.w +: readable and writeable and (empty)

with management context

f = open()
f.read()
#自动关闭文件
with open() as f:
    f.read()

Reptile principle

By the browser sends a request to get the content; analog transmission request requests by the browser to get content

Reptile process

1. The transmission request (fill a URL)
2. acquire content
3. Filter data you need

Use requests module

import requests
res = requests.get(url)
#文本
res.text
#二进制流
res.content

re module

re.S 全局搜索
data = '<img id = "blogLogo" src = "http://www.baidu.com" alt="返回主页">'
re.findall('src ="(.*?)"',data)从内容中筛选所需要的内容
.*?--你需要什么就把什么(.*?)#80%-90%场景下用.*?

Guess you like

Origin www.cnblogs.com/793564949liu/p/11425948.html