python学习(五):读写文本及文本数据处理

1. 打开文件

数据文本:sketch.txt

我们利用程序来处理数据时,需要额外打开数据文件。

我们首先导入os模块。  #关于os模块的用法点击此处

os.getcwd()  #获取当前工作路径

os.chdir()   #把当前工作路径切换到想要处理文本数据所在所在文件夹

>>> import os
>>> os.getcwd()
'/home/mwx'
>>> os.chdir('/home/mwx/HeadFirstPython/chapter3')
>>> os.getcwd()                        #再来一遍获取路径,检查是否路径已经改到文本所在的路径
'/home/mwx/HeadFirstPython/chapter3'
>>> data=open('sketch.txt')         #打开数据文件,把文件赋值给‘data’
>>> print(data.readline(),end='')  #读取文件的第一行数据
>>> data.seek(0) #使用seek()回到文件起始位置,python文件tell()也可以
0
>>> for each_line in data:          #打印每一行数据
	print(data.readline(),end='')

	
Other Man: I've told you once.
Other Man: Yes I have.
Other Man: Just now.
Other Man: Yes I did!
Other Man: I'm telling you, I did!
Other Man: Oh I'm sorry, is this a five minute argument, or the full half hour?
Other Man: Just the five minutes. Thank you.
Man: You most certainly did not!
Man: Oh no you didn't!
Man: Oh no you didn't!
Man: Oh look, this isn't an argument!
Other Man: Yes it is!
(pause)
Other Man: No it isn't!
Other Man: It is NOT!
Other Man: No I didn't!
Other Man: No no no!
Other Man: Nonsense!
(pause)
Man: Yes it is!
>>> data.close()

2. split()的用法

Python split()通过指定分隔符对字符串进行切片,如果参数num 有指定值,则仅分隔 num 个子字符串。

2.1 语法

str.split(str="", num=string.count(str)) 

#str -- 分隔符,默认为所有的空字符,包括空格、换行(\n)、制表符(\t)等。   num--分割次数

3. 对数据进行处理

#将每一句话中的':'改为' said :'

扫描二维码关注公众号,回复: 2162584 查看本文章

>>> import os
>>> os.getcwd()
'/home/mwx'
>>> os.chdir('/home/mwx/HeadFirstPython/chapter3')
>>> os.getcwd()
'/home/mwx/HeadFirstPython/chapter3'
>>> data=open('sketch.txt')
>>> for each_line in data:
          (role,line_spoken)=each_line.split(':',1)
          print(role,end='')
          print(' said:',end='')
          print(line_spoken,end='')

          
Man said: Is this the right room for an argument?
Other Man said: I've told you once.
Man said: No you haven't!
Other Man said: Yes I have.
Man said: When?
Other Man said: Just now.
Man said: No you didn't!
Other Man said: Yes I did!
Man said: You didn't!
Other Man said: I'm telling you, I did!
Man said: You did not!
Other Man said: Oh I'm sorry, is this a five minute argument, or the full half hour?
Man said: Ah! (taking out his wallet and paying) Just the five minutes.
Other Man said: Just the five minutes. Thank you.
Other Man said: Anyway, I did.
Man said: You most certainly did not!
Other Man said: Now let's get one thing quite clear: I most definitely told you!
Man said: Oh no you didn't!
Other Man said: Oh yes I did!
Man said: Oh no you didn't!
Other Man said: Oh yes I did!
Man said: Oh look, this isn't an argument!
#此处后一行报错,因为此处原文为"(pause)",并不存在':',split()查找':'就会出现问题。

4. 错误处理

  • 增加额外逻辑来处理错误
for each_line in data:
	if not each_line.find(':')==-1:  #find()返回-1表示未找到
		(role,line_spoken)=each_line.split(':',1)
		print(role,end='')
		print(' said:',end='')
		print(line_spoken,end='')


  • 让错误出现,监视错误,从运行时错误恢复
for each_line in data:
	try:
		(role,line_spoken)=each_line.split(':',1)
		print(role,end='')
		print(' said:',end='')
		print(line_spoken,end='')
	except:
		pass #根据实际情况,有时候可以直接"放过"错误


5. 一些错误检查及错误提示

  • os.path.exists('sketch.txt')#检查文件是否存在

ValueError: 数据不符合期望格式。

IoError: 数据无法正常访问(如文件已被移走或者重命名)。

AttributeError: 调用不存在的方法引发的异常

EOFError: 遇到文件末尾引发的异常

ImportError: 导入模块出错引发的异常

IndexError: 列表越界引发的异常

KeyError: 使用字典中不存在的关键字引发的异常

NameError: 使用不存在的变量名引发的异常

TabError: 语句块缩进不正确引发的异常

ZeroDivisionError: 除数为零引发的异常


猜你喜欢

转载自blog.csdn.net/mao_jonah/article/details/79462267