Python基础知识（十）----- 持久化及字符串

文章目录

一. 对象持久化

1. 扁平文件
2.pickle
3.shelve

二. 字符串

1.概述
2.字符编码
3. 内置函数
4. 类型转换
5. BOM处理

一. 对象持久化

1. 扁平文件

文本文件

scores = [88,99,77,55]

def write_scores():
    with open('data.list.txt', 'w', encoding='utf8') as f:
        f.write(str(scores))
    print('complete')

def read_scores():
    with open('data.list.txt', 'r', encoding='utf8') as f:
        lst = eval(f.read()) #字符串转化成python表达式

    lst[0] = 99
    print(lst)

if __name__ == '__main__':
    read_scores()

2.pickle

pickle
（1）序列化为字符串

import pickle
person = {'name':'tom','age':20}
s = pickle.dumps(person) #将对象序列化为字符串
s
b'\x80\x03}q\x00(X\x04\x00\x00\x00nameq\x01X\x03\x00\x00\x00tomq\x02X\x03\x00\x00\x00ageq\x03K\x14u.'
p = pickle.loads(s) #从字符串反序列化对象
p
{'name': 'tom', 'age': 20}

(2) 序列化对象到文件

pickle.dump(person,open('pickle_db','wb'))
p = pickle.load(open('pickle_db','rb'))
p
{'name': 'tom', 'age': 20}

3.shelve

shelve

import shelve
scores = [99,88,77]
student = {'name':'Mike','age':20}
db = shelve.open('shelve_student')
db['s'] = student
db['scores'] = scores
len(db)
2
temp_student = db['s']
temp_student
{'name': 'Mike', 'age': 20}
db['scores']
[99, 88, 77]
del db['scores'] #删除
len(db)
1

import shelve

class Student:
    def __init__(self,name,age):
        self.name = name
        self.age = age

    def __str__(self):
        return self.name

def write_shelve():
    db = shelve.open('shelve_student_db')
    db['s'] = s
    db.close()

def read_shelve():
    s = Student('Tom', 20)
    db = shelve.open('shelve_student_db')
    st =db['s']
    print(st)
    print(st.name)
    db.close()

if __name__ == '__main__':
    read_shelve()

二. 字符串

1.概述

类型
（1）str 字符串
（2）bytes 字节
（3）bytearray 字节数组
字符编码架构
（1）字符集：赋值一个编码到某个字符，以便在内存中表示
（2）编码 encoding：转换字符到原始字节形式
（3）解码 Decoding：依据编码名称转换原始字节到字符的过程
字符串存储
（1）编码只作用于文件存储或中间媒介转换时
（2）内存中总是存储解码以后的文本

2.字符编码

ASCII：存储在一个Byte 0-127
latin-1：存储在一个Byte 128-225

chr(223)
'ß'

UTF-16: Byte存储字符（另加2Byte作为标识）
UTF-32:4 Byte
UTF-8: 可变字节：0-127 使用单字节，128-2047 双字节存储，2047 3-4Byte，每Byte使用 128-255

3. 内置函数

ord 获取字符代码点

ord('A')
65

chr 获取代码点对应字符

chr(104)
'h'

str.encode 转换字符到原始字节形式

s1 = 'ABCD'
s1.encode('ASCII')
b'ABCD'
s2 = '优品课堂'
type(s2)
<class 'str'>
s2.encode('UTF-8')
b'\xe4\xbc\x98\xe5\x93\x81\xe8\xaf\xbe\xe5\xa0\x82'
s2.encode('UTF-16')

str.decode 依据编码名称转换原始字节到字符的过程（文件形式默认jbk，不是utf8）

b1 = b'\xe4\xbc\x98\xe5\x93\x81\xe8\xaf\xbe\xe5\xa0\x82'
b1.decode('utf-8')
'优品课堂'
b1.decode('utf-16')
'볤\ue598膓꿨\ue5be芠'

4. 类型转换

bytes: 不可原位改变
(1) 手动声明 b’’
(2) 字符串编码 str.encode()
(3) 构造函数 bytes()

bytes('优品课堂','utf8')
b'\xe4\xbc\x98\xe5\x93\x81\xe8\xaf\xbe\xe5\xa0\x82'

byterray: 可原位改变

ba = bytearray('abc','utf8')
ba.append(93)
ba
bytearray(b'abc]')
ba.decode('utf8')
'abc]'

5. BOM处理

忽略乱码

open('data3.txt','w',encoding='utf-8-sig').write('youpinketang')
12
open('data3.txt','r',encoding='utf-8-sig').read()
'youpinketang'

mangogogo321

发布了11 篇原创文章 · 获赞 0 · 访问量 180

私信关注

Python基础知识（十）----- 持久化及字符串

文章目录

一. 对象持久化

1. 扁平文件

2.pickle

3.shelve

二. 字符串

1.概述

2.字符编码

3. 内置函数

4. 类型转换

5. BOM处理

猜你喜欢