字符串的内容、操作方法和字节

本文主要是对字符串的包含字符类型、操作方法进行了梳理，还简要提了一下字符和字节的关系

一、字符的内容

字母（大小写）+数字(二+八+十+十六）+符号+空格+空行

# 字母（大小写）+数字(二+八+十+十六）+符号+空格+空行
import string
print(dir(string))  # 字符串中包含的函数
print(dir(str))
print(string.ascii_letters)  # 二进制字母
print(string.digits)         # 十进制
print(string.hexdigits)      # 十六进制
print(string.octdigits)      # 八进制
print(string.punctuation)    # 键盘上所有符号（包括空格）
print(string.whitespace)     # 空格或空行
print(string.printable)      # 可打印的部分(数字+字母+符号+空格)

二、字符串方法简述

1. 基本方法

字符串的增查：+ * 、in

print('a'*3 + 'b')
print('a' in 'aaab')

2. 内置方法

格式相关：判断、大小写、进制转化、切分、索引、
格式化输出（）

# 字符串可用的操作方法、多用dir和help
L = [x for x in dir('python') if '__' not in x];print(L)

两种调用方法：

str.func(‘str1’,opt) # opt:optional

‘str1’.func(opt)

常用方法：join、split 、strip、replace、zfill capitalize upper lower endswith find title format

判断：字母、数字、大小写、可打印、结尾

字母数字：isalpha(字母)、isalnum(字母+整数) 、isnumeric(整数) 、isdigit()
进制：isascii(二进) isdecimal(十进)
大小写：islower isupper istitle

可打印字符：isspace isprintable isindentifier(标识符)

首尾：endswith、 startswith

"10".isnumeric()
'abc'.endswith('c')

操作（大小写）

capitalize(首字大写) title(单词首字大写) uppper(大) lower(小)
casefold(不考虑格式) swapcase(互换)

'young'.capitalize()

去空格

lstrip(去开头) rstrip（去结尾） strip（左右） expandtabs(\t变空格)

'   young'.lstrip()

切分

split rsplit splitlines rpartion(","切分成元组) partition

'abc,efd'.split(',')

查某个值的索引、计数、替代

rfind find rindex index count replace

'asfdd'.find('d')

二进制

encode、（字节是decode）

# 汉字的编码和解码
s = "中"
b = s.encode("gb2312")  # 通过gb2312编码成字节（十六进制）
print(b)
b2 = b'\xd6\xd0'  # ASCII码下“中”的字节码
s2 = b2.decode("gb2312")  # Unicode解码
print(s2)

格式化输出

format ljust(10,"*")(统一字符长度) rjust center zfill(左侧补零)

'abdc'.ljust(10,'*') # 限定长度，左侧对齐不够就补星

连接（列表等对象变成字符串）

join(连接字符)

L = [str(i) for i in range(1,10)]
','.join(L)

翻译解密

str.maketrans str.translate format_map

trans_dict = str.maketrans('abc','123')
str.translate('abccab',trans_dict)

format函数小结

digittype; b o d x X c？ n(同d) e E f F g> G %
alignmentType:< ^ > =(正负号)
长度限制
{ { {}}}根据左括号确定次序

在这里插入图片描述

# format 的位置
print("Hello {},your weight is {}.".format('Alex',200)) # 顺序位置
print("Hello {1},your weight is {0}.".format('Alex',200)) # 指定位置
print("Hello {name},your weight is {weight}.".format(name='Alex',weight=200)) # 关键词位置
print("Hello {weight:3.1f},your weight is {0}.".format('Alex',weight=200.55)) # 关键词位置加格式
# format 的数字形式
print("bin: {0:b}, oct: {0:o}, hex: {0:x},int: {0:d}".format(12)) # 二、十六、八、十进制
print("flo: {0:.1f} exp: {0:.3e}, per: {0:.3%}".format(10.22))  # 控制浮点及科学、百分制
# format 长度加对齐和补齐 < ^ > =
print("{:5d}".format(12))  # 默认右对齐 整数
print("{:>5d}".format(12)) # 左侧补空格
print("{:>05d}".format(12)) # 左侧补零
print("{:^10.3f}".format(12))  # 居中  浮点型
print("{:<05d}".format(12))  # 左对齐，右面补零 
print("{:=8.3f}".format(-12.234))  # 数字加符号
    # 字符串也有类似内置方法just
print("{:*^5}".format("cat"))  # 居中，左右补星
print("{:*^5}".format(123))

三、字符串与字节

编码和解码

编码：从字符（串）到字节; chr(num)

解码：从字节到字符（串）: ord(char)

ASCII和Unicode是两个流行的编码方式,反向为解码。

ASCII码：0-127个ASCII码（对应string.printable）

for i in range(128):print(chr(i))

Unicode: 包含所有语言的字符,包含中文和日文等

字符串是一个Unicode序列，Unicode中包含所有语言中的字符，将字符串编码成十六进制的序列，因此提供了一种统一的编码方式。

常用的Unicode: utf8、gb2312

字符串和字节的操作方法的异同

str	bytes	str	bytes	str	bytes
capitalize	capitalize	isdigit	isdigit	rfind	rfind
casefold		isidentifier		rindex	rindex
center	center	islower	islower	rjust	rjust
count	count	isnumeric		rpartition	rpartition
encode	decode	isprintable		rsplit	rsplit
endswith	endswith	isspace	isspace	rstrip	rstrip
expandtabs	expandtabs	istitle	istitle	split	split
find	find	isupper	isupper	splitlines	splitlines
format	fromhex	join	join	startswith	startswith
	hex	ljust	ljust	strip	strip
index	index	lower	lower	swapcase	swapcase
isalnum	isalnum	lstrip	lstrip	title	title
isalpha	isalpha	maketrans	maketrans	translate	translate
isascii	isascii	partition	partition	upper	upper
isdecimal		replace	replace	zfill	zfill

关于字节的解释，见字节篇