Chapter 8: Strings in Python
1. The resident mechanism of strings
1.1 Strings
A string is a basic data type in Python and is an immutable sequence of characters
1.2 What is the string resident mechanism
- Only one copy of the same and immutable string is saved. Different values are stored in the resident pool of the string. Python’s resident mechanism only keeps one copy of the same string. When the same string is created later , will not open up a new space, but assign the address of the string to the newly created variable
icon
code demo
"""
字符串的驻留机制
"""
a = 'Python'
b = "Python"
c = '''Python'''
print(a, id(a)) # 内存地址相同
print(b, id(b))
print(c, id(c))
1.3 Several situations of the resident mechanism (interactive mode)
The intern method in sys forces 2 strings to point to the same object
PyCharm optimizes strings
- When the length of the string is 0 or 1
- string matching the identifier
- Strings are only resident at compile time, not runtime
- Integer numbers between [-5,256]
Second, the common operation of the string
2.1 String query operation
common method
method name | effect |
---|---|
index() | Find the position of the first occurrence of the substring, if the searched substring does not exist, an exception ValueError will be thrown |
rindex() | Find the position of the last occurrence of the substring, if the searched substring does not exist, an exception ValueError will be thrown |
find() | Find the first occurrence of the substring, if the searched substring does not exist, return -1 |
rfind() | Find the position of the last occurrence of the substring, if the searched substring does not exist, return -1 |
code demo
s = 'hello,hello'
print('1.', s.index('lo'))
print('2.', s.find('lo'))
print('3.', s.rindex('lo'))
print('4.', s.rfind('lo'))
# print('5.', s.index('lo0')) # 抛出异常
print('6.', s.find('lo0'))
print('7.', s.rfind('lo0'))
2.2 Case conversion of strings
will generate a new string object
Even though the converted string is the same as before, the id is still different
method name | effect |
---|---|
upper() | Convert all characters in a string to uppercase |
lower() | Convert all characters in a string to lowercase |
swapcase() | Convert all uppercase letters in a string to lowercase letters, and convert all lowercase letters to uppercase letters |
capitalize() | Convert the first character to uppercase and the rest to lowercase |
title() | Convert the first character of each word to uppercase and the single remaining letter to lowercase |
code demo
s = 'hello,python'
print('0.', s, id(s))
a = s.upper()
print('1.', a, id(a))
b = s.lower()
print('2.', b, id(b))
s2 = 'hello,Python'
c = s2.swapcase()
print('3.', c)
d = s2.title()
print('4.', d)
2.3 Operation of the content of the string
method name | effect |
---|---|
center() | Center specifies the width for the first parameter, and the second parameter specifies the filler (optional, the default is space). If the set width is smaller than the actual width, the original string will be returned |
light() | Left alignment The first parameter specifies the width, and the second parameter specifies the filler (optional, the default is space). If the set width is smaller than the actual width, the original string will be returned |
rjust() | Right alignment The first parameter specifies the width, and the second parameter specifies the filler (optional, default is space). If the set width is smaller than the actual width, the original string will be returned |
zfill() | Align the left and fill the right with 0. This method only accepts one parameter, which is used to specify the width of the string. If the set width is less than or equal to the length of the string, the original string will be returned. |
code demo
s = 'hello,Python'
print('原字符:', s)
print('中对齐:', s.center(20, '*'))
print('中对齐:', s.center(10, '*'))
print('左对齐:', s.ljust(20, '*'))
print('右对齐:', s.rjust(20, '*'))
print('右对齐:', s.zfill(20))
2.4 String Content Splitting Operation
method name | effect |
---|---|
split() | Start splitting from the left side of the string. The default splitting character is a space string, and the returned value is a list. The splitting character can be specified by parameters. The maximum number of splits can be specified by parameters . After the maximum number of splits, the remaining The substring of will be taken alone as part of sep maxslpit |
rsplit() | Start splitting from the right side of the string. The default splitting character is a space string, and the returned value is a list. The splitting character can be specified by parameters and the maximum number of splits can be specified by parameters . After the maximum number of splits, the remaining The substring of will be taken alone as part of sep maxslpit |
code demo
s = 'hello world Python'
lst = s.split()
print(lst)
s1 = 'hello|world|Python'
print(s1.split(sep='|'))
print(s1.split(sep='|', maxsplit=1))
print('-------------------------------')
'''rsplit()从右侧开始劈分'''
print(s.rsplit())
print(s1.rsplit('|'))
print(s1.rsplit(sep='|', maxsplit=1))
2.5 Judgment operation of string
method name | effect |
---|---|
isidentifier() | Determine whether the specified string is a legal identifier |
isspace() | Determine whether the specified string consists of all blank characters (carriage return, line feed, horizontal tab) |
isalpha() | Determines whether the specified string consists of all letters |
isdecimal() | Determine whether the specified string is composed of all decimal numbers |
isnumeric() | Determines whether the specified string is composed entirely of numbers |
isalnum() | Determine whether the specified characters are all composed of numbers |
code demo
s = 'abc%'
s1 = 'hellopython'
print(s.isidentifier()) # False
print(s1.isidentifier()) # True
print('\t'.isspace()) # True
print('abc'.isalpha()) # True
print('abc1'.isalpha()) # False
print('张三'.isalpha()) # True
print('123'.isdecimal()) # True
print('123四'.isdecimal()) # False
print('123'.isnumeric()) # True
print('123四'.isnumeric()) # True
print('IIIIIIIV'.isnumeric()) # False
print('abc123'.isalnum()) # True
print('123张'.isalnum()) # True
print('123!'.isalnum()) # False
2.6 Other common operations on strings
Function | method name | effect |
---|---|---|
string replacement | replace() | The first parameter specifies the substring to be replaced, and the second parameter specifies the string to replace the substring. This method returns the string obtained after replacement, and the string before replacement does not change. You can pass the third parameter when calling this method. Specify the maximum number of replacements |
merging of strings | join() | Combine strings in a list or tuple into one string |
code demo
s = 'hello,Python'
print(s.replace('Python', 'Java'))
s1 = 'hello,Python,Python,Python'
print(s1.replace('Python', 'Java', 2))
lst = ['hello', 'java', 'Python']
print('|'.join(lst))
print(''.join(lst))
t = ('hello', 'Java', 'Python')
print(''.join(t))
print('*'.join('Python'))
Third, the comparison of strings
- operator
> >= < <= == !=
-
compare rules
- First compare the first characters in the two strings, if they are equal, continue to compare the next character, and compare them in turn until the characters in the two strings are not equal, the comparison result is the comparison result of the two strings, All subsequent characters in the two strings will no longer be compared
-
comparison principle
- When two characters are compared, the ordinal value (original value) is compared, and the
ord
ordinal value of the specified character can be obtained by calling the built-in function. - Corresponding to the built-in function ord is a built-in function
chr
. When calling the built-in function chr, specify ordinal value to get its corresponding character
- When two characters are compared, the ordinal value (original value) is compared, and the
-
code demo
rint('apple' > 'app') # True
print('apple' > 'banana') # False ,相当于97>98 >False
print(ord('a'), ord('b'))
print(ord('魏'))
print(chr(97), chr(98))
print(chr(39759))
'''
== 与is的区别
== 比较的是 value 是否相等
is 比较的是 id 是否相等
'''
a = b = 'Python'
c = 'Python'
print(a == b) # True
print(b == c) # True
print(a is b) # True
print(a is c) # True
print(id(a)) # 2204259933168
print(id(b)) # 2204259933168
print(id(c)) # 2204259933168
Fourth, the slice operation of the string
Strings are immutable types:
Does not have the operation of adding, deleting and modifying
Slicing operations will generate new objects
code demo
s = 'hello,Python'
s1 = s[:5] # 由于没有指定起始位置,所以从0开始切
s2 = s[6:] # 由于没有指定结束位置,所以切到字符串的最后一个元素
s3 = '!'
newstr = s1 + s3 + s2
print(s1)
print(s2)
print(newstr)
print('--------------------')
print(id(s))
print(id(s1))
print(id(s2))
print(id(s3))
print(id(newstr))
print('------------------切片[start:end:step]-------------------------')
print(s[1:5:1]) # 从1开始截到5(不包含5),步长为1
print(s[::2]) # 默认从0 开始,没有写结束,默认到字符串的最后一个元素 ,步长为2 ,两个元素之间的索引间隔为2
print(s[::-1]) # 默认从字符串的最后一个元素开始,到字符串的第一个元素结束,因为步长为负数
print(s[-6::1]) # 从索引为-6开始,到字符串的最后一个元素结束,步长为1
Five, format string
Two ways to format strings
code demo
"""第一种:% 占位符"""
name = '小米'
age = 20
print('我叫%s,今年%d岁' % (name, age))
"""第二种方法 {} 占位符 """
print('我叫{0},今年{1}岁'.format(name, age))
"""第三种方法 f-string方法"""
print(f'我叫{
name},今年{
age}岁')
representation of precision
print('%10d' % 99) # 10表示宽度
print('%f' % 3.1415926)
# 保留三位小数
print('%.3f' % 3.1415926)
# 同时设置宽度和精度:总宽度为10,小数点为3位
print('%10.3f' % 3.1415926)
print('{0}'.format(3.1415936))
print('{0:.3}'.format(3.1415936)) # .3表示一共是三位
print('{0:.3f}'.format(3.1415936)) # .3f表示是三位小数
# 同时设置宽度和精度:总宽度为10,小数点为3位
print('{0:10.3f}'.format(3.1415926))
6. String encoding conversion
- Why do you need string encoding conversion
-
How to encode and decode
- Encoding: convert string to binary data (bytes)
- Decoding: convert bytes type data into string type
-
code demo
The crawler part will apply
s = '天涯共此时'
# 编码
print(s.encode(encoding='GBK')) # 在GBK格式中 一个中文占2个字节
print(s.encode(encoding='UTF-8')) # UTF-8格式中,一个中文占3个字节
# 解码(解码格式 要和 编码格式 相同)
# byte代表一个二进制数据(字节类型数据)
byte = s.encode(encoding='GBK')
print(byte.decode(encoding='GBK'))
# print(byte.decode(encoding='UTF-8'))# 报错