版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/YandiLu/article/details/84753729
一、列表去重
1、循环去重
list_1 = [5,5,1,4,4,6,7,8,1]
new_list = []
for i in list_1:
if i not in new_list:
new_list.append(i)
print(new_list)
结果:[5, 1, 4, 6, 7, 8]。结果顺序是原来的顺序。
2、集合set()去重
list_1 = [5,5,1,4,4,6,7,8,1]
new_list = list(set(list_1))
print(new_list)
结果:[1, 4, 5, 6, 7, 8]。结果进行了排序。
二、数据框去重
1、unique()去重
import pandas as pd
data =pd.DataFrame({'score':[1,2,3,1,5,6],'name':['Tom','John','june','Tom','John','june']})
data.name.unique()
#import numpy as np
#np.unique(data.score)
2、frame.drop_duplicates()去重
import pandas as pd
data =pd.DataFrame({'score':[1,2,3,1,5,6],'name':['Tom','John','june','Tom','John','june']})
data.drop_duplicates(['name'])
data.drop_duplicates(['score'])
data.drop_duplicates(['name','score'])
三个结果分别如下: