版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/winycg/article/details/82803106
去处重复数据:
>>> a = pd.DataFrame({'a':[1,1,2], 'b':[1,1,3]})
>>> a
a b
0 1 1
1 1 1
2 2 3
>>> a.drop_duplicates()
a b
0 1 1
2 2 3
汇总统计:
>>> a.describe()
a b
count 2.000000 2.000000
mean 1.500000 2.000000
std 0.707107 1.414214
min 1.000000 1.000000
25% 1.250000 1.500000
50% 1.500000 2.000000
75% 1.750000 2.500000
max 2.000000 3.000000