版权声明: https://blog.csdn.net/Dorothy_Xue/article/details/83931954
1. .groupby()
以一种自然的方式对数据集进行切片、切块、摘要等操作。根据一个或多个键(可以是函数、数组或DataFrame列名)拆分pandas对象。
>>>import pandas as pd
>>>df=pd.DataFrame({'key1':['a','a','b','b','a'],
'key2':['one','two','one','two','one'],
'data1':np.random.randn(5),
'data2':np.random.randn(5)})
>>>df
data1 data2 key1 key2
0 -0.410673 0.519378 a one
1 -2.120793 0.199074 a two
2 0.642216 -0.143671 b one
3 0.975133 -0.592994 b two
4 -1.017495 -0.530459 a one
#按key1分组,并计算data1列的平均值
>>>grouped=df['data1'].groupby(df['key1'])
>>>grouped.mean()
key1
a -1.182987
b 0.808674
>>>means=df['data1'].groupby(df['key1'],df['key2']).means()
key1 key2
a one -0.714084
two -2.120793
b one 0.642216
two 0.975133
2. .groupby().apply()
先分组,再对每个分组应用apply函数中的操作
3. .loc() 与 .iloc()
loc——通过行标签索引行数据
iloc——通过行号索引行数据
具体参考下面的博文
https://blog.csdn.net/hecongqing/article/details/61927615
未完待续