Learning content
pandas packet calculated
Tips: packet summing step
1) Packet
2) Application
3) combined
key
1. General Packet
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(10, 20, (4, 2)),
index=['a', 'b', 'c', 'd'],
columns=["one", "two"])
print(df)
print(df["one"].groupby(df['two']))#这个关键字分组只能按列分组
one two
a 11 12
b 16 14
c 15 16
d 12 13
<pandas.core.groupby.generic.SeriesGroupBy object at 0x00000297773CDC50>
print(df["one"].groupby(df['two']).mean())
two
11 14
14 16
15 15
17 14
Name: one, dtype: int32
2. grouped dictionary table
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(10, 20, (4, 3)),
index=['a', 'b', 'c', 'd'],
columns=["one", "two", 'three'])
df.iloc[1, 1:3] = np.NaN
mapping={'a':'red','b':'red','c':'blue','d':'white'}
grouped=df.groupby(mapping,axis=0)
print(grouped.sum())
one two three
blue 15 10.0 10.0
red 35 12.0 18.0
white 17 13.0 12.0
3. By grouping function
import pandas as pd
import numpy as np
def group_by(idx):
print(idx)
return idx
df = pd.DataFrame(np.random.randint(10, 20, (4, 3)),
index=['a', 'b', 'c', 'd'],
columns=["one", "two", 'three'])
print(df)
print(df.groupby(group_by).size())
one two three
a 17 10 14
b 10 16 10
c 12 16 14
d 17 17 13
a
b
c
d
a 1
b 1
c 1
d 1
4. By grouping index level
df.groupby(level="关键字",axis= )
#这里的level可以是一级索引也可以是二级索引,但是如果索引选择行索引,那么axis必须是列,不然无法进行分组