Pandas的DataFrame教程

1、DF创建

>>> import pandas as pd
>>> val = [[1,3,3,4],[5,6,7,8],[1,1,1,1],[2,3,2,3]]
>>> cols = ['A','B','C','D']
>>> indx= ['i1','i2','i3','i4']
>>> df = pd.DataFrame(val,columns=cols)              # DF创建，未设置索引
>>> print(df)
   A  B  C  D
0  1  3  3  4
1  5  6  7  8
2  1  1  1  1
3  2  3  2  3
>>> df = pd.DataFrame(val,columns=cols,index=indx)  #DF创建，设置索引
>>> df
    A  B  C  D
i1  1  3  3  4
i2  5  6  7  8
i3  1  1  1  1
i4  2  3  2  3
>>> print(df.values)                                 #DF转换成数组
[[1 3 3 4]
 [5 6 7 8]
 [1 1 1 1]
 [2 3 2 3]]
>>> print(df[['A','B']].values)
[[1 3]
 [5 6]
 [1 1]
 [2 3]]
>>> df.columns                                        #获取columns，转成list
Index(['A', 'B', 'C', 'D'], dtype='object')
>>> df.columns.tolist()
['A', 'B', 'C', 'D']
>>> df.index
Index(['i1', 'i2', 'i3', 'i4'], dtype='object')       #获取索引，转成list
>>> df.index.tolist()
['i1', 'i2', 'i3', 'i4']
>>>

2、DF遍历

>>> for row in df.index:
...     print(df.loc[row][['A','B']])               #遍历，同事选择A，B列
...
A    1
B    3
Name: i1, dtype: int64
A    5
B    6
Name: i2, dtype: int64
A    1
B    1
Name: i3, dtype: int64
A    2
B    3
Name: i4, dtype: int64
>>>

3、DF查询、索引

>>> df
    A  B  C  D
i1  1  3  3  4
i2  5  6  7  8
i3  1  1  1  1
i4  2  3  2  3
>>> df.loc['i1','A']             #loc,中括号里面是先行后列,以逗号分割,行和列分别是行标签和列标签
1
>>> df.loc['i1':'i3','A':'C']    #切片
    A  B  C
i1  1  3  3
i2  5  6  7
i3  1  1  1
>>>
>>> df.loc[['i1','i3'],['A','C']]  #多个值，list
    A  C
i1  1  3
i3  1  1
>>> df
    A  B  C  D
i1  1  3  3  4
i2  5  6  7  8
i3  1  1  1  1
i4  2  3  2  3
>>> df.iloc[1,2]                   #.iloc 是根据行数与列数来索引的
7
>>> df.iloc[0:2,1:2]               #切片
    B
i1  3
i2  6
>>> df.iloc[[0,1],[1,2]]           #多个值，list
    B  C
i1  3  3
i2  6  7
>>>

4、DF修改

>>> df
    A  B  C  D
i1  1  3  3  4
i2  5  6  7  8
i3  1  1  1  1
i4  2  3  2  3
>>> df['A']=100                   #某一列修改
>>> df
      A  B  C  D
i1  100  3  3  4
i2  100  6  7  8
i3  100  1  1  1
i4  100  3  2  3
>>> df.loc['i1','A']=222           #某个值修改
>>> df
      A  B  C  D
i1  222  3  3  4
i2    5  6  7  8
i3    1  1  1  1
i4    2  3  2  3
>>>df.loc[df['A']==1,'B'] = 10    #按条件查询批量修改值
    A  B  C  D
i1  1  10  3  4
i2  5  6  7  8
i3  1  10  1  1
i4  2  3  2  3

5、DF增加

>>> df
   A  B  C  D
0  1  3  3  4
1  5  6  7  8
2  1  1  1  1
3  2  3  2  3
>>> temp
   A  B  C  D
0  1  3  3  4
1  5  6  7  8
2  1  1  1  1
3  2  3  2  3
>>> result=df.append(temp,ignore_index=True)         #增加行，忽略索引，会按序新增索引
>>> result
   A  B  C  D
0  1  3  3  4
1  5  6  7  8
2  1  1  1  1
3  2  3  2  3
4  1  3  3  4
5  5  6  7  8
6  1  1  1  1
7  2  3  2  3
>>> df.loc[4]=[7,8,9,10]                             #索引赋值+增加行
>>> df
   A  B  C   D
0  1  3  3   4
1  5  6  7   8
2  1  1  1   1
3  2  3  2   3
4  7  8  9  10
>>>

pandas学习：对series和dataframe进行排序： https://blog.csdn.net/u014662865/article/details/59058039

基于pandas中的列中的值从DataFrame中选择行： https://www.cnblogs.com/to-creat/p/7724562.html

Pandas的DataFrame教程

猜你喜欢