How exactly is axis defined in python? Do they really represent rows or columns of the DataFrame? Consider the following code:
>>>df = pd.DataFrame([[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3]], \
columns=["col1", "col2", "col3", "col4"])
>>>df
col1 col2 col3 col4
0 1 1 1 1
1 2 2 2 2
2 3 3 3 3
If we call df.mean(axis=1) we will get the row-wise mean
>>> df.mean(axis=1)
0 1
1 2
2 3
However, if we call df.drop((name, axis=1) , we actually drop a column, not a row:
>>> df.drop("col4", axis=1)
col1 col2 col3
0 1 1 1
1 2 2 2
2 3 3 3
The real meaning of the axis parameter in pandas, numpy, and scipy?
The essence of the problem:
In fact, there is a problem with understanding the axis. df.mean actually takes the mean of all columns on each row, rather than retaining the mean of each column. Maybe a simple way to remember is that axis=0 means to go across the row ( down) , and axis=1 means across the column (across) , as an adverb of the method action (Translator's Note)
in other words:
- Use a value of 0 to execute the method down each column or row label\index value
- Use a value of 1 to execute the corresponding method along each row or column label modulo
The following figures represent the meanings of axis 0 and 1 in the DataFrame:
![](https://upload-images.jianshu.io/upload_images/2233157-b77105789e36c847.jpg?imageMogr2/auto-orient/strip%7CimageView2/2/w/652)
Also, keep in mind that Pandas maintains Numpy's usage of the keyword axis, which is explained in the Numpy library's glossary:
Axes are used to define properties for arrays with more than one dimension. Two-dimensional data has two axes: the 0th axis runs vertically down the rows, and the 1st axis runs horizontally along the columns.
So the first column in the question, df.mean(axis=1), represents the calculation of the mean along the horizontal direction of the column, and the second column, df.drop(name, axis=1), represents the column label(s) corresponding to the name. Delete in turn in the horizontal direction.