Unexpected Behavior for pd.DataFrame.drop when using with Multilevel Index dataframe

PotentialLime :

I've been getting an unexpected behavior after dropping a column from a multi-index dataframe in pandas.

I need to get the outermost (level=0) columns for a multi-index dataframe after dropping a column. To get the level=0 columns, I used:

df.columns.levels[0]

However, even after dropping a specific column from the original dataframe and assigning it to a new dataframe, I still get the same elements on the index list, instead of the updated column list.

For example:

INPUT: df
Box       '1'                  '2'                   '3'
Latency   code latency  loc    code latency  loc    code latency  loc
0         9170.  948.    L.    8170.  328.    R.    9160.  238.    L.
1         7540   1501.   R     9170.  9028.   L.    7170.   94.    L.
INPUT:df.columns.levels[0]
Out: Index(['1', '2', '3'], dtype='object', name='Box Number')



dropped_df = df.drop('2', axis=1, level=0)
INPUT: dropped_df.columns.levels[0]
Out: Index(['1', '2', '3'], dtype='object', name='Box Number')


INPUT: dropped_df
Out: 
Box       '1'                  '3'                  
Latency   code latency  loc    code latency  loc
0         9170.  948.    L.    9160.  238.    L.
1         7540   1501.   R     7170.   94.    L.

I'm not sure if this is a bug or if I'm doing something wrong... Why is the updated dataframe (dropped_df) returning the same columns as the original dataframe even when the output of the updated df shows that the dataframe has been changed? Is the original dataframe being cached (copied) somewhere?

Any help / pointers would be appreciated!

NOTE: I'm using python =3.6.8. / pandas =0.25.0

EDIT 1: The columns are string type, so it is not a matter of incorrect types affecting behavior.

Celius Stingher :

After some investigation and using the code you give as sample, and trying:

dropped_df.columns.levels[1] = dropped_df.columns.levels[1]

I got the following error:

TypeError: 'FrozenList' does not support mutable operations.

Researching pandas documentation it seems that indeed as stated in this answer:

The construct is used to represent a MultiIndex levels,labels, and names. The point of it is to prevent modification of these thru attributes and force the use of methods (e.g. set_levels()). As the state of these cannot be changed independent (for level/labels), but must be changed together.

Explaining why when we see dropped_df.columns.levels[1] we get the Frozen (original) values, not equaling what we see when simply displaying dropped_df

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=194812&siteId=1