pandas.factorize

官网地址https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.factorize.html

pandas.factorize

将Series中的相同的标称型映射为相同的index

pandas.factorize(valuessort=Falsena_sentinel=- 1size_hint=Nonedropna=True)[source]

Encode the object as an enumerated type or categorical variable.

This method is useful for obtaining a numeric representation of an array when all that matters is identifying distinct values. factorize is available as both a top-level function pandas.factorize(), and as a method Series.factorize() and Index.factorize().

Parameters

valuessequence

A 1-D sequence. Sequences that aren’t pandas objects are coerced to ndarrays before factorization.

sortbool, default False

Sort uniques and shuffle codes to maintain the relationship.

na_sentinelint, default -1

Value to mark “not found”.

size_hintint, optional

Hint to the hashtable sizer.

Returns

codesndarray

An integer ndarray that’s an indexer into uniques. uniques.take(codes) will have the same values as values.

uniquesndarray, Index, or Categorical

The unique valid values. When values is Categorical, uniques is a Categorical. When values is some other pandas object, an Index is returned. Otherwise, a 1-D ndarray is returned.

Note

Even if there’s a missing value in values, uniques will not contain an entry for it.

See also

cut

Discretize continuous-valued array.

unique

Find the unique value in an array.

猜你喜欢

转载自blog.csdn.net/fu_jian_ping/article/details/108018959