Knowing a column of indexes how to get column of their values in Dataframe?

Question:

I have a DataFramedf = pd.DataFrame(np.random.randint(-100, 100, 100).reshape(10, -1), columns=list(range(10)), index=list('abcdefghij')) and I found some indexes using method .idxmin(axis=1) and getting Seriesdf_ind =

  • a 5
  • b 8
  • c 2
  • d 0
  • e 7
  • f 0
  • g 4
  • h 6
  • i 5
  • j 2

How to get column of their values and than add (insert) it to df?

df.insert(11, "values", df[df_ind]) gives me a 10×10 DataFrame, but not 1×10 DataFrame

Asked By: ddddd

||

Answers:

If you want to slice the values, you can use a modified indexing lookup:

# first reindex the indexer to ensure
# having the correct final order
idx, cols = pd.factorize(df_ind.reindex(df.index))

# then perform indexing lookup
df['values'] = df.reindex(columns=cols).to_numpy()[np.arange(len(df)), idx]

Output:

    0   1   2   3   4   5   6   7   8   9  values
a  72 -53  17  92 -33  95   3 -91 -79 -64      95
b -13 -30 -12  40 -42  93 -61 -13  74 -12      74
c -19  65 -75 -23 -28 -91  48  15  97 -21     -75
d  75  92 -18  -1  77 -71  47  47  42  67      75
e -68  93 -91  85  27 -68 -69  51  63  14      51
f  83 -72 -66  28  28  64 -47  33 -62 -83      83
g -21  32   5 -58  86 -69  20 -99 -35  69      86
h -43 -65   2  19 -89  74 -18  -9  28  42     -18
i  -1 -47  40  21  70 -16 -32 -94  96 -53     -16
j  27  31   0  80 -22  43  48  86 -77  41       0

Output if the original DataFrame was in a different order:

    0   1   2   3   4   5   6   7   8   9  values
c -19  65 -75 -23 -28 -91  48  15  97 -21     -75
e -68  93 -91  85  27 -68 -69  51  63  14      51
h -43 -65   2  19 -89  74 -18  -9  28  42     -18
d  75  92 -18  -1  77 -71  47  47  42  67      75
g -21  32   5 -58  86 -69  20 -99 -35  69      86
i  -1 -47  40  21  70 -16 -32 -94  96 -53     -16
b -13 -30 -12  40 -42  93 -61 -13  74 -12      74
a  72 -53  17  92 -33  95   3 -91 -79 -64      95
j  27  31   0  80 -22  43  48  86 -77  41       0
f  83 -72 -66  28  28  64 -47  33 -62 -83      83

Reproducible df_ind:

df_ind = pd.Series({'a': 5, 'b': 8, 'c': 2, 'd': 0, 'e': 7, 'f': 0, 'g': 4, 'h': 6, 'i': 5, 'j': 2})
Answered By: mozway
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.