Pandas: How to create a new column by random seletct other columns?

Question:

Example df:

   A B C
0  X 9 0
1  5 7 5
2  5 6 Y

Expect output as below. The column D’s value is random selected from column A/B/C

   A B C D
0  X 9 0 X(random from column A)
1  5 7 5 7(random from column B)
2  5 6 Y Y(random from column C)
Asked By: smelling lady

||

Answers:

You can use numpy advanced indexing with numpy.random.randint:

df['D'] = df.to_numpy()[np.arange(df.shape[0]),
                        np.random.randint(0, df.shape[1], df.shape[0])]

Example:

   A  B  C  D
0  X  9  0  X   # A
1  5  7  5  7   # B
2  5  6  Y  5   # A

If you wan exactly one value from each column, use a random permutation with numpy.random.choice:

df['D'] = df.to_numpy()[np.arange(df.shape[0]),
                        np.random.choice(np.arange(df.shape[1]),
                                         df.shape[0], replace=False)]

Example:

   A  B  C  D
0  X  9  0  9   # B
1  5  7  5  5   # A
2  5  6  Y  Y   # C
Answered By: mozway
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.