Pandas: How to create a new column by random seletct other columns?
Question:
Example df:
A B C
0 X 9 0
1 5 7 5
2 5 6 Y
Expect output as below. The column D’s value is random selected from column A/B/C
A B C D
0 X 9 0 X(random from column A)
1 5 7 5 7(random from column B)
2 5 6 Y Y(random from column C)
Answers:
You can use numpy advanced indexing with numpy.random.randint
:
df['D'] = df.to_numpy()[np.arange(df.shape[0]),
np.random.randint(0, df.shape[1], df.shape[0])]
Example:
A B C D
0 X 9 0 X # A
1 5 7 5 7 # B
2 5 6 Y 5 # A
If you wan exactly one value from each column, use a random permutation with numpy.random.choice
:
df['D'] = df.to_numpy()[np.arange(df.shape[0]),
np.random.choice(np.arange(df.shape[1]),
df.shape[0], replace=False)]
Example:
A B C D
0 X 9 0 9 # B
1 5 7 5 5 # A
2 5 6 Y Y # C
Example df:
A B C
0 X 9 0
1 5 7 5
2 5 6 Y
Expect output as below. The column D’s value is random selected from column A/B/C
A B C D
0 X 9 0 X(random from column A)
1 5 7 5 7(random from column B)
2 5 6 Y Y(random from column C)
You can use numpy advanced indexing with numpy.random.randint
:
df['D'] = df.to_numpy()[np.arange(df.shape[0]),
np.random.randint(0, df.shape[1], df.shape[0])]
Example:
A B C D
0 X 9 0 X # A
1 5 7 5 7 # B
2 5 6 Y 5 # A
If you wan exactly one value from each column, use a random permutation with numpy.random.choice
:
df['D'] = df.to_numpy()[np.arange(df.shape[0]),
np.random.choice(np.arange(df.shape[1]),
df.shape[0], replace=False)]
Example:
A B C D
0 X 9 0 9 # B
1 5 7 5 5 # A
2 5 6 Y Y # C