How to randomly chose rows of a pandas dataframe to update

Question:

I am a beginner in python and I have a pandas dataframe that I want to change as below:

10% of rows of column "review" must be changed by adding a prefix
90% of rows of column "review" must be unchanged

for changing all rows of "review" I can use the code :
X_test["modified_review"] = " abc " + X_test["review"]

and to select 10% of rows I can use :
X_test.sample(frac=0.1)

But I don’t know how to combine the two codes to modify only the selected lines.

Please help!

Asked By: SLA

||

Answers:

You can sample 10% random indexes and update the corresponding locations only:

df["modified_review"] = df["review"]

rand_ids = df.index.to_series().sample(frac=0.1)
df.loc[rand_ids, "modified_review"] = " abc " + df.loc[rand_ids, "modified_review"]
Answered By: Mustafa Aydın
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.