Unable to assign different values in each cell of a column in dataframe, containing 99,000 records

Question

I want to change values greater than 70 in column CT_feat7 but it only changes till 59000. After that, I have to run the iteration again, with a different index value.

Please, explain why this happens. Is there a better way?
Dataset before replacement. After I run this code:

for index,j in enumerate(df['CT_feat7']):
  if j>70:
    df.loc[index,'CT_feat7'] = 11+random.random()

values are changed only up to index 59180.

i,j = 59180,2
while i <= 99195:
  if df.loc[i,'CT_feat7']>70:
    df.loc[i,'CT_feat7'] = j
    j+=0.1
    if j>12:
      j=2
  i+=1

Asked By: Shasi Kumar

||

Source

Answer 1

I think it is because enumerate() is not the proper iterator to use with .loc. Try:

for index,j in df['CT_feat7'].items():
  if j>70:
    df.loc[index,'CT_feat7'] = 11+random.random()

enumerate() works on the first ~50,000 rows because that is (I suspect) how many rows are in df. This is because enumerate() iterates over the values j in the passed Series and for each j, the corresponding index is the location of j in the Series, ranging from 0 to the length of the Series. However, when slicing with .loc, you must give the label (not the location) of the item(s) you want. See this answer for more information.

Answered By: evces

Unable to assign different values in each cell of a column in dataframe, containing 99,000 records

Question:

Answers: