Python to Iterate first 50 element, run a test on that and iterate next 50 element and run a test on that till last 50 and print result
Question:
I am trying to test condition when iteration reach first 50 elements, then next 50 elements in a list based on certain conditions and so on. My list contains 630000 elements fed from df_. This is my attempt:
For a dataframe: df_
distance
0.5
10.4
0.5
14.4
0.15
100.4
0.25
12.4
mylist_data = list()
mylist1_data = list()
for index, row in df_.iterrows():
mylist = (row.distance)
mylist_data.append(mylist)
mylist1 = (row.day_night)
mylist1_data.append(mylist1)
if (len(mylist_data)== 50):
xmean = np.mean(mylist_data)
ymean = np.mean(mylist1_data)
:
:
print(index)
Thanks for your immense help!
Answers:
How about this?
groups = df_.groupby(pd.cut(df_.index, int(630_000 / 50)))
for interval, sub_df in groups:
xmean = sub_df['distance'].mean()
ymean = sub_df['day_night'].mean()
print(f'doing my test for indices {sub_df.index[0]} : {sub_df.index[-1]}')
Here – for each group you have the sub-dataframe! (and you don’t have to iterate through rows, which is very inefficient).
pd.cut
returns a "categorical array-like object representing the respective bin" for each row of df_
. It takes the number of bins as an argument: int(630_000 / 50)
.
I am trying to test condition when iteration reach first 50 elements, then next 50 elements in a list based on certain conditions and so on. My list contains 630000 elements fed from df_. This is my attempt:
For a dataframe: df_
distance |
---|
0.5 |
10.4 |
0.5 |
14.4 |
0.15 |
100.4 |
0.25 |
12.4 |
mylist_data = list()
mylist1_data = list()
for index, row in df_.iterrows():
mylist = (row.distance)
mylist_data.append(mylist)
mylist1 = (row.day_night)
mylist1_data.append(mylist1)
if (len(mylist_data)== 50):
xmean = np.mean(mylist_data)
ymean = np.mean(mylist1_data)
:
:
print(index)
Thanks for your immense help!
How about this?
groups = df_.groupby(pd.cut(df_.index, int(630_000 / 50)))
for interval, sub_df in groups:
xmean = sub_df['distance'].mean()
ymean = sub_df['day_night'].mean()
print(f'doing my test for indices {sub_df.index[0]} : {sub_df.index[-1]}')
Here – for each group you have the sub-dataframe! (and you don’t have to iterate through rows, which is very inefficient).
pd.cut
returns a "categorical array-like object representing the respective bin" for each row of df_
. It takes the number of bins as an argument: int(630_000 / 50)
.