Returning unique values from a function in a for loop
Question:
I have a logic problem that I’m struggling to get my head around. I’m currently processing some data, specimen by specimen. Each specimen has a data frame of raw data associated with it. A different number of specimens will be processed at the same time (ie, one run of the code could process two specimens, one could do four, one could do just one. It currently gives me an output for each specimen, but I want to be able to return certain values from the function to perform different calculations (I take averages over all the specimens, etc etc later.) So far, a snipped of my code looks like this:
import process data
import rafu
import pandas as pd
specimen_ids = [spec_1, spec_2, spec_3]
directory = 'some_folder'
cycle_rise_values = [1,2,3]
cycle_fall_values = [3,2,1]
stage_type = 'cycling'
target_growths = [0.1,0.2,0.3]
for y in range(len(specimen_ids)):
# Read in dcpd data for specimen - this
dcpd_df = rafu.dcpd_data_input (directory,specimen_ids[y])
# If data is there, process it
if dcpd_df is not None:
specimen_df_csv, summary_df, delta_k_values, crack_length_values = process_data.main(dcpd_df, cycle_rise_values, cycle_fall_values, stage_type, target_growths)
specimen_df_csv.to_csv(directory + '\'+ specimen_ids[y]+'.csv')
summary_df.to_csv(directory + '\'+ specimen_ids[y]+' Summary.csv', index = False)
My problem here is with the two outputs delta_k_values
and crack_length_values
– I need them to be associated with that specimen id for future calculations (right now, my code just overwrites each specimens values. Is there a way I can attach a unique specimen identifier to them? I’ve heard of eval
, but I’m not sure if it’s the right way to go. Any help would be great, cheers!
Answers:
If you want to keep track of the specimen id for each delta_k_values
and each crack_length_values
, a dict might work.
I would also iterate directly over the specimen_ids
, since you don’t use y
other than indexing into specimen_ids
.
import process data
import rafu
import pandas as pd
import os.path
specimen_ids = [spec_1, spec_2, spec_3]
directory = 'some_folder'
cycle_rise_values = [1,2,3]
cycle_fall_values = [3,2,1]
stage_type = 'cycling'
target_growths = [0.1,0.2,0.3]
delta_k_values = {}
crack_length_values = {}
for specimen_id in specimen_ids:
# Read in dcpd data for specimen - this
dcpd_df = rafu.dcpd_data_input (directory, specimen_id)
# If data is there, process it
if dcpd_df is not None:
specimen_df_csv, summary_df, delta_k_values[specimen_id], crack_length_values[specimen_id] = process_data.main(dcpd_df, cycle_rise_values, cycle_fall_values, stage_type, target_growths)
specimen_df_csv.to_csv(os.path.join(directory, f"{specimen_id}.csv"))
summary_df.to_csv(os.path.join(directory, f"{specimen_id} Summary.csv"), index = False)
(I’ve altered the path/file name bit, but it amounts to the same. Both will work, using os.path.join
can be a bit safer with regards to directory separators.)
Then once done, you can iterate over the dicts, e.g. like
for specimen_id, value in delta_k_values.items():
print(specimen_id, ':', value)
or access a value directly if you know the id:
specific_value = delta_k_values[known_id]
It sounds like you need a dict
keyed on the specimen id:
import process data
import rafu
import pandas as pd
specimen_ids = [spec_1, spec_2, spec_3]
directory = 'some_folder'
cycle_rise_values = [1,2,3]
cycle_fall_values = [3,2,1]
stage_type = 'cycling'
target_growths = [0.1,0.2,0.3]
values = {}
for specimen_id in specimen_ids:
# Read in dcpd data for specimen - this
dcpd_df = rafu.dcpd_data_input (directory,specimen_id)
# If data is there, process it
if dcpd_df is not None:
specimen_df_csv, summary_df, delta_k_values, crack_length_values = process_data.main(dcpd_df, cycle_rise_values, cycle_fall_values, stage_type, target_growths)
values[specimen_id] = delta_k_values, crack_length_values
specimen_df_csv.to_csv(directory + '\'+ specimen_id +'.csv')
summary_df.to_csv(directory + '\'+ specimen_id +' Summary.csv', index = False)
You can now access the saved values like this (given a specimen_id):
delta_k_values, crack_length_values = values[specimen_id]
I have a logic problem that I’m struggling to get my head around. I’m currently processing some data, specimen by specimen. Each specimen has a data frame of raw data associated with it. A different number of specimens will be processed at the same time (ie, one run of the code could process two specimens, one could do four, one could do just one. It currently gives me an output for each specimen, but I want to be able to return certain values from the function to perform different calculations (I take averages over all the specimens, etc etc later.) So far, a snipped of my code looks like this:
import process data
import rafu
import pandas as pd
specimen_ids = [spec_1, spec_2, spec_3]
directory = 'some_folder'
cycle_rise_values = [1,2,3]
cycle_fall_values = [3,2,1]
stage_type = 'cycling'
target_growths = [0.1,0.2,0.3]
for y in range(len(specimen_ids)):
# Read in dcpd data for specimen - this
dcpd_df = rafu.dcpd_data_input (directory,specimen_ids[y])
# If data is there, process it
if dcpd_df is not None:
specimen_df_csv, summary_df, delta_k_values, crack_length_values = process_data.main(dcpd_df, cycle_rise_values, cycle_fall_values, stage_type, target_growths)
specimen_df_csv.to_csv(directory + '\'+ specimen_ids[y]+'.csv')
summary_df.to_csv(directory + '\'+ specimen_ids[y]+' Summary.csv', index = False)
My problem here is with the two outputs delta_k_values
and crack_length_values
– I need them to be associated with that specimen id for future calculations (right now, my code just overwrites each specimens values. Is there a way I can attach a unique specimen identifier to them? I’ve heard of eval
, but I’m not sure if it’s the right way to go. Any help would be great, cheers!
If you want to keep track of the specimen id for each delta_k_values
and each crack_length_values
, a dict might work.
I would also iterate directly over the specimen_ids
, since you don’t use y
other than indexing into specimen_ids
.
import process data
import rafu
import pandas as pd
import os.path
specimen_ids = [spec_1, spec_2, spec_3]
directory = 'some_folder'
cycle_rise_values = [1,2,3]
cycle_fall_values = [3,2,1]
stage_type = 'cycling'
target_growths = [0.1,0.2,0.3]
delta_k_values = {}
crack_length_values = {}
for specimen_id in specimen_ids:
# Read in dcpd data for specimen - this
dcpd_df = rafu.dcpd_data_input (directory, specimen_id)
# If data is there, process it
if dcpd_df is not None:
specimen_df_csv, summary_df, delta_k_values[specimen_id], crack_length_values[specimen_id] = process_data.main(dcpd_df, cycle_rise_values, cycle_fall_values, stage_type, target_growths)
specimen_df_csv.to_csv(os.path.join(directory, f"{specimen_id}.csv"))
summary_df.to_csv(os.path.join(directory, f"{specimen_id} Summary.csv"), index = False)
(I’ve altered the path/file name bit, but it amounts to the same. Both will work, using os.path.join
can be a bit safer with regards to directory separators.)
Then once done, you can iterate over the dicts, e.g. like
for specimen_id, value in delta_k_values.items():
print(specimen_id, ':', value)
or access a value directly if you know the id:
specific_value = delta_k_values[known_id]
It sounds like you need a dict
keyed on the specimen id:
import process data
import rafu
import pandas as pd
specimen_ids = [spec_1, spec_2, spec_3]
directory = 'some_folder'
cycle_rise_values = [1,2,3]
cycle_fall_values = [3,2,1]
stage_type = 'cycling'
target_growths = [0.1,0.2,0.3]
values = {}
for specimen_id in specimen_ids:
# Read in dcpd data for specimen - this
dcpd_df = rafu.dcpd_data_input (directory,specimen_id)
# If data is there, process it
if dcpd_df is not None:
specimen_df_csv, summary_df, delta_k_values, crack_length_values = process_data.main(dcpd_df, cycle_rise_values, cycle_fall_values, stage_type, target_growths)
values[specimen_id] = delta_k_values, crack_length_values
specimen_df_csv.to_csv(directory + '\'+ specimen_id +'.csv')
summary_df.to_csv(directory + '\'+ specimen_id +' Summary.csv', index = False)
You can now access the saved values like this (given a specimen_id):
delta_k_values, crack_length_values = values[specimen_id]