Returning unique values from a function in a for loop

Question

I have a logic problem that I’m struggling to get my head around. I’m currently processing some data, specimen by specimen. Each specimen has a data frame of raw data associated with it. A different number of specimens will be processed at the same time (ie, one run of the code could process two specimens, one could do four, one could do just one. It currently gives me an output for each specimen, but I want to be able to return certain values from the function to perform different calculations (I take averages over all the specimens, etc etc later.) So far, a snipped of my code looks like this:

import process data
import rafu
import pandas as pd

specimen_ids = [spec_1, spec_2, spec_3]
directory = 'some_folder'
cycle_rise_values = [1,2,3]
cycle_fall_values = [3,2,1]
stage_type = 'cycling'
target_growths = [0.1,0.2,0.3]


for y in range(len(specimen_ids)):
        
        # Read in dcpd data for specimen - this
        dcpd_df = rafu.dcpd_data_input (directory,specimen_ids[y])
        
        # If data is there, process it
        if dcpd_df is not None:
        
            specimen_df_csv, summary_df, delta_k_values, crack_length_values = process_data.main(dcpd_df, cycle_rise_values, cycle_fall_values, stage_type,  target_growths)
            
            specimen_df_csv.to_csv(directory + '\'+ specimen_ids[y]+'.csv')
            
            summary_df.to_csv(directory + '\'+ specimen_ids[y]+' Summary.csv', index = False)

My problem here is with the two outputs delta_k_values and crack_length_values – I need them to be associated with that specimen id for future calculations (right now, my code just overwrites each specimens values. Is there a way I can attach a unique specimen identifier to them? I’ve heard of eval, but I’m not sure if it’s the right way to go. Any help would be great, cheers!

Asked By: Murray Ross

||

Source

Answer 1

If you want to keep track of the specimen id for each delta_k_values and each crack_length_values, a dict might work.

I would also iterate directly over the specimen_ids, since you don’t use y other than indexing into specimen_ids.

import process data
import rafu
import pandas as pd

import os.path

specimen_ids = [spec_1, spec_2, spec_3]
directory = 'some_folder'
cycle_rise_values = [1,2,3]
cycle_fall_values = [3,2,1]
stage_type = 'cycling'
target_growths = [0.1,0.2,0.3]

delta_k_values = {}
crack_length_values = {}
for specimen_id in specimen_ids:
        
        # Read in dcpd data for specimen - this
        dcpd_df = rafu.dcpd_data_input (directory, specimen_id)
        
        # If data is there, process it
        if dcpd_df is not None:
        
            specimen_df_csv, summary_df, delta_k_values[specimen_id], crack_length_values[specimen_id] = process_data.main(dcpd_df, cycle_rise_values, cycle_fall_values, stage_type,  target_growths)
            
            specimen_df_csv.to_csv(os.path.join(directory, f"{specimen_id}.csv"))
            
            summary_df.to_csv(os.path.join(directory, f"{specimen_id} Summary.csv"), index = False)

(I’ve altered the path/file name bit, but it amounts to the same. Both will work, using os.path.join can be a bit safer with regards to directory separators.)

Then once done, you can iterate over the dicts, e.g. like

for specimen_id, value in delta_k_values.items():
    print(specimen_id, ':', value)

or access a value directly if you know the id:

specific_value = delta_k_values[known_id]

Answered By: 9769953

Answer 2

It sounds like you need a dict keyed on the specimen id:

import process data
import rafu
import pandas as pd

specimen_ids = [spec_1, spec_2, spec_3]
directory = 'some_folder'
cycle_rise_values = [1,2,3]
cycle_fall_values = [3,2,1]
stage_type = 'cycling'
target_growths = [0.1,0.2,0.3]

values = {}

for specimen_id in specimen_ids:
    # Read in dcpd data for specimen - this
    dcpd_df = rafu.dcpd_data_input (directory,specimen_id)
        
    # If data is there, process it
    if dcpd_df is not None:
        
        specimen_df_csv, summary_df, delta_k_values, crack_length_values = process_data.main(dcpd_df, cycle_rise_values, cycle_fall_values, stage_type,  target_growths)
        values[specimen_id] = delta_k_values, crack_length_values
            
        specimen_df_csv.to_csv(directory + '\'+ specimen_id +'.csv')
            
        summary_df.to_csv(directory + '\'+ specimen_id +' Summary.csv', index = False)

You can now access the saved values like this (given a specimen_id):

delta_k_values, crack_length_values = values[specimen_id]

Answered By: quamrana

Returning unique values from a function in a for loop

Question:

Answers: