Iteration and index dropping using general logic in python


So I’ve got this code I’ve been working on for a few days. I need to iterate through a set of csv’s, and using general logic, find the indexes which don’t have the same number of columns as index 2 and strip them out of the new csv. I’ve gotten the code to this point, but I’m stuck as to how to use slicing to strip the broken index.

Say each index in file A is supposed to have 10 columns, and for some reason index 2,000 logs with only 7 columns. How is the best way to approach this problem to get the code to strip index 2,000 out of the new csv?

#Comments to the right
for f in TD_files:                                                  #FOR ALL TREND FILES:
    with open(f,newline='',encoding='latin1') as g:                 #open file as read
        r = csv.reader((line.replace('','') for line in g))       #declare read variable for list while stripping nulls
        data = [line for line in r]                                 #set list to all data in file 
        for j in range(0,len(data)):                                #set up data variable
            if data[j][2] != data[j-1][2] and j != 0:               #compare index j2 and j2-1
                print('Index Not Equal')                            #print debug
        data[0] = TDmachineID                                       #add machine ID line
        data[1] = trendHeader                                       #add trend header line
    with open(f,'w',newline='') as g:                               #open file as write
        w = csv.writer(g)                                           #declare write variable

The Index To Strip

Asked By: Kyle Lucas



Since you loop through the whole data anyway, I would replace that at the same list comprehension when checking for the length. It looks cleaner to me and works the same.

with open(f, newline='', encoding='latin1') as g:
    raw_data = csv.reader(g)
    data = [[elem.replace('', '') for elem in line] for line in raw_data if len(line)==10]
    data[0] = TDmachineID
    data[1] = trendHeader 

old answer:
You could add a condition to your list comprehension if the list has the length 10.

with open(f,newline='',encoding='latin1') as g:
    r = csv.reader((line.replace('','') for line in g))
    data = [line for line in r if len(line)==10] #add condition to check if the line is added to your data
    data[0] = TDmachineID
    data[1] = trendHeader 
Answered By: Rabinzel
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.