How to get the sum of the same position in a tuple output of a for loop in python?

Question:

I wrote a definition to iterate over 200 files and calculate the number of transitions and transversions in a DNA sequence. Now I want to sum up the first column of the output of this for loop together and the second column together.

this is the output that I get repeated 200 times because I have 200 files, I want to get the sum of the first column (0+1+1+1+1+…)and the second column (1+0+0+0+….)

(0, 1) (1, 0) (1, 0) (1, 0) (1, 0) (1, 0) (1, 0) (1, 0) (1, 0) (1, 0) (0, 1) (1, 0) (0, 1) (1, 0) (0, 1) (1, 0)

I tried to print the definition as a list and then sum up the lists, but those lists are not defined as they are just a for loop output, so I couldn’t sum them up.

print([dna_comparison(wild_type= f, mut_seq= h)])

Result:

[(0, 1)]
[(1, 0)]
[(1, 0)]
[(1, 0)]
[(1, 0)]
[(1, 0)]
[(1, 0)]
[(1, 0)]

Answers:

As the other answers show, you have multiple solutions to this.

I think the most convenient way to handle this is through numpy, especially if you then use these tuples also for furhter processing.

For example, imagine t is your collection of tuples, then you can transform it into numpy.array and access it like a matrix, i.e. with row and column indexes:

t = [(0, 1), (1, 0), (1, 0), (1, 0), (1, 0), (1, 0), (1, 0), (1, 0)]
t_array = np.array(t)   
t_array[:, 1]

>>> array([1, 0, 0, 0, 0, 0, 0, 0])

At this point you can simply sum the elements by column:

t_array.sum(axis=0)
>>> array([7, 1])
Answered By: Luca Clissa

I think list comprehension could be used in this case.

If you put all these tuple pairs into a list as such (probably by using a for loop):

outputs = [(0, 1), (1, 0), (1, 0), ....]

you could do something like

sum_totals = ( sum([x[0] for x in outputs]), sum([x[1] for x in outputs]) )

and sum_totals will look like (sum first column, sum second column)

Answered By: Hobanator

Can be achieved by the below code:

arr = [[(0, 1)],
[(1, 0)],
[(1, 0)],
[(1, 0)],
[(1, 0)],
[(1, 0)],
[(1, 0)],
[(1, 0)]]
firstcolumn = 0
secondcolumn = 0  
for i in arr:
    for v in i:
        print(v[0], ' ', v[1])
        firstcolumn = firstcolumn + v[0]
        secondcolumn = secondcolumn + v[1]
        
print(firstcolumn, secondcolumn)
#         7             1
Answered By: Himanshu Joshi

I don’t think it’s clear enough how you are getting the data. I mean, are tuples, but how are we reading those tuples?

You say you are reading the tuples from different files, so I guess aren’t already in a list.

For example:

import random

data_number = 200  # Simulating n number of data in files
wild_type: int = 0
mut_seq: int = 0
for _ in range(data_number):
    data = (random.randint(0, 1), random.randint(0, 1))  # Simulating the tuple reading from a file
    wild_type += data[0]
    mut_seq += data[1]

print(f'wild_type {wild_type} times. mut_seq {mut_seq} times.')
Answered By: Asi

One solution using itertools accumulate, I think it is pretty and clean:

from itertools import accumulate

your_list =  [(0, 1), (1, 0), (1, 0), ....]

*_, sum_ = accumulate(your_list, lambda x,y: (x[0]+y[0],x[1]+y[1]))
print(sum_)

Less clean, more python magic and only really relevant for code golf, but not importing anything:

tuple(map(sum, zip(*your_lst)))
Answered By: Christian Sloper
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.