Python : Replace two for loops with the fastest way to sum the elements

Question:

I have list of 5 elements which could be 50000, now I want to sum all the combinations from the same list and create a dataframe from the results, so I am writing following code,

x =list(range(1,5))
t=[]
for i in x:
    for j in x:
        t.append((i,j,i+j))


df=pd.Dataframe(t)

The above code is generating the correct results but taking so long to execute when I have more elements in the list. Looking for the fastest way to do the same thing

Asked By: Kallol

||

Answers:

List Comprehension can make it faster. So, you can use t=[(i,j,i+j) for i in x for j in x] instead of for loop, as the traditional for loop is slower than list comprehensions, and nested loop is even slower. Here’s the updated code in replacement of nested loops.

x =list(range(1,5))
t=[(i,j,i+j) for i in x for j in x]

df=pd.Dataframe(t)
Answered By: iihsan

Combinations can be obtained through the pandas.merge() method without using explicit loops

x = np.arange(1, 5+1)
df = pd.DataFrame(x, columns=['x']).merge(pd.Series(x, name='y'), how='cross')
df['sum'] = df.x.add(df.y)
print(df)
    x  y  sum
0   1  1    2
1   1  2    3
2   1  3    4
3   1  4    5
4   1  5    6
5   2  1    3
6   2  2    4
...

Option 2: with itertools.product()

import itertools
num = 5
df = pd.DataFrame(itertools.product(range(1,num+1),range(1,num+1)))
df['sum'] = df[0].add(df[1])
print(df)
Answered By: Алексей Р