Creating Permutations from DataFrame without Repetition
Question:
I’ve searched for a solution to this problem but haven’t found anything specific to this problem.
My dataframe is structured like this:
column_1 column_2 column_3
a 2 3 7
b 9 4 3
c 1 5 2
I want to find all permutations of the above dataframe without repeating rows or columns in each individual permutation.
The preceding isn’t super clear, so here is the output I’m trying to achieve:
Out: [(2,4,2),(2,5,3),(9,3,2),(9,5,7),(1,3,3),(1,4,7)]
In other words, I expected n! results
The solution I tried was:
permutations = list(product(df['column_1'], df['column_2'], df['column_3']))
print(permutations)
This returns n^n combinations.
Any help is appreciated! THANKS
Answers:
You can use permutations
method of the itertools
package. This gives you the indices you need for each column.
from itertools import permutations
indices = list(permutations('abc', 3))
print(indices)
You can use itertools.permutations
on the row indices and numpy indexing:
from itertools import permutations
idx = list(permutations(range(len(df))))
df.to_numpy()[idx, np.arange(df.shape[1])].tolist()
output:
[[2, 4, 2], [2, 5, 3], [9, 3, 2], [9, 5, 7], [1, 3, 3], [1, 4, 7]]
I’ve searched for a solution to this problem but haven’t found anything specific to this problem.
My dataframe is structured like this:
column_1 column_2 column_3
a 2 3 7
b 9 4 3
c 1 5 2
I want to find all permutations of the above dataframe without repeating rows or columns in each individual permutation.
The preceding isn’t super clear, so here is the output I’m trying to achieve:
Out: [(2,4,2),(2,5,3),(9,3,2),(9,5,7),(1,3,3),(1,4,7)]
In other words, I expected n! results
The solution I tried was:
permutations = list(product(df['column_1'], df['column_2'], df['column_3']))
print(permutations)
This returns n^n combinations.
Any help is appreciated! THANKS
You can use permutations
method of the itertools
package. This gives you the indices you need for each column.
from itertools import permutations
indices = list(permutations('abc', 3))
print(indices)
You can use itertools.permutations
on the row indices and numpy indexing:
from itertools import permutations
idx = list(permutations(range(len(df))))
df.to_numpy()[idx, np.arange(df.shape[1])].tolist()
output:
[[2, 4, 2], [2, 5, 3], [9, 3, 2], [9, 5, 7], [1, 3, 3], [1, 4, 7]]