Finding unique combinations of two sequence (String) Python
Question:
Hello i have a question.
I got two List like that :
list1 = ['G','C','A','T','C','A']
list2 = ['G','A','*','T','AC','A']
And i wanted in python to find all combination possible of those 2 sequences.
For example the result will be :
(GCATCA), (GA*TACA), (GAATCA), (GA*TCA), (GC*TCA), (GC*TACA), (GCATACA), (GAATACA)
Thanks you for your answer
Answers:
First lets get all the pairs in every position:
pairs = zip(list1, list2)
Now lets make a combination for every choice in every position.
To do so, we will use itertools.product
.
from itertools import product
combinations = product(*zip(list1, list2))
Lets get rid of all duplicate combinations.
unique_combinations = set(combinations)
Finally lets join each combination in a single string:
all_combinations = [''.join(bases) for bases in unique_combinations]
Here all_combinations
will be the desired output:
['GC*TCA', 'GC*TACA', 'GCATCA', 'GAATACA', 'GCATACA', 'GA*TACA', 'GAATCA', 'GA*TCA']
. Of course you can get it in a single line: [''.join(bases) for bases in set(product(*zip(list1, list2)))]
Just like Jorge Luis’s answer, but also deduplicating earlier, at each position. Then we only produce the 8 combinations directly instead of producing 64 and throwing away 56 of them.
from itertools import product
list1 = ['G','C','A','T','C','A']
list2 = ['G','A','*','T','AC','A']
result = {*map(''.join, product(*map(set, zip(list1, list2))))}
print(result)
Output (Attempt This Online!):
{'GAATCA', 'GA*TCA', 'GA*TACA', 'GC*TACA',
'GCATACA', 'GC*TCA', 'GCATCA', 'GAATACA'}
Alternatively, without itertools:
list1 = ['G','C','A','T','C','A']
list2 = ['G','A','*','T','AC','A']
result = {''}
for pair in zip(list1, list2):
result = {
r + p
for p in set(pair)
for r in result
}
print(result)
Hello i have a question.
I got two List like that :
list1 = ['G','C','A','T','C','A']
list2 = ['G','A','*','T','AC','A']
And i wanted in python to find all combination possible of those 2 sequences.
For example the result will be :
(GCATCA), (GA*TACA), (GAATCA), (GA*TCA), (GC*TCA), (GC*TACA), (GCATACA), (GAATACA)
Thanks you for your answer
First lets get all the pairs in every position:
pairs = zip(list1, list2)
Now lets make a combination for every choice in every position.
To do so, we will use itertools.product
.
from itertools import product
combinations = product(*zip(list1, list2))
Lets get rid of all duplicate combinations.
unique_combinations = set(combinations)
Finally lets join each combination in a single string:
all_combinations = [''.join(bases) for bases in unique_combinations]
Here all_combinations
will be the desired output:
['GC*TCA', 'GC*TACA', 'GCATCA', 'GAATACA', 'GCATACA', 'GA*TACA', 'GAATCA', 'GA*TCA']
. Of course you can get it in a single line: [''.join(bases) for bases in set(product(*zip(list1, list2)))]
Just like Jorge Luis’s answer, but also deduplicating earlier, at each position. Then we only produce the 8 combinations directly instead of producing 64 and throwing away 56 of them.
from itertools import product
list1 = ['G','C','A','T','C','A']
list2 = ['G','A','*','T','AC','A']
result = {*map(''.join, product(*map(set, zip(list1, list2))))}
print(result)
Output (Attempt This Online!):
{'GAATCA', 'GA*TCA', 'GA*TACA', 'GC*TACA',
'GCATACA', 'GC*TCA', 'GCATCA', 'GAATACA'}
Alternatively, without itertools:
list1 = ['G','C','A','T','C','A']
list2 = ['G','A','*','T','AC','A']
result = {''}
for pair in zip(list1, list2):
result = {
r + p
for p in set(pair)
for r in result
}
print(result)