Create all possible variations for a string with hyphens
Question:
I have a list of hyphenated strings, e.g:
myList = ['mother-in-law', 'co-operation', 'sixty-nine-eighty-ninths']
For every element of this list I want to be able to create all the variations where the hyphen is between two or more of the tokens of every element:
mother-in law
mother in-law
sixty-nine eighty ninths
sixty-nine-eighty ninths
sixty nine-eighty-ninths
sixty-nine eighty-ninths
sixty nine-eighty ninths
sixty nine eighty-ninths
...
I tried the solution from this question (Create variations of a string)
But I can’t figure how to adapt it:
from itertools import combinations
myList = ['mother-in-law', 'co-operation', 'sixty-nine-eighty-ninths']
for e in myList :
for i in range(len(e.split("-"))):
for indices in combinations(range(len(e.split("-"))), i):
print(''.join([e.split("-")[x] if x in indices else '-' for x in range(len(e))]))
This is what I get:
-------------
mother------------
-in-----------
--law----------
motherin-----------
mother-law----------
-inlaw----------
------------
co-----------
-operation----------
------------------------
sixty-----------------------
-nine----------------------
--eighty---------------------
---ninths--------------------
sixtynine----------------------
sixty-eighty---------------------
sixty--ninths--------------------
-nineeighty---------------------
-nine-ninths--------------------
--eightyninths--------------------
sixtynineeighty---------------------
sixtynine-ninths--------------------
sixty-eightyninths--------------------
-nineeightyninths--------------------
Answers:
It might be a little easier to just make your own generator to produce the combinations. This can be done in a very readable way with a recursive generator so long as your strings aren’t gigantic enough to run into stack limits:
def hyphenCombos(s):
head, _, rest = s.partition('-')
if len(rest) == 0:
yield head
else:
for c in hyphenCombos(rest):
yield f'{head}-{c}'
yield f'{head} {c}'
s = 'sixty-nine-eighty-ninths'
list(hyphenCombos(s))
Result:
['sixty-nine-eighty-ninths',
'sixty nine-eighty-ninths',
'sixty-nine eighty-ninths',
'sixty nine eighty-ninths',
'sixty-nine-eighty ninths',
'sixty nine-eighty ninths',
'sixty-nine eighty ninths',
'sixty nine eighty ninths']
With that you can use it in a comprehension or pass it to other itertools
functions to do whatever you need:
myList = ['mother-in-law', 'co-operation', 'sixty-nine-eighty-ninths']
chain.from_iterable(hyphenCombos(s) for s in myList))
# or variations...
# [list(hyphenCombos(s)) for s in myList]
Looking a little bit through the tools that itertools provides, I found product could be most useful here. It lets us go through all the possibilities of having a space or a dash between two words.
from itertools import product, zip_longest
my_list = ['mother-in-law', 'co-operation', 'sixty-nine-eighty-ninths']
symbols = ' ', '-'
for string in my_list:
string_split = string.split('-')
for symbols_product in product(symbols, repeat=len(string_split)-1):
if '-' not in symbols_product:
continue
rtn = ""
for word, symbol in zip_longest(string_split, symbols_product, fillvalue=''):
rtn += word + symbol
print(rtn)
print()
Also, I’m skipping the iterations where there’s no dash between any two words, as per your request.
Output:
mother in-law
mother-in law
mother-in-law
co-operation
sixty nine eighty-ninths
sixty nine-eighty ninths
sixty nine-eighty-ninths
sixty-nine eighty ninths
sixty-nine eighty-ninths
sixty-nine-eighty ninths
sixty-nine-eighty-ninths
I have a list of hyphenated strings, e.g:
myList = ['mother-in-law', 'co-operation', 'sixty-nine-eighty-ninths']
For every element of this list I want to be able to create all the variations where the hyphen is between two or more of the tokens of every element:
mother-in law
mother in-law
sixty-nine eighty ninths
sixty-nine-eighty ninths
sixty nine-eighty-ninths
sixty-nine eighty-ninths
sixty nine-eighty ninths
sixty nine eighty-ninths
...
I tried the solution from this question (Create variations of a string)
But I can’t figure how to adapt it:
from itertools import combinations
myList = ['mother-in-law', 'co-operation', 'sixty-nine-eighty-ninths']
for e in myList :
for i in range(len(e.split("-"))):
for indices in combinations(range(len(e.split("-"))), i):
print(''.join([e.split("-")[x] if x in indices else '-' for x in range(len(e))]))
This is what I get:
-------------
mother------------
-in-----------
--law----------
motherin-----------
mother-law----------
-inlaw----------
------------
co-----------
-operation----------
------------------------
sixty-----------------------
-nine----------------------
--eighty---------------------
---ninths--------------------
sixtynine----------------------
sixty-eighty---------------------
sixty--ninths--------------------
-nineeighty---------------------
-nine-ninths--------------------
--eightyninths--------------------
sixtynineeighty---------------------
sixtynine-ninths--------------------
sixty-eightyninths--------------------
-nineeightyninths--------------------
It might be a little easier to just make your own generator to produce the combinations. This can be done in a very readable way with a recursive generator so long as your strings aren’t gigantic enough to run into stack limits:
def hyphenCombos(s):
head, _, rest = s.partition('-')
if len(rest) == 0:
yield head
else:
for c in hyphenCombos(rest):
yield f'{head}-{c}'
yield f'{head} {c}'
s = 'sixty-nine-eighty-ninths'
list(hyphenCombos(s))
Result:
['sixty-nine-eighty-ninths',
'sixty nine-eighty-ninths',
'sixty-nine eighty-ninths',
'sixty nine eighty-ninths',
'sixty-nine-eighty ninths',
'sixty nine-eighty ninths',
'sixty-nine eighty ninths',
'sixty nine eighty ninths']
With that you can use it in a comprehension or pass it to other itertools
functions to do whatever you need:
myList = ['mother-in-law', 'co-operation', 'sixty-nine-eighty-ninths']
chain.from_iterable(hyphenCombos(s) for s in myList))
# or variations...
# [list(hyphenCombos(s)) for s in myList]
Looking a little bit through the tools that itertools provides, I found product could be most useful here. It lets us go through all the possibilities of having a space or a dash between two words.
from itertools import product, zip_longest
my_list = ['mother-in-law', 'co-operation', 'sixty-nine-eighty-ninths']
symbols = ' ', '-'
for string in my_list:
string_split = string.split('-')
for symbols_product in product(symbols, repeat=len(string_split)-1):
if '-' not in symbols_product:
continue
rtn = ""
for word, symbol in zip_longest(string_split, symbols_product, fillvalue=''):
rtn += word + symbol
print(rtn)
print()
Also, I’m skipping the iterations where there’s no dash between any two words, as per your request.
Output:
mother in-law
mother-in law
mother-in-law
co-operation
sixty nine eighty-ninths
sixty nine-eighty ninths
sixty nine-eighty-ninths
sixty-nine eighty ninths
sixty-nine eighty-ninths
sixty-nine-eighty ninths
sixty-nine-eighty-ninths