String Replacement Combinations

Question:

So I have a string ‘1xxx1’ and I want to replace a certain number (maybe all maybe none) of x’s with a character, let’s say ‘5’. I want all possible combinations (…maybe permutations) of the string where x is either substituted or left as x. I would like those results stored in a list.

So the desired result would be

>>> myList = GenerateCombinations('1xxx1', '5')
>>> print myList
['1xxx1','15xx1','155x1','15551','1x5x1','1x551','1xx51']

Obviously I’d like it to be able to handle strings of any length with any amount of x’s as well as being able to substitute any number. I’ve tried using loops and recursion to figure this out to no avail. Any help would be appreciated.

Asked By: Hoopdady

||

Answers:

How about:

from itertools import product

def filler(word, from_char, to_char):
    options = [(c,) if c != from_char else (from_char, to_char) for c in word]
    return (''.join(o) for o in product(*options))

which gives

>>> filler("1xxx1", "x", "5")
<generator object <genexpr> at 0x8fa798c>
>>> list(filler("1xxx1", "x", "5"))
['1xxx1', '1xx51', '1x5x1', '1x551', '15xx1', '15x51', '155x1', '15551']

(Note that you seem to be missing 15x51.)
Basically, first we make a list of every possible target for each letter in the source word:

>>> word = '1xxx1'
>>> from_char = 'x'
>>> to_char = '5'
>>> [(c,) if c != from_char else (from_char, to_char) for c in word]
[('1',), ('x', '5'), ('x', '5'), ('x', '5'), ('1',)]

And then we use itertools.product to get the Cartesian product of these possibilities and join the results together.

For bonus points, modify to accept a dictionary of replacements. :^)

Answered By: DSM

Generate the candidate values for each possible position – even if there is only one candidate for most positions – then create a Cartesian product of those values.

In the OP’s example, the candidates are ['x', '5'] for any position where an 'x' appears in the input; for each other position, the candidates are a list with a single possibility (the original letter). Thus:

def candidates(letter):
    return ['x', '5'] if letter == 'x' else [letter]

Then we can produce the patterns by producing a list of candidates for positions, using itertools.product, and combining them:

from itertools import product

def combine(candidate_list):
    return ''.join(candidate_list)

def patterns(data):
    all_candidates = [candidates(element) for element in data]
    for result in product(*all_candidates):
        yield combine(result)

Let’s test it:

>>> list(patterns('1xxx1'))
['1xxx1', '1xx51', '1x5x1', '1x551', '15xx1', '15x51', '155x1', '15551']

Notice that the algorithm in the generator is fully general; all that varies is the detail of how to generate candidates and how to process results. For example, suppose we want to replace "placeholders" within a string – then we need to split the string into placeholders and non-placeholders, and have a candidates function that generates all the possible replacements for placeholders, and the literal string for non-placeholders.

For example, with this setup:

keywords = {'wouldyou': ["can you", "would you", "please"], 'please': ["please", "ASAP"]}

template = '((wouldyou)) give me something ((please))'

First we would split the template, for example with a regular expression:

import re

def tokenize(t):
    return re.split(r'(((.*?)))', t)

This tokenizer will give empty strings before and after the placeholders, but this doesn’t cause a problem:

>>> tokenize(template)
['', '((wouldyou))', ' give me something ', '((please))', '']

To generate replacements, we can use something like:

def candidates(part):
    if part.startswith('((') and part.endswith('))'):
        return keywords.get(part[2:-2], [part[2:-2]])
    else:
        return [part]

That is: placeholder-parts are identified by the parentheses, stripped of those parentheses, and looked up in the dictionary.

Trying it with the other existing definitions:

>>> list(patterns(tokenize(template)))
['can you give me something please', 'can you give me something ASAP', 'would you give me something please', 'would you give me something ASAP', 'please give me something please', 'please give me something ASAP']

To generalize patterns properly, rather than depending on other global functions combine and candidates, we should use dependency injection – by simply passing those as parameters which are higher-order functions. Thus:

from itertools import product

def patterns(data, candidates, combine):
    all_candidates = [candidates(element) for element in data]
    for result in product(*all_candidates):
        yield combine(result)

Now the same core code solves whatever problem. Examples might look like:

def euler_51(s):
    for pattern in patterns(
        s,
        lambda letter: ['x', '5'] if letter == 'x' else [letter],
        ''.join
    ):
        print(pattern)

euler_51('1xxx1')

or

def replace_in_template(template, replacement_lookup):
    tokens = re.split(r'(((.*?)))', template)
    return list(patterns(
        tokens, 
        lambda part: (
            keywords.get(part[2:-2], [part[2:-2]])
            if part.startswith('((') and part.endswith('))')
            else [part]
        ),
        ''.join
    ))

replace_in_template(
    '((wouldyou)) give me something ((please))',
    {
        'wouldyou': ["can you", "would you", "please"],
        'please': ["please", "ASAP"]
    }
)
Answered By: Karl Knechtel
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.