Generating all possible combinations of characters in a string

Question:

Say I have a string list:

li = ['a', 'b', 'c']

I would like to construct a new list such that each entry of the new list is a concatenation of a selection of 3 entries in the original list. Note that each entry can be chosen repeatedly:

new_li=['abc', 'acb', 'bac', 'bca', 'cab', 'cba', 'aab', 'aac',....'aaa', 'bbb', 'ccc']

The brutal force way is to construct a 3-fold nested for loop and insert each 3-combination into the new list. I was wondering if there is any Pythonic way to deal with that? Thanks.

Update:
Later I will convert the new list into a set, so the order does not matter anyway.

Asked By: James LT

||

Answers:

This looks like a job for itertools.product.

import itertools

def foo(l):
     yield from itertools.product(*([l] * 3)) 

for x in foo('abc'):
     print(''.join(x))

aaa
aab
aac
aba
abb
abc
aca
acb
acc
baa
bab
bac
bba
bbb
bbc
bca
bcb
bcc
caa
cab
cac
cba
cbb
cbc
cca
ccb
ccc

yield from is available to you from python3.3 and beyond. For older version, yield within a loop:

def foo(l):
     for i in itertools.product(*([l] * 3)) :
         yield i
Answered By: cs95

The best way to get all combinations (also called cartesian product) of a list is to use itertools.product using the len of your iterable as repeat argument (that’s where it differs from the other answer):

from itertools import product
li = ['a', 'b', 'c']
for comb in product(li, repeat=len(li)):
    print(''.join(comb))

or if you want the result as list:

>>> combs = [''.join(comb) for comb in product(li, repeat=len(li))]
>>> combs
['aaa', 'aab', 'aac', 'aba', 'abb', 'abc', 'aca', 'acb', 'acc', 'baa', 
 'bab', 'bac', 'bba', 'bbb', 'bbc', 'bca', 'bcb', 'bcc', 'caa', 'cab', 
 'cac', 'cba', 'cbb', 'cbc', 'cca', 'ccb', 'ccc']

It’s a bit cleaner to use the repeat argument than to multiply and unpack the list you have manually.

Answered By: MSeifert
import itertools
repeat=int(input("Enter length: ")
def password():
    def foo(l):
        yield from itertools.product(*([l] * repeat)))

    for x in foo('abcdefghijklmnopqrstuvwxyz'): 
        # you could also use string.ascii_lowercase or ["a","b","c"]
        print(''.join(x))

password()
Answered By: Santosh Kumar

An alternate approach using list comprehension:

li = ['a', 'b', 'c']

new_li = [a+b+c for a in li for b in li for c in li]
Answered By: linda

I’ll show you a way to do this without any libraries so that you can understand the logic behind how to achieve it.

First, we need to understand how to achieve all combinations mathematically.

Let’s take a look at the pattern of every possible combination of characters ranging from a-b with a length of ‘1’.

a
b

Not much to see but from what we can see, there is one set of each character in the list. Let’s increase our string length to ‘2’ and see what pattern emerges.

aa
ab
ba
bb

So looking at this pattern, we see a new column has been added. The far right column is the same as the first example, with there being only 1 set of characters, but it’s looped this time. The column on the far left has 2 set of characters. Could it be that for every new column added, one more set of characters is added? Let’s take a look and find out by increasing the string length to ‘3’.

aaa
aab
aba
abb
baa
bab
bba
bbb

We can see the two columns on the right have stayed the same and the new column on the left has 4 of each characters! Not what we was expecting. So the number of characters doesn’t increase by 1 for each column. Instead, if you notice the pattern, it is actually increasing by powers of 2.

The first column with only ‘1’ set of characters : 2 ^ 0 = 1

The second column with ‘2’ sets of characters : 2 ^ 1 = 2

The third column with ‘4’ sets of characters : 2 ^ 2 = 4

So the answer here is, with each new column added, the number of each characters in the column is determined by it’s position of powers, with the first column on the right being x ^ 0, then x ^ 1, then x ^ 2… and so on.

But what is x? In the example I gave x = 2. But is it always 2? Let’s take a look.

I will now give an example of each possible combination of characters from range a-c

aa
ab
ac
ba
bb
bc
ca
cb
cc

If we count how many characters are in the first column on the right, there is still only one set of each characters for every time it loops, this is because the very first column on the right will always be equal to x ^ 0 and anything to the power of 0 is always 1. But if we look at the second column, we see 3 of each characters for every loop. So if x ^ 1 is for the second column, then x = 3. For the first example I gave with a range of a-b (range of 2), to the second example where I used a range a-c (range of 3), it seems as if x is always the length of characters used in your combinations.

With this first pattern recognised, we can start building a function that can identify what each column should represent. If we want to build every combination of characters from range a-b with a string length of 3, then we need a function that can understand that every set of characters in each column will as followed : [4, 2, 1].

Now create a function that can find how many set of characters should be in each column by returning a list of numbers that represent the total number of characters in a column based on it’s position. We do this using powers.

Remember if we use a range of characters from a-b (2) then each column should have a total of x ^ y number of characters for each set, where x represents the length of characters being used, and y represents it’s column position, where the very first column on the right is column number 0.

Example:

A combination of characters ranging from [‘a’, ‘b’] with a string length of 3 will have a total of 4 a’s and b’s in the far left column for each set, a total of 2 a’s and b’s in the next for each set and a total of 1 a’s and b’s in the last for each set.

To return a list with this total number of characters respective to their columns as so [4, 2, 1] we can do this

def getCharPower(stringLength, charRange):
    charpowers = []
    for x in range(0, stringLength):
            charpowers.append(len(charRange)**(stringLength - x - 1))
    return charpowers

With the above function – if we want to create every possible combination of characters that range from a-b (2) and have a string length of 4, like so

aaaa
aaab
aaba
aabb
abaa
abab
abba
abbb
baaa
baab
baba
babb
bbaa
bbab
bbba
bbbb

which have a total set of (8) a’s and b’s, (4) a’s and b’s, (2) a’s and b’s, and (1) a’s and b’s, then we want to return a list of [8, 4, 2, 1]. The stringLength is 4 and our charRange is ['a', 'b'] and the result from our function is [8, 4, 2, 1].

So now all we have to do is print out each character x number of times depending on the value of it’s column placement from our returned list.

In order to do this though, we need to find out how many times each set is printed in it’s column. Take a look at the first column on the right of the previous combination example. All though a and b is only printed once per set, it loops and prints out the same thing 7 more times (8 total). If the string was only 3 characters in length then it loop a total of 4 times.

The reason for this is because the length of our strings determine how many combinations there will be in total. The formula for working this out is x ^ y = a, where x equals our range of characters, y equals the length of the string and a equals the total number of combinations that are possible within those specifications.

So to finalise this problem, our solution is to figure out

  1. How many many characters in each set go into each column
  2. How many times to repeat each set in each column

Our first option has already been solved with our previously created function.
Our second option can be solved by finding out how many combinations there are in total by calculating charRange ^ stringLength. Then running through a loop, we add how many sets of characters there are until a (total number of possible combinations) has been reached in that column. Run that for each column and you have your result.

Here is the function that solves this

def Generator(stringLength, charRange):
    workbench = []
    results = []
    charpowers = getCharPower(stringLength, charRange)
    for x in range(0, stringLength):
            while len(workbench) < len(charRange)**stringLength:
                    for char in charRange:
                            for z in range(0, charpowers[x]):
                                    workbench.append(char)
            results.append(workbench)
            workbench = []
    results = ["".join(result) for result in list(zip(*results))]
    return results

That function will return every possible combination of characters and of string length that you provide.

A way more simpler way of approaching this problem would be to just run a for loop for your total length.

So to create every possible combination of characters ranging from a-b with a length of 2

characters = ['a', 'b']
for charone in characters:
    for chartwo in characters:
        print(charone+chartwo)

All though this is a lot simpler, this is limited. This code only works to print every combination with a length of 2. To create more than this, we would have to manually add another for loop each time we wanted to change it. The functions I provided to you before this code however will print any combination for how many string length you give it, making it 100% adaptable and the best way to solve this issue manually yourself without any libraries.

Answered By: cap1hunna