Repeating letters like excel columns?
Question:
I want to create a list of string that resemble the column letters in Microsoft Excel. For example, after 26 columns, the next columns become AA
, AB
, AC
, etc.
I have tried using the modulus operator, but I just end up with AA
, BB
, CC
, etc…
import string
passes_through_alphabet = 0
for num, col in enumerate([_ for _ in range(40)]):
if num % 26 == 0:
passes_through_alphabet += 1
excel_col = string.ascii_uppercase[num%26] * passes_through_alphabet
print(num, excel_col)
0 A
1 B
2 C
3 D
...
22 W
23 X
24 Y
25 Z
26 AA
27 BB
28 CC
...
Answers:
You can use itertools.product for this:
import string
import itertools
list(itertools.product(string.ascii_uppercase, repeat=2))
Output:
[('A', 'A'), ('A', 'B'), ('A', 'C'), ('A', 'D'), ...
Combining this with the first set of letters, and joining the pairs results in:
list(
itertools.chain(
string.ascii_uppercase,
(''.join(pair) for pair in itertools.product(string.ascii_uppercase, repeat=2))
))
Output:
['A', 'B', 'C', .. 'AA', 'AB', 'AC' .. 'ZZ']
To generalize, we define a generator that builds up bigger and bigger products. Note that the yield is only available in python 3.3+, but you can just use a for loop to yield each item if you’re on python 2.
def excel_cols():
n = 1
while True:
yield from (''.join(group) for group in itertools.product(string.ascii_uppercase, repeat=n))
n += 1
list(itertools.islice(excel_cols(), 28))
output
['A', 'B', 'C', ... 'X', 'Y', 'Z','AA', 'AB']
This generator function will work with arbitrary alphabets:
import string
def labels(alphabet=string.ascii_uppercase):
assert len(alphabet) == len(set(alphabet)) # make sure every letter is unique
s = [alphabet[0]]
while 1:
yield ''.join(s)
l = len(s)
for i in range(l-1, -1, -1):
if s[i] != alphabet[-1]:
s[i] = alphabet[alphabet.index(s[i])+1]
s[i+1:] = [alphabet[0]] * (l-i-1)
break
else:
s = [alphabet[0]] * (l+1)
> x = labels(alphabet='ABC')
> print([next(x) for _ in range(20)])
['A', 'B', 'C', 'AA', 'AB', 'AC', 'BA', 'BB', 'BC', 'CA', 'CB', 'CC', 'AAA', 'AAB', ... ]
It generates the next string from the current one:
-
Find the first character from the back that is not the last in the alphabet: e.g. != 'Z'
b) increment it: set it to the next alphabet letter
c) reset all following characters to the first alphabet character
-
if no such incrementable character was found, start over with all first alphabet letters, increasing the length by 1
One can write a more readable/comprehensive function at the cost of a (much) larger memory footprint, especially if many labels are generated:
def labels(alphabet=string.ascii_uppercase):
agenda = deque(alphabet)
while agenda:
s = agenda.popleft()
yield s
agenda.append([s+c for c in alphabet])
Based on this answer: https://stackoverflow.com/a/182009/6591347
def num_to_excel_col(n):
if n < 1:
raise ValueError("Number must be positive")
result = ""
while True:
if n > 26:
n, r = divmod(n - 1, 26)
result = chr(r + ord('A')) + result
else:
return chr(n + ord('A') - 1) + result
My solution:
itertools.chain(*[itertools.product(map(chr, range(65,91)), repeat=i) for i in xrange(1, 10)])
Please notice to the magic number 10 – this is the maximum letters in column name.
Explain:
First creating the A-Z letters as list:
map(chr, range(65,91))
then using product for creating the combinations (length starts from 1 and ends at 10)
itertools.product(map(chr, range(65,91)), repeat=i)
And finally concat all those generators into single generator using itertools.chain
I think this would be easier
cols = [chr(x) for x in range(65, 91)] +
[chr(x) + chr(y) for x in range(65, 91) for y in range(65, 91)] +
[chr(x) + chr(y) + chr(z) for x in range(65, 91) for y in range(65, 91) for z in range(65, 91)]
I want to create a list of string that resemble the column letters in Microsoft Excel. For example, after 26 columns, the next columns become AA
, AB
, AC
, etc.
I have tried using the modulus operator, but I just end up with AA
, BB
, CC
, etc…
import string
passes_through_alphabet = 0
for num, col in enumerate([_ for _ in range(40)]):
if num % 26 == 0:
passes_through_alphabet += 1
excel_col = string.ascii_uppercase[num%26] * passes_through_alphabet
print(num, excel_col)
0 A
1 B
2 C
3 D
...
22 W
23 X
24 Y
25 Z
26 AA
27 BB
28 CC
...
You can use itertools.product for this:
import string
import itertools
list(itertools.product(string.ascii_uppercase, repeat=2))
Output:
[('A', 'A'), ('A', 'B'), ('A', 'C'), ('A', 'D'), ...
Combining this with the first set of letters, and joining the pairs results in:
list(
itertools.chain(
string.ascii_uppercase,
(''.join(pair) for pair in itertools.product(string.ascii_uppercase, repeat=2))
))
Output:
['A', 'B', 'C', .. 'AA', 'AB', 'AC' .. 'ZZ']
To generalize, we define a generator that builds up bigger and bigger products. Note that the yield is only available in python 3.3+, but you can just use a for loop to yield each item if you’re on python 2.
def excel_cols():
n = 1
while True:
yield from (''.join(group) for group in itertools.product(string.ascii_uppercase, repeat=n))
n += 1
list(itertools.islice(excel_cols(), 28))
output
['A', 'B', 'C', ... 'X', 'Y', 'Z','AA', 'AB']
This generator function will work with arbitrary alphabets:
import string
def labels(alphabet=string.ascii_uppercase):
assert len(alphabet) == len(set(alphabet)) # make sure every letter is unique
s = [alphabet[0]]
while 1:
yield ''.join(s)
l = len(s)
for i in range(l-1, -1, -1):
if s[i] != alphabet[-1]:
s[i] = alphabet[alphabet.index(s[i])+1]
s[i+1:] = [alphabet[0]] * (l-i-1)
break
else:
s = [alphabet[0]] * (l+1)
> x = labels(alphabet='ABC')
> print([next(x) for _ in range(20)])
['A', 'B', 'C', 'AA', 'AB', 'AC', 'BA', 'BB', 'BC', 'CA', 'CB', 'CC', 'AAA', 'AAB', ... ]
It generates the next string from the current one:
-
Find the first character from the back that is not the last in the alphabet: e.g.
!= 'Z'
b) increment it: set it to the next alphabet letter
c) reset all following characters to the first alphabet character
-
if no such incrementable character was found, start over with all first alphabet letters, increasing the length by
1
One can write a more readable/comprehensive function at the cost of a (much) larger memory footprint, especially if many labels are generated:
def labels(alphabet=string.ascii_uppercase):
agenda = deque(alphabet)
while agenda:
s = agenda.popleft()
yield s
agenda.append([s+c for c in alphabet])
Based on this answer: https://stackoverflow.com/a/182009/6591347
def num_to_excel_col(n):
if n < 1:
raise ValueError("Number must be positive")
result = ""
while True:
if n > 26:
n, r = divmod(n - 1, 26)
result = chr(r + ord('A')) + result
else:
return chr(n + ord('A') - 1) + result
My solution:
itertools.chain(*[itertools.product(map(chr, range(65,91)), repeat=i) for i in xrange(1, 10)])
Please notice to the magic number 10 – this is the maximum letters in column name.
Explain:
First creating the A-Z letters as list:
map(chr, range(65,91))
then using product for creating the combinations (length starts from 1 and ends at 10)
itertools.product(map(chr, range(65,91)), repeat=i)
And finally concat all those generators into single generator using itertools.chain
I think this would be easier
cols = [chr(x) for x in range(65, 91)] +
[chr(x) + chr(y) for x in range(65, 91) for y in range(65, 91)] +
[chr(x) + chr(y) + chr(z) for x in range(65, 91) for y in range(65, 91) for z in range(65, 91)]