How to get itertool.product to generate strings instead of list of chars and then brace each string with additional strings?
Question:
I don’t know much about python data structures but basically I’m generating all permutations of letters :
import, itertools, string
k = itertools.product(string.ascii_lowercase, repeat = 2)
list(k)
[('a', 'a'), ('a', 'b'), ('a', 'c'), ..., ('z', 'x'), ('z', 'y'), ('z', 'z')]
I need to perform 2 operations on the generator k
that will preserve it as a generator if possible :
1) Concatenate each list so list(k)
would return :
["aa", "ab", ..., "zy", "zz"]
2) Brace each string with aditional strings "str1"
and "str2"
so after step 1) and step 2) list(k)
would return :
["str1aastr2", "str1abstr2", ...,"str1zystr2","str1zzstr2"]
How to proceed to get a generator that looks like that so I can feed it to scrapy’s start_urls
?
Answers:
You can use a generator expression to do the complete action. Not sure I understand the need to keep it as a generator if you immediately call list()
on it:
>>> import itertools as it
>>> from string import ascii_lowercase
>>> k = ('str1{}str2'.format(''.join(s)) for s in it.product(ascii_lowercase, repeat=2))
>>> next(k)
'str1aastr2'
>>> list(k)
['str1abstr2', 'str1acstr2', 'str1adstr2', 'str1aestr2', ...]
Note: the str1aastr2
was consumed by the next(k)
Or a slightly different construct:
>>> k = (f'str1{c1}{c2}str2' for c1, c2 in it.product(ascii_lowercase, repeat=2))
>>> next(k)
'str1aastr2'
>>> next(k)
'str1abstr2'
You can create a new generator that will produce the desired values:
import itertools
import string
def g():
start = ["str1"]
end = ["str2"]
for item in itertools.product(string.ascii_lowercase, repeat=2):
yield "".join(start + list(item) + end)
Example:
>>> gen = g()
>>> list(gen)[:10]
['str1aastr2', 'str1abstr2', 'str1acstr2', 'str1adstr2', 'str1aestr2', 'str1afstr2', 'str1agstr2', 'str1ahstr2', 'str1aistr2', 'str1ajstr2']
After gen = g()
you’ve got a generator object that you can use with Scrapy.
I don’t know much about python data structures but basically I’m generating all permutations of letters :
import, itertools, string
k = itertools.product(string.ascii_lowercase, repeat = 2)
list(k)
[('a', 'a'), ('a', 'b'), ('a', 'c'), ..., ('z', 'x'), ('z', 'y'), ('z', 'z')]
I need to perform 2 operations on the generator k
that will preserve it as a generator if possible :
1) Concatenate each list so list(k)
would return :
["aa", "ab", ..., "zy", "zz"]
2) Brace each string with aditional strings "str1"
and "str2"
so after step 1) and step 2) list(k)
would return :
["str1aastr2", "str1abstr2", ...,"str1zystr2","str1zzstr2"]
How to proceed to get a generator that looks like that so I can feed it to scrapy’s start_urls
?
You can use a generator expression to do the complete action. Not sure I understand the need to keep it as a generator if you immediately call list()
on it:
>>> import itertools as it
>>> from string import ascii_lowercase
>>> k = ('str1{}str2'.format(''.join(s)) for s in it.product(ascii_lowercase, repeat=2))
>>> next(k)
'str1aastr2'
>>> list(k)
['str1abstr2', 'str1acstr2', 'str1adstr2', 'str1aestr2', ...]
Note: the str1aastr2
was consumed by the next(k)
Or a slightly different construct:
>>> k = (f'str1{c1}{c2}str2' for c1, c2 in it.product(ascii_lowercase, repeat=2))
>>> next(k)
'str1aastr2'
>>> next(k)
'str1abstr2'
You can create a new generator that will produce the desired values:
import itertools
import string
def g():
start = ["str1"]
end = ["str2"]
for item in itertools.product(string.ascii_lowercase, repeat=2):
yield "".join(start + list(item) + end)
Example:
>>> gen = g()
>>> list(gen)[:10]
['str1aastr2', 'str1abstr2', 'str1acstr2', 'str1adstr2', 'str1aestr2', 'str1afstr2', 'str1agstr2', 'str1ahstr2', 'str1aistr2', 'str1ajstr2']
After gen = g()
you’ve got a generator object that you can use with Scrapy.