Find permutations which also match other constraints
Question:
Given the following list:
dbset = [[{'id': '10556', 'nation': 'France', 'worth': '70'}], [{'id': '14808', 'nation': 'France', 'worth': '65'}], [{'id': '11446', 'nation': 'Ghana', 'worth': '69'}], [{'id': '11419', 'nation': 'France', 'worth': '69'}], [{'id': '11185', 'nation': 'Ghana', 'worth': '69'}], [{'id': '1527', 'nation': 'Ghana', 'worth': '64'}], [{'id': '12714', 'nation': 'Moldova', 'worth': '67'}], [{'id': '2855', 'nation': 'Moldova', 'worth': '63'}], [{'id': '9620', 'nation': 'Moldova', 'worth': '71'}]]
I know how to find all permutations of length 4 with:
from itertools import permutations
perms = permutations(dbset,4)
Now comes the part where I struggle; I want the maximum times a nation is in the permutation to be equal to 2 and I also would like the total worth to be above 300.
—Update—
I managed to get this working for the limited sample with limited permutations. The sample size however, is a set with over 16000 records and the permutation size is 11. As of yet, I am still executing it for the first time with 2 criteria: avg worth = 80 and nation occurrence <= 5.
It’s been running for over an hour now… any way to improve?
Answers:
What you want is as simple as:
perms = list(permutations(dbset,4))
out = [x in perms if CONDITION]
The second condition is as simple as:
out = [x for x in perms if sum([int(country[0]["worth"]) for country in x]) >= 300]
Note that this will be empty in your case, since the maximum worth of any nation is 71, and 300/4 = 75.
I’ll let you figure out the way to implement the first condition, but it is very similar. Note: and
statements are your friend!
Write a function that tests your conditions. Then filter the permutations using it.
from collections import Counter
def valid_permutation(perm):
if sum(int(p['worth']) for p in perm) <= 300:
return False
counts = Counter(p['nation'] for p in perm)
return counts.most_common()[0][1] <= 2
perms = filter(valid_permutation, permutations([d[0] for d in dbset], 4))
Given the following list:
dbset = [[{'id': '10556', 'nation': 'France', 'worth': '70'}], [{'id': '14808', 'nation': 'France', 'worth': '65'}], [{'id': '11446', 'nation': 'Ghana', 'worth': '69'}], [{'id': '11419', 'nation': 'France', 'worth': '69'}], [{'id': '11185', 'nation': 'Ghana', 'worth': '69'}], [{'id': '1527', 'nation': 'Ghana', 'worth': '64'}], [{'id': '12714', 'nation': 'Moldova', 'worth': '67'}], [{'id': '2855', 'nation': 'Moldova', 'worth': '63'}], [{'id': '9620', 'nation': 'Moldova', 'worth': '71'}]]
I know how to find all permutations of length 4 with:
from itertools import permutations
perms = permutations(dbset,4)
Now comes the part where I struggle; I want the maximum times a nation is in the permutation to be equal to 2 and I also would like the total worth to be above 300.
—Update—
I managed to get this working for the limited sample with limited permutations. The sample size however, is a set with over 16000 records and the permutation size is 11. As of yet, I am still executing it for the first time with 2 criteria: avg worth = 80 and nation occurrence <= 5.
It’s been running for over an hour now… any way to improve?
What you want is as simple as:
perms = list(permutations(dbset,4))
out = [x in perms if CONDITION]
The second condition is as simple as:
out = [x for x in perms if sum([int(country[0]["worth"]) for country in x]) >= 300]
Note that this will be empty in your case, since the maximum worth of any nation is 71, and 300/4 = 75.
I’ll let you figure out the way to implement the first condition, but it is very similar. Note: and
statements are your friend!
Write a function that tests your conditions. Then filter the permutations using it.
from collections import Counter
def valid_permutation(perm):
if sum(int(p['worth']) for p in perm) <= 300:
return False
counts = Counter(p['nation'] for p in perm)
return counts.most_common()[0][1] <= 2
perms = filter(valid_permutation, permutations([d[0] for d in dbset], 4))