Finding number of occurrences of items in a dictionary of lists
Question:
I have two supplied bits of information: A dictionary of transactions and a list of the unique items in the transactions.
transactions = {
"T1": ["A", "B", "C", "E"],
"T2": ["A", "D", "E"],
"T3": ["B", "C", "E"],
"T4": ["B", "C", "D", "E"],
"T5": ["B", "D", "E"]
}
items = ["A", "B", "C", "D", "E"]
What I need to do is find the number of occurrences of these items in the transactions. I created a dictionary that has keys representing the unique items and the value for each key initialized to 0, but I am unsure of how to update these values to represent the number of occurrences in the transactions.
occurr = dict()
for x in items:
occurr[x] = 0
This is my occurrences dictionary which yields the output:
{'A': 0, 'B': 0, 'C': 0, 'D': 0, 'E': 0}
The final dictionary should look like:
{'A': 2, 'B':4, 'C': 3, 'D': 3, 'E': 5}
as ‘A’ occurs 2 times in the transactions, ‘B’ occurs 4 times, etc.
Answers:
You can use Counter, for example:
from collections import Counter
transactions = {
"T1": ["A", "B", "C", "E"],
"T2": ["A", "D", "E"],
"T3": ["B", "C", "E"],
"T4": ["B", "C", "D", "E"],
"T5": ["B", "D", "E"]
}
c = Counter()
for dd in transactions.values():
c.update(dd)
print(c) # or c.items(), c.keys() or c.values()
# Result: Counter({'E': 5, 'B': 4, 'C': 3, 'D': 3, 'A': 2})
# Note that the result is a subclass of dict
This will count the frequency of all values in transactions
. If you need to restrict to those keys present in items
then filter for that.
Alternatively, flatten the transaction list values into a single list, and count that in one call. For example:
flatlist = [item for sublist in transactions.values() for item in sublist]
print(Counter(flatlist))
Well, you are in the right direction. You need to iterate over the values of dictionary.
occurr = dict()
for x in items:
occurr[x] = 0
for transaction in transactions.values():
for item in transaction:
occurr[item] += 1
Alternatively, you can concatenate all lists to single list and call collections.Counter
:
import collections
items = [item for transaction in transactions.values() for item in transaction]
print(collections.Counter(items))
Try:
transactions = {
"T1": ["A", "B", "C", "E"],
"T2": ["A", "D", "E"],
"T3": ["B", "C", "E"],
"T4": ["B", "C", "D", "E"],
"T5": ["B", "D", "E"],
}
items = ["A", "B", "C", "D", "E"]
out = {}
for l in transactions.values():
for v in l:
out[v] = out.get(v, 0) + 1
out= {k: out.get(k) for k in items}
print(out)
Prints:
{'A': 2, 'B': 4, 'C': 3, 'D': 3, 'E': 5}
I have two supplied bits of information: A dictionary of transactions and a list of the unique items in the transactions.
transactions = {
"T1": ["A", "B", "C", "E"],
"T2": ["A", "D", "E"],
"T3": ["B", "C", "E"],
"T4": ["B", "C", "D", "E"],
"T5": ["B", "D", "E"]
}
items = ["A", "B", "C", "D", "E"]
What I need to do is find the number of occurrences of these items in the transactions. I created a dictionary that has keys representing the unique items and the value for each key initialized to 0, but I am unsure of how to update these values to represent the number of occurrences in the transactions.
occurr = dict()
for x in items:
occurr[x] = 0
This is my occurrences dictionary which yields the output:
{'A': 0, 'B': 0, 'C': 0, 'D': 0, 'E': 0}
The final dictionary should look like:
{'A': 2, 'B':4, 'C': 3, 'D': 3, 'E': 5}
as ‘A’ occurs 2 times in the transactions, ‘B’ occurs 4 times, etc.
You can use Counter, for example:
from collections import Counter
transactions = {
"T1": ["A", "B", "C", "E"],
"T2": ["A", "D", "E"],
"T3": ["B", "C", "E"],
"T4": ["B", "C", "D", "E"],
"T5": ["B", "D", "E"]
}
c = Counter()
for dd in transactions.values():
c.update(dd)
print(c) # or c.items(), c.keys() or c.values()
# Result: Counter({'E': 5, 'B': 4, 'C': 3, 'D': 3, 'A': 2})
# Note that the result is a subclass of dict
This will count the frequency of all values in transactions
. If you need to restrict to those keys present in items
then filter for that.
Alternatively, flatten the transaction list values into a single list, and count that in one call. For example:
flatlist = [item for sublist in transactions.values() for item in sublist]
print(Counter(flatlist))
Well, you are in the right direction. You need to iterate over the values of dictionary.
occurr = dict()
for x in items:
occurr[x] = 0
for transaction in transactions.values():
for item in transaction:
occurr[item] += 1
Alternatively, you can concatenate all lists to single list and call collections.Counter
:
import collections
items = [item for transaction in transactions.values() for item in transaction]
print(collections.Counter(items))
Try:
transactions = {
"T1": ["A", "B", "C", "E"],
"T2": ["A", "D", "E"],
"T3": ["B", "C", "E"],
"T4": ["B", "C", "D", "E"],
"T5": ["B", "D", "E"],
}
items = ["A", "B", "C", "D", "E"]
out = {}
for l in transactions.values():
for v in l:
out[v] = out.get(v, 0) + 1
out= {k: out.get(k) for k in items}
print(out)
Prints:
{'A': 2, 'B': 4, 'C': 3, 'D': 3, 'E': 5}