Don't worry I just misindented
Question:
#I have this dataset after filtering:
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected].
I want to count the number of emails for each user using a dictionary.
I used the code(this data is called emails):
dic=dict()
lst=list()
handle = open('mbox-short.txt')
for words in handle:
if words.startswith("From "):
g=words.split()
emails=g[1]
lst.append(emails)
for x in lst:
dic[x]=dic.get(x,0)+1
print(dic)
and these are the results:
{'[email protected]': 34, '[email protected]': 37, '[email protected]': 79, '[email protected]': 46, '[email protected]': 47, '[email protected]': 53, '[email protected]': 15, '[email protected]': 13, '[email protected]': 12, '[email protected]': 38, '[email protected]': 4}
Why is the count so high in comparison to the actual number of emails and how can I fix this.
sorry for the mess it is my first time posting here.
Answers:
The second for loop must be outside of the first one. 🙂
dic=dict()
lst=list()
handle = open('mbox-short.txt')
for words in handle:
if words.startswith("From "):
g=words.split()
emails=g[1]
lst.append(emails)
for x in lst:
dic[x]=dic.get(x,0)+1
print(dic)
#I have this dataset after filtering:
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected].
I want to count the number of emails for each user using a dictionary.
I used the code(this data is called emails):
dic=dict()
lst=list()
handle = open('mbox-short.txt')
for words in handle:
if words.startswith("From "):
g=words.split()
emails=g[1]
lst.append(emails)
for x in lst:
dic[x]=dic.get(x,0)+1
print(dic)
and these are the results:
{'[email protected]': 34, '[email protected]': 37, '[email protected]': 79, '[email protected]': 46, '[email protected]': 47, '[email protected]': 53, '[email protected]': 15, '[email protected]': 13, '[email protected]': 12, '[email protected]': 38, '[email protected]': 4}
Why is the count so high in comparison to the actual number of emails and how can I fix this.
sorry for the mess it is my first time posting here.
The second for loop must be outside of the first one. 🙂
dic=dict()
lst=list()
handle = open('mbox-short.txt')
for words in handle:
if words.startswith("From "):
g=words.split()
emails=g[1]
lst.append(emails)
for x in lst:
dic[x]=dic.get(x,0)+1
print(dic)