Don't worry I just misindented

Question

#I have this dataset after filtering:

[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected].

I want to count the number of emails for each user using a dictionary.
I used the code(this data is called emails):

dic=dict()
lst=list()
handle = open('mbox-short.txt')
for words in handle:
        if words.startswith("From "):
         g=words.split()
         emails=g[1]
         lst.append(emails)
         for x in lst: 
            dic[x]=dic.get(x,0)+1 
print(dic)

and these are the results:

{'[email protected]': 34, '[email protected]': 37, '[email protected]': 79, '[email protected]': 46, '[email protected]': 47, '[email protected]': 53, '[email protected]': 15, '[email protected]': 13, '[email protected]': 12, '[email protected]': 38, '[email protected]': 4}

Why is the count so high in comparison to the actual number of emails and how can I fix this.
sorry for the mess it is my first time posting here.

Asked By: Mamdouh Dabjan

||

Source

Answer 1

The second for loop must be outside of the first one. 🙂

dic=dict()
lst=list()
handle = open('mbox-short.txt')
for words in handle:
        if words.startswith("From "):
         g=words.split()
         emails=g[1]
         lst.append(emails)

for x in lst: 
    dic[x]=dic.get(x,0)+1 
print(dic)

Answered By: Behzad Aslani Avilaq

Don't worry I just misindented

Question:

Answers: