turning a collections counter into dictionary

Question:

I have a collection outcome resulting from the function:

Counter(df.email_address)

it returns each individual email address with the count of its repetitions.

Counter({nan: 1618, '[email protected]': 265, '[email protected]': 1})

what I want to do is to use it as if it was a dictionary and create a pandas dataframe out of it with two columns one for email addresses and one for the value associated.

I tried with:

dfr = repeaters.from_dict(repeaters, orient='index')

but i got the following error:

AttributeError: 'Counter' object has no attribute 'from_dict'

It makes thing that Counter is not a dictionary as it looks like. Any idea on how to append it to a df?

Asked By: Blue Moon

||

Answers:

d = {}
cnt = Counter(df.email_address)
for key, value in cnt.items():
    d[key] = value

EDIT

Or, how @Trif Nefzger suggested:

d = dict(Counter(df.email_address))
Answered By: doru

Alternatively you could use pd.Series.value_counts, which returns a Series object.

df.email_address.value_counts(dropna=False)

Sample output:

[email protected]    2
[email protected]    1
NaN        1
dtype: int64

This is not exactly what you asked for but looks like what you’d like to achieve.

Answered By: ldirer

as ajcr wrote at the comment, from_dict is a method that belongs to dataframe and thus you can write the following to achieve your goal:

from collections import Counter
import pandas as pd

repeaters = Counter({"nan": 1618, '[email protected]': 265, '[email protected]': 1})

dfr = pd.DataFrame.from_dict(repeaters, orient='index')
print dfr

Output:

[email protected]     1
nan                           1618
[email protected]            265
Answered By: omri_saadon

Not sure why there are many convoluted ways.

  1. Counter is a dict subclass. So you can pass to anything that expects a param of type dict.
class Counter(dict):
    '''Dict subclass for counting hashable items...
  1. If you really really want to convert Counter to a dict:
>>> d1 = dict(cntr)
>>> d1
{nan: 1618, '[email protected]': 265, '[email protected]': 1}
>>> 
>>> 
>>> d2 = {k: v for k, v in cntr.items()}
>>> d2
{nan: 1618, '[email protected]': 265, '[email protected]': 1}
>>> 
  1. To create a Pandas DataFrame from Counter use pandas.DataFrame.from_dict(). It takes a dict, but a dict of either:
    • {'col_name1': [r1c1, r2c1...], 'col_name2': [r1c2, r2c2,...], ... OR
    • {'row_id1': [r1c1, r1c2,...], 'row_id2': [r2c1, r2c2,...], ...

where rNcM is the value Nth row and Mth column.

>>> from collections import Counter
>>> cntr = Counter({float('nan'): 1618, '[email protected]': 265, '[email protected]': 1})
>>> cntr
Counter({nan: 1618, '[email protected]': 265, '[email protected]': 1})
>>> 
>>> import panadas as pd
>>> pdf = pd.DataFrame.from_dict({'emails': cntr.keys(), 'repeatation_count': cntr.values()})
>>> print(pdf.to_string())
                         emails  repeatation_count
0                           NaN               1618
1           [email protected]                265
2  [email protected]                  1
>>> 
Answered By: Kashyap