GroupBy results to dictionary of lists

Question:

I have an excel sheet that looks like so:

Column1 Column2 Column3
0       23      1
1       5       2
1       2       3
1       19      5
2       56      1
2       22      2
3       2       4
3       14      5
4       59      1
5       44      1
5       1       2
5       87      3

And I’m looking to extract that data, group it by column 1, and add it to a dictionary so it appears like this:

{0: [1],
1: [2,3,5],
2: [1,2],
3: [4,5],
4: [1],
5: [1,2,3]}

This is my code so far

excel = pandas.read_excel(r"e:test_data.xlsx", sheetname='mySheet', parse_cols'A,C')
myTable = excel.groupby("Column1").groups
print myTable

However, my output looks like this:

{0: [0L], 1: [1L, 2L, 3L], 2: [4L, 5L], 3: [6L, 7L], 4: [8L], 5: [9L, 10L, 11L]}

Thanks!

Asked By: SuperDougDougy

||

Answers:

According to the docs, the GroupBy.groups:

is a dict whose keys are the computed unique groups and corresponding
values being the axis labels belonging to each group.

If you want the values themselves, you can groupby ‘Column1’ and then call apply and pass the list method to apply to each group.

You can then convert it to a dict as desired:

In [5]:

dict(df.groupby('Column1')['Column3'].apply(list))
Out[5]:
{0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}

(Note: have a look at this SO question for why the numbers are followed by L)

Answered By: EdChum

You could groupby on Column1 and then take Column3 to apply(list) and call to_dict?

In [81]: df.groupby('Column1')['Column3'].apply(list).to_dict()
Out[81]: {0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}

Or, do

In [433]: {k: list(v) for k, v in df.groupby('Column1')['Column3']}
Out[433]: {0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}
Answered By: Zero
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.