Determine number of unique values of one column for each value of another column

Question:

I have:

df = pd.DataFrame({"A": [1,2,3,4,5,6,7,8], "B": [1,1,2,2,3,3,4,4], "C": [1,1,1,1,2,3,2,2] })

    A   B   C
0   1   1   1
1   2   1   1
2   3   2   1
3   4   2   1
4   5   3   2
5   6   3   3
6   7   4   2
7   8   4   2

I would like to know, for each value b of column B, how many unique values c of column C there that are in rows where B=b.

So I’d like a series that tells me something like {1:1, 2:2, 3:2, 4:1} meaning that, for example, when B=3, there are two unique values of C (namely 2 and 3).

How do I do this?

Asked By: Smithey

||

Answers:

Try this:

df.groupby("B")["C"].nunique()

> B
  1    1
  2    1
  3    2
  4    1
  Name: C, dtype: int64
Answered By: ewz93
df.groupby("B")["C"].nunique().to_dict()

output:

{1: 1, 2: 1, 3: 2, 4: 1}

how does it work?

  1. every time you want to calculate something in one column based on values in another, groupby is coming… so use it and pass all values grouped.
  2. what do you want? number of unique values in C… so use ["C"].nunique() that return number of unique values.
  3. and at last, you want dict, so convert your result to_dict()
Answered By: MoRe
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.