How to count specific rows?

Question

I have an example of pyspark dataframe:

What I want is to count rows with the same X, Y and DATE columns and store the value in a new column.

Final dataframe should looks like this:

Asked By: dawid2312

||

Answer 1

This might help (assuming you mistyped and wanted Count instead of sum):

from pyspark.sql.functions import count

df = df.withColumn("Count", count("*").over(Window.partitionBy("X", "Y", "Date")))

Answered By: TrimPeachu

Question: