How to sort by count with groupby in dataframe spark

Question:

I want to sort this count column by descending but I keep getting a ‘NoneType’ object is not callable error. How can I add a sort function to this so I won’t get the error?

from pyspark.sql.functions import hour
hour = checkin.groupBy(hour("date").alias("hour")).count().show()

enter image description here

Asked By: user12755836

||

Answers:

.show is returning None which you can’t chain any dataframe method after. Remove it and use orderBy to sort the result dataframe:

from pyspark.sql.functions import hour, col
hour = checkin.groupBy(hour("date").alias("hour")).count().orderBy(col('count').desc())

Or:

from pyspark.sql.functions import hour, desc
checkin.groupBy(hour("date").alias("hour")).count().orderBy(desc('count')).show()
Answered By: Psidom

.show() returns None, so either assign the result to the variable hour than call hour.show() or don’t assign like @Psidom suggest.

BTW, @Psidom doesn’t work. The code below works with Spark 3.2 or above. I haven’t checked the earlier version.

checkin.groupBy(hour("date").alias("hour")).count().orderBy('count', ascending=False).show()
Answered By: Lorenzo Cazador
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.