How to get the most repated elements in a dataframe/array

Question:

I compiled a list of the top artists for every year across 14 years and I want to gather the top 7 for the 14 years combined so my idea was to gather them all in a dataframe then gather the most repeated artists for these years, but it didn’t work out.

#Collecting the top 7 artists across the 14 years
artists = []
year = 2020
while year >= 2006:
    TAChart = billboard.ChartData('Top-Artists', year = year)
    artists.append(str(TAChart))
    year -= 1

len(artists)
Artists = pd.DataFrame(artists)
n = 7
Artists.value_counts().index.tolist()[:n]
Asked By: YamenAly

||

Answers:

You’re very close – you just need to flatten your list of lists into a single list, then call value_counts:

artists_flat = [a for lst in artists for a in lst]
pd.Series(artists_flat).value_counts().head(n)

Your current code is counting the occurrences of entire lists (as strings), rather than individual artists.
Also, note that I used head(n) rather than indexing, as this is more robust in case there are ties for the nth place spot.

Answered By: doppy

You can try something like this:

# Create a panda DataFrame using the list
List = ['AB', 'B', 'B', 'A','A', 'D', 'C','B']
df=pd.DataFrame({'Artist': List})
 
# Creating a new dataframe to store the values
# with appropriate column name
# value_counts() returns the count based on

df1 = df['Artist'].value_counts().to_frame()
df1 = df1.rename(columns = {'Artist':'Count'})

# Out[df1]:
#     Count
# B       3
# A       2
# AB      1
# D       1
# C       1
Answered By: Panagiotis Erodotou