how can I plot some graphics from data in a given dataset?

Question:

I have a dataset containing data on covid cases. the link is as follows

i have a 3 questions waiting to be answered:

  • Pie chart of the top 10 countries with the highest number of cases and deaths per million.
  • Stackplot visualization of cases and death rates by continent.
  • Visualization of the number of coronavirus cases of all countries in March 2020 on a daily basis with a line chart.

I tried many times, but I couldn’t come to a conclusion. I used pandas for create dataframe, but i couldn’t draw those 3 graphics/charts. I can reach the results by filtering and grouping as desired, but I cannot turn them into graphics. Thanks for the helps guys.(If you want more details I can share with you.)

Asked By: Tolga Dönmez

||

Answers:

Preface:

The community will help with your issues, but there are certain expectations on you. It is not a code writing service. Please take a few minuets to take the Tour and review How to Ask questions. Then update your question to include sample data, table definition (ddl scripts), the expected results of that data also show what you have tried. All is as text – no images. Further clearly describe what you are attempting and where you are having issues.

Only that I found the task interesting, so you have it.

1:

import requests
import matplotlib.pyplot as plt
import seaborn as sns
import seaborn.objects as so

url = r'https://covid.ourworldindata.org/data/owid-covid-data.csv'
with open('covid_data.csv', 'wb') as f:
    f.write(requests.get(url).content)

df = pd.read_csv('covid_data.csv')
df['date'] = pd.to_datetime(df['date'])

group_location_max = df.dropna(subset='continent').groupby(by=['location']).max()
top10_total_cases_per_milliion = group_location_max['total_cases_per_million'].sort_values(ascending=False).head(10)
top10_total_deaths_per_milliion = group_location_max['total_deaths_per_million'].sort_values(ascending=False).head(10)

def make_autopct(values):
    def my_autopct(pct):
        total = sum(values)
        val = int(round(pct*total/100.0))
        return '{p:.2f}%  ({v:d})'.format(p=pct,v=val)
    return my_autopct

vals1 = top10_total_cases_per_milliion.values
vals2 = top10_total_deaths_per_milliion.values
ax1 = top10_total_cases_per_milliion.plot.pie(figsize=(10, 9), autopct=make_autopct(vals1), explode=np.ones((10))*0.1)
ax1.yaxis.set_label_coords(-0.15, 0.5)
plt.show()

ax2 = top10_total_deaths_per_milliion.plot.pie(figsize=(10, 9), autopct=make_autopct(vals2), explode=np.ones((10))*0.1)
ax2.yaxis.set_label_coords(-0.15, 0.5)
plt.tight_layout()
plt.show()

enter image description here

2:

total_cases_slice = df[['date', 'continent', 'total_cases']].dropna()
total_deaths_slice = df[['date', 'continent', 'total_deaths']].dropna()

s1 = so.Plot(total_cases_slice, x='date', y='total_cases', color='continent').add(so.Area(alpha=.5), so.Agg(), so.Stack()).layout(size=(8, 4))
s2 = so.Plot(total_deaths_slice, x='date', y='total_deaths', color='continent').add(so.Area(alpha=.5), so.Agg(), so.Stack()).layout(size=(8, 4))

s1.save('s1.png', bbox_inches='tight')                                                                                                                             
s2.save('s2.png', bbox_inches='tight') 

enter image description here
enter image description here

3:

total_cases_march = df[df.date.gt('2020-03-01') & df.date.le('2020-03-31') & df.continent.notna()][['date', 'location', 'total_cases']]
s3 = sns.lineplot(data=total_cases_march, x='date', y='total_cases', hue='location')
plt.legend(bbox_to_anchor=(2.04, 1), loc="upper right")
for tick in s3.get_xticklabels():
    tick.set_rotation(45)
plt.show()

enter image description here

Answered By: Sergey Sakharovskiy