Geopandas – plot chart with continent data

Question:

I am trying to plot the data on the continents with Geopandas.

I have the following number of tickets from my pivot table on the number of tickets logged from each country:

    Number of Tickets
region
Africa            370
Americas         1130
Asia              873
Europe            671
Oceania           445

In my ticket list dataframe, I have the cases logged from each country. Each country is mapped to a region and a continent. Following the logic, each ticket logged has a country, region and continent assigned.

To be able to plot the data, I merge the Geopandas dataframe (country geometries) with my ticket dataframe on 3-letter country codes and make sure that the resulting dataframe is a geodataframe:

tickets_region = pd.merge(world, tickets, left_on='ISO_A3', right_on='code-3')

type(tickets_region)
geopandas.geodataframe.GeoDataFrame

I try to plot the data with the following code:

fig, ax = plt.subplots()
ax = tickets_region.plot('continent', cmap='Reds',scheme='headtailbreaks')
ax.tick_params(left=False, labelleft=False, bottom=False, labelbottom=False)
plt.title('Number of Tickets by Continent')
plt.box(False)
plt.show()

However this code block never finishes, eats up memory and CPU cycles and I have to press Ctrl-C to cut it out. Same code works with ‘code-3’ (3-letter country codes.)

I assume that this is due to the ‘continent’ geography not defined in the geojson file, but I am expecting that to be filled by Python by adding up the number of tickets. I see that my expectation has a broken logic somewhere, but I am not able to see that.

Any ideas on how I can make the continent plot work?

Thank you.

Edit: "world" dataframe is the geojson file download from https://datahub.io/core/geo-countries

Asked By: tbalci

||

Answers:

You can use the method dissolve() from the GeoPandas dataframe. You can have a look at the GeoPandas documentation here. Your code can be modified like this :

tickets_region = tickets_region.dissolve(by='continent', aggfunc='sum')

fig, ax = plt.subplots()
ax = tickets_region.plot(column='Number of Tickets', cmap='Reds',scheme='headtailbreaks')
ax.tick_params(left=False, labelleft=False, bottom=False, labelbottom=False)
plt.title('Number of Tickets by Continent')
plt.box(False)
plt.show()
Answered By: Pierre-Loic

I have utilized this thread to make an analysis recently. The data lacked continent co-ordinates to plot the graph, so had an idea to import the existing dataset and merge them together. Here is the import and dissolve code:

import geopandas as gpd

world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres")).drop(['gdp_md_est'], axis=1)
world = world.dissolve(by='continent', aggfunc='sum')
world = world.merge(d, how='inner', left_on='continent', right_index=True)

The Kaggle notebook is available at https://www.kaggle.com/code/pavfedotov/gtc-map

Answered By: Pavel Fedotov