Seaborn plots incorrect data

Question:

I’m using pandas to handle my dataset and seaborn to create a plot for it, specifically a bivariate KDE plot.

The dataset contains lightning bolts coordinates and power.

dataset

When plotting the data, it comes out as if all of it has a power of ~0, while in reality the dataset doesn’t even contain any row with power < 1.3.

plot

import seaborn as sns
import pandas as pd

df = middleEast.head(1000)

sns.jointplot(x='Longitude', y='Power (J)', data=df, kind='kde', bw=0.5)

plt.show()

That’s my code. If I use the first 10 rows instead of 1000, it works fine. But again, there is no way that the data in the first 1000 rows is anything close to what is being shown in the plot.

Also tried kdeplot() instead of jointplot(), plots look the same.

Please help!

Asked By: Chefi

||

Answers:

If you look carefully, you’ll see that your axes have been scaled. They are in scientific notation, so you have 1e6 for example on your power axis. With that scale, it’s no surprise that your values are plotted close to the x-axis.

Still, there’s clearly something odd going on to get such strong scaling; you probably have some outlier data points (that this works for the first ten data points but not the first thousand supports this conclusion). You should find the maximum and minimum values in each DataFrame column and also test for missing (null/nan) values, which could mess up your plot.

Answered By: Paddy Alton
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.