x axis gets transformed to floats

Question:

I am trying to plot my data grouped by year, and for each year, i want to count the number of users. Below, i just transformed the date column from float to integer.

This is my plotenter image description here

If you see the x-axis, my year ticker seems to have become a float and the each ticker is 0.5 tick apart.

How do i make this purely an integer?


Changing the groupby has the same result:
enter image description here


ticks are still 2 spaces apart after converting the year column to a string format

df['year'] = df['year'].astype(str)

:
enter image description here

Asked By: jxn

||

Answers:

import matplotlib.pyplot as plt

# Use min and max to get the range of years to use in axis ticks
year_min = df['year'].min()
year_max = df['year'].max()

df['year'] = df['year'].astype(str) # Prevents conversion to float

plt.xticks(range(year_min, year_max, 1)) # Sets plot ticks to years within range

Hope this helps!

Answered By: AmourK

The expectation that using integer data will lead a matplotlib axis to only show integers is not justified. At the end, each axis is a numeric float axis.

The ticks and labels are determined by locators and formatters. And matplotlib does not know that you want to plot only integers.

Some possible solutions:

Tell the default locator to use integers

The default locator is a AutoLocator, which accepts an attribute integer. So you may set this attribute to True:

ax.locator_params(integer=True)

Example:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

data = pd.DataFrame({"year" : [2010,2011,2012,2013,2014],
                     "count" :[1000,2200,3890,5600,8000] })

ax = data.plot(x="year",y="count")
ax.locator_params(integer=True)

plt.show()

Using a fixed locator

You may just tick only the years present in the dataframe by using ax.set_ticks().

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

data = pd.DataFrame({"year" : [2010,2011,2012,2013,2014],
                     "count" :[1000,2200,3890,5600,8000] })

data.plot(x="year",y="count")
plt.gca().set_xticks(data["year"].unique())
plt.show()

Convert year to date

You may convert the year column to a date. For dates much nicer ticklabeling takes place automatically.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

data = pd.DataFrame({"year" : [2010,2011,2012,2013,2014],
                     "count" :[1000,2200,3890,5600,8000] })

data["year"] = pd.to_datetime(data["year"].astype(str), format="%Y")
ax = data.plot(x="year",y="count")

plt.show()

In all cases you would get something like this:

enter image description here

a solution that worked for me was to first convert the column to int and in a second step again to a string:

df['year'].astype(int)
df['year'].astype(str)

This might be more or less a "quick and dirty" workaround for the usage of a locator.

Answered By: dlg_
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.