Date ticks not separated properly on x-axis

Question:

I have following data:

         Date  Chemical
10 2021-11-20        21
11 2021-11-26        19
12 2021-11-26        31
13 2021-11-26        32
14 2021-11-27        31
0  2021-12-06        21
6  2021-12-16        23
7  2021-12-16        24
8  2021-12-16        23
9  2021-12-16        25
1  2022-03-07        26
2  2022-03-08        28
3  2022-03-08        29
4  2022-03-08        28
5  2022-03-09        26

I plot column Chemical on y-axis against Date on x-axis:

maindf.boxplot('Chemical', 'Date')
plt.xticks(rotation=40)
plt.show()

I get following plot:

enter image description here

The x-axis is showing date entries to be equidistant and not according to how they are in time. 2021-11-26 and 2021-11-27 should be close together while 2021-12-16 and 2022-03-07 should be far apart.

Where is the problem and how can it be corrected. Thanks for your help.

Asked By: rnso

||

Answers:

Matplotlib considers each of the boxes as a categorical entry and separates them equally. To keep them separated based on date, you will need to use the positions parameters in boxplot available in matplotlib. For this, you need to find the date difference (109 days in your case). Adding that position and providing the number of days from minimum date (20-Nov-2021) will give you the below plot…

This code…

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
fig, ax = plt.subplots(figsize=(25,6))
maindf['Date'] =  pd.to_datetime(maindf['Date'])
maindf.sort_values('Date', inplace=True)
##Added column position to DF to show (in int) how many days away each entry is from min-date
maindf['position'] = (maindf['Date'] - maindf.Date.min()).dt.days 
##Using positions parameter to define the position
maindf.boxplot('Chemical', 'Date', positions = maindf.position.unique(), ax=ax)

plt.xticks(rotation=40, ha='right')
plt.show()

…will give you this plot

Note that the width is quite large because we have 109 days (max – min date)

enter image description here

There is a large gap in between because of the data. If you don’t need this big gap, you can manually change the maindf.position manually to suit your needs.

If you also want the dates to be equidistant from one another, then you will need to set the ticks to the way you want. I have used 11 dates (10 blocks) as an example. Also, have updated the date to show DD-MM-YYYY format. Changes will be as below.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import datetime
ax.yaxis.grid(True)
fig, ax = plt.subplots(figsize=(25,6))
maindf=pd.read_excel('myinput.xlsx', 'Sheet60')
maindf['Date'] =  pd.to_datetime(maindf['Date'])#, format='%d-%m-%y')
maindf.sort_values('Date', inplace=True)
maindf['position'] = (maindf['Date'] - maindf.Date.min()).dt.days
maindf.boxplot('Chemical', 'Date', positions = maindf.position.unique(), ax=ax)

my_xticklabels = []
for i in range(11):
    s = (maindf.Date.min() + i * (maindf.Date.max() - maindf.Date.min())/10)
    my_xticklabels.append(datetime.datetime.strftime(s, "%d-%m-%Y"))

ax.set_xticks(np.linspace(0,109,11))
ax.set_xticklabels(my_xticklabels, fontsize=14)

plt.xticks(rotation=40, ha='right')
plt.show()

Plot

enter image description here

Answered By: Redox
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.