how to set scope of data displayed with simple plotly bar graph

Question:

working my through my understanding of plotly/dash. immediately have a problem I can’t find an answer too. I broke my code down into its simplest form to isolate the problem.

There are 10 float values for x, and 10 for y

x = [1548.36, 1548.35, 1548.32, 1548.31, 1548.3, 1548.26, 1548.25, 1548.17, 1548.12, 
1548.03]
y = [36.9467, 2.7585, 4.5658, 7.5905, 18.9993, 3.6085, 4.3028, 0.02, 29.7094, 0.2]
fig = px.bar(x=x,y=y)
fig.show()

this yields : enter image description here

this is obviously wrong. Is feeding it float values that aren’t perfectly sequential in nature causing plotly to misinterpret scope of where to draw it bars? I don’t even know how to phrase my question here. I just want to see 10 bars next to each other, scaled to fit my screen.

Right now it looks like its plotting my float values on a sequential timeline, and since my numbers aren’t sequential i wind up with these gaps.

Asked By: Philip C.

||

Answers:

You want something like this?

enter image description here

Code:

import plotly.express as px

x = [1548.36, 1548.35, 1548.32, 1548.31, 1548.3, 1548.26, 1548.25, 1548.17, 1548.12,  1548.03]
y = [36.9467, 2.7585, 4.5658, 7.5905, 18.9993, 3.6085, 4.3028, 0.02, 29.7094, 0.2]
fig = px.bar(x, y=y, text=x)
Answered By: u1234x1234

Answer instead of comment to be able to add a plot and sorted data.

When scaling the x values to a reasonable range the plot looks as expected:

import plotly.express as px

x = [1548.36, 1548.35, 1548.32, 1548.31, 1548.3, 1548.26, 1548.25, 1548.17, 1548.12, 
1548.03]
y = [36.9467, 2.7585, 4.5658, 7.5905, 18.9993, 3.6085, 4.3028, 0.02, 29.7094, 0.2]
x_scaled = [(i - 1548)*100 for i in x]
print(f"scaled x:n{x_scaled}")

fig = px.bar(x=x_scaled,y=y)
fig.show()

x_scaled.sort()
print(f"scaled and sorted x: n{x_scaled}")
scaled x:
[35.999999999989996, 34.999999999990905, 31.999999999993634, 30.999999999994543,
 29.999999999995453, 25.99999999999909, 25.0, 17.000000000007276, 11.999999999989086,
 2.9999999999972715]

enter image description here

scaled and sorted x: 
[2.9999999999972715, 11.999999999989086, 17.000000000007276, 25.0, 25.99999999999909,
 29.999999999995453, 30.999999999994543, 31.999999999993634, 34.999999999990905,
 35.999999999989996]

To sort x aside with y using pandas dataframe:

import pandas as pd

zipped = list(zip(x, y))

df = pd.DataFrame(zipped, columns=['X', 'Y'])
df_sorted = df.sort_values(by='X')
df_sorted

             X        Y
    9  1548.03   0.2000
    8  1548.12  29.7094
    7  1548.17   0.0200
    6  1548.25   4.3028
    5  1548.26   3.6085
    4  1548.30  18.9993
    3  1548.31   7.5905
    2  1548.32   4.5658
    1  1548.35   2.7585
    0  1548.36  36.9467


If that’s still not what you expect and the answer from u1234x1234 doesn’t fit as well you may want to describe more what you expect.

Answered By: MagnusO_O

Another interpretation that may meet "I just want to see 10 bars next to each other, scaled to fit my screen." including some background:

Converting the x values to string before plotting gives:

import plotly.express as px

x = [1548.36, 1548.35, 1548.32, 1548.31, 1548.3, 1548.26, 1548.25, 1548.17, 1548.12, 
1548.03]
x_string = list(map(str, x))
print(x_string)

y = [36.9467, 2.7585, 4.5658, 7.5905, 18.9993, 3.6085, 4.3028, 0.02, 29.7094, 0.2]

fig = px.bar(x=x_string,y=y)
fig.show()
['1548.36', '1548.35', '1548.32', '1548.31', '1548.3', '1548.26', '1548.25',
 '1548.17', '1548.12', '1548.03']

enter image description here

By the conversion to string the x axis list contains the same information concerning plotting as e.g. ['Joe','Jane','Julia','Alfons',...], so the numerical information is "removed".
And without the numerical information the common (and sensible) way to plot is just to have the strings in their order with one after the other on the x axis.

You then can even mix string "numbers" with "normal" strings, try the following as an x axis:

x = ['1548.36', 'Julia', '1548.32', 'Zaphod', 'Joe', '1548.26', '1548.25',
 '1548.17', '1548.12', '1548.03']

Since the numerical information is "removed" by the string conversion you have to take care of any sorting in advance if that’s intended.

If you want the x axis to still be sorted acc. the intial numbers sequence, e.g. pandas dataframe can be used for that:

import plotly.express as px
import pandas as pd

x = [1548.36, 1548.35, 1548.32, 1548.31, 1548.3, 1548.26, 1548.25, 1548.17, 1548.12, 
1548.03]
y = [36.9467, 2.7585, 4.5658, 7.5905, 18.9993, 3.6085, 4.3028, 0.02, 29.i7094, 0.2]

zipped = list(zip(x, y))
df = pd.DataFrame(zipped, columns=['X', 'Y'])
df_sorted = df.sort_values(by='X')
df_sorted['X'] = df_sorted['X'].astype(str)

fig = px.bar(df_sorted, x='X', y='Y')
fig.show()

enter image description here


Maybe another angle for the explanation, let’s plot a sine function:

enter image description here

However note that it’s actually not a continous sine function but a sampled one with just the dots connected by a line.

Code and scatter plot:

import numpy as np
import random
import plotly.express as px
import plotly.graph_objects as go

x = np.linspace(0, 10, 100)
y = np.sin(x)

fig = px.line(x=x,y=y)
fig.show()

fig = px.scatter(x=x,y=y)
fig.show()

enter image description here

Note: Every plot is sampled. You may think of it like that:

  • To "plot" there needs to be a color dot placed.
    • A continuous function has infinite points. And if you would actually plot such a continuous function that would need infinite dots being placed.
  • For continuous line looking plots the samples are just connected with the dpi and the human eyes and brain it looks continuous when the sampling rate is high enough.

To mess with the sampling let’s remove some of those sampling points and plot again:

random.seed(1424836)
random_index = random.sample(range(0,100), 75)
x_reduced = np.delete(x, random_index)
y_reduced = np.delete(y, random_index)

fig1 = px.scatter(x=x_reduced,y=y_reduced)
fig2 = px.line(x=x_reduced,y=y_reduced)

fig = go.Figure(data = fig1.data + fig2.data)
fig.show()

enter image description here

Due to the ‘distribution’ that still looks kinda like a sine function.

But when we now do the string conversion and plot again (so no distribution but just one "number sting" after the next one) it looks like something different:

# number to strings, float with 2 digits after the comma 
#        (resolution doesn't matter, but the x axis lables are more readable)
x_reduced_string = ["%.2f" % i for i in x_reduced]  
print(x_reduced_string)


fig1 = px.scatter(x=x_reduced_string,y=y_reduced)
fig2 = px.line(x=x_reduced_string,y=y_reduced)

fig = go.Figure(data = fig1.data + fig2.data)
fig.show()
['0.20', '0.30', '0.81', '1.21', '1.31', '1.52', '1.62', '2.42', '2.63', '3.33', '4.55', '5.45', '5.76', '5.86', '5.96', '6.67', '7.88', '8.38', '8.48', '8.59', '8.69', '9.09', '9.39', '9.49', '9.80']

enter image description here

The same effect shown in a series of bar plots of the above data:

enter image description here
enter image description here
enter image description here

So keep that in mind when generating the bar plot as you intended in your question.
I’d recommend when you do this it would be good to explicitely highlight or mention that because when others see numbers on the x axis they probably expect the default plot.

Answered By: MagnusO_O
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.