python: cumulative density plot
Question:
I have the following dataframe:
df =
Time_to_event event
0 0 days 443
1 1 days 226
2 2 days 162
3 3 days 72
4 4 days 55
5 5 days 30
6 6 days 36
7 7 days 18
8 8 days 15
9 9 days 14
10 10 days 21
11 11 days 13
12 12 days 10
13 13 days 10
14 14 days 8
I want to produce a cumulative density plot of the sum of the events per days. For example 0 days 443, 1 days = 443 + 226 etc.
I am currently trying this code:
stat = "count" # or proportion
sns.histplot(df, stat=stat, cumulative=True, alpha=.4)
but I come up with a pretty terrible plot:
If I could also come up with a line instead of bars that would be awesome!
Answers:
You can try a combo of pandas.Series.cumsum
and seaborn.lineplot
:
df["cumsum"] = df["event"].cumsum()
plt.figure(figsize=(6,4))
sns.lineplot(x="Time_to_event", y="cumsum", data=df);
Output :
I think what you are looking for your plot values is:
xvalues=df["Time_to_event"]
yvalues=df["event"].cumsum()
The code could look like this:
import pandas as pd
import matplotlib.pyplot as plt
df=pd.read_csv("test.txt")
print(df.columns)
print(df)
plt.bar(df["Time_to_event"],df["event"].cumsum())
# replace plt.bar with plt.plot for a plotted diagram
plt.show()
I have the following dataframe:
df =
Time_to_event event
0 0 days 443
1 1 days 226
2 2 days 162
3 3 days 72
4 4 days 55
5 5 days 30
6 6 days 36
7 7 days 18
8 8 days 15
9 9 days 14
10 10 days 21
11 11 days 13
12 12 days 10
13 13 days 10
14 14 days 8
I want to produce a cumulative density plot of the sum of the events per days. For example 0 days 443, 1 days = 443 + 226 etc.
I am currently trying this code:
stat = "count" # or proportion
sns.histplot(df, stat=stat, cumulative=True, alpha=.4)
but I come up with a pretty terrible plot:
If I could also come up with a line instead of bars that would be awesome!
You can try a combo of pandas.Series.cumsum
and seaborn.lineplot
:
df["cumsum"] = df["event"].cumsum()
plt.figure(figsize=(6,4))
sns.lineplot(x="Time_to_event", y="cumsum", data=df);
Output :
I think what you are looking for your plot values is:
xvalues=df["Time_to_event"]
yvalues=df["event"].cumsum()
The code could look like this:
import pandas as pd
import matplotlib.pyplot as plt
df=pd.read_csv("test.txt")
print(df.columns)
print(df)
plt.bar(df["Time_to_event"],df["event"].cumsum())
# replace plt.bar with plt.plot for a plotted diagram
plt.show()