peak integration using trapz but not starting from baseline and visualization

Question:

I am trying to integrate signal peaks using numpy.trapz or scipy.trapz. I do manually select the range of a peak min and max. the problem I am facing is that I think these functions integerates the AUC till the baseline. Sometimes my signal start or end is above the baseline. I am open to use a baseline correction method, but I need to visualize the integrated area without any correction to make sure they are correct.

image source: https://terpconnect.umd.edu/~toh/spectrum/Integration.html

Reference Here

An Example to generate data

import numpy
import peakutils
from peakutils.plot import plot as pplot
from matplotlib import pyplot


centers = (30.5, 72.3)
x = numpy.linspace(0, 120, 121)
y = (peakutils.gaussian(x, 5, centers[0], 3) +
    peakutils.gaussian(x, 7, centers[1], 10) +
    numpy.random.rand(x.size))
pyplot.figure(figsize=(10,6))
pyplot.plot(x, y)
pyplot.title("Data with noise")

enter image description here

Asked By: PPenton

||

Answers:

To get the area under the peak you can interpolate the curve using scipy.interpolate.make_interp_spline which has an integrate method. This will be the entire area under the curve between x=a and x=b, so you will then want to subtract the area of the trapezoid created by connecting points at x=a and x=b and the x-axis. In the code I share below I assume f(a) >= 0 and f(b) >= 0.

Note: I added another peak and reduced the noise level to make the endpoints more prominent/obvious.

import numpy as np
import peakutils
import matplotlib.pyplot as plt
from scipy.signal import find_peaks
from scipy.interpolate import make_interp_spline

plt.close("all")

rng = np.random.default_rng(42)

centers = (30.5, 72.3, 112.1)
x = np.arange(0, 120)
y = (peakutils.gaussian(x, 5, centers[0], 3)
     + peakutils.gaussian(x, 7, centers[1], 10)
     + peakutils.gaussian(x, 6, centers[2], 5)
     + rng.random(x.size)*0.2)
     
peak_indices, _ = find_peaks(-y, prominence=4)

fig, ax = plt.subplots()
ax.plot(x, y, label="Data")
ax.plot(x[peak_indices], y[peak_indices], "x", color="r", label="End points")
# ax.title("Data with noise")


x1, x2 = x[peak_indices]

interp = make_interp_spline(x, y, k=1)
m = (interp(x2) - interp(x1))/(x2 - x1)
x = np.linspace(x1, x2, 200)
ax.fill_between(x, interp(x), m*(x - x1) + interp(x1), alpha=0.5, 
                label="Desired area")
ax.fill_between(x, m*(x - x1) + interp(x1), alpha=0.5, 
                label="Trapezoid area")

ax.legend()

# Integral under the peak minus the area of the trapezoid connecting the two
# ends of the peak.
# NOTE: EDGE CASE OF PEAK ENDS GOING BELOW THE X-AXIS IS NOT BEING CONSIDERED
integral = interp.integrate(x1, x2) - 0.5*(x2-x1)*(interp(x1) + interp(x2))
print(integral)  # 163.8743118056535

Answered By: jared