How can I tell that the standard deviation of the slope is underestimated for correlated data?

Question:

I wrote a function for a colored noise process (Gauss-Markov) for different correlation lengths T. Below the plot with sample interval Δt = 1 s., length N = 2000.
Plot of Gauss-Markov process for different T values

Next, I used np.polyfit to fit a line to each realization. In total, I did 300 realizations of each sequence for each T value (300×7). Then I computed the std and mean of the slopes. The resulting table is the following:

Table with std and mean of slopes

I understand that the line fitted on the signal with high correlation length will better resemble the Gauss-Markov sequence, but each realization will deviate a lot from the others, so the std of the slopes for the high T signals will be large.

For low T values (little correlation), I think that the presence of many oscillations will average out and the resulting std of the slopes will be lower compared to the std of the slopes of the 300 realizations with high correlation. The slope of many of the realizations of the low T values will even be 0.

My question is, looking at the table and plot, how can I tell that the std of the slopes is underestimated for the correlated data?

Asked By: Playstation_waifu

Source

Answers:

This is more of a stats question than a coding question.

Nevertheless, an answer is this:

Given that we know that the standard deviation (std) is the mean deviation from the mean and that it is proportional to 1 / sqrt(n), then we expect that a larger n will reduce the std for similar deviations.

Also, you should not confuse the slope to the std of the slopes. It is indeed true that the slope might be lower, but in fact the std which measures the variance of the slopes could indeed be higher.

Notice the difference:

Answered By: D.L