Does the preprocessing of one algorithm change the conditions of the experiment?

Question:

As an example,

We have two algorithms that utilize the same dataset and the same train and test data:

1 – uses k-NN and returns the accuracy;

2 -applies preprocessing before k-NN and adds a few more things, before returning the accuracy.

Although the preprocessing "is a part of" algorithm number 2, I’ve been told that we cannot compare these two methods because the experiment’s conditions have changed as a result of the preprocessing.
Given that the preprocessing is only exclusive to algorithm no. 2, I believe that the circumstances have not been altered.

Which statement is the correct one?

Asked By: melson

||

Answers:

It depends what you are comparing.

  • if you compare the two methods "with preprocessing allowed", then you don’t include the preprocessing in the experiment; and in principle you should test several (identical) queries;

  • if you compare "with no preprocessing allowed", then include everything in the measurement.

Answered By: Yves Daoust