KeyError: "None of [Index([''], dtype='object')] are in the [columns]"

Question:

I am trying to implement my first anomaly detection with IsolationForest, but unfortunately it does not succeed.

I have a .csv file with different network parameters like ip.ttl, frame.len, etc.

#Einlesen
quelle = pd.read_csv('./x.csv')
pdf=quelle.to_numpy()
print(quelle.columns)

Index([‘;ip.proto;ttl;frame.len;ip.src;ip.dst;ip.len;ip.flags;eth.src;eth.dst;eth.type;vlan.id;udp.port’], dtype=’object’)

print(quelle.shape)

(1658, 1)

But when I try to create the IsolationForest model with a column like ip.ttl or frame.len (one of the columns), I get an error

model=IsolationForest(n_estimators=50, max_samples='auto',contamination=float(0.1),max_features=1.0)
model.fit(quelle[['frame.len']])

KeyError: "None of [Index([‘frame.len’], dtype=’object’)] are in the [columns]"

Where is my mistake?

Thanks in advance

Asked By: Anna

||

Answers:

The dataframe has many datapoints but only a single column.

print(quelle.shape)
(1658, 1)

When you loaded the file into the dataframe it failed to auto detect what is the proper delimiter of the file and instead of reading each column, it packed all columns into a single column.

To solve this issue, you should specify delimiter when reading the file.

pd.read_csv('./x.csv', sep=';')
Answered By: RaidasGrisk
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.