DecisionTreeClassifier TypeError: fit() missing 1 required positional argument: 'y'

Question:

While using Jupyter notebook I never had this problem with the fit() function.
But with this code I do:

import pandas as pd
from sklearn.tree import DecisionTreeClassifier

data = pd.read_csv('train.csv')
test_data = pd.read_csv('test.csv')

X = data.drop(columns=['Survived'])
y = data['Survived']

model = DecisionTreeClassifier
model.fit(X, y)
prediction = model.predict(test_data)
prediction

The train.csv and test.csv files were successfully read by pandas (I visualized X and Y in Jupyter).

The output:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~AppDataLocalTempipykernel_232003416706318.py in <module>
      9 
     10 model = DecisionTreeClassifier
---> 11 model.fit(X, y)
     12 prediction = model.predict(test_data)
     13 prediction

TypeError: fit() missing 1 required positional argument: 'y'

How do I fix this bug?

The data used:
https://www.kaggle.com/competitions/titanic/data?select=train.csv

Asked By: Code7G

||

Answers:

Either your dataset do not have the column ‘Survived’ at that level or data.drop() does an in-place removal, or possibly a 3rd spooky alternative.
Either way this behaviour is caused by supplying a None argument to a function which is not prepared for that, python just disregards it as not being supplied at all.

Answered By: Harambo

Fix for the error (syntax error):

First of all the error you have encountered can be fixed by adding parenthesis when calling the model to use model = DecisionTreeClassifier()

After adding the parenthesis in the model the code again will encounter an error since your X data have multiple columns with string value.When training a model the algorithm (DecisionTreeClassifier) will only accept numerical values for X and y. Please see this link for more details.

Answered By: George