n_classes * n_clusters_per_class must be smaller or equal 2 in make_classification function

Question:

I am generating datas on Python by this command line :

X, Y = sklearn.datasets.make_classification(n_classes=3  ,n_features=20, n_redundant=0, n_informative=1,
                         n_clusters_per_class=1)

but I get this error and can’t understand what to do to avoid…:

ValueError: n_classes * n_clusters_per_class must be smaller or equal 2 ** n_informative

Could someone help please?

Asked By: user7195968

||

Answers:

You can increase n_informative to 2.

Answered By: Rachel Kogan

The doc string says that the clusters are placed on the corners of a hypercube. A hypercube for n=1 is the unit line segment. Which has 2 corner points. Thus only 2 clusters can be placed. This is only an algorithmic constraint

Answered By: Quickbeam2k1

Note the below rule should be followed:

(n_classes * n_clusters_per_class) ≤ (2 ^ n_informative)

As per post, eqn becomes:
(3 x 1) ≤ (2 ^ 1)

The above is wrong, eqn doesn’t satisfy. So one option is to increase n_informative to 2, and then it will satisfy.

For meaning of the parameters, you can follow the documentation.

Answered By: Hrisav Bhowmick
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.