sklearn KMeans is not working as I only get 'NoneType' object has no attribute 'split' on nonEmpty Array

Question:

I don’t know what is wrong but suddenly KMeans from sklearn is not working anymore and I don’t know what I am doing wrong. Has anyone encountered this problem yet or knows how I can fix it?

from sklearn.cluster import KMeans

kmeanModel = KMeans(n_clusters=k, random_state=0)
kmeanModel.fit(allLocations)

allLocations looks like this:

array([[12.40236   , 51.38086   ],
       [12.40999   , 51.38494   ],
       [12.40599   , 51.37284   ],
       [12.28692   , 51.32039   ],
       [12.41349   , 51.34443   ], ...])

and allLocations.dtype gives dtype('float64').

The scikit-learn version is 1.0.2 and the NumPy version is 1.22.2 and I am using Jupyter Notebook.

The Error says:

'NoneType' object has no attribute 'split'

The whole Error looks like this:

AttributeError                            Traceback (most recent call last)
<ipython-input-30-db8e8220c8b9> in <module>
     12 for k in K:
     13     kmeanModel = KMeans(n_clusters=k, random_state=0)
---> 14     kmeanModel.fit(allLocations)
     15     distortions.append(kmeanModel.inertia_)
     16 #Plotting the distortions
    
~anaconda3libsite-packagessklearncluster_kmeans.py in fit(self, X, y, sample_weight)
   1169         if self._algorithm == "full":
   1170             kmeans_single = _kmeans_single_lloyd
-> 1171             self._check_mkl_vcomp(X, X.shape[0])
   1172         else:
   1173             kmeans_single = _kmeans_single_elkan
    
~anaconda3libsite-packagessklearncluster_kmeans.py in _check_mkl_vcomp(self, X, n_samples)
   1026         active_threads = int(np.ceil(n_samples / CHUNK_SIZE))
   1027         if active_threads < self._n_threads:
-> 1028             modules = threadpool_info()
   1029             has_vcomp = "vcomp" in [module["prefix"] for module in modules]
   1030             has_mkl = ("mkl", "intel") in [
    
~anaconda3libsite-packagessklearnutilsfixes.py in threadpool_info()
    323         return controller.info()
    324     else:
--> 325         return threadpoolctl.threadpool_info()
    326 
    327 
    
~anaconda3libsite-packagesthreadpoolctl.py in threadpool_info()
    122     In addition, each module may contain internal_api specific entries.
    123     """
--> 124     return _ThreadpoolInfo(user_api=_ALL_USER_APIS).todicts()
    125 
    126 
    
~anaconda3libsite-packagesthreadpoolctl.py in __init__(self, user_api, prefixes, modules)
    338 
    339             self.modules = []
--> 340             self._load_modules()
    341             self._warn_if_incompatible_openmp()
    342         else:
    
~anaconda3libsite-packagesthreadpoolctl.py in _load_modules(self)
    371             self._find_modules_with_dyld()
    372         elif sys.platform == "win32":
--> 373             self._find_modules_with_enum_process_module_ex()
    374         else:
    375             self._find_modules_with_dl_iterate_phdr()
    
~anaconda3libsite-packagesthreadpoolctl.py in _find_modules_with_enum_process_module_ex(self)
    483 
    484                 # Store the module if it is supported and selected
--> 485                 self._make_module_from_path(filepath)
    486         finally:
    487             kernel_32.CloseHandle(h_process)
    
~anaconda3libsite-packagesthreadpoolctl.py in _make_module_from_path(self, filepath)
    513             if prefix in self.prefixes or user_api in self.user_api:
    514                 module_class = globals()[module_class]
--> 515                 module = module_class(filepath, prefix, user_api, internal_api)
    516                 self.modules.append(module)
    517 
    
~anaconda3libsite-packagesthreadpoolctl.py in __init__(self, filepath, prefix, user_api, internal_api)
    604         self.internal_api = internal_api
    605         self._dynlib = ctypes.CDLL(filepath, mode=_RTLD_NOLOAD)
--> 606         self.version = self.get_version()
    607         self.num_threads = self.get_num_threads()
    608         self._get_extra_info()
    
~anaconda3libsite-packagesthreadpoolctl.py in get_version(self)
    644                              lambda: None)
    645         get_config.restype = ctypes.c_char_p
--> 646         config = get_config().split()
    647         if config[0] == b"OpenBLAS":
    648             return config[1].decode("utf-8")
    
AttributeError: 'NoneType' object has no attribute 'split'
Asked By: kitty

||

Answers:

I started getting the same error recently. It might have had something to do with a macOS upgrade from Sierra to Catalina, but I found that it was having an issue calculating kMeans when n_clusters = 1. In the following code, I changed my range to be 2:10 instead of 1:10, and it started working.

K = range(2,10)

for k in K:

  kmeanModel = KMeans(n_clusters=k)

  kmeanModel.fit(data)

  distortions.append(kmeanModel.inertia_)
Answered By: user18391552

Downgrading numpy to 1.21.4 made it work again

Answered By: kitty

Upgrade threadpoolctl to version >3.

This works for all versions of numpy.

Answered By: quanty

I upgraded threadpoolctl from version 2.2.0 to version 3.1.0 and this solved the issue

Answered By: Florian Lalande

None of the options work for MacOS Ventura:
Upgraded threadpoolctl from version 2.2.0 to version 3.1.0
Downgrading numpy to 1.21.4

I solved the issue by moving from Jupiter notebook to Google Colab.

Answered By: mumutuxin