Will scikit-learn utilize GPU?

Question:

Reading implementation of scikit-learn in TensorFlow: http://learningtensorflow.com/lesson6/ and scikit-learn: http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html I’m struggling to decide which implementation to use.

scikit-learn is installed as part of the tensorflow docker container so can use either implementation.

Reason to use scikit-learn :

scikit-learn contains less boilerplate than the tensorflow
implementation.

Reason to use tensorflow :

If running on Nvidia GPU the algorithm will be run against in parallel
, I’m not sure if scikit-learn will utilize all available GPUs?

Reading https://www.quora.com/What-are-the-main-differences-between-TensorFlow-and-SciKit-Learn

TensorFlow is more low-level; basically, the Lego bricks that help
you to implement machine learning algorithms whereas scikit-learn
offers you off-the-shelf algorithms, e.g., algorithms for
classification such as SVMs, Random Forests, Logistic Regression, and
many, many more. TensorFlow shines if you want to implement
deep learning algorithms, since it allows you to take advantage of
GPUs for more efficient training.

This statement re-enforces my assertion that "scikit-learn contains less boilerplate than the tensorflow implementation" but also suggests scikit-learn will not utilize all available GPUs?

Asked By: blue-sky

||

Answers:

Tensorflow only uses GPU if it is built against Cuda and CuDNN. By default it does not use GPU, especially if it is running inside Docker, unless you use nvidia-docker and an image with a built-in support.

Scikit-learn is not intended to be used as a deep-learning framework and it does not provide any GPU support.

Why is there no support for deep or reinforcement learning / Will there be support for deep or reinforcement learning in scikit-learn?

Deep learning and reinforcement learning both require a rich
vocabulary to define an architecture, with deep learning additionally
requiring GPUs for efficient computing. However, neither of these fit
within the design constraints of scikit-learn; as a result, deep
learning and reinforcement learning are currently out of scope for
what scikit-learn seeks to achieve.

Extracted from http://scikit-learn.org/stable/faq.html#why-is-there-no-support-for-deep-or-reinforcement-learning-will-there-be-support-for-deep-or-reinforcement-learning-in-scikit-learn

Will you add GPU support in scikit-learn?

No, or at least not in the near future. The main reason is that GPU
support will introduce many software dependencies and introduce
platform specific issues. scikit-learn is designed to be easy to
install on a wide variety of platforms. Outside of neural networks,
GPUs don’t play a large role in machine learning today, and much
larger gains in speed can often be achieved by a careful choice of
algorithms.

Extracted from http://scikit-learn.org/stable/faq.html#will-you-add-gpu-support

Answered By: Ivan De Paz Centeno

I’m experimenting with a drop-in solution (h2o4gpu) to take advantage of GPU acceleration in particular for Kmeans:

try this:

from h2o4gpu.solvers import KMeans
#from sklearn.cluster import KMeans

as of now, version 0.3.2 still don’t have .inertia_ but I think it’s in their TODO list.

EDIT: Haven’t tested yet, but scikit-cuda seems to be getting traction.

EDIT: RAPIDS is really the way to go here.

Answered By: martin

From my experience, I use this package to utilize GPU for some sklearn algorithms in here.

The code I use:

import numpy as np
import dpctl
from sklearnex import patch_sklearn, config_context
patch_sklearn()

from sklearn.cluster import DBSCAN

X = np.array([[1., 2.], [2., 2.], [2., 3.],
            [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
with config_context(target_offload="gpu:0"):
    clustering = DBSCAN(eps=3, min_samples=2).fit(X)

Source: oneAPI and GPU support in Intel(R) Extension for Scikit-learn

Answered By: M.Vu