Skip to content

Commit f202b14

Browse files
rewrite docs after CPU/GPU split (#3097)
1 parent 29ec7ad commit f202b14

8 files changed

Lines changed: 245 additions & 369 deletions

File tree

doc/sources/about.rst

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
.. Copyright contributors to the oneDAL project
2+
..
3+
.. Licensed under the Apache License, Version 2.0 (the "License");
4+
.. you may not use this file except in compliance with the License.
5+
.. You may obtain a copy of the License at
6+
..
7+
.. http://www.apache.org/licenses/LICENSE-2.0
8+
..
9+
.. Unless required by applicable law or agreed to in writing, software
10+
.. distributed under the License is distributed on an "AS IS" BASIS,
11+
.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
.. See the License for the specific language governing permissions and
13+
.. limitations under the License.
14+
.. include:: substitutions.rst
15+
16+
=====================
17+
About the |sklearnex|
18+
=====================
19+
20+
The |sklearnex| is a free and open-source software accelerator built atop of the |sklearn| and the :external+onedal:doc:`oneDAL <index>` (|onedal|) libraries.
21+
22+
It mostly works by replacing selected calls to algorithms in |sklearn| with calls to the |onedal| library, which offers more optimized versions of the same routines (see :doc:`algorithms`). The optimizations in the |onedal| in turn are achieved by leveraging SIMD instructions and exploiting cache structures of modern hardware, along with using the `oneMKL <https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl.html>`__ library for linear algebra operations in place of the `OpenBLAS <https://www.openmathlib.org/OpenBLAS/>`__ library used by default by |sklearn|.
23+
24+
Unlike other libraries in the Python ecosystem, classes and functions in the |sklearnex| are not just :external+sklearn:doc:`scikit-learn-compatible <developers/develop>`, but rather are built atop of |sklearn| itself by inheriting from their classes directly, defining the same attributes that the stock version of |sklearn| would do for each estimator, and reusing most of scikit-learn's estimator methods where appropriate.
25+
26+
The |sklearnex| is regularly tested for API compatibility and for correctness against |sklearn|'s own test suite (see :ref:`conformance_tests` for more information), and can be easily swapped in place of the stock |sklearn| library by :doc:`patching <patching>` it.
27+
28+
The |sklearnex| aims to be compatible with the last 3 minor releases of |sklearnex| available at any given time, in addition to the 1.0 release as a special case, and ensures this compatibility by offering different code routes according to the |sklearn| version encountered at runtime - for example, if a given attribute of a class is removed in version 1.x of |sklearn|, the |sklearnex| will not set that attribute when running with |sklearn| >=1.x, but will still do so when running with |sklearn| <1.x, in order to guarantee full API compatibility.
29+
30+
Performance of the |sklearnex| is regularly measured and compared against that of other libraries using public and synthetic datasets through `sklbench <https://github.com/IntelPython/scikit-learn_bench>`__, which is also free and fully open-source.
31+
32+
Initially developed by Intel as the Intel Extension for Scikit-learn*, the |sklearnex| and the |onedal| are now projects under the `UXL Foundation <https://uxlfoundation.org>`__ umbrella, and can be built from source to provide accelerated routines for other platforms such as ARM and RISCV - see :doc:`building-from-source` for more information.

doc/sources/array_api.rst

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,12 +32,18 @@ on GPU without moving the data from host to device.
3232
be :external+sklearn:doc:`enabled in scikit-learn <modules/array_api>`, which requires either changing
3333
global settings or using a :doc:`config_context <config-contexts>`.
3434

35+
.. hint::
36+
Executing computations on GPUs has additional dependencies, particularly on package ``scikit-learn-intelex-gpu`` - see
37+
:doc:`oneapi-gpu` for details.
38+
3539
When passing array API inputs whose data is on a SYCL-enabled device (e.g. an Intel GPU), as
3640
supported for example by `PyTorch <https://docs.pytorch.org/docs/stable/notes/get_start_xpu.html>`__
37-
and |dpnp|, if array API support is enabled and the requested operation (e.g. call to ``.fit()`` / ``.predict()``
41+
and |dpnp|, if array API support is enabled, :doc:`GPU dependencies <oneapi-gpu>` are available, and the
42+
requested operation (e.g. call to ``.fit()`` / ``.predict()``
3843
on the estimator class being used) is :ref:`supported on device/GPU <sklearn_algorithms_gpu>`, computations
39-
will be performed on the device where the data lives, without involving any data transfers. Note that all of
40-
the inputs (e.g. ``X`` and ``y`` passed to ``.fit()`` methods) must be allocated on the same device for this to
44+
will be performed on the device where the data lives, without involving any data transfers.
45+
46+
Note that all of the inputs (e.g. ``X`` and ``y`` passed to ``.fit()`` methods) must be allocated on the same device for this to
4147
work. If the requested operation is not supported on the device where the data lives, then it will either fall
4248
back to |sklearn|, or to an accelerated CPU version from the |sklearnex| when supported - these are controllable
4349
through options ``allow_sklearn_after_onedal`` (default is ``True``) and ``allow_fallback_to_host`` (default is

doc/sources/building-from-source.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@ The |sklearnex| predominantly functions as a frontend to the |onedal| by leverag
2424

2525
.. note:: Python packages ``dal`` (conda) and ``daal`` (PyPI) provide the same components, but due to naming availability in these repositories, they are distributed under different names.
2626

27+
.. note:: When installing the |onedal| through ``pip`` or ``conda``, the files required for running on GPU are contained in a different package ``dal-gpu`` / ``daal-gpu``, while the standalone intallers, APT/YUM packages and others have the required files for GPU in the same package.
28+
2729
As a library, the |sklearnex| consists of a Python codebase with Python extension modules written in C++ and Cython, with some of those modules being optional. These extension modules require compilation before being used, for which a C++ compiler along with other dependencies is required. In the case of GPU-related modules, a SYCL compiler (such as `Intel's DPC++ <https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compiler.html>`__) is required, and in the case of distributed mode, whether on CPU or on GPU, an MPI backend is required, such as `Intel MPI <https://www.intel.com/content/www/us/en/developer/tools/oneapi/mpi-library.html>`__.
2830

2931
The extension modules are as follows:
@@ -246,9 +248,9 @@ The following environment variables can be used to control setup aspects:
246248
- ``DALROOT``: sets the |onedal| path.
247249
- ``MKLROOT``: path to the oneMKL runtime libraries, which are used for the DPC module. This variable is optional and only has an effect when using the option ``abs-rpath`` on Linux* (see the rest of this page for details).
248250
- ``MPIROOT``: sets the path to the MPI library. If this variable is not set but ``I_MPI_ROOT`` is found, will use ``I_MPI_ROOT`` instead. Not used when using ``NO_DIST=1``.
249-
- ``NO_DIST``: set to '1', 'yes' or alike to build without support for distributed mode.
251+
- ``NO_DIST``: set to '1', 'yes' or alike to build without support for distributed mode. Note that distributed mode in the ``sklearnex`` module requires building with DPC++ support.
250252
- ``NO_STREAM``: set to '1', 'yes' or alike to build without support for streaming mode.
251-
- ``NO_DPC``: set to '1', 'yes' or alike to build without support of oneDAL DPC++ interfaces.
253+
- ``NO_DPC``: set to '1', 'yes' or alike to build without support of the |onedal| DPC++ interfaces (GPU). Note that building the DPC++ component (default) of this library requires also the DPC++ components of the |onedal| (packages ``dal-gpu`` / ``daal-gpu`` if installing it from ``conda`` or ``pip``).
252254
- ``MAKEFLAGS``: the last `-j` flag determines the number of threads for building the onedal extension. It will default to the number of CPU threads when not set.
253255

254256
.. note:: The ``-j`` flag in the ``MAKEFLAGS`` environment variable is superseded in ``setup.py`` modes which support the ``--parallel`` and ``-j`` command line flags.

doc/sources/distributed-mode.rst

Lines changed: 0 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -38,13 +38,6 @@ via the ``impi_rt`` / ``impi-rt`` python/conda package) and the |mpi4py| python
3838

3939
conda install -c conda-forge mpi4py mpi=*=impi
4040

41-
.. tab:: From Intel's conda channel
42-
::
43-
44-
conda install -c https://software.repos.intel.com/python/conda/ -c conda-forge --override-channels mpi4py mpi=*=impi
45-
46-
.. warning:: Packages from the Intel channel are meant to be compatible with dependencies from ``conda-forge``, and might not work correctly in environments that have packages installed from the ``anaconda`` channel.
47-
4841
.. tab:: From PyPI
4942
::
5043

@@ -64,22 +57,11 @@ via the ``impi_rt`` / ``impi-rt`` python/conda package) and the |mpi4py| python
6457

6558
conda install -c conda-forge impi_rt mpi=*=impi
6659

67-
.. tab:: From Intel's conda channel
68-
::
69-
70-
conda install -c https://software.repos.intel.com/python/conda/ -c conda-forge --override-channels impi_rt mpi=*=impi
71-
7260
.. tab:: From PyPI
7361
::
7462

7563
pip install impi-rt
7664

77-
.. tab:: From Intel's pip Index
78-
::
79-
80-
pip install --index-url https://software.repos.intel.com/python/pypi impi-rt
81-
82-
8365
Using other MPI backends that are not MPICH-compatible (e.g. OpenMPI) requires building |sklearnex| from source with that backend, and using an |mpi4py| built with that same backend.
8466

8567

doc/sources/index.rst

Lines changed: 111 additions & 95 deletions
Original file line numberDiff line numberDiff line change
@@ -14,43 +14,131 @@
1414
1515
.. include:: substitutions.rst
1616

17+
.. toctree::
18+
:caption: Getting Started
19+
:hidden:
20+
:maxdepth: 3
21+
22+
Quick Start <self>
23+
installation.rst
24+
about.rst
25+
26+
.. toctree::
27+
:caption: Documentation topics
28+
:hidden:
29+
:maxdepth: 4
30+
31+
patching.rst
32+
algorithms.rst
33+
oneapi-gpu.rst
34+
config-contexts.rst
35+
array_api.rst
36+
serialization.rst
37+
distributed-mode.rst
38+
distributed_daal4py.rst
39+
non-scikit-algorithms.rst
40+
non_sklearn_d4p.rst
41+
model_builders.rst
42+
logistic_model_builder.rst
43+
input-types.rst
44+
verbose.rst
45+
parallelism.rst
46+
preview.rst
47+
deprecation.rst
48+
49+
.. toctree::
50+
:caption: daal4py
51+
:hidden:
52+
53+
about_daal4py.rst
54+
daal4py.rst
55+
56+
.. toctree::
57+
:caption: Development guides
58+
:hidden:
59+
60+
building-from-source.rst
61+
tests.rst
62+
contribute.rst
63+
contributor-reference.rst
64+
ideas.rst
65+
66+
.. toctree::
67+
:caption: Performance
68+
:hidden:
69+
:maxdepth: 2
70+
71+
guide/acceleration.rst
72+
73+
.. toctree::
74+
:caption: Learn
75+
:hidden:
76+
:maxdepth: 2
77+
78+
Tutorials & Case Studies <tutorials.rst>
79+
Medium Blogs <blogs.rst>
80+
81+
.. toctree::
82+
:caption: More
83+
:hidden:
84+
:maxdepth: 2
85+
86+
support.rst
87+
code-of-conduct.rst
88+
license.rst
89+
90+
.. toctree::
91+
:caption: Examples
92+
:hidden:
93+
:maxdepth: 3
94+
95+
samples.rst
96+
kaggle.rst
97+
1798
.. _index:
1899

19-
###########
20-
|sklearnex|
21-
###########
100+
Introduction
101+
============
22102

23103
|sklearnex| is a **free software AI accelerator** designed to deliver up to **100X** faster performance for your existing |sklearn| code.
24104
The software acceleration is achieved with vector instructions, AI hardware-specific memory optimizations, threading, and optimizations.
25105

26-
.. rubric:: Designed for Data Scientists and Framework Designers
27-
28-
29-
Use |sklearnex|, to:
106+
Benefits:
30107

31-
* Speed up training and inference by up to 100x with equivalent mathematical accuracy
32-
* Benefit from performance improvements across different x86-64 CPUs and Intel(R) GPUs (including iGPUs)
33-
* Integrate the extension into your existing |sklearn| applications without code modifications
34-
* Enable and disable the extension with a couple of lines of code or at the command line
108+
* Speed up training and inference by up to 100x with equivalent mathematical accuracy.
109+
* Benefit from performance improvements across different hardware configurations, including :doc:`GPUs <oneapi-gpu>` and :doc:`multi-GPU <distributed-mode>` configurations.
110+
* Integrate the extension into your existing |sklearn| applications without code modifications.
111+
* Continue to use the open-source |sklearn| API.
112+
* Enable and disable the extension with a couple of lines of code or at the command line.
35113

36114
.. image:: _static/scikit-learn-acceleration.PNG
37115
:width: 800
38116

117+
(`Benchmarks code <https://github.com/IntelPython/scikit-learn_bench>`__)
39118

40-
These performance charts use benchmarks that you can find in the `scikit-learn bench repository <https://github.com/IntelPython/scikit-learn_bench>`_.
119+
See :doc:`about` for more information.
41120

121+
Quick Install
122+
=============
42123

43-
Supported Algorithms
44-
--------------------
124+
.. tabs::
125+
126+
.. tab:: From PyPI
127+
128+
.. code-block::
129+
130+
pip install scikit-learn-intelex
131+
132+
.. tab:: From conda-forge
45133

46-
See all of the :ref:`sklearn_algorithms`.
134+
.. code-block::
47135
136+
conda install -c conda-forge scikit-learn-intelex --override-channels
48137
49-
Optimizations
50-
-------------
138+
See the full :doc:`installation` for more details.
51139

52-
Enable CPU Optimizations
53-
************************
140+
Example Usage
141+
=============
54142

55143
.. tabs::
56144
.. tab:: By patching
@@ -66,6 +154,8 @@ Enable CPU Optimizations
66154
[8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
67155
clustering = DBSCAN(eps=3, min_samples=2).fit(X)
68156
157+
See :doc:`patching` for more details.
158+
69159
.. tab:: Without patching
70160
.. code-block:: python
71161
@@ -77,8 +167,8 @@ Enable CPU Optimizations
77167
clustering = DBSCAN(eps=3, min_samples=2).fit(X)
78168
79169
80-
Enable GPU optimizations
81-
************************
170+
Running on GPUs
171+
===============
82172

83173
Note: executing on GPU has `additional system software requirements <https://www.intel.com/content/www/us/en/developer/articles/system-requirements/intel-oneapi-dpcpp-system-requirements.html>`__ - see :doc:`oneapi-gpu`.
84174

@@ -150,77 +240,3 @@ Note: executing on GPU has `additional system software requirements <https://www
150240
151241
152242
See :ref:`oneapi_gpu` for other ways of executing on GPU.
153-
154-
155-
.. toctree::
156-
:caption: Getting Started
157-
:hidden:
158-
:maxdepth: 3
159-
160-
quick-start.rst
161-
samples.rst
162-
kaggle.rst
163-
164-
.. toctree::
165-
:caption: Documentation topics
166-
:hidden:
167-
:maxdepth: 4
168-
169-
patching.rst
170-
algorithms.rst
171-
oneapi-gpu.rst
172-
config-contexts.rst
173-
array_api.rst
174-
serialization.rst
175-
distributed-mode.rst
176-
distributed_daal4py.rst
177-
non-scikit-algorithms.rst
178-
non_sklearn_d4p.rst
179-
model_builders.rst
180-
logistic_model_builder.rst
181-
input-types.rst
182-
verbose.rst
183-
parallelism.rst
184-
preview.rst
185-
deprecation.rst
186-
187-
.. toctree::
188-
:caption: daal4py
189-
:hidden:
190-
191-
about_daal4py.rst
192-
daal4py.rst
193-
194-
.. toctree::
195-
:caption: Development guides
196-
:hidden:
197-
198-
building-from-source.rst
199-
tests.rst
200-
contribute.rst
201-
contributor-reference.rst
202-
ideas.rst
203-
204-
.. toctree::
205-
:caption: Performance
206-
:hidden:
207-
:maxdepth: 2
208-
209-
guide/acceleration.rst
210-
211-
.. toctree::
212-
:caption: Learn
213-
:hidden:
214-
:maxdepth: 2
215-
216-
Tutorials & Case Studies <tutorials.rst>
217-
Medium Blogs <blogs.rst>
218-
219-
.. toctree::
220-
:caption: More
221-
:hidden:
222-
:maxdepth: 2
223-
224-
support.rst
225-
code-of-conduct.rst
226-
license.rst

0 commit comments

Comments
 (0)