uxlfoundation
diff --git a/‎doc/sources/about.rst‎
Lines changed: 32 additions & 0 deletions b/‎doc/sources/about.rst‎
Lines changed: 32 additions & 0 deletions
diff --git a/‎doc/sources/array_api.rst‎
Lines changed: 9 additions & 3 deletions b/‎doc/sources/array_api.rst‎
Lines changed: 9 additions & 3 deletions
diff --git a/‎doc/sources/building-from-source.rst‎
Lines changed: 4 additions & 2 deletions b/‎doc/sources/building-from-source.rst‎
Lines changed: 4 additions & 2 deletions
diff --git a/‎doc/sources/distributed-mode.rst‎
Lines changed: 0 additions & 18 deletions b/‎doc/sources/distributed-mode.rst‎
Lines changed: 0 additions & 18 deletions
diff --git a/‎doc/sources/index.rst‎
Lines changed: 111 additions & 95 deletions b/‎doc/sources/index.rst‎
Lines changed: 111 additions & 95 deletions
@@ -0,0 +1,32 @@
+.. Copyright contributors to the oneDAL project
+..
+.. Licensed under the Apache License, Version 2.0 (the "License");
+.. you may not use this file except in compliance with the License.
+.. You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+.. include:: substitutions.rst
+
+=====================
+About the |sklearnex|
+=====================
+
+The |sklearnex| is a free and open-source software accelerator built atop of the |sklearn| and the :external+onedal:doc:`oneDAL <index>` (|onedal|) libraries.
+
+It mostly works by replacing selected calls to algorithms in |sklearn| with calls to the |onedal| library, which offers more optimized versions of the same routines (see :doc:`algorithms`). The optimizations in the |onedal| in turn are achieved by leveraging SIMD instructions and exploiting cache structures of modern hardware, along with using the `oneMKL <https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl.html>`__ library for linear algebra operations in place of the `OpenBLAS <https://www.openmathlib.org/OpenBLAS/>`__ library used by default by |sklearn|.
+
+Unlike other libraries in the Python ecosystem, classes and functions in the |sklearnex| are not just :external+sklearn:doc:`scikit-learn-compatible <developers/develop>`, but rather are built atop of |sklearn| itself by inheriting from their classes directly, defining the same attributes that the stock version of |sklearn| would do for each estimator, and reusing most of scikit-learn's estimator methods where appropriate.
+
+The |sklearnex| is regularly tested for API compatibility and for correctness against |sklearn|'s own test suite (see :ref:`conformance_tests` for more information), and can be easily swapped in place of the stock |sklearn| library by :doc:`patching <patching>` it.
+
+The |sklearnex| aims to be compatible with the last 3 minor releases of |sklearnex| available at any given time, in addition to the 1.0 release as a special case, and ensures this compatibility by offering different code routes according to the |sklearn| version encountered at runtime - for example, if a given attribute of a class is removed in version 1.x of |sklearn|, the |sklearnex| will not set that attribute when running with |sklearn| >=1.x, but will still do so when running with |sklearn| <1.x, in order to guarantee full API compatibility.
+
+Performance of the |sklearnex| is regularly measured and compared against that of other libraries using public and synthetic datasets through `sklbench <https://github.com/IntelPython/scikit-learn_bench>`__, which is also free and fully open-source.
+
+Initially developed by Intel as the Intel Extension for Scikit-learn*, the |sklearnex| and the |onedal| are now projects under the `UXL Foundation <https://uxlfoundation.org>`__ umbrella, and can be built from source to provide accelerated routines for other platforms such as ARM and RISCV - see :doc:`building-from-source` for more information.
@@ -32,12 +32,18 @@ on GPU without moving the data from host to device.
     be :external+sklearn:doc:`enabled in scikit-learn <modules/array_api>`, which requires either changing
     global settings or using a :doc:`config_context <config-contexts>`.
 
+.. hint::
+    Executing computations on GPUs has additional dependencies, particularly on package ``scikit-learn-intelex-gpu`` - see
+    :doc:`oneapi-gpu` for details.
+
 When passing array API inputs whose data is on a SYCL-enabled device (e.g. an Intel GPU), as
 supported for example by `PyTorch <https://docs.pytorch.org/docs/stable/notes/get_start_xpu.html>`__
-and |dpnp|, if array API support is enabled and the requested operation (e.g. call to ``.fit()`` / ``.predict()``
+and |dpnp|, if array API support is enabled, :doc:`GPU dependencies <oneapi-gpu>` are available, and the
+requested operation (e.g. call to ``.fit()`` / ``.predict()``
 on the estimator class being used) is :ref:`supported on device/GPU <sklearn_algorithms_gpu>`, computations
-will be performed on the device where the data lives, without involving any data transfers. Note that all of
-the inputs (e.g. ``X`` and ``y`` passed to ``.fit()`` methods) must be allocated on the same device for this to
+will be performed on the device where the data lives, without involving any data transfers.
+
+Note that all of the inputs (e.g. ``X`` and ``y`` passed to ``.fit()`` methods) must be allocated on the same device for this to
 work. If the requested operation is not supported on the device where the data lives, then it will either fall
 back to |sklearn|, or to an accelerated CPU version from the |sklearnex| when supported - these are controllable
 through options ``allow_sklearn_after_onedal`` (default is ``True``) and ``allow_fallback_to_host`` (default is
 
@@ -24,6 +24,8 @@ The |sklearnex| predominantly functions as a frontend to the |onedal| by leverag
 
 .. note:: Python packages ``dal`` (conda) and ``daal`` (PyPI) provide the same components, but due to naming availability in these repositories, they are distributed under different names.
 
+.. note:: When installing the |onedal| through ``pip`` or ``conda``, the files required for running on GPU are contained in a different package ``dal-gpu`` / ``daal-gpu``, while the standalone intallers, APT/YUM packages and others have the required files for GPU in the same package.
+
 As a library, the |sklearnex| consists of a Python codebase with Python extension modules written in C++ and Cython, with some of those modules being optional. These extension modules require compilation before being used, for which a C++ compiler along with other dependencies is required. In the case of GPU-related modules, a SYCL compiler (such as `Intel's DPC++ <https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compiler.html>`__) is required, and in the case of distributed mode, whether on CPU or on GPU, an MPI backend is required, such as `Intel MPI <https://www.intel.com/content/www/us/en/developer/tools/oneapi/mpi-library.html>`__.
 
 The extension modules are as follows:
@@ -246,9 +248,9 @@ The following environment variables can be used to control setup aspects:
 - ``DALROOT``: sets the |onedal| path.
 - ``MKLROOT``: path to the oneMKL runtime libraries, which are used for the DPC module. This variable is optional and only has an effect when using the option ``abs-rpath`` on Linux* (see the rest of this page for details).
 - ``MPIROOT``: sets the path to the MPI library. If this variable is not set but ``I_MPI_ROOT`` is found, will use ``I_MPI_ROOT`` instead. Not used when using ``NO_DIST=1``.
-- ``NO_DIST``: set to '1', 'yes' or alike to build without support for distributed mode.
+- ``NO_DIST``: set to '1', 'yes' or alike to build without support for distributed mode. Note that distributed mode in the ``sklearnex`` module requires building with DPC++ support.
 - ``NO_STREAM``: set to '1', 'yes' or alike to build without support for streaming mode.
-- ``NO_DPC``: set to '1', 'yes' or alike to build without support of oneDAL DPC++ interfaces.
+- ``NO_DPC``: set to '1', 'yes' or alike to build without support of the |onedal| DPC++ interfaces (GPU). Note that building the DPC++ component (default) of this library requires also the DPC++ components of the |onedal| (packages ``dal-gpu`` / ``daal-gpu`` if installing it from ``conda`` or ``pip``).
 - ``MAKEFLAGS``: the last `-j` flag determines the number of threads for building the onedal extension. It will default to the number of CPU threads when not set.
 
 .. note:: The ``-j`` flag in the ``MAKEFLAGS`` environment variable is superseded in ``setup.py`` modes which support the ``--parallel`` and ``-j`` command line flags.
 
@@ -38,13 +38,6 @@ via the ``impi_rt``  / ``impi-rt`` python/conda package) and the |mpi4py| python
 
                 conda install -c conda-forge mpi4py mpi=*=impi
 
-        .. tab:: From Intel's conda channel
-            ::
-
-                conda install -c https://software.repos.intel.com/python/conda/ -c conda-forge --override-channels mpi4py mpi=*=impi
-
-            .. warning:: Packages from the Intel channel are meant to be compatible with dependencies from ``conda-forge``, and might not work correctly in environments that have packages installed from the ``anaconda`` channel.
-
         .. tab:: From PyPI
             ::
 
@@ -64,22 +57,11 @@ via the ``impi_rt``  / ``impi-rt`` python/conda package) and the |mpi4py| python
 
                 conda install -c conda-forge impi_rt mpi=*=impi
 
-        .. tab:: From Intel's conda channel
-            ::
-
-                conda install -c https://software.repos.intel.com/python/conda/ -c conda-forge --override-channels impi_rt mpi=*=impi
-
         .. tab:: From PyPI
             ::
 
                 pip install impi-rt
 
-        .. tab:: From Intel's pip Index
-            ::
-
-                pip install --index-url https://software.repos.intel.com/python/pypi impi-rt
-
-
   Using other MPI backends that are not MPICH-compatible (e.g. OpenMPI) requires building |sklearnex| from source with that backend, and using an |mpi4py| built with that same backend.
 
 
 
@@ -14,43 +14,131 @@
 
 .. include:: substitutions.rst
 
+.. toctree::
+   :caption: Getting Started
+   :hidden:
+   :maxdepth: 3
+
+   Quick Start <self>
+   installation.rst
+   about.rst
+
+.. toctree::
+   :caption: Documentation topics
+   :hidden:
+   :maxdepth: 4
+
+   patching.rst
+   algorithms.rst
+   oneapi-gpu.rst
+   config-contexts.rst
+   array_api.rst
+   serialization.rst
+   distributed-mode.rst
+   distributed_daal4py.rst
+   non-scikit-algorithms.rst
+   non_sklearn_d4p.rst
+   model_builders.rst
+   logistic_model_builder.rst
+   input-types.rst
+   verbose.rst
+   parallelism.rst
+   preview.rst
+   deprecation.rst
+
+.. toctree::
+   :caption: daal4py
+   :hidden:
+
+   about_daal4py.rst
+   daal4py.rst
+
+.. toctree::
+   :caption: Development guides
+   :hidden:
+
+   building-from-source.rst
+   tests.rst
+   contribute.rst
+   contributor-reference.rst
+   ideas.rst
+
+.. toctree::
+   :caption: Performance
+   :hidden:
+   :maxdepth: 2
+
+   guide/acceleration.rst
+
+.. toctree::
+   :caption: Learn
+   :hidden:
+   :maxdepth: 2
+
+   Tutorials & Case Studies <tutorials.rst>
+   Medium Blogs <blogs.rst>
+
+.. toctree::
+   :caption: More
+   :hidden:
+   :maxdepth: 2
+
+   support.rst
+   code-of-conduct.rst
+   license.rst
+
+.. toctree::
+   :caption: Examples
+   :hidden:
+   :maxdepth: 3
+
+   samples.rst
+   kaggle.rst
+
 .. _index:
 
-###########
-|sklearnex|
-###########
+Introduction
+============
 
 |sklearnex| is a **free software AI accelerator** designed to deliver up to **100X** faster performance for your existing |sklearn| code.
 The software acceleration is achieved with vector instructions, AI hardware-specific memory optimizations, threading, and optimizations.
 
-.. rubric:: Designed for Data Scientists and Framework Designers
-
-
-Use |sklearnex|, to:
+Benefits:
 
-* Speed up training and inference by up to 100x with equivalent mathematical accuracy
-* Benefit from performance improvements across different x86-64 CPUs and Intel(R) GPUs (including iGPUs)
-* Integrate the extension into your existing |sklearn| applications without code modifications
-* Enable and disable the extension with a couple of lines of code or at the command line
+* Speed up training and inference by up to 100x with equivalent mathematical accuracy.
+* Benefit from performance improvements across different hardware configurations, including :doc:`GPUs <oneapi-gpu>` and :doc:`multi-GPU <distributed-mode>` configurations.
+* Integrate the extension into your existing |sklearn| applications without code modifications.
+* Continue to use the open-source |sklearn| API.
+* Enable and disable the extension with a couple of lines of code or at the command line.
 
 .. image:: _static/scikit-learn-acceleration.PNG
   :width: 800
 
+(`Benchmarks code <https://github.com/IntelPython/scikit-learn_bench>`__)
 
-These performance charts use benchmarks that you can find in the `scikit-learn bench repository <https://github.com/IntelPython/scikit-learn_bench>`_.
+See :doc:`about` for more information.
 
+Quick Install
+=============
 
-Supported Algorithms
---------------------
+.. tabs::
+
+    .. tab:: From PyPI
+
+        .. code-block::
+
+            pip install scikit-learn-intelex
+
+    .. tab:: From conda-forge
 
-See all of the :ref:`sklearn_algorithms`.
+        .. code-block::
 
+            conda install -c conda-forge scikit-learn-intelex --override-channels
 
-Optimizations
--------------
+See the full :doc:`installation` for more details.
 
-Enable CPU Optimizations
-************************
+Example Usage
+=============
 
 .. tabs::
    .. tab:: By patching
@@ -66,6 +154,8 @@ Enable CPU Optimizations
                        [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
          clustering = DBSCAN(eps=3, min_samples=2).fit(X)
 
+      See :doc:`patching` for more details.
+
    .. tab:: Without patching
       .. code-block:: python
 
@@ -77,8 +167,8 @@ Enable CPU Optimizations
          clustering = DBSCAN(eps=3, min_samples=2).fit(X)
 
 
-Enable GPU optimizations
-************************
+Running on GPUs
+===============
 
 Note: executing on GPU has `additional system software requirements <https://www.intel.com/content/www/us/en/developer/articles/system-requirements/intel-oneapi-dpcpp-system-requirements.html>`__ - see :doc:`oneapi-gpu`.
 
@@ -150,77 +240,3 @@ Note: executing on GPU has `additional system software requirements <https://www
 
 
 See :ref:`oneapi_gpu` for other ways of executing on GPU.
-
-
-.. toctree::
-   :caption: Getting Started
-   :hidden:
-   :maxdepth: 3
-
-   quick-start.rst
-   samples.rst
-   kaggle.rst
-
-.. toctree::
-   :caption: Documentation topics
-   :hidden:
-   :maxdepth: 4
-
-   patching.rst
-   algorithms.rst
-   oneapi-gpu.rst
-   config-contexts.rst
-   array_api.rst
-   serialization.rst
-   distributed-mode.rst
-   distributed_daal4py.rst
-   non-scikit-algorithms.rst
-   non_sklearn_d4p.rst
-   model_builders.rst
-   logistic_model_builder.rst
-   input-types.rst
-   verbose.rst
-   parallelism.rst
-   preview.rst
-   deprecation.rst
-
-.. toctree::
-   :caption: daal4py
-   :hidden:
-
-   about_daal4py.rst
-   daal4py.rst
-
-.. toctree::
-   :caption: Development guides
-   :hidden:
-
-   building-from-source.rst
-   tests.rst
-   contribute.rst
-   contributor-reference.rst
-   ideas.rst
-
-.. toctree::
-   :caption: Performance
-   :hidden:
-   :maxdepth: 2
-
-   guide/acceleration.rst
-
-.. toctree::
-   :caption: Learn
-   :hidden:
-   :maxdepth: 2
-
-   Tutorials & Case Studies <tutorials.rst>
-   Medium Blogs <blogs.rst>
-
-.. toctree::
-   :caption: More
-   :hidden:
-   :maxdepth: 2
-
-   support.rst
-   code-of-conduct.rst
-   license.rst