Updated tutorials

Markus28 · Markus28 · commit 3a1d1a63e50d · 2025-08-07T16:48:21.000+02:00
diff --git a/docs/tutorials/basic_usage.md b/docs/tutorials/basic_usage.md
@@ -6,7 +6,7 @@ We first load two datasets of graphs and then compare how similar these datasets
 ## Loading Datasets of Graphs
 
 Polygraph comes with the most commonly used datasets in graph generation.
-We provide documentation for all provided datasets in [here](../datasets/index.md).
+We provide documentation for all provided datasets [here](../datasets/index.md).
 Below, we load two synthetic datasets and inspect an element from one of them:
 
 ```python
@@ -17,7 +17,7 @@ sbm = SBMGraphDataset("val")
 print(planar[0])            # PyG object: Data(edge_index=[2, 354], num_nodes=64)
 ```
 
-This will download the dataseets to your device and cache them. You may specify the download location by setting the environment variable `POLYGRAPH_CACHE_DIR`.
+This will download the datasets to your device and cache them. You may specify the download location by setting the environment variable `POLYGRAPH_CACHE_DIR`.
 All datasets in `polygraph` contain PyTorch-geometric objects.
 However, we may also access the graphs as NetworkX objects as follows:
 
@@ -28,7 +28,7 @@ print(planar_nx[0])         # (Networkx) Graph with 64 nodes and 177 edges
 
 ## Comparing Distributions of Graphs
 When evaluating graph generative models, we want to compare a set of *generated* graphs to a set of *reference* graphs (typically the test set).
-In `polygraph`, we provide various different metrics to quantify how close these two sets of graphs are.
+In `polygraph`, we provide various different metrics to quantify how similar these two sets of graphs are.
 We usually pass collections of NetworkX graphs to metrics.
 Below, we demonstrate how a set of these metrics, combined in the [`MMD2CollectionGaussianTV`][polygraph.metrics.MMD2CollectionGaussianTV] benchmark may be computed:
 
@@ -44,7 +44,7 @@ print(benchmark.compute(generated))          # Dictionary of different metrics
 
 We discuss available metrics [in the next tutorial](metrics_overview.md).
 
-All metrics are evaluated in a similar fashion:
+All metrics are evaluated in a similar fashion, as defined by the common [interface](../api_reference/metrics/interface.md):
 
 - We first initialize a metric object via `benchmark = MMD2CollectionGaussianTV(reference)`. This fits the metric to the `reference` set, caching data that is required in later computations
 - We then compute the metric against the generated set via `benchmark.compute(generated)`
diff --git a/docs/tutorials/custom_metrics.md b/docs/tutorials/custom_metrics.md
@@ -0,0 +1,7 @@
+# Defining Custom Metrics
+
+## Custom Graph Descriptors
+
+## Custom Graph Kernels
+
+## Building Benchmarks
diff --git a/docs/tutorials/evaluating_models.md b/docs/tutorials/evaluating_models.md
diff --git a/docs/tutorials/metrics_overview.md b/docs/tutorials/metrics_overview.md
@@ -9,16 +9,16 @@ For convenience, `polygraph` allows metrics that follow this interface to be bun
 
 ```python
 from polygraph.metrics import MetricCollection
-from polygraph.metrics.gran import RBFOrbitMMD2, ClassifierOrbitMetric
+from polygraph.metrics import MMD2CollectionRBF, MMD2CollectionGaussianTV
 from polygraph.datasets import PlanarGraphDataset, SBMGraphDataset
 
 reference_graphs = PlanarGraphDataset("val").to_nx()
 generated_graphs = SBMGraphDataset("val").to_nx()
 
 metrics = MetricCollection(
     metrics={
-        "rbf_orbit": RBFOrbitMMD2(reference_graphs=reference_graphs),
-        "classifier_orbit": ClassifierOrbitMetric(reference_graphs=reference_graphs),
+        "rbf_mmd": MMD2CollectionRBF(reference_graphs),
+        "tv_mmd": MMD2CollectionGaussianTV(reference_graphs),
     }
 )
 print(metrics.compute(generated_graphs))        # Dictionary of metrics
@@ -28,19 +28,31 @@ We now proceed to give a high-level overview over the different types of metrics
 
 ## Maximum Mean Discrepancy
 
-[Maximum Mean Discrepancy (MMD)](../api_reference/metrics/mmd.md) is the most commonly used approach for comparing graph distributions.
+[Maximum Mean Discrepancy (MMD)](../api_reference/metrics/mmd.md) is the predominant method for comparing graph distributions.
 The two distributions are embedded in a reproducing kernel Hilbert space (RKHS) and their distance is then computed in this space.
 
-To construct an MMD metric, one must choose two components:
+In `polygraph`, we bundle the most commonly used MMD metrics in two benchmark classes: [`MMD2CollectionGaussianTV`][polygraph.metrics.MMD2CollectionGaussianTV] and [`MMD2CollectionRBF`][polygraph.metrics.MMD2CollectionRBF]. These benchmarks may be evaluated in the following fashion:
+
+```python
+from polygraph.datasets import PlanarGraphDataset, SBMGraphDataset
+from polygraph.metrics import MMD2CollectionGaussianTV, MMD2IntervalCollectionGaussianTV
+
+reference = PlanarGraphDataset("val").to_nx()
+generated = SBMGraphDataset("val").to_nx()
+
+# Evaluate the benchmark with point estimates
+benchmark = MMD2CollectionGaussianTV(reference)
+print(benchmark.compute(generated))     # {'orbit': 1.067700488335175, 'clustering': 0.32549637224264394, 'degree': 0.3375409762261701, 'spectral': 0.0830197437100697}
+```
+
+For more details on these collections we refer to the documentation on the [Gaussian TV metrics](../metrics/gaussian_tv_mmd.md) and [RBF metrics](../metrics/rbf_mmd.md).
+
+Polygraph also allows you to construct custom MMD metrics. To construct an MMD metric, one must choose two components:
 
 - [Descriptor](../api_reference/utils/graph_descriptors.md) - A function that transforms graphs into vectorial descriptions
-- [Kernel](../api_reference/utils/graph_kernels.md) - A kernel function operating on these vectors produced by the descriptor
+- [Kernel](../api_reference/utils/graph_kernels.md) - A kernel function operating on the vectors produced by the descriptor
 
 We implement a large number of different descriptors and kernels in `polygraph`.
-For convenience, we provide commonly used combinations of kernels, descriptors, and estimators, based on [classical descriptors](../metrics/gran.md) or [gnn features](../metrics/gin.md).
-We recommend using these standardized implementations to ensure fair and comparable evaluations.
-
-However, you may also construct custom MMD metrics, combining kernels and descriptors as you like.
 An MMD metric operating on orbit counts with a linear kernel may thus be constructed in the following fashion:
 
 ```python
@@ -58,20 +70,33 @@ metric = DescriptorMMD2(
 
 The MMD may be computed via a biased estimator (`"biased"`) or via an unbiased one (`"umve"`).
 In the large sample size limit, the two should converge to the same value. However, at low sample sizes the differences may be substantial.
-In practice the biased estimator is oftentimes used.
+In practice the biased estimator is oftentimes used. We refer to the documentation of the [base MMD classes](../api_reference/metrics/mmd.md).
 
 
 !!! warning
     MMD metrics that are computed with different estimators, metrics, or kernels lie on different scales and are not comparable to each other.
 
 ## PolyGraphScore
 
-The [PolyGraphScore metric](../api_reference/metrics/polygraphscore.md) operates in a similar fashion as MMD metrics. However, it aims to make metrics comparable across graph descriptors and produces interpretable values between 0 and 1.
-The PolyGraphScore is typically computed for several graph descriptors and produces a summary metric for these descriptors.
+The [PolyGraphScore metric](../api_reference/metrics/polygraphscore.md) compares two graph distributions by determining how well they can be distinguished by a binary classifier.
+It aims to make metrics comparable across graph descriptors and produces interpretable values between 0 and 1.
+The PolyGraphScore is computed for several graph descriptors and produces a summary metric for these descriptors.
 This summary metric is an estimated lower bound on a probability metric that is intrinsic to the graph distributions and independent of the descriptors
 
+We provide [`PGS5`][polygraph.metrics.PGS5], a standardized version of the PolyGraphScore that combines 5 different graph descriptors:
+
 ```python
 from polygraph.datasets import PlanarGraphDataset, SBMGraphDataset
+from polygraph.metrics import PGS5
+
+metric = PGS5(reference_graphs=PlanarGraphDataset("test").to_nx())
+metric.compute(SBMGraphDataset("test").to_nx()) # {'polygraphscore': 0.999301797449604, 'polygraphscore_descriptor': 'degree', 'subscores': {'orbit': 0.9986018004713674, 'clustering': 0.9933180272388359, 'degree': 0.999301797449604, 'spectral': 0.9690467491487502, 'gin': 0.9984711185804029}}
+```
+
+As with MMD metrics, you may also construct custom PolyGraphScore variants using other graph descriptors, evaluation metrics, or binary classification approaches.
+E.g., you may construct the following metric
+
+```python
 from polygraph.utils.graph_descriptors import OrbitCounts, SparseDegreeHistogram
 from polygraph.metrics.base import PolyGraphScore
 
@@ -81,12 +106,13 @@ metric = PolyGraphScore(
         "orbit": OrbitCounts(),
         "degree": SparseDegreeHistogram(),
     },
-    classifier="tabpfn",
-    variant="jsd"
+    classifier="logistic",
+    variant="informedness",
 )
-metric.compute(SBMGraphDataset("test").to_nx())
+metric.compute(SBMGraphDataset("test").to_nx())         # {'polygraphscore': 0.9, 'polygraphscore_descriptor': 'orbit', 'subscores': {'orbit': 0.9, 'degree': 0.9}}
 ```
 
+We refer to the [API reference](../api_reference/metrics/polygraphscore.md) for further details.
 
 ## Validity, Uniqueness, Novelty
 
@@ -100,7 +126,6 @@ To determine novelty, this metric must be passed the training set on which the g
 
 ```python
 from polygraph.metrics import VUN
-from polygraph.datasets import PlanarGraphDataset, SBMGraphDataset
 
 train = PlanarGraphDataset("train").to_nx()
 generated = SBMGraphDataset("val").to_nx()
@@ -109,11 +134,8 @@ metric = VUN(
     train_graphs=train,         # Pass the training set to determine novelty
     validity_fn=PlanarGraphDataset.is_valid
 )
-print(metric.compute(generated))        # Dictionary containing fraction of unique/novel/valid graphs (all combinations)
+print(metric.compute(generated))        # {'unique': 1.0, 'novel': 1.0, 'unique_novel': 1.0, 'valid': 0.0, 'valid_unique_novel': 0.0, 'valid_novel': 0.0, 'valid_unique': 0.0}
 ```
 
 All synthetic datasets in the `polygraph` package provide a static `is_valid` function.
 If no validity function is available for your dataset, `validity_fn` may be set to `None`. In this case, only the fraction of unique and novel graphs is computed.
-
-
-## Uncertaingy Quantification
diff --git a/docs/tutorials/uncertainty_quantification.md b/docs/tutorials/uncertainty_quantification.md
@@ -0,0 +1 @@
+# Uncertainty Quantification
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -11,7 +11,7 @@ nav:
   - Tutorials:
     - Basic Usage: tutorials/basic_usage.md
     - Metrics Overview: tutorials/metrics_overview.md
-    - Evaluating Models: tutorials/evaluating_models.md
+    - Uncertainty Quantification: tutorials/uncertainty_quantification.md
     - Defining Custom Metrics: tutorials/custom_metrics.md
   - Base API Reference:
     - polygraph.metrics: