Skip to content

Commit 27f36fb

Browse files
committed
Use TabPFN v2.5 weights as default classifier
Switch default_classifier() to use TabPFNClassifier.create_default_for_version with ModelVersion.V2_5 instead of the plain constructor (which uses v2.0). Update README to reflect the new default and document the license differences between v2.0 (Apache 2.0 + attribution, commercial OK) and v2.5 (non-commercial). Add logistic regression as a lightweight alternative.
1 parent a4b248f commit 27f36fb

2 files changed

Lines changed: 19 additions & 8 deletions

File tree

README.md

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -142,24 +142,30 @@ print(pgd.compute(generated)) # {'pgd': ..., 'pgd_descriptor': ..., 'subscores':
142142

143143
`pgd_descriptor` provides the best descriptor used to report the final score.
144144

145-
By default, PGD uses TabPFN v2 weights. To use TabPFN v2.5 weights instead, pass a custom classifier. The v2.5 weights are hosted on a gated Hugging Face repository ([Prior-Labs/tabpfn_2_5](https://huggingface.co/Prior-Labs/tabpfn_2_5)) and require authentication:
145+
By default, PGD uses TabPFN v2.5 weights. The v2.5 weights are hosted on a gated Hugging Face repository ([Prior-Labs/tabpfn_2_5](https://huggingface.co/Prior-Labs/tabpfn_2_5)) and require authentication:
146146

147147
```bash
148148
pip install huggingface_hub
149149
huggingface-cli login
150150
```
151151

152-
Then:
152+
Alternatively, you can use TabPFN v2.0 weights, which are licensed under the Prior Labs License (Apache 2.0 with an additional attribution clause) and permit commercial use. The v2.5 weights, in contrast, use a non-commercial license that prohibits commercial and production use without a separate enterprise license from Prior Labs:
153153

154154
```python
155155
from tabpfn import TabPFNClassifier
156156
from polygraph.metrics import StandardPGD
157157

158-
classifier = TabPFNClassifier.create_default_for_version(
159-
"v2.5", device="auto", n_estimators=4
160-
)
158+
classifier = TabPFNClassifier(device="auto", n_estimators=4)
161159
pgd = StandardPGD(reference, classifier=classifier)
162-
print(pgd.compute(generated))
160+
```
161+
162+
A logistic regression classifier can also be used as a lightweight alternative, although it yields a looser bound in practice:
163+
164+
```python
165+
from sklearn.linear_model import LogisticRegression
166+
from polygraph.metrics import StandardPGD
167+
168+
pgd = StandardPGD(reference, classifier=LogisticRegression())
163169
```
164170

165171
#### Validity, uniqueness and novelty

polygraph/metrics/base/polygraphdiscrepancy.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@
6262
from sklearn.preprocessing import StandardScaler
6363
from packaging.version import Version
6464
from tabpfn import TabPFNClassifier
65+
from tabpfn.classifier import ModelVersion
6566

6667
from polygraph import GraphType
6768
from polygraph.metrics.base.interface import GenerationMetric
@@ -107,7 +108,7 @@ def default_classifier() -> TabPFNClassifier:
107108
"""Create the default TabPFN classifier used by PGD.
108109
109110
Returns:
110-
A TabPFNClassifier with default settings (auto device, 4
111+
A TabPFNClassifier with v2.5 weights (auto device, 4
111112
estimators). Requires ``tabpfn >= 2.0.9``.
112113
"""
113114
tabpfn_ver = Version(version("tabpfn"))
@@ -116,7 +117,11 @@ def default_classifier() -> TabPFNClassifier:
116117
"TabPFN >= 2.0.9 is required. "
117118
"Install with `pip install 'tabpfn>=2.0.9'`."
118119
)
119-
return TabPFNClassifier(device="auto", n_estimators=4)
120+
return TabPFNClassifier.create_default_for_version(
121+
ModelVersion.V2_5,
122+
device="auto",
123+
n_estimators=4,
124+
)
120125

121126

122127
class PolyGraphDiscrepancyResult(TypedDict):

0 commit comments

Comments
 (0)