Skip to content

Commit 496fe73

Browse files
Implemented requested changes in docs
1 parent a604e55 commit 496fe73

2 files changed

Lines changed: 17 additions & 10 deletions

File tree

docs/basics.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -284,8 +284,12 @@ model.print_topics()
284284

285285
#### Datamapplot *(clustering models only)*
286286

287-
You can interactively explore clusters using `datamapplot` directly in Turftopic!
288-
You will first have to install `datamapplot` for this to work.
287+
You can interactively explore clusters using [datamapplot](https://github.com/TutteInstitute/datamapplot) directly in Turftopic!
288+
You will first have to install `datamapplot` for this to work:
289+
290+
```bash
291+
pip install turftopic[datamapplot]
292+
```
289293

290294
```python
291295
from turftopic import ClusteringTopicModel
@@ -309,7 +313,6 @@ fig
309313
</figure>
310314

311315

312-
313316
#### topicwizard
314317

315318
Turftopic integrates with [topicwizard](https://github.com/x-tabdeveloping/topicwizard), a package for interactive topic visualization.

docs/clustering.md

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -23,19 +23,19 @@ model = ClusteringTopicModel(dimensionality_reduction=TSNE())
2323

2424
It is common practice to reduce the dimensionality of the embeddings before clustering them.
2525
This is to avoid the curse of dimensionality, an issue, which many clustering models are affected by.
26-
Dimensionality reduction by default is done with **TSNE** in Turftopic,
26+
Dimensionality reduction by default is done with [**TSNE**](https://scikit-learn.org/stable/modules/manifold.html#t-distributed-stochastic-neighbor-embedding-t-sne) in Turftopic,
2727
but users are free to specify the model that will be used for dimensionality reduction.
2828

2929
!!! tip "Use openTSNE for better performance!"
30-
By default, a scikit-learn implementation is used, but if you have the openTSNE package installed on your system, Turftopic will automatically use it.
30+
By default, a scikit-learn implementation is used, but if you have the [openTSNE](https://github.com/pavlin-policar/openTSNE) package installed on your system, Turftopic will automatically use it.
3131
You can potentially speed up your clustering topic models by multiple orders of magnitude.
3232
```bash
33-
pip install opentsne
33+
pip install turftopic[opentsne]
3434
```
3535

3636
??? note "What reduction model should I choose?"
3737
Our knowledge about the impacts of choice of dimensionality reduction is limited, and has not yet been explored in the literature.
38-
Top2Vec and BERTopic both use UMAP, which has a number of desirable properties over alternatives (arranging data points into cluster-like structures, better preservation of global structure than TSNE, speed).
38+
Top2Vec and BERTopic both use [UMAP](https://umap-learn.readthedocs.io/en/latest/basic_usage.html), which has a number of desirable properties over alternatives (arranging data points into cluster-like structures, better preservation of global structure than TSNE, speed).
3939

4040
### Clustering
4141

@@ -47,7 +47,7 @@ model = ClusteringTopicModel(clustering=HDBSCAN())
4747
```
4848

4949
After reducing the dimensionality of the embeddings, they are clustered with a clustering model.
50-
Turftopic uses **HDBSCAN** as its default.
50+
Turftopic uses [**HDBSCAN**](https://scikit-learn.org/stable/modules/clustering.html#hdbscan) as its default.
5151

5252
??? note "What clustering model should I choose?"
5353
Some clustering models are capable of discovering the number of clusters in the data (HDBSCAN, DBSCAN, OPTICS, etc.).
@@ -183,8 +183,12 @@ model.reset_topics()
183183

184184
### Visualization
185185

186-
You can interactively explore clusters using `datamapplot` directly in Turftopic!
187-
You will first have to install `datamapplot` for this to work.
186+
You can interactively explore clusters using [datamapplot](https://github.com/TutteInstitute/datamapplot) directly in Turftopic!
187+
You will first have to install `datamapplot` for this to work:
188+
189+
```bash
190+
pip install turftopic[datamapplot]
191+
```
188192

189193
```python
190194
from turftopic import ClusteringTopicModel

0 commit comments

Comments
 (0)