Skip to content

Commit c041b90

Browse files
Update README.md
1 parent 7117f23 commit c041b90

1 file changed

Lines changed: 8 additions & 12 deletions

File tree

README.md

Lines changed: 8 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
## Features
88
- Novel transformer-based topic models:
9-
- Semantic Signal Separation - S³ (paper in progress ⏳)
9+
- Semantic Signal Separation - S³ 🧭
1010
- KeyNMF 🔑
1111
- GMM
1212
- Implementations of existing transformer-based topic models
@@ -159,14 +159,10 @@ topicwizard.visualize(corpus, model=model)
159159

160160
Alternatively you can use the [Figures API](https://x-tabdeveloping.github.io/topicwizard/figures.html) in topicwizard for individual HTML figures.
161161

162-
## Models
163-
164-
| Model | Description | Usage |
165-
| - | - | - |
166-
| KeyNMF | Non-negative Matrix Factorization enhanced with keyword extraction using sentence embeddings | `model = KeyNMF(n_components=10).fit(corpus)` |
167-
| GMM | Gaussian Mixture Model over contextual embeddings + post-hoc term importance estimation | `model = GMM(n_components=10).fit(corpus)` |
168-
|| Separates semantic signals, aka. axes of semantics in a corpus using independent component analysis. | `model = SemanticSignalSeparation(n_components=10).fit(corpus)` |
169-
| Autoencoding Models | Learn topics using amortized variational inference enhanced by contextual representations. | `model = AutoEncodingTopicModel(n_components=10, combined=False).fit(corpus)` |
170-
| Clustering Models | Clusters semantic embeddings, and estimates term importances for clusters. | `model = ClusteringTopicModel(feature_importance="ctfidf").fit(corpus)` |
171-
172-
For extensive comparison see our [Model Overview](https://x-tabdeveloping.github.io/turftopic/model_overview/).
162+
## References
163+
- Kardos, M., Kostkan, J., Vermillet, A., Nielbo, K., Enevoldsen, K., & Rocca, R. (2024, June 13). $S^3$ - Semantic Signal separation. arXiv.org. https://arxiv.org/abs/2406.09556
164+
- Grootendorst, M. (2022, March 11). BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv.org. https://arxiv.org/abs/2203.05794
165+
- Angelov, D. (2020, August 19). Top2VEC: Distributed representations of topics. arXiv.org. https://arxiv.org/abs/2008.09470
166+
- Bianchi, F., Terragni, S., & Hovy, D. (2020, April 8). Pre-training is a Hot Topic: Contextualized Document Embeddings Improve Topic Coherence. arXiv.org. https://arxiv.org/abs/2004.03974
167+
- Bianchi, F., Terragni, S., Hovy, D., Nozza, D., & Fersini, E. (2021). Cross-lingual Contextualized Topic Models with Zero-shot Learning. In Proceedings of the 16th Conference of the European
168+
- Chapter of the Association for Computational Linguistics: Main Volume (pp. 1676–1683). Association for Computational Linguistics.

0 commit comments

Comments
 (0)