Skip to content

Commit 308806f

Browse files
Updated Readme
1 parent b22ef1d commit 308806f

1 file changed

Lines changed: 26 additions & 31 deletions

File tree

README.md

Lines changed: 26 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -16,21 +16,42 @@
1616
- Streamlined scikit-learn compatible API 🛠️
1717
- Easy topic interpretation 🔍
1818
- Automated topic naming with LLMs
19+
- Topic modeling with keyphrases :key:
20+
- Lemmatization and Stemming
1921
- Visualization with [topicwizard](https://github.com/x-tabdeveloping/topicwizard) 🖌️
2022

2123
> This package is still work in progress and scientific papers on some of the novel methods are currently undergoing peer-review. If you use this package and you encounter any problem, let us know by opening relevant issues.
2224
23-
## New in version 0.11.0: Chinese Topic Modeling :cn:
25+
## New in version 0.11.0: Vectorizers Module
2426

25-
You can now readily apply Turftopic models to Chinese topic modeling thanks to newly added utilities.
27+
You can now use a set of custom vectorizers for topic modeling over **phrases**, as well as **lemmata** and **stems**.
2628

27-
```bash
28-
pip install turftopic[jieba]
29+
```python
30+
from turftopic import KeyNMF
31+
from turftopic.vectorizers.spacy import NounPhraseCountVectorizer
32+
33+
model = KeyNMF(
34+
n_components=10,
35+
vectorizer=NounPhraseCountVectorizer("en_core_web_sm"),
36+
)
37+
model.fit(corpus)
38+
model.print_topics()
2939
```
3040

41+
| Topic ID | Highest Ranking |
42+
| - | - |
43+
| 0 | atheists, atheism, atheist, belief, beliefs, theists, faith, gods, christians, abortion |
44+
| 1 | alt atheism, usenet alt atheism resources, usenet alt atheism introduction, alt atheism faq, newsgroup alt atheism, atheism faq resource txt, alt atheism groups, atheism, atheism faq intro txt, atheist resources |
45+
| 2 | religion, christianity, faith, beliefs, religions, christian, belief, science, cult, justification |
46+
| 3 | fanaticism, theism, fanatism, all fanatism, theists, strong theism, strong atheism, fanatics, precisely some theists, all theism |
47+
| 4 | religion foundation darwin fish bumper stickers, darwin fish, atheism, 3d plastic fish, fish symbol, atheist books, atheist organizations, negative atheism, positive atheism, atheism index |
48+
| | ... |
49+
50+
Turftopic now also comes with a Chinese vectorizer for easier use.
51+
3152
```python
3253
from turftopic import KeyNMF
33-
from turftopic.chinese import default_chinese_vectorizer
54+
from turftopic.vectorizers.chinese import default_chinese_vectorizer
3455

3556
model = KeyNMF(10, vectorizer=default_chinese_vectorizer(), encoder="BAAI/bge-small-zh-v1.5")
3657
model.fit(corpus)
@@ -45,32 +66,6 @@ model.print_topics()
4566
| 3 | 股, 下跌, 上涨, 震荡, 板块, 大盘, 股指, 涨幅, 沪, 反弹 |
4667
| | ... |
4768

48-
### New in version 0.10.0: Datamapplot cluster visualization
49-
50-
You can interactively explore clusters using `datamapplot` directly in Turftopic!
51-
You will first have to install `datamapplot` for this to work.
52-
53-
```python
54-
from turftopic import ClusteringTopicModel
55-
from turftopic.namers import OpenAITopicNamer
56-
57-
model = ClusteringTopicModel(feature_importance="centroid")
58-
model.fit(corpus)
59-
60-
namer = OpenAITopicNamer("gpt-4o-mini")
61-
model.rename_topics(namer)
62-
63-
fig = model.plot_clusters_datamapplot()
64-
fig.save("clusters_visualization.html")
65-
fig
66-
```
67-
> If you are not running Turftopic from a Jupyter notebook, make sure to call `fig.show()`. This will open up a new browser tab with the interactive figure.
68-
69-
<figure>
70-
<img src="docs/images/cluster_datamapplot.png" width="70%" style="margin-left: auto;margin-right: auto;">
71-
<figcaption>Interactive figure to explore cluster structure in a clustering topic model.</figcaption>
72-
</figure>
73-
7469

7570
## Basics [(Documentation)](https://x-tabdeveloping.github.io/turftopic/)
7671
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/x-tabdeveloping/turftopic/blob/main/examples/basic_example_20newsgroups.ipynb)

0 commit comments

Comments
 (0)