Skip to content

Commit 12402f9

Browse files
Added doc page for seeded modeling
1 parent c66ec54 commit 12402f9

2 files changed

Lines changed: 60 additions & 0 deletions

File tree

docs/seeded.md

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# Seeded Topic Modeling
2+
3+
When investigating a set of documents, you might already have an idea about what aspects you would like to explore.
4+
Some models are able to account for this by taking seed phrases or words.
5+
This is currently only possible with KeyNMF in Turftopic, but will likely be extended in the future.
6+
7+
In [KeyNMF](../keynmf.md), you can describe the aspect, from which you want to investigate your corpus, using a free-text seed-phrase,
8+
which will then be used to only extract topics, which are relevant to your research question.
9+
10+
In this example we investigate the 20Newsgroups corpus from three different aspects:
11+
12+
```python
13+
from sklearn.datasets import fetch_20newsgroups
14+
15+
from turftopic import KeyNMF
16+
17+
corpus = fetch_20newsgroups(
18+
subset="all",
19+
remove=("headers", "footers", "quotes"),
20+
).data
21+
22+
model = KeyNMF(5, seed_phrase="<your seed phrase>")
23+
model.fit(corpus)
24+
25+
model.print_topics()
26+
```
27+
28+
29+
=== "`'Is the death penalty moral?'`"
30+
31+
| Topic ID | Highest Ranking |
32+
| - | - |
33+
| 0 | morality, moral, immoral, morals, objective, morally, animals, society, species, behavior |
34+
| 1 | armenian, armenians, genocide, armenia, turkish, turks, soviet, massacre, azerbaijan, kurdish |
35+
| 2 | murder, punishment, death, innocent, penalty, kill, crime, moral, criminals, executed |
36+
| 3 | gun, guns, firearms, crime, handgun, firearm, weapons, handguns, law, criminals |
37+
| 4 | jews, israeli, israel, god, jewish, christians, sin, christian, palestinians, christianity |
38+
39+
=== "`'Evidence for the existence of god'`"
40+
41+
| Topic ID | Highest Ranking |
42+
| - | - |
43+
| 0 | atheist, atheists, religion, religious, theists, beliefs, christianity, christian, religions, agnostic |
44+
| 1 | bible, christians, christian, christianity, church, scripture, religion, jesus, faith, biblical |
45+
| 2 | god, existence, exist, exists, universe, creation, argument, creator, believe, life |
46+
| 3 | believe, faith, belief, evidence, blindly, believing, gods, believed, beliefs, convince |
47+
| 4 | atheism, atheists, agnosticism, belief, arguments, believe, existence, alt, believing, argument |
48+
49+
=== "`'Operating system kernels'`"
50+
51+
| Topic ID | Highest Ranking |
52+
| - | - |
53+
| 0 | windows, dos, os, microsoft, ms, apps, pc, nt, file, shareware |
54+
| 1 | ram, motherboard, card, monitor, memory, cpu, vga, mhz, bios, intel |
55+
| 2 | unix, os, linux, intel, systems, programming, applications, compiler, software, platform |
56+
| 3 | disk, scsi, disks, drive, floppy, drives, dos, controller, cd, boot |
57+
| 4 | software, mac, hardware, ibm, graphics, apple, computer, pc, modem, program |
58+
59+

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ nav:
88
- Interpreting and Visualizing Models: model_interpretation.md
99
- Modifying and Finetuning Models: finetuning.md
1010
- Saving and Loading Models: persistence.md
11+
- Seeded Topic Modeling: seeded.md
1112
- Dynamic Topic Modeling: dynamic.md
1213
- Online Topic Modeling: online.md
1314
- Hierarchical Topic Modeling: hierarchical.md

0 commit comments

Comments
 (0)