You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/dynamic.md
+8-19Lines changed: 8 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,41 +4,30 @@ If you want to examine the evolution of topics over time, you will need a dynami
4
4
5
5
> Note that regular static models can also be used to study the evolution of topics and information dynamics, but they can't capture changes in the topics themselves.
6
6
7
-
## Theory
7
+
## Models
8
8
9
-
A number of different conceptualizations can be used to study evolving topics in corpora, for instance:
10
-
11
-
1. One can imagine topic representations to be governed by a Brownian Markov Process (random walk), in such a case the evolution is part of the model itself.
12
-
In layman's terms you describe the evolution of topics directly in your generative model by expecting the topic representations to be sampled from Gaussian noise around the last time step.
13
-
Sometimes researchers will also refer to such models as _state-space_ approaches.
14
-
This is the approach that the original [DTM paper](https://mimno.infosci.cornell.edu/info6150/readings/dynamic_topic_models.pdf) utilizes.
15
-
Along with [this paper](https://arxiv.org/pdf/1709.00025.pdf) on Dynamic NMF.
16
-
2. You can fit one underlying statistical model over the entire corpus, and then do post-hoc term importance estimation per time slice.
17
-
This is [what BERTopic does](https://maartengr.github.io/BERTopic/getting_started/topicsovertime/topicsovertime.html).
18
-
3. You can fit one model per time slice, and then use some aggregation procedure to merge the models.
19
-
This approach is used in the Dynamic NMF in [this paper](https://www.cambridge.org/core/journals/political-analysis/article/exploring-the-political-agenda-of-the-european-parliament-using-a-dynamic-topic-modeling-approach/BBC7751778E4542C7C6C69E6BF954E4B).
20
-
21
-
Developing such approaches takes a lot of time and effort, and we have plans to add dynamic modeling capabilities to all models in Turftopic.
22
-
For now only models of the second kind are on our list of things to do, and dynamic topic modeling has been implemented for GMM, and will soon be implemented for Clustering Topic Models.
23
-
For more theoretical background, see the page on [GMM](GMM.md).
9
+
In Turftopic you can currently use three different topic models for modeling topics over time:
10
+
1.[ClusteringTopicModel](clustering.md), where an overall model is fitted on the whole corpus, and then term importances are estimated over time slices.
11
+
2.[GMM](GMM.md), similarly to clustering models, term importances are reestimated per time slice
12
+
3.[KeyNMF](KeyNMF.md), an overall decomposition is done, then using coordinate descent, topic-term-matrices are recalculated based on document-topic importances in the given time slice.
24
13
25
14
## Usage
26
15
27
16
Dynamic topic models in Turftopic have a unified interface.
28
17
To fit a dynamic topic model you will need a corpus, that has been annotated with timestamps.
29
18
The timestamps need to be Python `datetime` objects, but pandas `Timestamp` object are also supported.
30
19
31
-
Models that have dynamic modeling capabilities (currently, `GMM` and `ClusteringTopicModel`) have a `fit_transform_dynamic()` method, that fits the model on the corpus over time.
20
+
Models that have dynamic modeling capabilities (`KeyNMF`, `GMM` and `ClusteringTopicModel`) have a `fit_transform_dynamic()` method, that fits the model on the corpus over time.
0 commit comments