Skip to content

Commit 61293a2

Browse files
Merge branch 'main' of https://github.com/x-tabdeveloping/turftopic into main
2 parents 3290413 + a37354d commit 61293a2

78 files changed

Lines changed: 254 additions & 24579 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# creates the documentation on pushes it to the gh-pages branch
2+
name: Documentation
3+
4+
on:
5+
pull_request:
6+
branches: [main]
7+
push:
8+
branches: [main]
9+
10+
11+
permissions:
12+
contents: write
13+
14+
jobs:
15+
deploy:
16+
runs-on: ubuntu-latest
17+
steps:
18+
- uses: actions/checkout@v3
19+
- uses: actions/setup-python@v4
20+
with:
21+
python-version: '3.10'
22+
23+
- name: Dependencies
24+
run: |
25+
python -m pip install --upgrade pip
26+
pip install turftopic[pyro-ppl,docs]
27+
28+
- name: Build and Deploy
29+
if: github.event_name == 'push'
30+
run: mkdocs gh-deploy --force
31+
32+
- name: Build
33+
if: github.event_name == 'pull_request'
34+
run: mkdocs build

docs/clustering.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -188,6 +188,11 @@ top2vec = ClusteringTopicModel(
188188
Theoretically the model descriptions above should result in the same behaviour as the other two packages, but there might be minor changes in implementation.
189189
We do not intend to keep up with changes in Top2Vec's and BERTopic's internal implementation details indefinitely.
190190

191+
### _(Optional)_ 5. Dynamic Modeling
192+
193+
Clustering models are also capable of dynamic topic modeling. This happens by fitting a clustering model over the entire corpus, as we expect that there is only one semantic model generating the documents.
194+
To gain temporal representations for topics, the corpus is divided into equal, or arbitrarily chosen time slices, and then term importances are estimated using Soft-c-TF-IDF, c-TF-IDF, or distances from cluster centroid for each of the time slices separately. When distance from cluster centroids is used to estimate topic importances in dynamic modeling, cluster centroids are computed based on documents and terms present within a given time slice.
195+
191196
## Considerations
192197

193198
### Strengths

docs/dynamic.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ Dynamic topic models in Turftopic have a unified interface.
2828
To fit a dynamic topic model you will need a corpus, that has been annotated with timestamps.
2929
The timestamps need to be Python `datetime` objects, but pandas `Timestamp` object are also supported.
3030

31-
Models that have dynamic modeling capabilities have a `fit_transform_dynamic()` method, that fits the model on the corpus over time.
31+
Models that have dynamic modeling capabilities (currently, `GMM` and `ClusteringTopicModel`) have a `fit_transform_dynamic()` method, that fits the model on the corpus over time.
3232

3333
```python
3434
from datetime import datetime

pyproject.toml

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -18,17 +18,13 @@ torch = "^2.1.0"
1818
scipy = "^1.10.0"
1919
rich = "^13.6.0"
2020
pyro-ppl = { version = "^1.8.0", optional = true }
21+
mkdocs = { version = "^1.5.2", optional = true }
22+
mkdocs-material = { version = "^9.5.12", optional = true }
23+
mkdocstrings = { version = "^0.24.0", extras = ["python"], optional = true }
2124

2225
[tool.poetry.extras]
2326
pyro-ppl = ["pyro-ppl"]
24-
25-
[tool.poetry.group.docs]
26-
optional = true
27-
28-
[tool.poetry.group.docs.dependencies]
29-
mkdocs = "^1.5.2"
30-
mkdocs-material = "^9.5.12"
31-
mkdocstrings = { version = "^0.24.0", extras = ["python"] }
27+
docs = ["mkdocs", "mkdocs-material", "mkdocstrings"]
3228

3329
[build-system]
3430
requires = ["poetry-core"]

0 commit comments

Comments
 (0)