Skip to content

Commit 44847ac

Browse files
Added results and method on perplexity testing
1 parent de1d511 commit 44847ac

2 files changed

Lines changed: 25 additions & 4 deletions

File tree

papers/topeax/main.pdf

245 KB
Binary file not shown.

papers/topeax/main.typ

Lines changed: 25 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
#show title: set text(size: 18pt)
22
#show title: set align(left)
3+
#show figure.caption: set align(left)
34

45
#set text(
56
size: 12pt,
@@ -214,12 +215,32 @@ Ideally a model should both have high intrinsic and extrinsic coherence, and thu
214215
estimate of topic quality: $accent(C, -) = sqrt(C_("in") dot C_("ex"))$.
215216
In addition an aggregate metric of topic quality can be calculated by taking the geometric mean of coherence and diversity $I = sqrt(accent(C, -) dot d)$.
216217

217-
== Robustness checks
218+
== Sensitivity to Perplexity
218219

219-
+ Hyperparameters (perplexity)
220-
+ Corpus Subsampling
220+
Both TSNE and UMAP, have a hyperparameter that determines, how many neighbours of a given point are considered when generating lower-dimensional projections, this hyperparameter is usually referred to as _perplexity_.
221+
It is also known that both methods are sensitive to the choice of hyperparameters, and depending on these, structures, that do not exist in the higher-dimensional feature space might occur (cite Distill article and "Understanding UMAP").
222+
In order to see how this affects the Topeax algorithm, and how robust it is to the choice of this hyperparameter in comparison with other clustering topic models, I fitted each model to the 20 Newsgroups corpus using `all-MiniLM-L6-v2` with `perplexities=[2, 5, 30, 50, 100]`.
223+
This choice of values was inspired by (cite Distill). Each model was evaluated on the metrics outlined above.
224+
225+
== Subsampling Invariance
221226

222227
= Results
223228

224-
== Cluster Recovery
225229

230+
== Perplexity
231+
232+
Metrics of quality and number of topics across perplexity values can are displayed on @perplexity_robustness.
233+
Topeax converges very early on the number of topics with perplexity, and remains stable from `perplexity=5`, while converges at around `perplexity=30` for quality metrics. In light of this, it is reasonable to conclude that 50 is a reasonable recommendation and default value.
234+
Meanwhile, BERTopic converges at around `perplexity=50`, and has the lowest performance on all metrics. Top2Vec does not seem to converge at all for the values of perplexity tested, and is most unstable. It does seem to improve with larger values of the hyperparameter.
235+
Keep in mind, that while BERTopic and Top2Vec improve with higher values, their default is set at `perplexity=15`, which, according to these evaluations seems rather unreasonable.
236+
237+
238+
#figure(
239+
image("figures/perplexity_robustness.png", width: 100%),
240+
caption: [Clustering model's performance at different perplexity values.\
241+
_Left_: Fowlkes-Mallows Index at different perplexity values,
242+
_Middle_: Topic Interpretability Score at different values of Perplexity,
243+
_Right_: Number of Topics at each value of perplexity against Gold label.
244+
245+
],
246+
) <perplexity_robustness>

0 commit comments

Comments
 (0)