Skip to content

Use faster algorithm for topological sort#20790

Merged
JukkaL merged 5 commits intomasterfrom
optimize-topsort
Feb 12, 2026
Merged

Use faster algorithm for topological sort#20790
JukkaL merged 5 commits intomasterfrom
optimize-topsort

Conversation

@JukkaL
Copy link
Copy Markdown
Collaborator

@JukkaL JukkaL commented Feb 12, 2026

In a large codebase, up to 9% of CPU was used in topsort when doing a small incremental run. This should make it significantly faster (I've verified that it's faster at least when using synthetic data).

Use Kahn's algorithm, since it's O(V + E) rather than O(depth * V) for the original algorithm.

Description of the algorithm: https://www.geeksforgeeks.org/dsa/topological-sorting-indegree-based-solution/

perf_compare.py showed a small improvement in self check performance, but the difference is below the noise floor. This will likely mostly help with larger codebases.

Keep the old topsort function around for now, so that we can test that the new and old functions behave identically in tests. I'll remove the old one afterwards.

I used coding agent assist for this, but I did the implementation in multiple small increments.

@JukkaL JukkaL requested a review from ilevkivskyi February 12, 2026 14:59
Copy link
Copy Markdown
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I guess this will also help with ordering large SCCs (like in case of torch) where this may be called multiple times.

from mypy.errors import Errors
from mypy.fscache import FileSystemCache
from mypy.graph_utils import strongly_connected_components, topsort
from mypy.graph_utils import strongly_connected_components, topsort, topsort2
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this will be around for a while, we may choose a better name, otherwise this is fine.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

|'m planning to rename the new one to topsort and drop the old one pretty soon, once I've measured the performance in our large codebase (probably within a week or so).

@github-actions
Copy link
Copy Markdown
Contributor

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@JukkaL JukkaL merged commit ba2374f into master Feb 12, 2026
23 checks passed
@JukkaL JukkaL deleted the optimize-topsort branch February 12, 2026 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants