Skip to content
Change the repository type filter

All

    Repositories list

    • Python
      Other
      88000Updated Apr 23, 2026Apr 23, 2026
    • DeepPrune

      Public
      🌿 DeepPrune: Parallel Scaling without Inter-trace Redundancy
      Python
      MIT License
      02100Updated Apr 20, 2026Apr 20, 2026
    • OpenSAE

      Public
      Python
      MIT License
      14600Updated Apr 12, 2026Apr 12, 2026
    • 0000Updated Apr 6, 2026Apr 6, 2026
    • Python
      MIT License
      2600Updated Apr 3, 2026Apr 3, 2026
    • VerIF

      Public
      [EMNLP 2025] Verification Engineering for RL in Instruction Following
      Python
      Apache License 2.0
      25335Updated Mar 30, 2026Mar 30, 2026
    • Code for paper "WildReward: Learning Reward Models from In-the-Wild Human Interactions"
      Python
      Apache License 2.0
      12200Updated Feb 26, 2026Feb 26, 2026
    • Data and code for the paper: Finding Safety Neurons in Large Language Models
      Jupyter Notebook
      MIT License
      02540Updated Jan 29, 2026Jan 29, 2026
    • Python
      MIT License
      1618210Updated Dec 5, 2025Dec 5, 2025
    • AgentIF

      Public
      [NIPS 2025 DB Spotlight] AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios
      Python
      13131Updated Dec 1, 2025Dec 1, 2025
    • BGPO

      Public
      Boundary-Guided Policy Optimization for Memory-Efficient RL of Diffusion Large Language Models
      Python
      0800Updated Oct 14, 2025Oct 14, 2025
    • Linguistic-SAE

      Public archive
      Python
      2100Updated Sep 11, 2025Sep 11, 2025
    • LLMAEL

      Public
      [CIKM 2025] LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking
      Python
      11710Updated Sep 6, 2025Sep 6, 2025
    • ReaRAG

      Public
      ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation
      Python
      22600Updated Aug 24, 2025Aug 24, 2025
    • 0410Updated Jul 23, 2025Jul 23, 2025
    • RM-Bench

      Public
      [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
      Python
      38130Updated Jul 18, 2025Jul 18, 2025
    • Python
      21610Updated Jun 25, 2025Jun 25, 2025
    • Python
      53410Updated Jun 18, 2025Jun 18, 2025
    • [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
      Python
      MIT License
      613000Updated Jun 11, 2025Jun 11, 2025
    • AtomR

      Public
      [KDD 2025] AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning
      Jupyter Notebook
      21600Updated May 27, 2025May 27, 2025
    • MMGeoLM

      Public
      Python
      0910Updated May 27, 2025May 27, 2025
    • Crab

      Public
      [CIKM 2025] Constraint Back-translation Improves Complex Instruction Following of Large Language Models
      Python
      01700Updated May 23, 2025May 23, 2025
    • Python
      01400Updated Apr 14, 2025Apr 14, 2025
    • [ACM MM25] LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models
      Python
      MIT License
      02200Updated Mar 29, 2025Mar 29, 2025
    • MRCEval

      Public
      MRCEval: A Comprehensive, Challenging and Accessible Machine Reading Comprehension Benchmark
      Python
      MIT License
      0400Updated Mar 12, 2025Mar 12, 2025
    • OmniEvent

      Public
      A comprehensive, unified and modular event extraction toolkit.
      Python
      MIT License
      38406104Updated Dec 18, 2024Dec 18, 2024
    • ADELIE

      Public
      [EMNLP2024] Aligning Large Language Models on Information Extraction
      Python
      25510Updated Nov 4, 2024Nov 4, 2024
    • KB-Plugin

      Public
      [EMNLP2024] KB-Plugin: A Plug-and-play Framework for Large Language Models to Induce Programs over Low-resourced Knowledge Bases
      Python
      1900Updated Oct 16, 2024Oct 16, 2024
    • The data and source code for the paper "MoocRadar: A Fine-grained and Multi-aspect Knowledge Repository for Improving Cognitive Student Modeling in MOOCs"
      Python
      46060Updated Oct 7, 2024Oct 7, 2024
    • DICE

      Public
      DICE: Detecting In-distribution Data Contamination with LLM's Internal State
      Python
      MIT License
      01100Updated Sep 21, 2024Sep 21, 2024
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.