A curated list of resources for Vibe Researching -- AI-driven research automation across the scientific discovery pipeline, from literature review to publication.
Vibe Researching extends the concept of "vibe coding" (conversational, AI-assisted programming) to the entire research workflow. This collection covers tools, papers, and frameworks spanning six research pipeline stages (Survey -> Ideation -> Experiment -> Analysis -> Writing -> Promotion) and six autonomy levels (L0 Manual -> L5 Fully Autonomous).
Legend: NEW = published/released 2025-2026 | OSS = open-source | FREE = free to use
- What is Vibe Researching?
- Navigation Guide
- Survey & Literature Review
- Ideation & Hypothesis Generation
- Experiment & Implementation
- Analysis & Interpretation
- Writing & Publishing
- Promotion & Dissemination
- Cross-Cutting Tools
- Related Resources
- Contributing
- License
Vibe Researching is the application of AI agents with skills to the full research pipeline, enabling researchers to:
- Automate literature reviews -- 80-90% time savings with tools like Elicit, Rayyan
- Generate code from academic papers -- DeepCode: 75.9% success, beats human experts at 72.4%
- Run autonomous experiments -- A-Lab: 10x data collection speedup, 71% success rate
- Produce complete research papers -- AI Scientist-v2: first peer-reviewed AI-generated paper, 2025
Key Paper: Vibe Researching as Wolf Coming (arXiv:2602.22401, Feb 2026) -- Introduces the vibe researching paradigm with cognitive task framework
Community: VibeX 2026 Workshop -- 1st International Workshop on Vibe Coding and Vibe Researching @ EASE 2026
Manual work is too slow. Fully automated AI is too generic. Vibe Researching is the new frontier.
Dr. Claw turns your Research Taste into outsized outcomes with Agentic Execution -- so you can move faster, think bigger, and still hold the line on scientific rigor.
This list is organized by the research pipeline (Survey -> Ideation -> Experiment -> Analysis -> Writing -> Promotion). Each stage has tools and papers at different autonomy levels:
| Level | Name | Human Role | AI Role | Example |
|---|---|---|---|---|
| L1 | Assist | Drives all decisions | Suggests | Code completion, grammar checking |
| L2 | Partial | Defines task, validates | Executes specific subtasks | Automated screening, figure generation |
| L3 | Conditional | Sets goals, approves at checkpoints | Handles full workflows | End-to-end literature review |
| L4 | High | Monitors for failures | Handles extended workflows | Self-driving labs |
| L5 | Full | None (reviewer/consumer) | Operates independently | AI Scientist: idea -> paper |
| Survey | Ideation | Experiment | Analysis | Writing | Promotion |
-----------------+--------+----------+------------+----------+---------+-----------+
Level 1 (Assist) | 8 | 2 | 12 | 3 | 6 | 1 |
Level 2 (Partial)| 6 | 3 | 10 | 7 | 2 | 0 |
Level 3 (Cond.) | 2 | 2 | 8 | 4 | 1 | 0 |
Level 4 (High) | 1 | 1 | 12 | 3 | 0 | 0 |
Level 5 (Full) | 0 | 1 | 6 | 1 | 1 | 0 |
-----------------+--------+----------+------------+----------+---------+-----------+
Total | 17 | 9 | 48 | 18 | 10 | 1 |
| Metric | Value | Source/Year |
|---|---|---|
| Total resources | 107 (45 papers, 62 tools) | This survey, Mar 2026 |
| Developer AI adoption | 92% of US developers use AI coding tools daily | 2025 industry data |
| Code generation share | 41% of all code globally AI-generated (256B lines) | Google, 2024 |
| Lab automation market | $7.84B -> $14.78B (CAGR 6.55%) | 2024-2034 forecast |
| Multimodal AI market | $391B -> ~$2T (CAGR 35.9%) | 2025-2030 forecast |
| Key milestone | AI Scientist-v2: first peer-reviewed AI paper | ICLR 2025 |
Tools and papers for literature search, systematic review, knowledge extraction, and gap analysis. (17 items)
Papers:
- Vibe Coding in Practice: Motivations, Challenges, and a Future Outlook
NEW-- Grey literature review (arXiv:2510.00328, Sep 2025). Analyzed 101 practitioner sources with 518 firsthand behavioral accounts. Revealed speed-quality trade-off paradox: 62% motivated by speed, but code often 'fast but flawed'. Tags: empirical study, grey literature, developer experience.
Tools:
- Google Scholar
FREE-- Classic academic search engine with citation tracking, h-index metrics, and researcher profiles. Free access to academic literature across disciplines. De facto standard for citation counts. L1 Assist. - Semantic Scholar
FREE-- AI-powered academic search from Allen Institute for AI. Features citation contexts, paper recommendations, TLDR summaries, and influential citations analysis. Covers 200M+ papers. L1 Assist.
Tools:
- Elicit -- AI-powered systematic review platform. 125M+ papers, 545K clinical trials. 80% time savings for systematic reviews, high precision for preliminary searches. Automated data extraction and evidence synthesis. L2 Partial. Commercial.
- Rayyan -- AI study screening for systematic reviews. 90% reduction in screening time, ML-based study suggestion, automatic deduplication. Widely used in biomedical systematic reviews. L2 Partial. Freemium.
- Consensus -- AI academic search engine. 200M+ peer-reviewed papers, evidence-based answers with citation support, automated data synthesis. Focuses on extracting research findings as direct answers. L2 Partial. Freemium.
- Scite -- Smart citation analysis evaluating how publications are cited: supporting, contrasting, or mentioning. Reveals citation context beyond simple counts. L2 Partial. Commercial.
Tools:
- Paperguide
NEW-- Fully automated systematic review: research question -> search -> screening -> report generation. Improved by Deep Research integration (June 2025). Competitive features at lower price than SciSpace. L3 Conditional. Commercial. - SciSpace -- 10M+ researchers (2026). Upload PDF, chat with paper, semantic search, automated summarization, reference management, AI copilot. Comprehensive research assistant platform. L3 Conditional. Freemium.
| Finding | Detail |
|---|---|
| Time savings | 80-90% demonstrated by Elicit and Rayyan in systematic review workflows |
| Adoption pattern | Commercial-first: tools emerged 2021-2024, academic evaluation papers followed 2024-2026 |
| Gap | No Level 4-5 tools for autonomous literature synthesis exist yet |
Tools and papers for research question formulation, hypothesis generation, and novelty checking. (9 items -- under-represented stage)
Tools:
- Brainstorming Assistants -- GPT-powered conversational agents for research question ideation and problem formulation. General-purpose LLMs (ChatGPT, Claude) serve as interactive brainstorming partners. L1 Assist.
- Hypothesis Generators -- Template-based tools for structured research question development using PICO/FINER frameworks. L1 Assist.
- Gap Analysis Tools -- AI-powered literature analysis for identifying research gaps and understudied areas by analyzing citation patterns and topic coverage. L2 Partial.
Papers:
- The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (arXiv:2408.06292, 2024, Sakana AI) -- Original autonomous research system producing ML conference-quality LaTeX papers, achieves 'Weak Accept' ratings. Uses Semantic Scholar for automated citation retrieval. Foundational paper for AI-driven scientific discovery.
- The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search
NEW(arXiv:2504.08066, 2025, Sakana AI) -- First AI-generated paper accepted through peer review (ICLR 2025 workshop). Removes human templates, uses progressive agentic tree search, generalizes across ML domains. Milestone: exceeded human acceptance threshold.
Tools:
- Sakana AI - AI Scientist
OSS-- (GitHub | 5k+ stars) End-to-end autonomous research: idea generation -> code implementation -> experiment execution -> LaTeX paper. Open-source reference implementation. L5 Full.
Papers:
- Evaluating Sakana's AI Scientist for Autonomous Research
NEW(arXiv:2502.14297, 2025, ACM SIGIR Forum) -- Critical evaluation revealing 42% experiment failure rate due to coding errors, poor novelty assessments (misclassifying established concepts as novel). Important counterpoint to optimistic claims. Evaluation paper.
| Finding | Detail |
|---|---|
| Critical gap | Ideation stage under-represented (9 items total, mostly Level 1-2) |
| Need | Standalone autonomous hypothesis generation tools (Level 4-5), novelty checking systems |
| AI Scientist limitation | 42% failure rate requires human review of all outputs |
Tools and papers for protocol design, code implementation, laboratory execution, and data collection. Largest category (48 items).
Papers:
- A Survey of Vibe Coding with Large Language Models
NEW(arXiv:2510.12399, Dec 2025) -- Synthesizes vibe coding practices into 5 development models: Unconstrained Automation, Iterative Conversational Collaboration, Planning-Driven, Test-Driven, Context-Enhanced. Survey paper. Tags: taxonomy, development models. - Vibe coding: programming through conversation with artificial intelligence
NEW(arXiv:2506.23253, Oct 2025) -- First empirical study analyzing 8+ hours of curated video. Found vibe coding redistributes expertise toward context management and rapid code evaluation. Empirical study. Tags: think-aloud, expertise. - Academic Vibe Coding: Opportunities for Accelerating Research in an Era of Resource Constraint
NEW(arXiv:2508.00952, Aug 2025) -- Structured, prompt-driven code generation embedded in reproducible workflows to compress idea-to-analysis timeline and reduce staffing pressure. Position paper. Tags: academic research, reproducibility. - Vibe Coding for UX Design
NEW(arXiv:2509.10652, Sep 2025) -- UX professionals find benefits (productivity, cognitive offloading, learning, creativity) but note unsuitability for production-level tasks with complex back-end systems. User study. Tags: UX, front-end. - Vibe Coding: Toward an AI-Native Paradigm for Semantic and Intent-Driven Programming
NEW(arXiv:2510.17842, Oct 2025) -- Proposes AI-native programming paradigm focused on semantic and intent-driven development, moving beyond syntax-level assistance. Theory paper. Tags: paradigm, semantic programming.
Tools:
| Tool | Type | Key Metric | Pricing |
|---|---|---|---|
| GitHub Copilot | Commercial | 20M users, 90% of Fortune 100 | $10-39/mo |
| Cursor | Commercial | 4.9/5 ratings, $60M Series A | Free-$20/mo |
| Claude Code | Commercial | 46% "most loved" (2026) | Subscription |
| Tabnine | Commercial | Air-gapped support, privacy-first | Freemium |
Continue OSS |
Open Source | GitHub 15k+ stars | Free |
Aider OSS |
Open Source | GitHub 20k+ stars, ~75% success | Free |
Tool Details:
- GitHub Copilot -- Pioneering inline autocomplete (launched 2021). 20M users (mid-2025), powers 90% of Fortune 100. Supports Claude 3 Sonnet, Gemini 2.5 Pro. Agent Mode introduced 2025. L1 Assist. Commercial: $10/mo individual, $19/mo Business, $39/mo Enterprise.
- Cursor -- Developed by Anysphere (2023), $60M Series A (Aug 2024). Averages 4.9/5 user ratings across 2025-2026 roundups. Top-ranked vibe-coding tool in independent benchmarks. L1 Assist. Commercial: Free tier, $20/mo Pro.
- Claude Code
NEW-- 46% "most loved" rating (early 2026) vs Cursor 19%, GitHub Copilot 9%. Terminal-native agentic coding from Anthropic. L1 Assist. Included in Claude subscription. - Tabnine -- Privacy-focused pioneer. Local/private cloud/VPC deployment, air-gapped support. Enterprise Context Engine learns org architecture. Best for regulated industries. L1 Assist. Freemium + Enterprise.
- Continue
OSS-- (GitHub | 15k+ stars) Open-source, developer-first coding assistant. Flexible model selection, extensible architecture. L1 Assist. Free. - Aider
OSS-- (GitHub | 20k+ stars) Repository-level agent handling 50k+ LOC codebases with ~75% success rate. Multi-file refactors, debugging loops, scoped task execution. L2 Partial. Free.
Papers:
- Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning (arXiv:2504.17192, 2024) -- Multi-agent LLM framework: planning -> analysis -> generation. 77% user preference, 83% practical utility on PaperBench, performance on par with author-released repos. Benchmark paper.
- DeepCode: Open Agentic Coding
NEW(arXiv:2512.07921, 2025) -- 73.5% success vs PaperCoder 51.1% (+22.4%). Beats human experts (75.9% vs 72.4%) on ICML 2024 paper reproduction. Agentic framework for Paper2Code, Text2Web, Text2Backend. SOTA on PaperBench.
Tools:
- PaperCoder / Paper2Code
OSS-- (2k+ stars) Multi-agent framework: ML paper -> operational code. 77% user preference, performance on par with author-released repos (no statistical difference vs Oracle). L2 Partial. - DeepCode
OSSNEW-- (1k+ stars) Open agentic coding for Paper2Code, Text2Web, Text2Backend. 73.5% success on PaperBench, beats human experts (75.9% vs 72.4%). L2 Partial. - Devin -- Built by competitive programming champions (Cognition), backed by Founders Fund. $4B valuation. Autonomous agent with integrated dev environment, terminal, tests, browser. L3 Conditional. Commercial.
Papers:
- Automatic Prompt Engineer (APE) (Zhou et al., 2022) -- (GitHub) Instruction generation as natural language synthesis, black-box optimization. Discovered better zero-shot CoT prompt than 'Let's think step by step' (+3% improvement). Foundational paper for prompt optimization.
- Is It Time To Treat Prompts As Code?
NEW(arXiv:2507.03620, 2025) -- Multi-use case study for DSPy prompt optimization. Prompt evaluation criterion task: 46.2% -> 64.0% accuracy improvement (+17.8%). Empirical study.
Tools:
| Tool | Stars | Key Feature | Type |
|---|---|---|---|
| LangChain | 90k+ | Most complete production stack | OSS |
| LangGraph | 30k+ | Graph-based stateful agents | OSS |
| AutoGen | 30k+ | Conversational multi-agent | OSS |
| CrewAI | 25k+ | Role-based, 40% Fortune 500 | OSS |
| DSPy | 15k+ | Programmatic prompt optimization | OSS |
Tool Details:
- LangChain
OSS-- (GitHub | 90k+ stars) Most complete stack for production agent workflows. Evolved from chaining library to full orchestration platform. Best-in-class observability with LangSmith. L3 Conditional. - LangGraph
OSS-- (GitHub | 30k+ stars) Modern production interface for LangChain. Graph abstraction for stateful multi-agent apps with branching, state machines, error handling, checkpointing. L3 Conditional. - LangSmith -- Best-in-class observability for agents. Traces every LLM call, tool invocation, chain step (latency, tokens, errors). Standard production runtime for debugging and monitoring. L3 Conditional. Commercial.
- CrewAI
OSS-- (GitHub | 25k+ stars) Role-based multi-agent workflows. 40% of Fortune 500 use CrewAI agents. Fastest prototyping (minutes). Structured memory with RAG. L3 Conditional. - AutoGen
OSS-- (GitHub | 30k+ stars) Microsoft's conversational multi-agent model. Agents communicate via natural language. Async task execution, human-in-the-loop oversight. L3 Conditional. - DSPy
OSS-- (GitHub | 15k+ stars) Stanford framework (2023). Abstracts prompts into modular Python (Signatures, Modules). Programmatically creates and refines prompts. Up to 17.8% accuracy improvement. L3 Conditional.
Papers:
- Self-Driving Laboratories for Chemistry and Materials Science (Chemical Reviews, 2024) -- Comprehensive review: SDLs automate experimental workflows with autonomous planning, accelerating chemistry and materials discovery. Demonstrates 10x data collection speedup. Review paper. Venue: Chemical Reviews (IF 54.3).
- Autonomous 'self-driving' laboratories: a review of technology
NEW(Royal Society Open Science, 2025) -- Most capable SDLs automate entire scientific method: hypothesis generation, experimental design, execution, data analysis, conclusion drawing, hypothesis updating. Review paper. - A Survey of AI Scientists
NEW(arXiv:2510.23045, 2025) -- Comprehensive survey of autonomous research agents and hierarchical multi-agent architectures with meta-orchestrators spawning domain specialists. Survey paper.
Tools:
| Tool | Autonomy | Domain | Key Result |
|---|---|---|---|
| A-Lab | L5 | Materials | 71% success, 10x speedup |
| ChemCrow | L4 | Chemistry | 18 expert tools |
| Coscientist | L4 | Chemistry | <4 min protocol design |
| FutureHouse | L4-5 | Biology | 4 specialized agents |
| Edison Scientific | L5 | Multi | $70M seed, 79.4% accuracy |
| Agent Laboratory | L4 | ML | 84% cost reduction |
Tool Details:
- A-Lab (Berkeley Lab) -- Fully autonomous solid-state synthesis. 41/58 DFT-predicted materials synthesized in 17 days, 71% success, minimal human intervention. 10x data collection speedup. Published in Nature. L5 Full.
- ChemCrow -- LLM for chemical tasks with 18 expert-designed tools. Finds molecules, plans synthetic routes, executes synthesis on cloud robotic platforms. L4 High.
- Coscientist -- LLM-driven system for autonomous chemical experiment design, planning, robotic control. Successfully optimized Nobel Prize-winning palladium-catalyzed cross-couplings without human intervention (<4 min protocol design). Published in Nature. L4 High.
- AlabOS
NEW-- Autonomous Laboratory Operating System. Reconfigurable workflow management for autonomous materials labs. Orchestrates instruments, robots, and AI planners. L4 High. - FutureHouse -- Nonprofit AI-for-science lab (SF, launched Sep 2023). Building AI Scientist with world models. Specialized agents: Crow, Falcon (literature search), Phoenix, Owl (experimental design). L4-5.
- Edison Scientific
NEW(FutureHouse spinout) -- $70M seed at $250M valuation (co-led Spark Capital, Triatomic Capital). Kosmos platform: reads 1500 papers, runs 42k lines of analysis. Beta users: 6 months work -> 1 day, 79.4% conclusion accuracy. L5 Full. Commercial. - Agent Laboratory
NEW-- o1-preview driven, 84% reduction in research expenses. Supports literature review -> experiment -> report writing as end-to-end pipeline. L4 High.
Tools:
- CodeRabbit -- AI code review assistant. Reports AI-generated code has 1.7x more issues, 2.74x more security vulnerabilities. Achieves 4x faster PR merge times. Essential for quality assurance in vibe coding workflows. L2 Partial. Commercial.
| Finding | Detail |
|---|---|
| Developer adoption | 92% of US developers use AI coding tools daily (2025) |
| Code generation | 41% of all code globally AI-generated, 256B lines in 2024 |
| Quality concerns | 1.7x more issues, 2.74x more security vulnerabilities (CodeRabbit 2025) |
| Productivity paradox | Developers felt 20% faster but took 19% longer after debugging (Stack Overflow 2025) |
| Materials discovery | Years -> weeks compression with self-driving laboratories |
| Market | Lab automation $7.84B (2024) -> $14.78B (2034), CAGR 6.55% |
Tools and papers for statistical analysis, visualization, results interpretation, and reproducibility. (18 items)
Papers:
- SciRep: Framework for Reproducibility
NEW(arXiv:2503.07080, 2025) -- Configuration, execution, packaging of computational experiments. 89% success vs 61% for Docker/Singularity baselines. Addresses the reproducibility crisis in ML research. Benchmark paper. - ReproSchema
NEW(JMIR, 2025) -- Schema-centric survey design for reproducible data collection. Reusable assessments, validation/conversion tools. Published in peer-reviewed medical informatics journal. Venue: JMIR (IF 7.4).
Tools:
| Tool | Stars | Focus | Type |
|---|---|---|---|
| Docker | 100k+ | Standard containerization | OSS |
| Nextflow | 25k+ | Bioinformatics pipelines | OSS |
| Snakemake | 20k+ | Declarative workflows | OSS |
| MLflow | 18k+ | MLOps platform | OSS |
| DVC | 13k+ | Data version control | OSS |
| Apptainer | 10k+ | HPC containerization | OSS |
| Weights & Biases | 8k+ | ML experiment tracking | Freemium |
Tool Details:
- SciRep
NEW-- 89% success rate vs 61% for other tools. Automated configuration, execution, and packaging of computational experiments into reproducible artifacts. L2 Partial. - Docker
OSS-- (GitHub | 100k+ stars) Industry-standard container platform for reproducible scientific computing environments. Foundation for most reproducibility workflows. L2 Partial. Free. - Singularity / Apptainer
OSS-- (GitHub | 10k+ stars) HPC-friendly containerization for scientific workflows. Runs without root access on shared computing clusters. L2 Partial. Free. - Snakemake
OSS-- (GitHub | 20k+ stars) Declarative workflow management for bioinformatics and data science. Python-based rule definitions, automatic parallelization. L2 Partial. Free. - Nextflow
OSS-- (GitHub | 25k+ stars) Scalable workflow orchestration for bioinformatics pipelines. Supports Docker, Singularity, cloud execution. Active nf-core community. L2 Partial. Free. - DVC (Data Version Control)
OSS-- (GitHub | 13k+ stars) Git-like version control for datasets and ML models. Tracks data pipelines, enables experiment reproducibility. L2 Partial. Free. - MLflow
OSS-- (GitHub | 18k+ stars) Open-source MLOps platform: experiment tracking, project packaging, model registry, deployment. L2 Partial. Free. - Weights & Biases -- (GitHub | 8k+ stars) ML experiment tracking, dataset versioning, model registry, hyperparameter sweeps. Industry-leading visualization. L2 Partial. Freemium.
| Finding | Detail |
|---|---|
| Reproducibility crisis | Only 19.5% of top ML conference papers (2024) provide code |
| SciRep advantage | 89% success vs 61% for Docker/Singularity |
| 2026 vision | Automated reproducibility as byproduct of AI-assisted research |
Tools and papers for manuscript drafting, citation management, figure/table creation, and formatting. (10 items)
Tools:
- Grammarly -- Grammar, style, and tone suggestions. Widely used academic writing assistant with clarity improvements and plagiarism detection. L1 Assist. Freemium.
- QuillBot -- Paraphrasing and grammar correction. Flow workspace for end-to-end writing process. Reliable for rephrasing, checking grammar, summarizing. L1 Assist. Freemium.
- LanguageTool
OSS-- Open-source grammar checker supporting 30+ languages. Free alternative to commercial tools. Self-hostable for privacy. L1 Assist. Free.
Tools:
- Jenni AI -- Research and academic writing assistant. Strong for citations, outlines, academic formatting. Specialized for research-driven writing tasks with inline citation support. L2 Partial. Commercial.
- Yomu AI -- Excels at paragraph development and PDF interaction. Quick paper gist understanding for staying current with literature. L2 Partial. Commercial.
- SciSpace (writing mode) -- 10M+ researchers (2026). Upload PDF, chat with paper. Semantic search, automated summarization, reference management, AI copilot for writing. L2 Partial. Freemium.
Papers:
- The AI Scientist-v2
NEW(arXiv:2504.08066, 2025, Sakana AI) -- First AI-generated paper accepted through peer review (ICLR 2025 workshop). Removes human templates, uses progressive agentic tree search. Milestone paper. Venue: ICLR Workshop.
Tools:
- Sakana AI - AI Scientist
OSS-- (GitHub | 5k+ stars) End-to-end paper generation: hypothesis -> code -> experiment -> LaTeX paper. First system to produce peer-review-accepted output. L5 Full.
| Finding | Detail |
|---|---|
| Gap | Level 3-4 tools missing (between Jenni AI L2 and AI Scientist-v2 L5) |
| Need | Automated draft generation with human refinement (RAG for review papers) |
| Milestone | AI Scientist-v2: first AI paper to exceed human acceptance threshold (2025) |
Tools and papers for presentation slides, video/audio narration, social media, and homepage creation. (1 item -- critical gap)
Tools:
- Slide Templates -- AI-suggested layouts and content generation for academic presentations. Basic automation of slide creation from paper content. L1 Assist.
| Finding | Detail |
|---|---|
| CRITICAL GAP | Promotion stage severely under-researched: 1 tool, 0 papers (96% gap vs Experiment 48 items) |
| Missing L2-3 | Automated slide generation, poster creation from papers |
| Missing L4-5 | Video narration, social media automation, conference talk generation |
| Potential additions | Beautiful.ai, Tome, Gamma (slides); Descript, Synthesia (video); Buffer, Hootsuite (social media) |
General-purpose platforms, multimodal AI, and infrastructure supporting multiple research stages.
Papers:
- AI for Science 2025
NEW(Nature, 2025) -- Interdisciplinary knowledge graphs, RL-driven closed-loop systems, interactive AI interfaces for scientific theory refinement. Venue: Nature.
Tools:
- NVIDIA BioNeMo -- LLM for biology (announced fall 2022). Major expansion to full open development platform. Lab-in-a-loop workflows for biology and drug discovery. RNA structure, molecular synthesis, toxicity prediction. Commercial.
- Eli Lilly x NVIDIA Partnership -- $1B 5-year strategic partnership. AI co-innovation lab (SF Bay Area). Generative AI for drug discovery. Partnership.
- NSF RAISE -- Research in AI for Science and Engineering. Democratizing AI access for researchers through infrastructure and funding. Government initiative.
- National AI Research Resources -- NSF infrastructure for democratizing AI access to computational resources for the research community. Government initiative.
Tools:
- InternVL3-78B
OSS-- (5k+ stars) 72.2 MMMU (open-source record). Vision-language model for multimodal scientific tasks including figure understanding and data extraction. Open-source. - Qwen2.5-VL-32B-Instruct
OSS-- (10k+ stars) Vision-language instruction tuning (Alibaba). Strong multimodal understanding and generation capabilities. Open-source. - GLM-4.5V -- MoE architecture, 3D-RoPE, 72.2 MMMU benchmark (tied with InternVL3 for top score). Advanced vision-language capabilities. Commercial.
- MaCBench
NEW-- Benchmark for chemistry/materials tasks: data extraction, experimental execution, results interpretation. Published in Nature Computational Science. Venue: Nature Computational Science.
| Finding | Detail |
|---|---|
| Market size | Multimodal AI: $391B (2025) -> ~$2T (2030), CAGR 35.9% |
| Strategic investments | $1B+ partnerships emerging (NVIDIA-Eli Lilly) |
| Timeline compression | Discovery cycles compressed from years to weeks with AI + autonomous experimentation |
- Vibe Researching as Wolf Coming
NEW(arXiv:2602.22401, Feb 2026) -- Introduces vibe researching concept using scholar-skill (26-skill plugin for Claude Code, full pipeline idea -> submission). Cognitive task framework: codifiability x tacit knowledge. AI agents excel at speed, coverage, scaffolding but struggle with theoretical originality. - VibeX 2026 Workshop -- 1st International Workshop on Vibe Coding and Vibe Researching @ EASE 2026. Mixed audience: junior/senior researchers, practitioners. Focus on autonomous AI agents in SE research.
- Awesome Machine Learning -- Comprehensive ML resources across languages and frameworks
- Awesome Deep Learning -- Deep learning papers, tutorials, and resources
- Awesome AI for Science -- AI applications in scientific research across domains
| Conference | Relevance |
|---|---|
| ICLR | AI Scientist-v2 first peer-reviewed AI paper (2025 workshop) |
| EASE 2026 | Hosts VibeX 2026 workshop on vibe coding and vibe researching |
| NeurIPS | PaperBench benchmark papers, ML reproducibility initiatives |
| ICML | DeepCode benchmark (75.9% success on ICML 2024 reproduction) |
Contributions welcome! Please read the contribution guidelines first.
- Add a resource: Submit a pull request with the new tool/paper. Include: title, URL, description (1-2 sentences), and classification (Stage, Level).
- Update existing entries: Corrections, updated metrics (GitHub stars, user counts), or additional information.
- Report gaps: Identify missing categories, tools, or papers via Issues.
- Papers: Must be peer-reviewed, arXiv preprints, or published in reputable venues. Include arXiv ID or DOI.
- Tools: Must be actively maintained (not archived), have clear documentation, and be relevant to research automation.
- Descriptions: Concise (1-2 sentences), factual (no marketing language), include quantitative metrics when available.
| Level | Definition | Human Role | Example |
|---|---|---|---|
| L1 (Assist) | AI suggests, human drives | Decides all logic | Code completion, grammar |
| L2 (Partial) | AI executes specific subtasks | Defines task, validates | Automated screening |
| L3 (Conditional) | AI handles workflows with checkpoints | Sets goals, approves | End-to-end literature review |
| L4 (High) | AI handles extended workflows | Monitors for failures | Self-driving labs |
| L5 (Full) | AI operates independently | None | AI Scientist: idea -> paper |
Stages: Survey, Ideation, Experiment, Analysis, Writing, Promotion
See CONTRIBUTING.md for the full classification decision tree and formatting standards.
To the extent possible under law, the contributors have waived all copyright and related rights to this work. See LICENSE for details.
This list was compiled through systematic web research (March 2026) covering academic databases (arXiv, Google Scholar, Semantic Scholar), GitHub repositories, and commercial platforms. The survey identified 107 resources (45 papers, 62 tools) across 10 research categories.
Maintained by: OpenLAIR Version: 2.0 (2026-03-16) Last Updated: March 2026
Citation: If you use this resource in your research, please cite:
@misc{awesome-vibe-researching,
title={Awesome Vibe Researching: A Curated List of AI-Driven Research Automation Resources},
author={Dingjie Song and Lichao Sun},
year={2026},
howpublished={\url{https://github.com/OpenLAIR/awesome-vibe-researching}},
note={107 resources across 6 research stages and 5 autonomy levels}
}

