-
Notifications
You must be signed in to change notification settings - Fork 4
Expand file tree
/
Copy pathpapers_methods_analysis.yaml
More file actions
131 lines (131 loc) · 4.89 KB
/
papers_methods_analysis.yaml
File metadata and controls
131 lines (131 loc) · 4.89 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
- short_name: PassRateConstraint
title: 'Does Pass Rate Tell the Whole Story? Evaluating Design Constraint Compliance
in LLM-based Issue Resolution'
authors: Kai Yu, Zhenhao Zhou, Junhao Zeng, Ying Wang, Xueying Du, Zhiqiang Yuan,
Junwei Liu, Ziyu Zhou, Yujia Wang, Chong Wang, Xin Peng
year: '2026'
venue: arXiv preprint arXiv:2604.05955
month: 2026-04
links:
arxiv: https://arxiv.org/abs/2604.05955
- short_name: ContextBench
title: 'ContextBench: A Benchmark for Context Retrieval in Coding Agents'
authors: Han Li, Letian Zhu, Bohan Zhang, Rili Feng, Jiaming Wang, Yue Pan, Earl
T. Barr, Federica Sarro, Zhaoyang Chu, He Ye
year: '2026'
venue: arXiv preprint arXiv:2602.05892
month: 2026-02
links:
arxiv: https://arxiv.org/abs/2602.05892
github: https://github.com/EuniAI/ContextBench
huggingface: https://huggingface.co/datasets/Contextbench/ContextBench
website: https://contextbench.github.io/
- short_name: SWEnergy
title: 'SWEnergy: An Empirical Study on Energy Efficiency in Agentic Issue Resolution
Frameworks with SLMs'
authors: Arihant Tripathy, Ch Pavan Harshit, Karthik Vaidhyanathan
year: '2025'
venue: ICSE Workshops 2026
month: 2025-12
links:
arxiv: https://arxiv.org/abs/2512.09543v2
- short_name: Failures analysis
title: An Empirical Study on Failures in Automated Issue Solving
authors: Simiao Liu, Fang Liu, Liehao Li, Xin Tan, Yinghao Zhu, Xiaoli Lian, Li
Zhang
year: '2025'
venue: arXiv preprint arXiv:2509.13941
month: 2025-09
links:
arxiv: https://arxiv.org/abs/2509.13941
- short_name: Security analysis
title: How Safe Are AI-Generated Patches? A Large-scale Study on Security Risks
in LLM and Agentic Automated Program Repair on SWE-bench
authors: Amirali Sajadi, Kostadin Damevski, Preetha Chatterjee
year: '2025'
venue: arXiv preprint arXiv:2507.02976
month: 2025-07
links:
arxiv: https://arxiv.org/abs/2507.02976
- short_name: Dissecting the SWE-Bench Leaderboards
title: 'Dissecting the SWE-Bench Leaderboards: Profiling Submitters and Architectures
of LLM- and Agent-Based Repair Systems'
authors: Matias Martinez, Xavier Franch
year: '2025'
venue: arXiv preprint arXiv:2506.17208
month: 2025-06
links:
arxiv: https://arxiv.org/abs/2506.17208
- short_name: GSO
title: 'GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents'
authors: Manish Shetty, Naman Jain, Jinjian Liu, Vijay Kethanaboyina, Koushik Sen,
Ion Stoica
year: '2025'
venue: arXiv preprint arXiv:2505.23671
month: 2025-05
links:
arxiv: https://arxiv.org/abs/2505.23671
- short_name: Strong-Weak Model Collaboration
title: An Empirical Study on Strong-Weak Model Collaboration for Repo-level Code
Generation
authors: Shubham Gandhi, Atharva Naik, Yiqing Xie, Carolyn Rose
year: '2025'
venue: arXiv preprint arXiv:2505.20182
month: 2025-05
links:
arxiv: https://arxiv.org/abs/2505.20182
- short_name: Agents in the Wild
title: Agents in the Wild
authors: LogicStar-AI, SRILab
year: '2025'
venue: '-'
month: 2025-05
links:
website: https://insights.logicstar.ai/
- short_name: SeaView
title: 'SeaView: Software Engineering Agent Visual Interface for Enhanced Workflow'
authors: Timothy Bula, Saurabh Pujar, Luca Buratti, Mihaela Bornea, Avirup Sil
year: '2025'
venue: arXiv preprint arXiv:2504.08696
month: 2025-04
links:
arxiv: https://arxiv.org/abs/2504.08696
- short_name: Beyond final code
title: 'Beyond Final Code: A Process-Oriented Error Analysis of Software Development
Agents in Real-World GitHub Scenarios'
authors: Zhi Chen, Wei Ma, Lingxiao Jiang
year: '2025'
venue: arXiv preprint arXiv:2503.12374
month: 2025-03
links:
arxiv: https://arxiv.org/abs/2503.12374
- short_name: Overthinking
title: 'The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic
Tasks'
authors: Alejandro Cuadron, Dacheng Li, Wenjie Ma, Xingyao Wang, Yichuan Wang, Siyuan
Zhuang, Shu Liu et al.
year: '2025'
venue: arXiv preprint arXiv:2502.08235
month: 2025-02
links:
arxiv: https://arxiv.org/abs/2502.08235
- short_name: Evaluating software development agents
title: 'Evaluating Software Development Agents: Patch Patterns, Code Quality, and
Issue Complexity in Real-World GitHub Scenarios'
authors: Chen, Zhi, Jiang, Lingxiao
year: '2024'
venue: 2025 IEEE International Conference on Software Analysis, Evolution and Reengineering
(SANER) 2025
month: 2024-10
links:
arxiv: https://arxiv.org/abs/2410.12468v2
doi: http://dx.doi.org/10.1109/SANER64311.2025.00068
- short_name: Context Retrieval
title: On The Importance of Reasoning for Context Retrieval in Repository-Level
Code Editing
authors: Alexander Kovrigin, Aleksandra Eliseeva, Yaroslav Zharov, Timofey Bryksin
year: '2024'
venue: arXiv preprint arXiv:2406.04464
month: 2024-06
links:
arxiv: https://arxiv.org/abs/2406.04464