代码代理走向仓库级执行与验证闭环
本周最强主线是代码代理进入真实软件工程。关注点从“会不会写”转向“能否理解仓库、完成执行、再用验证闭环证明没写坏”。RAIM强调仓库级新功能添加,需要先找插入点、比较多种设计,再做影响评估。BeyondSWE把任务扩到跨仓库、依赖迁移和从文档生成仓库,直接暴露当前代理在复杂任务上的低成功率。Echo则把检索、生成、执行、验证串成闭环,进一步贴近真实开发流程。
Representative sources
- Closing the Loop – Optimizing the Agentic SDLC — btraut
- Architecture-Aware Multi-Design Generation for Repository-Level Feature Addition — Mingwei Liu; Zhenxi Chen; Zheng Pei; Zihao Wang; Yanlin Wang; Zibin Zheng
- Graduate from Single-Session Coding: My Full Agentic Coding Workflow — btraut
- RepoLaunch: Automating Build&Test Pipeline of Code Repositories on ANY Language and ANY Platform — Kenan Li; Rongzhi Li; Linghao Zhang; Qirui Jin; Liao Zhu; Xiaosong Huang; …
- BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing? — Guoxin Chen; Fanzhe Meng; Jiale Zhao; Minghao Li; Daixuan Cheng; Huatong Song; …
- A Scalable Benchmark for Repository-Oriented Long-Horizon Conversational Context Management — Yang Liu; Li Zhang; Fang Liu; Ping Lin; Xinyi Li