---
kind: trend
trend_doc_id: 284
granularity: day
period_start: '2026-03-08T00:00:00'
period_end: '2026-03-09T00:00:00'
topics:
- code-generation
- agent-memory
- agent-security
- human-in-the-loop
- software-engineering
run_id: materialize-outputs
aliases:
- recoleta-trend-284
tags:
- recoleta/trend
- topic/code-generation
- topic/agent-memory
- topic/agent-security
- topic/human-in-the-loop
- topic/software-engineering
language_code: zh-CN
---

# 结构化代码智能、长时智能体与 Agent 安全前移

## Overview
今天的材料集中传递出一个信号：AI系统正从“会生成”走向“能落地”。代码、智能体、安全与研究流程都在转向结构化约束、长期运行和人类把关。主要观察-代码任务更依赖结构化知识。这类进展不只是“更大模型”，而是把版本关系、程序图和演化路径显式引入生成与判断流程。-智能体开始按真实生产系统来设计。重点变成记忆、审计、回滚、异步运行，以及人何时介入。-安全问题前移。

## Clusters

### 结构化代码推理替代纯文本生成

代码智能体开始从“直接生成”转向“先建结构，再生成”。最强信号来自 API 演化迁移：KCoEvo 用知识图谱显式表示跨版本关系，再让模型按迁移路径生成代码，多个模型在困难版本迁移上大幅提升。与之呼应，补丁正确性评估也显示图结构表示更稳，说明代码任务里，结构化程序信息正在替代纯文本提示或普通检索，成为提升可靠性的主轴。

#### Representative sources
- [KCoEvo: A Knowledge Graph Augmented Framework for Evolutionary Code Generation](../Inbox/2026-03-08--kcoevo-a-knowledge-graph-augmented-framework-for-evolutionary-code-generation.md) — Jiazhen Kang; Yuchen Lu; Chen Jiang; Jinrui Liu; Tianhao Zhang; Bo Jiang; …
- [On the Effectiveness of Code Representation in Deep Learning-Based Automated Patch Correctness Assessment](../Inbox/2026-03-08--on-the-effectiveness-of-code-representation-in-deep-learning-based-automated-patch-correctness-assessment.md) — Quanjun Zhang; Chunrong Fang; Haichuan Hu; Yuan Zhao; Weisong Sun; Yun Yang; …


### 智能体进入“长期运行 + 可监督”阶段

智能体系统的关注点正在从“能不能调用工具”转向“能不能长期工作且不失控”。一条线是记忆：综述指出 write-manage-read 闭环和跨会话评测已成核心议题，记忆直接影响任务完成率。另一条线是运行形态：Oly 把长时 CLI 代理变成可异步监督的后台服务，Claude Custom Chat 则把“自修改工具”包进快照回滚和受限作用域里。两者都说明，下一阶段竞争点是可持续运行、可回滚、可审计，而不只是单次回答质量。

#### Representative sources
- [Memory for Autonomous LLM Agents:Mechanisms, Evaluation, and Emerging Frontiers](../Inbox/2026-03-08--memory-for-autonomous-llm-agents-mechanisms-evaluation-and-emerging-frontiers.md) — Pengfei Du
- [Oly – Run AI agents, close your terminal, intervene when it needed from anywhere](../Inbox/2026-03-08--oly-run-ai-agents-close-your-terminal-intervene-when-it-needed-from-anywhere.md) — binwen
- [Claude Custom Chat – customize your Claude Code extension](../Inbox/2026-03-08--claude-custom-chat-customize-your-claude-code-extension.md) — mattiagaggi


### Agent 安全从提示防护升级为数据流治理

随着代理获得文件、代码、外部服务和跨工具访问权限，安全边界明显前移。安全分析文章用真实事件说明，高权限代理、提示注入和外部通信的组合会放大供应链与横向移动风险。学术侧的 AgentRaft 则进一步量化了问题：在大规模真实工具环境中，数据过度暴露是系统性现象。趋势很清楚：Agent 安全不再只是模型对齐问题，而是权限最小化、数据流审计和工具链治理问题。

#### Representative sources
- [AI Assistants Are Moving the Security Goalposts](../Inbox/2026-03-08--ai-assistants-are-moving-the-security-goalposts.md) — todsacerdoti
- [AgentRaft: Automated Detection of Data Over-Exposure in LLM Agents](../Inbox/2026-03-08--agentraft-automated-detection-of-data-over-exposure-in-llm-agents.md) — Yixi Lin; Jiangrong Wu; Yuhong Nan; Xueqiang Wang; Xinyuan Zhang; Zibin Zheng


### 人机协同工作流比“全自动”更受重视

多篇材料共同指向一个更务实的方向：AI 不再追求全自动替代，而是进入“人类把关 + 流水线自动化”的生产阶段。HLER 在经济学研究中把人类放在选题和最终批准节点，显著减少不可行问题。提示工程实践也强调评测集、结构化约束和调试信息，而远程测试研究则显示，分布式软件流程越来越依赖文档化、自动化和可追踪协作。整体上，行业在把 AI 嵌入已有工作流，而不是单独崇拜自治。

#### Representative sources
- [HLER: Human-in-the-Loop Economic Research via Multi-Agent Pipelines for Empirical Discovery](../Inbox/2026-03-08--hler-human-in-the-loop-economic-research-via-multi-agent-pipelines-for-empirical-discovery.md) — Chen Zhu; Xiaolu Wang
- [State-of-the-Art Prompting for AI Agents (2025)](../Inbox/2026-03-08--state-of-the-art-prompting-for-ai-agents-2025.md) — walterbell
- [Regression Testing in Remote and Hybrid Software Teams: An Exploratory Study of Processes, Tools, and Practices](../Inbox/2026-03-08--regression-testing-in-remote-and-hybrid-software-teams-an-exploratory-study-of-processes-tools-and-practices.md) — Juliane Pascoal; Cleytton Magalhaes; Ronnie de Souza Santos