---
kind: trend
trend_doc_id: 65
granularity: day
period_start: '2026-03-04T00:00:00'
period_end: '2026-03-05T00:00:00'
topics:
- robotics
- vla
- memory
- benchmark
- continual-learning
- dexterous-manipulation
- dual-arm
run_id: materialize-outputs
aliases:
- recoleta-trend-65
tags:
- recoleta/trend
- topic/robotics
- topic/vla
- topic/memory
- topic/benchmark
- topic/continual-learning
- topic/dexterous-manipulation
- topic/dual-arm
language_code: zh-CN
---

# 机器人研究转向记忆评测、结构化控制与大规模基准

## Overview
这一天的机器人研究很集中。关键词不是单纯“更大模型”，而是更清楚地拆解能力来源：记忆、基准、结构化控制，以及持续学习。主要观察-记忆成为最明确的主题，但研究焦点已从“给模型加历史”转向“什么任务需要什么记忆”。-基准建设继续加速。一类工作扩大仿真规模，另一类工作开始补真实世界统一评测。-结构先验重新变重要。双臂和灵巧手都在用更可组合的表示替代端到端混合控制。

## Clusters

### 机器人记忆从能力口号走向系统评测与分层设计

当天最强主线是“机器人需要记忆，但记忆不是单一模块”。RoboMME先把记忆评测标准化，证明不同任务依赖不同记忆表示与注入方式。MEM则进一步把这件事做成可运行系统：短期视频记忆负责细节，长期语言记忆负责任务进度。两篇工作合起来，推动讨论从“要不要记忆”转向“按任务分配记忆类型”。

#### Representative sources
- [RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies](../Inbox/2026-03-04--robomme-benchmarking-and-understanding-memory-for-robotic-generalist-policies.md) — Yinpei Dai; Hongze Fu; Jayjun Lee; Yuejiang Liu; Haoran Zhang; Jianing Yang; …
- [MEM: Multi-Scale Embodied Memory for Vision Language Action Models](../Inbox/2026-03-04--mem-multi-scale-embodied-memory-for-vision-language-action-models.md) — Marcel Torne; Karl Pertsch; Homer Walke; Kyle Vedder; Suraj Nair; Brian Ichter; …


### 大规模基准扩张到仿真与真实世界两端

第二条主线是“通用机器人需要更大、更统一的训练与评测场”。RoboCasa365把任务、场景和演示同时放大，用统一协议衡量多任务训练、预训练收益和终身学习问题。ManipulationNet则把视角拉回真实世界，强调标准化物体套件、提交流程和集中审核，试图建立可比较、可验证的真实操作基准。

#### Representative sources
- [RoboCasa365: A Large-Scale Simulation Framework for Training and Benchmarking Generalist Robots](../Inbox/2026-03-04--robocasa365-a-large-scale-simulation-framework-for-training-and-benchmarking-generalist-robots.md) — Soroush Nasiriany; Sepehr Nasiriany; Abhiram Maddukuri; Yuke Zhu
- [ManipulationNet: An Infrastructure for Benchmarking Real-World Robot Manipulation with Physical Skill Challenges and Embodied Multimodal Reasoning](../Inbox/2026-03-04--manipulationnet-an-infrastructure-for-benchmarking-real-world-robot-manipulation-with-physical-skill-challenges-and-embodied-multimodal-reasoning.md) — Yiting Chen; Kenneth Kimble; Edward H. Adelson; Tamim Asfour; Podshara Chanrungmaneekul; Sachin Chitta; …


### 结构化动作表示开始替代一体化黑盒控制

第三条主线是“通用性不再只靠更大模型，也靠更好的结构归纳”。SkillVLA把双臂操作拆成可复用的单臂技能再按需通信，解决未见技能配对几乎全灭的问题。SAT把灵巧手动作改写为按关节组织的3D结构序列，让同一模型更自然地跨手型迁移。两者都在减少动作表示中的无谓耦合。

#### Representative sources
- [SkillVLA: Tackling Combinatorial Diversity in Dual-Arm Manipulation via Skill Reuse](../Inbox/2026-03-04--skillvla-tackling-combinatorial-diversity-in-dual-arm-manipulation-via-skill-reuse.md) — Xuanran Zhai; Zekai Huang; Longyan Wu; Qianyou Zhao; Qiaojun Yu; Jieji Ren; …
- [Structural Action Transformer for 3D Dexterous Manipulation](../Inbox/2026-03-04--structural-action-transformer-for-3d-dexterous-manipulation.md) — Xiaohan Lei; Min Wang; Bohong Weng; Wengang Zhou; Houqiang Li


### 预训练VLA在持续学习中展现更强抗遗忘性

持续学习方向出现一个偏乐观的结论：大规模预训练VLA的遗忘没有想象中严重。该工作显示，简单经验回放就能让Pi0和GR00T在LIBERO多个套件上保持较低遗忘，明显好于从零训练的小模型。这意味着未来技能库扩展，可能更依赖预训练底座与少量回放，而不是复杂防遗忘技巧。

#### Representative sources
- [Pretrained Vision-Language-Action Models are Surprisingly Resistant to Forgetting in Continual Learning](../Inbox/2026-03-04--pretrained-vision-language-action-models-are-surprisingly-resistant-to-forgetting-in-continual-learning.md) — Huihan Liu; Changyeon Kim; Bo Liu; Minghuan Liu; Yuke Zhu
- [RoboCasa365: A Large-Scale Simulation Framework for Training and Benchmarking Generalist Robots](../Inbox/2026-03-04--robocasa365-a-large-scale-simulation-framework-for-training-and-benchmarking-generalist-robots.md) — Soroush Nasiriany; Sepehr Nasiriany; Abhiram Maddukuri; Yuke Zhu
