A "automated data generation + online correction" data factory for robot teams
Build a "robot data factory middleware": use a small collection policy to explore in parallel and generate candidate trajectories, score and filter them with a multimodal verifier, and then add inference-time guidance during replay or real execution to form a closed loop from data generation to deployment. The first wedge is not training a general large model, but serving robot teams that already have a small amount of demonstrations but lack data scaling capability.
The previous pain point was that automatically generated data was too noisy, and failed trajectories could steer the policy off course. This evidence now shows that with only 4 seed demonstrations, average success rate can rise from 22.18% to 68.57%, while inference-time guidance can further substantially boost both success and safety. That suggests the chain of "bootstrapping from few demonstrations, automated data scaling, and online correction" is complete enough for the first time.
What changed is that automated sampling is no longer limited to coarse data expansion: there are now parallelizable small-model collectors, large-model video verifiers, and a no-retraining inference-time guidance layer that can connect low-quality trajectory filtering with execution-time correction.
Find 2–3 manipulation teams that already have no more than 10 demonstrations per task, and run pilots on grasping, stacking, and opening/closing tasks: compare four groups—manual data expansion, automated collection only, automated collection + verification, and then plus inference-time guidance—to verify within two weeks whether the cost per successful sample can drop by at least 50%.
- Seed2Scale: A Self-Evolving Data Engine for Embodied AI via Small to Large Model Synergy and Multimodal Evaluation: With very few seed demonstrations, the closed loop of "small-model collection + large-model verification + target-policy learning" can significantly improve success rates, showing that automated data generation plus quality filtering has reached a viable starting point for productization.
- OmniGuide: Universal Guidance Fields for Enhancing Generalist Robot Policies: Inference-time guidance can significantly improve success rate and safety without retraining or adding robot data, making it well suited as an online guardrail and correction layer after automated collection.