A development-process data reconstruction toolchain for code-agent training
Provide code-intelligence teams with a “development trajectory data factory” that reconstructs existing repositories and CI records into process samples covering requirements, localization, reading, editing, debugging, and validation, and outputs data formats usable for training, offline evaluation, and replay auditing.
What was previously missing was a scalable way to construct process data and a training objective that can verify step quality; now that both have appeared at once, “process data” is no longer just a research concept and can become a dedicated data layer for enterprise coding assistants.
On one side, Understanding by Reconstruction shows that roughly 4B tokens of development trajectories can be reverse-synthesized from about 300k repositories and improve long-context and code capabilities; on the other, ExecVerify shows that intermediate execution states can be white-box verified and used directly for reinforcement learning, rather than merely imitating explanation text.
Select 20–50 internal repositories with complete issue, PR, and CI records, and first build a minimal reconstruction version: generate file-read order, edit sequences, and trajectories from failing tests to fixed tests; then use those trajectories to train a small patch ranker or localizer and compare against a baseline trained only on repository snapshots.
- Understanding by Reconstruction: Reversing the Software Development Process for LLM Pretraining: Repository-snapshot training is being supplanted by “reconstructed development trajectories,” showing that process data usable for training or evaluating agents is starting to have clear methods and scale.
- ExecVerify: White-Box RL with Verifiable Stepwise Rewards for Code Execution Reasoning: Step-level verifiable rewards have already been shown to significantly improve code-execution reasoning and transfer to code generation, indicating that intermediate-state supervision is starting to have direct product value.