Recoleta Item Note

MOOSEnger -- a Domain-Specific AI Agent for the MOOSE Ecosystem

domain-specific-agentragcode-generationscientific-simulationtool-augmented-llm

Summary

MOOSEnger is a domain-specific AI agent for the MOOSE multiphysics simulation ecosystem, designed to convert natural-language requirements into runnable MOOSE input files. By combining retrieval-augmented generation, deterministic prechecks, and closed-loop validation with the MOOSE runtime in the loop, it significantly improves first-pass runnability.

Problem

MOOSE ".i" input files use strict HIT hierarchical syntax, with many components and detailed rules, making it difficult for newcomers to quickly write correct and runnable configurations.
When generating this kind of domain DSL using only a general-purpose LLM, it is easy to produce formatting errors, broken syntactic structure, hallucinated object type names, and solver/configuration errors that only surface at runtime.
This matters because the first runnable example in multiphysics simulation modeling is often the starting point for subsequent tuning, debugging, and scientific analysis; if the first-round failure rate is high, engineering and research efficiency are significantly slowed.

Approach

A core-plus-domain architecture separates general agent infrastructure from MOOSE-specific capabilities: the core layer handles configuration, tool registration, retrieval, persistence, and evaluation, while the MOOSE plugin layer handles HIT parsing, input file ingestion, type repair, and execution tools.
RAG is used to retrieve curated documentation, examples, and input files; for MOOSE ".i" files, it uses structure-preserving chunking based on HIT syntax blocks rather than ordinary text chunking, increasing the chance of retrieving the correct block.
A deterministic input-precheck pipeline is added after generation: it cleans hidden formatting contamination, repairs malformed HIT structures through a bounded, grammar-constrained loop, and corrects invalid object/type names using context-conditioned similarity search over the application syntax registry.
The MOOSE executable is placed into the loop through an MCP-backed or local backend: it first validates and then optionally performs a smoke test, feeding solver errors and logs back to the agent for iterative “verify-and-correct” repair.
Built-in evaluation covers both retrieval quality (faithfulness, answer relevancy, context precision/recall) and end-to-end execution success rate, using actual execution results to measure whether the system truly generated usable inputs.

Results

Evaluation was conducted on a benchmark covering 175 prompts, with tasks spanning 7 MOOSE physics families: diffusion, transient heat conduction, solid mechanics, porous flow, incompressible Navier–Stokes, phase field, and plasticity.
MOOSEnger achieves an end-to-end execution pass rate = 0.90, while the LLM-only baseline = 0.06, an absolute improvement of 0.84, or about 15× the baseline.
The paper mainly attributes this improvement to the combination of three mechanisms: retrieval augmentation, deterministic precheck repair, and involving the MOOSE runtime in validation and iterative error correction.
The paper also claims the system can evaluate RAG faithfulness, relevancy, and precision/recall, but the provided excerpt does not include the specific values for these metrics.

Link

http://arxiv.org/abs/2603.04756v2

Built with Recoleta

Run your own research radar

Turn arXiv, Hacker News, OpenReview, Hugging Face Daily Papers, and RSS into local Markdown, Obsidian notes, Telegram digests, and a public site.

View repo 5-minute quickstart