SUREFlow

State-space Uncertainty-aware REsidual Flow Matching for robust robot manipulation. A Mamba-backbone VLA policy that predicts both action velocities and input-dependent uncertainty — selectively refining unreliable action dimensions during inference without environment feedback.

IROS ’26 · SUBMITTED FLOW MATCHING MAMBA · VLA

↗ READ PAPER ✉ GET IN TOUCH

Flow Matching Mamba SSM VLA Uncertainty Estimation LIBERO LIBERO-PRO PyTorch Franka Panda

``` ## TL;DR SUREFlow is a generative robot-manipulation policy that closes the gap between **diffusion/flow** action models and **reliable execution** during long rollouts. - **92.6 %** average success rate on **LIBERO** — outperforms the Mamba-based MaIL baseline by **+34.3 %**. - **~50 %** success rate on **LIBERO-PRO** with only **179 M parameters** — comparable to 3–7 B VLAs. - Built on a **Mamba** backbone (state-space sequence modeling, linear-time inference). - Adds **input-dependent uncertainty** + **residual refinement** without environment feedback. ## What Problem It Solves Generative VLA policies (diffusion / flow matching) advanced robot manipulation, but they often **wobble under noise**, partial observability, and stochastic initial conditions. Tiny velocity errors **accumulate over long rollouts**, eroding success rates. Existing diffusion- and flow-based policies typically assume **homoscedastic residuals** — they ignore that some action dimensions are inherently harder to predict than others. The result: brittle one-shot predictions, error accumulation, and unreliable extended-horizon control. ## How It Works ```{raw} html

SUREFlow overview · closed-loop residual refinement — Closed-loop refinement of uncertain action dimensions via internal residual updates during inference, without external feedback. Right: LIBERO results vs SOTA baselines.

``` SUREFlow combines three ideas into one lightweight policy: 1. **Conditional flow matching** — learns a velocity field that transports Gaussian noise toward expert action distributions, conditioned on multi-view RGB observations, robot proprioception, and language task embeddings. 2. **Uncertainty-aware Residual Flow (URFlow)** — an auxiliary head predicts input-dependent variance over the velocity field. During inference, this signal **selectively re-refines** only the unreliable action dimensions through internal residual updates — no environment feedback or planner required. 3. **Memory-Guided Action Decoder (MGAD)** — re-attends learnable action queries to multimodal memory representations, improving temporal conditioning and structured action generation. All three modules live on top of a single **Mamba** state-space backbone — linear-time, scalable, and far lighter than transformer-based 3–7 B VLAs. ## Results ```{raw} html

92.6% LIBERO · avg success rate

+34.3% vs Mamba MaIL baseline

179M parameters · lightweight

~50% LIBERO-PRO · matches 3-7B VLAs

``` ## Why It's Interesting - **No environment feedback needed at inference.** Refinement happens entirely inside the policy using its own uncertainty signal — practical for real robots where feedback loops are expensive. - **State-space backbone instead of giant transformers.** SUREFlow gets foundation-model-level performance at a tiny fraction of the parameter count. - **Probabilistic regularization preserves the flow-matching objective.** Adds robustness without breaking the underlying generative formulation. ## My Contribution Co-author with **Md Tanvir Islam**, Sangmoon Lee, and Sangtae Ahn at **Kyungpook National University**. I contributed to the flow-matching policy architecture, the Mamba-backbone integration, evaluation pipeline on LIBERO / LIBERO-PRO, and the ablations that quantify the impact of URFlow and MGAD. ```{raw} html ```