SUREFlow#

SUBMITTED · IROS 2026 co-author with Md Tanvir Islam et al. · KNU

SUREFlow

State-space Uncertainty-aware REsidual Flow Matching for robust robot manipulation. A Mamba-backbone VLA policy that predicts both action velocities and input-dependent uncertainty — selectively refining unreliable action dimensions during inference without environment feedback.

FLOW MATCHING MAMBA · VLA
Flow Matching Mamba SSM VLA Uncertainty Estimation LIBERO LIBERO-PRO PyTorch Franka Panda
SUREFlow architecture diagram

TL;DR#

SUREFlow is a generative robot-manipulation policy that closes the gap between diffusion/flow action models and reliable execution during long rollouts.

  • 92.6 % average success rate on LIBERO — outperforms the Mamba-based MaIL baseline by +34.3 %.

  • ~50 % success rate on LIBERO-PRO with only 179 M parameters — comparable to 3–7 B VLAs.

  • Built on a Mamba backbone (state-space sequence modeling, linear-time inference).

  • Adds input-dependent uncertainty + residual refinement without environment feedback.

What Problem It Solves#

Generative VLA policies (diffusion / flow matching) advanced robot manipulation, but they often wobble under noise, partial observability, and stochastic initial conditions. Tiny velocity errors accumulate over long rollouts, eroding success rates.

Existing diffusion- and flow-based policies typically assume homoscedastic residuals — they ignore that some action dimensions are inherently harder to predict than others. The result: brittle one-shot predictions, error accumulation, and unreliable extended-horizon control.

How It Works#

SUREFlow overview · closed-loop residual refinement
Closed-loop refinement of uncertain action dimensions via internal residual updates during inference, without external feedback. Right: LIBERO results vs SOTA baselines.

SUREFlow combines three ideas into one lightweight policy:

  1. Conditional flow matching — learns a velocity field that transports Gaussian noise toward expert action distributions, conditioned on multi-view RGB observations, robot proprioception, and language task embeddings.

  2. Uncertainty-aware Residual Flow (URFlow) — an auxiliary head predicts input-dependent variance over the velocity field. During inference, this signal selectively re-refines only the unreliable action dimensions through internal residual updates — no environment feedback or planner required.

  3. Memory-Guided Action Decoder (MGAD) — re-attends learnable action queries to multimodal memory representations, improving temporal conditioning and structured action generation.

All three modules live on top of a single Mamba state-space backbone — linear-time, scalable, and far lighter than transformer-based 3–7 B VLAs.

Results#

92.6% LIBERO · avg success rate
+34.3% vs Mamba MaIL baseline
179M parameters · lightweight
~50% LIBERO-PRO · matches 3-7B VLAs

Why It’s Interesting#

  • No environment feedback needed at inference. Refinement happens entirely inside the policy using its own uncertainty signal — practical for real robots where feedback loops are expensive.

  • State-space backbone instead of giant transformers. SUREFlow gets foundation-model-level performance at a tiny fraction of the parameter count.

  • Probabilistic regularization preserves the flow-matching objective. Adds robustness without breaking the underlying generative formulation.

My Contribution#

Co-author with Md Tanvir Islam, Sangmoon Lee, and Sangtae Ahn at Kyungpook National University. I contributed to the flow-matching policy architecture, the Mamba-backbone integration, evaluation pipeline on LIBERO / LIBERO-PRO, and the ablations that quantify the impact of URFlow and MGAD.