Cloth-Splatting: 3D Cloth State Estimation from RGB Supervision

1KTH Royal Institute of Technology   2Carnegie Mellon University  

Accepted to CoRL 2024

Abstract

We introduce Cloth-Splatting, a method for estimating 3D states of cloth from RGB images through a prediction-update framework. Cloth-Splatting leverages an action-conditioned dynamics model for predicting future states and uses 3D Gaussian Splatting to update the predicted states. Our key insight is that coupling a 3D mesh-based representation with Gaussian Splatting allows us to define a differentiable map between the cloth's state space and the image space. This enables the use of gradient-based optimization techniques to refine inaccurate state estimates using only RGB supervision. Our experiments demonstrate that Cloth-Splatting not only improves state estimation accuracy over current baselines but also reduces convergence time by ~85%.

Method Overview

We propose Cloth-Splatting, a method to estimate 3D states of cloth from RGB supervision by combining 3D Gaussian Splatting (GS) with an action-conditioned dynamics model. The key idea of our method is to represent the 3D state of the cloth as a mesh and create a differentiable mapping between the cloth state space and the observation space using GS. This is achieved by populating the mesh faces with 3D Gaussians and expressing their positions relative to the mesh vertices. Given this, we can address the problem of estimating the 3D state of the cloth using a prediction-update framework akin to Bayesian filtering. Starting with a previous state estimate and a known robotic action, Cloth-Splatting predicts the next state using a learned dynamics model of the cloth (left, yellow). This prediction is then updated using RGB observations (right, green), leveraging the rendering loss provided by GS, allowing the refinement of the state estimate using visual clues such as texture and geometry.

Real World Results

Video Results

BibTeX

@inproceedings{
  longhini2024clothsplatting,
  title={Cloth-Splatting: 3D State Estimation from {RGB} Supervision for Deformable Objects},
  author={Alberta Longhini and Marcel B{\"u}sching and Bardienus Pieter Duisterhof and Jens Lundell and Jeffrey Ichnowski and M{\r{a}}rten Bj{\"o}rkman and Danica Kragic},
  booktitle={8th Annual Conference on Robot Learning},
  year={2024},
  url={https://openreview.net/forum?id=WmWbswjTsi}
}

Acknowledgments

This work was supported by the Swedish Research Council; the Wallenberg Artificial Intelligence, Autonomous Systems and Software Program (WASP) funded by Knut and Alice Wallenberg Foundation; the European Research Council (ERC-884807); and the Center for Machine Learning and Health (CMLH). The computations were enabled by the the Pittsburgh Supercomputing Center and by the Berzelius resource provided by the Knut and Alice Wallenberg Foundation at the Swedish National Supercomputer Centre.