The ground truth and the corresponding reconstruction by our method for the "pulling" and "cutting" clips.
Reconstructing deformable tissues from endoscopic stereo videos is essential in many downstream surgical applications. However, existing methods suffer from slow inference speed, which greatly limits their practical use.
In this paper, we introduce EndoGaussian, a real-time surgical scene reconstruction framework that builds on 3D Gaussian Splatting. Our framework represents dynamic surgical scenes as canonical Gaussians and a time-dependent deformation field, which predicts Gaussian deformations at novel timestamps. Due to the efficient Gaussian representation and parallel rendering pipeline, our framework significantly accelerates the rendering speed compared to previous methods. In addition, we design the deformation field as the combination of a lightweight encoding voxel and an extremely tiny MLP, allowing for efficient Gaussian tracking with a minor rendering burden. Furthermore, we design a holistic Gaussian initialization method to fully leverage the surface distribution prior, achieved by searching informative points from across the input image sequence.
Experiments on public endoscope datasets demonstrate that our method can achieve real-time rendering speed (195 FPS real-time, 100x gain) while maintaining the state-of-the-art reconstruction quality (35.925 PSNR) and the fastest training speed (within 2 min/scene), showing significant promise for intraoperative surgery applications.
Using EndoGaussian, you can perform deformable surgical scene reconstruction with nearly 200 FPS, 35+ PSNR.
llustration of the proposed EndoGaussian framework, which consists of a) Holistic Gaussian Initialization, b) Voxel-based Gaussian Tracking, and c) Optimization.
Display of rendered results of our EndoGaussian against prior SOTA methods on surgical scene reconstruction. The rendering FPS, training time cost and image quality in PSNR is provided.
@article{liu2024endogaussian,
title={EndoGaussian: Gaussian Splatting for Deformable Surgical Scene Reconstruction},
author={Liu, Yifan and Li, Chenxin and Yang, Chen and Yuan, Yixuan},
journal={arXiv preprint arXiv:2401.12561},
year={2024}
}
A pioneering exploration into high-fidelity medical video generation on endoscopy scenes.
An innovative enhancement of U-Net for medical image tasks using Kolmogorov-Arnold Network (KAN).