SegviGen
Repurposing 3D Generative Model for Part Segmentation
Lin Li2*
Haoran Feng3*
Zehuan Huang1†
Haohua Chen1
Wenbo Nie1
Shaohua Hou1
Keqing Fan1
Pan Hu1
Sheng Wang4
Buyu Li4
Lu Sheng1✉
1Beihang University
2Renmin University of China
3Tsinghua University
4OriginArk
*Equal Contribution
Project Lead
Corresponding Author

Thanks to Hao Xu for sharing the Blender source files that helped us build this demo.

TL;DR: A 3D part segmentation framework that repurposes the structured priors in pretrained native 3D generative models, supports multiple segmentation tasks, and achieves state-of-the-art results with minimal supervision.
We introduce SegviGen, a framework that repurposes native 3D generative models for 3D part segmentation. Existing pipelines either lift strong 2D priors into 3D via distillation or multi-view mask aggregation, often suffering from cross-view inconsistency and blurred boundaries, or explore native 3D discriminative segmentation, which typically requires large-scale annotated 3D data and substantial training resources. In contrast, SegviGen leverages the structured priors encoded in pretrained 3D generative model to induce segmentation through distinctive part colorization, establishing a novel and efficient framework for part segmentation. Specifically, SegviGen encodes an input 3D asset and predicts part-indicative colors on active voxels of a geometry-aligned reconstruction. It supports interactive part segmentation, full segmentation, and full segmentation with 2D guidance in a unified framework. Extensive experiments show that SegviGen improves over the prior state of the art by 40% on interactive part segmentation and by 15% on full segmentation, while using only 0.32% of the training data. It demonstrates that pretrained 3D generative priors transfer effectively to 3D part segmentation, enabling strong performance with limited supervision.
Interactive Demo
Segmentation | Interactive Segmentation

LaviGen supports interactive segmentation, where users specify sparse spatial points to indicate the target region and the system isolates the corresponding part accordingly. Click on the cards to view extracted GLB files.

Segmentation | Full Segmentation

LaviGen supports full-parts 3D segmentation, decomposing a single 3D model into complete part-level components. Click on the cards to view extracted GLB files.

Segmentation | Full Segmentation with 2D Guidance

LaviGen also supports full-parts 3D segmentation conditioned on 2D segmentation maps, which provides explicit spatial priors for user-guided, more precise, and customizable part-level segmentation. Click on the cards to view extracted GLB files.

Methodology

Pipeline of the method

SeviGen reformulates 3D part segmentation as part-wise colorization on a structured 3D representation: instead of predicting discrete part IDs, it predicts part-indicative voxel colors that align naturally with the pretrained generative prior. Different segmentation settings are instantiated by how the colorization target is constructed. For interactive segmentation, user-provided clicks specify the target part and we use a binary colorization. For full segmentation, each part is assigned a distinct color using a randomly sampled palette, and we further use multiple random color assignments during training to prevent the model from overfitting to any specific color-part correspondence. For 2D-guided full segmentation, a rendered 2D part-color map provides explicit guidance, and the 3D colorization is generated to be consistent with the 2D assignments.

Given an input asset, we encode it into a geometry latent z that anchors generation to the underlying shape and fixes the active-voxel support; the colorization target is encoded as a color latent y, and generation proceeds by denoising from a noisy state yt conditioned on (yt, z). User interaction is injected via point embeddings: each click is encoded as a 3D coordinate token paired with a shared learnable point feature, and the resulting point embeddings are appended (with zero-padding to a fixed length) to provide a unified conditioning interface across tasks. For 2D guidance, the 2D segmentation map is encoded into guidance tokens and injected through cross-attention, enabling controllable granularity and palette behavior. Finally, a task embedding is fused with the timestep embedding to make the model explicitly task-aware under a shared parameterization.

Citation

If you find our work useful, please consider citing:

@article{li2026segvigenrepurposing3dgenerative, title={SegviGen: Repurposing 3D Generative Model for Part Segmentation}, author={Lin Li and Haoran Feng and Zehuan Huang and Haohua Chen and Wenbo Nie and Shaohua Hou and Keqing Fan and Pan Hu and Sheng Wang and Buyu Li and Lu Sheng}, year={2026}, journal = {arXiv preprint arXiv:2603.16869}, }

The website template is borrowed from TRELLIS