-
PhD Thesis Defense - Yuzhong Huang
Tue, Dec 03, 2024 @ 09:00 AM - 11:00 PM
Thomas Lord Department of Computer Science
University Calendar
PhD Thesis Defense - Yuzhong Huang
Committee Members: Fred Morstatter (Chair), Yue Wang, Aiichiro Nakano, Antonio Ortega,
Title: Semantic Structure in Understanding and Generation of the 3D World
Abstract:
The ability to understand, generate, and modify 3D environments is foundational for applications such as virtual reality, autonomous driving, and generative AI tools. However, existing methods usually use non-semantic point clouds as their representation, which capture only geometric information without semantic context. This limitation creates a significant gap in both interpretability and performance when compared to methods that leverage semantic information. Moreover, non-semantic approaches often struggle to scale effectively as complexity increases, underscoring the importance of incorporating semantic structures to enhance scalability and adaptability.
This dissertation addresses these limitations by introducing methods that emphasize controllable semantic structures in 3D understanding, generation, and editing. First, to improve 3D scene understanding, we propose plane-aware techniques, such as planar priors and plane-splatting volume rendering, which provide explicit geometric and semantic representations. These methods enable more accurate and interpretable reconstructions compared to traditional point-cloud-based approaches. Second, for 3D content generation, we develop an orientation-conditioned diffusion model, which allows precise control over the alignment and orientation of generated objects, enhancing flexibility and user interaction. Third, to facilitate intuitive editing of 3D environments, we introduce a method for projecting text-guided 2D segmentation maps onto 3D models, bridging the gap between semantic understanding and user-driven modification.
These contributions collectively address the semantic and performance gaps in 3D reconstruction and generation, demonstrating that the integration of semantic information not only improves interpretability and precision but also enables models to scale more effectively for complex applications. By combining controllable semantic structures with geometric understanding, this dissertation advances the state-of-the-art in 3D vision and generation, paving the way for more scalable, interpretable, and interactive 3D workflows.
===============================
Time: Tuesday, December 3, 2024, 9:00 AM to 11:00 AM
Location: GCS | LL2 | SB-09
Zoom Link: https://usc.zoom.us/j/97579926743
Location: Ginsburg Hall (GCS) - SB-09
Audiences: Everyone Is Invited
Contact: Julia Mittenberg-Beirao
Event Link: https://usc.zoom.us/j/97579926743