Image credit: ©2024 ByteDance Inc. All rights reserved.
In the ever-evolving landscape of immersive technologies, Shaohui Jiao and team have pioneered a new approach with their geometry-enhanced 3D Gaussian Splatting (GS) method. Initially driven by the goal to create more immersive livestreaming applications that blend traditional broadcasting with XR 3D content, their innovative method not only retains the high-quality rendering and efficiency of GS but also introduces enhanced geometric accuracy and a tailored deferred rendering pipeline. Read on to learn more about this SIGGRAPH 2024 Posters contribution.
SIGGRAPH: What inspired you to develop the geometry-enhanced 3D Gaussian Splatting (GS) method, and how does it improve upon traditional techniques?
Shaohui Jiao (SJ): Our initial goal was to create a more immersive livestreaming application that combines traditional broadcasting with XR 3D content, for example, virtual scenarios. To achieve this, virtual scenes with photorealistic and real-time rendering are required. One efficient way to acquire virtual scenes is reconstruction. Nowadays, 3D Gaussian Splatting (GS) has gained popularity for its efficiency and high-fidelity novel view synthesis capabilities. It still faces limitations compared to traditional mesh-based pipelines, particularly in supporting deferred rendering and geometric accuracy for post-processing tasks like relighting.
This motivated us to develop a geometry-enhanced GS method. Compared to traditional mesh-based pipelines, our approach retains the advantages of GS (e.g., high-quality rendering and rasterization efficiency) while introducing normal attributes and depth optimization to enhance geometric accuracy. Additionally, we designed a deferred rendering pipeline tailored for GS, enabling dynamic relighting, shadow mapping, and compatibility with commercial engines like Unity and Unreal Engine. These improvements address the challenges of integrating GS into practical workflows while maintaining real-time performance.
SIGGRAPH: Explain the key steps in your deferred rendering pipeline and how it ensures high-precision depth and normal information for real-time complex illumination effects.
SJ: Our deferred rendering pipeline consists of two core modules: Geometry-enhanced GS and deferred lighting. To ensure high-quality geometry for lighting calculations, we enhanced the 3D Gaussian Splatting (GS) representation by introducing normal attributes and optimizing depth through differentiable rendering. During training, we first modified the rasterization process to support the rendering of depth and normals. Rendered depth maps are then used to compute pseudo-normals via local gradient analysis, which supervise the learning of GS normal attributes through a loss function. Crucially, this self-supervised approach requires no additional supervision data or external modules.
We further applied smoothness constraints to depth and normals to mitigate artifacts. Additionally, we extended the rasterization pipeline to support orthographic projection (via Jacobian matrix adjustments), enabling accurate directional lighting. These steps ensure precise depth/normal reconstruction while maintaining real-time performance for complex illumination effects.
SIGGRAPH: What are the unique advantages of your method in rendering speed and image quality, especially in handling complex lighting scenarios?
SJ: For rendering performance, our method preserves the efficiency of the original GS rasterization pipeline without introducing significant computational overhead. The deferred rendering stages are optimized for modern GPUs, ensuring real-time performance even with complex lighting.
For image quality, we enforce geometric constraints (e.g., depth and normal regularization) to reduce artifacts like floaters and texture holes, ensuring consistency across novel views. The deferred rendering pipeline further enables advanced effects like dynamic shadows, directional lighting, and post-processing (e.g., LUT filters), enhancing visual realism.
In complex lighting scenarios, our pipeline outperforms prior GS renderers by accurately handling occlusion and multi-light interactions, achieving photorealistic results.
SIGGRAPH: How does your method enhance the visual experience in XR/VR applications? Can you provide an example of its use in live broadcasting?
SJ: Our method enhances XR/VR experiences by enabling real-time relighting and seamless blending of reconstructed 3D scenes with virtual assets. For live broadcasting, hosts can broadcast from any virtual environment:
- The background is reconstructed using our GS method and dynamically relit.
- The foreground (e.g., the host) is captured as a mesh and fused with the background in real time.
- This setup allows artists to adjust lighting effects (e.g., directional/volumetric lights) during livestreams, creating immersive virtual studios.
Artists can adjust lighting (e.g., directional/volumetric lights) and apply post-effects, creating immersive virtual studios without prebaked lighting.
This approach was demonstrated in virtual livestreaming applications (Fig. 4b), where our method achieved high-fidelity rendering at 30+ FPS on consumer devices.
SIGGRAPH: What planned improvements or extensions do you foresee for this method, and how do you envision its impact on industries such as gaming or architectural visualization?
SJ: In terms of future directions, we foresee:
- Material Decomposition: Separating albedo, material properties, and environmental lighting from captured images to enable more physically based relighting.
- Geometric Enhancement: Integrating 2D GS priors or deep learning-based constraints to further refine geometry reconstruction.
- AIGC Integration: Leveraging generative AI for scene reconstruction from sparse or single-view inputs, reducing capture costs.
In terms of industry impact, we envision:
- Gaming: Real-time, photorealistic environments with dynamic lighting could replace prebaked assets, enhancing interactivity.
- Architectural Visualization: Clients could explore relightable 3D reconstructions of spaces under varying lighting conditions, improving design validation.
- XR/MR: Our pipeline’s compatibility with commercial engines (Unity/Unreal) lowers barriers to adopting GS technology in mainstream applications.
By addressing these challenges, we aim to make GS-based rendering a standard tool for high-quality, real-time 3D content creation.
Do you have visionary research just like this? Consider submitting it to the SIGGRAPH 2025 Posters program by 24 April.

Wang Shuo graduated with a master’s degree from Zhejiang University. He is currently a researcher at ByteDance. His research interests primarily focus on 3D reconstruction, AIGC, and VR/AR.

Cong Xie graduated from Beijing Normal University with a master’s degree. He is interested in computer graphics, digital geometry processing and general-purpose GPU computing. He is currently working at Bytedance, developing 3D content generation and free-viewpoint video systems.

Shengdong Wang is currently a researcher at ByteDance. He graduated from Heilongjiang University. His research interests primarily focus on computer graphics, game engine architecture, and real-time rendering technologies. At ByteDance, he is dedicated to advancing these fields through industrial applications, bridging the gap between theoretical frameworks and practical implementation.

Shaohui Jiao is PhD from the Institute of Software, Chinese Academy of Sciences. Her research interests include computer graphics, 3D reconstruction, deep learning and high-performance computing. Currently, Shaohui is a researcher at ByteDance, developing the next-generation 3D video, including free-viewpoint video, volumetric video, and AI generated video content.