| 摘 要: 随着沉浸式媒体、虚拟现实与数字孪生技术的快速发展,高保真、可交互三维场景的自动化生成已成为计算机视觉与图形学领域的重要研究课题。现有三维重建与生成方法普遍存在真实感不足、内容多样性受限以及难以支持实时交互编辑等核心瓶颈。针对上述问题,本文提出一种将神经辐射场与扩散生成模型相融合的统一框架,以实现高保真交互式三维场景合成。该框架以神经辐射场为几何骨架,精确捕捉场景的细粒度几何结构与光度细节;以扩散模型为生成先验,增强纹理真实感、丰富内容语义多样性并支持用户可控编辑;同时引入交互感知模块,实现对几何、外观与光照的解耦表示,保障实时操控下的全局一致性。在多个标准基准数据集上的大量定量实验与用户研究表明,本框架在渲染保真度、语义可控性及交互响应性等核心指标上均达到目前最优水平。所提方法为虚拟现实、沉浸式游戏、数字遗产保护及智能设计等下一代应用场景提供了鲁棒且通用的技术方案。 |
| 关键词: 神经渲染 扩散模型 三维场景生成 高保真渲染 虚拟现实 |
|
中图分类号: TP391
文献标识码:
|
|
| High-Fidelity Interactive 3D Scene Generation via Neural Rendering and Diffusion Models |
|
XIANG Xingren1, LV Jian1, PAN Weijie2, HU Tao3
|
1.Guizhou University;2.Guizhou Provincial Data Circulation and Transaction Service Center;3.Guiyang College
|
| Abstract: With the rapid advancement of immersive media, virtual reality, and digital twin technologies, the automated generation of high-fidelity, interactive 3D scenes has become a pivotal research challenge in computer vision and graphics. Existing 3D reconstruction and generation approaches are generally hindered by limited photorealism, insufficient content diversity, and inadequate support for real-time interactive editing. To address these limitations, we propose a unified framework that integrates neural radiance fields with diffusion-based generative modeling for interactive 3D scene synthesis. The framework leverages neural radiance fields as a geometric backbone to accurately capture fine-grained structural geometry and photometric detail, while employing diffusion models as generative priors to enrich texture realism, enhance semantic diversity, and enable user-controllable editing. An interaction-aware module further supports decoupled representations of geometry, appearance, and illumination, thereby maintaining global coherence during real-time manipulation. Extensive quantitative experiments and user studies on multiple standard benchmarks demonstrate that the proposed framework achieves state-of-the-art performance across rendering fidelity, semantic controllability, and interactive responsiveness. The method provides a robust and versatile solution for next-generation applications in virtual reality, immersive gaming, digital heritage preservation, and intelligent design systems. |
| Keywords: neural rendering diffusion models 3D scene generation high-fidelity rendering virtual reality |