• 首页
  • 期刊简介
  • 编委会
  • 投稿指南
  • 收录情况
  • 杂志订阅
  • 联系我们
引用本文:【点击复制】
【打印本页】   【下载PDF全文】   【查看/发表评论】  【下载PDF阅读器】  
←前一篇|后一篇→ 过刊浏览
分享到: 微信 更多
基于多尺度小波卷积与注意力融合的声场再现方法
田旭华, 汪震, 薛伟伟, 郭文强, 赵莹珂
陕西科技大学
摘 要: 声场再现技术在沉浸式音频与空间声学应用中具有重要意义。然而受阵列空间采样条件限制,传统声压匹配方法在高频段易受到空间混叠效应的影响,而现有端到端深度学习声场再现方法在频带建模与高频稳定性方面仍存在不足。针对上述问题,本文提出一种基于多尺度小波卷积与注意力融合的端到端声场再现方法。该方法在端到端网络框架中引入多尺度小波卷积与注意力融合机制,对不同频带的逆声学传递特性进行解耦建模与自适应融合,有效扩展了感受野并增强了高频特征的表达能力,提升了频谱特征表示的稳定性。实验结果表明,相较于传统声压匹配方法及现有端到端深度学习声场再现模型,所提出方法在高于阵列空间奈奎斯特频率下的平均再现误差降低超过3 dB,声场空间误差分布更加均匀,实现了更加集中的扬声器驱动能量分布。
关键词: 声场再现  小波卷积  注意力融合  子带卷积
中图分类号:     文献标识码: 
基金项目: 国家自然科学基金项目(面上项目,重点项目,重大项目)
Sound Field Reproduction Method Based on Multi-Scale Wavelet Convolution and Attention Fusion
Tian XuHua, Wang Zhen, Xue WeiWei, Guo WenQiang, Zhao YingKe
Shaanxi University of Science and Technology
Abstract: Sound field reproduction technology plays a vital role in immersive audio and spatial acoustic applications. However, constrained by the spatial sampling conditions of arrays, traditional sound pressure matching methods are susceptible to spatial aliasing effects in high frequency bands. Meanwhile, existing end-to-end deep learning-based sound field reproduction methods still suffer from limitations in frequency band modeling and high-frequency stability. To address these issues, this paper proposes an end-to-end sound field reproduction method based on multi-scale wavelet convolution and attention fusion. By integrating the multi-scale wavelet convolution and attention fusion mechanism into the end-to-end network framework, the proposed method realizes decoupled modeling and adaptive fusion of the inverse acoustic transfer characteristics across different frequency bands. This design effectively expands the receptive field, enhances the representation capability of high-frequency features, and improves the stability of spectral feature expression. Experimental results demonstrate that, compared with traditional sound pressure matching methods and existing end-to-end deep learning sound field reproduction models, the proposed method reduces the average reproduction error by more than 3 dB at frequencies above the spatial Nyquist frequency of the array. It also achieves a more uniform spatial distribution of the error of the reproduced sound field and a more concentrated energy distribution of loudspeaker driving signals.
Keywords: Sound field reproduction  Wavelet convolution  Attention fusion  Sub-band convolution


版权所有:软件工程杂志社
地址:辽宁省沈阳市浑南区新秀街2号 邮政编码:110179
电话:0411-84767887 传真:0411-84835089 Email:semagazine@neusoft.edu.cn
备案号:辽ICP备17007376号-1
技术支持:北京勤云科技发展有限公司

用微信扫一扫

用微信扫一扫