• 首页
  • 期刊简介
  • 编委会
  • 投稿指南
  • 收录情况
  • 杂志订阅
  • 联系我们
引用本文:【点击复制】
【打印本页】   【下载PDF全文】   【查看/发表评论】  【下载PDF阅读器】  
←前一篇|后一篇→ 过刊浏览
分享到: 微信 更多
基于混合空频与线性注意力的端到端图像压缩方法
谢国嘉, 章为川, 杨俊坡
陕西科技大学
摘 要: 为解决现有学习型图像压缩方法难以兼顾计算效率、全局建模与结构保真度的问题,提出一种基于混合空频与线性注意力的端到端图像压缩方法。该方法构建混合空频-线性注意力模块,利用双向RWKV机制替代传统Transformer以降低计算复杂度,并设计空间-频率调制注意力模块,通过频域幅度调制与空间大核门控策略协同优化长距离依赖捕捉与局部纹理保留。实验结果表明,该方法相比VTM-9.1在Kodak数据集上BD-Rate降低15.84%,峰值信噪比与多尺度结构相似性均优于ELIC等主流算法,验证了其在低计算复杂度下具备更优的率失真性能与泛化能力。
关键词: 图像压缩  混合空频线性注意力  空间-频率调制注意力  RWKV  上下文建模
中图分类号:     文献标识码: 
End-to-End Image Compression Method Based on Hybrid Spatial-Frequency and Linear Attention
xieguojia, zhangweichuan, yangjunpo
SHAANXI UNIVERSITY OF SCIENCE AND TECHNOLOGY
Abstract: To address the challenge of balancing computational efficiency, global modeling capability, and structural fidelity in learned image compression, an end-to-end Hybrid Spatial-frequency Linear Attention Image Compression (HSLAIC) method is proposed. A hybrid spatial-frequency linear attention module is constructed, where a bidirectional RWKV mechanism replaces the traditional Transformer to reduce computational complexity. Additionally, a Spatial-Frequency Modulated Attention (SFMA) module is designed to integrate frequency-domain amplitude modulation with a spatial large-kernel gating strategy, optimizing both long-range dependency capture and local texture preservation. Experimental results on the Kodak dataset show a 15.84% reduction in BD-Rate compared to VTM-9.1, with PSNR and MS-SSIM metrics outperforming mainstream methods like ELIC. These findings verify the method's superior rate-distortion performance and generalization ability under low computational complexity.
Keywords: Image Compression  Hybrid Spatial-Frequency Linear Attention  Spatial-Frequency Modulation Attention  RWKV  Context Modeling


版权所有:软件工程杂志社
地址:辽宁省沈阳市浑南区新秀街2号 邮政编码:110179
电话:0411-84767887 传真:0411-84835089 Email:semagazine@neusoft.edu.cn
备案号:辽ICP备17007376号-1
技术支持:北京勤云科技发展有限公司

用微信扫一扫

用微信扫一扫