软件工程

引用本文:

【点击复制】

【打印本页】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】

←前一篇|后一篇→

过刊浏览

分享到：微信更多

面向遥感小目标的检测算法

王红林杨灿

南京信息工程大学ａ计算机学院

摘要: 无人机遥感图像场景下的小目标检测由于目标尺度变化大、背景复杂以及目标像素占比低,仍面临较大挑战。针对上述问题,文章提出一种基于 YOLO11 的改进小目标检测方法 SM-YOLO,通过融合 Swin Transformer 模块与窗式 MLP-Mixer 结构,在保证计算效率的同时增强多尺度特征建模能力。在主干网络中,将 Swin 模块嵌入 YOLO 结构,并通过激活函数替换和填充–掩码机制的改进来更好的适配yolo网络并实现对多尺度输入的自适应处理；同时在backbone浅层引入基于固定卷积滤波层,以增强目标边缘与形状信息。此外,在改进的 C3k2 模块中融入高斯建模的去噪卷积核,进一步提升模型的抗噪能力。在特征融合阶段,采用一种基于加权拼接的 BiFPN 结构以增强跨尺度特征交互,并引入带有位置编码的窗式 MLP-Mixer,在控制计算量的前提下强化局部上下文建模能力。基于 VisDrone2019[1] 数据集的实验结果表明,所提出的 SM-YOLO 在检测精度上显著优于基线 YOLO11模型。对比试验显示YOLO11n 和 YOLO11s 的mAP50分别提升了 7.0% 和 6.2%,同时保持了实时检测性能。实验结果验证了 SM-YOLO 在无人机遥感小目标检测任务中的有效性与实用性。

关键词: 遥感影像轻量化目标检测注意力机制 YOLO 11

中图分类号: 文献标识码:

基金项目: 国家自然科学基金项目（面上项目，重点项目，重大项目）

SM-YOLO: A Detection Algorithm for Small Objects in Remote Sensing Images

WangHongLin¹, YangCan^2,2

1.Nanjing University of Information Science and Technology, School of Artificial Intelligence, Nan-jing;2.Nanjing University of Information Science and Technology, School of Computer and Cyberspace Security

Abstract: Small-object detection in unmanned aerial vehicle (UAV) imagery remains challenging because of large target scale variations, complex backgrounds, and the low pixel occupancy of targets. To address these issues, this paper pro-poses SM-YOLO, an improved small-object detector built on YOLO11. By integrating a Swin Transformer module and a windowed MLP-Mixer, SM-YOLO enhances multiscale feature modeling while maintaining computational efficiency. In the backbone, Swin blocks are embedded into the YOLO architecture; activation-function replace-ment and an improved padding–mask mechanism are introduced to better adapt the network to the YOLO design and to enable adaptive handling of multiscale inputs. A fixed convolutional filtering layer is further inserted at shallow backbone stages to strengthen edge and shape cues. Moreover, the C3k2 module is redesigned by incorpo-rating a Gaussian-modeling-based denoising convolution kernel, improving robustness to noise. For feature fusion, SM-YOLO adopts a BiFPN structure with weighted concatenation to strengthen cross-scale interactions, and intro-duces a windowed MLP-Mixer with positional encoding to enhance local contextual modeling under controlled computation. Experiments on the VisDrone2019[1] dataset show that SM-YOLO significantly outperforms the base-line YOLO11 models, boosting mAP50 by 7.0% for YOLO11n and 6.2% for YOLO11s while retaining real-time inference.

Keywords: Remote sensing imagery Lightweight model Object detection Attention mechanism YOLO11

用微信扫一扫