软件工程

引用本文:

林道阳,陈丹伟.基于多模态语义对齐的人脸反欺诈方法[J].软件工程,2026,29(2):62-65.【点击复制】

【打印本页】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】

←前一篇|后一篇→

过刊浏览

分享到：微信更多

基于多模态语义对齐的人脸反欺诈方法

林道阳,陈丹伟

(南京邮电大学计算机学院、软件学院、网络空间安全学院,江苏南京210023)
1392170875@qq.com; chendw@njupt.edu.cn

摘要: 针对现有人脸反欺诈方法在面对未知欺诈类型时泛化能力不足的问题,提出 VISTA(Vision-Integrated Semantic Text Alignment)模型。通过多模态语义对齐,融合视觉和语言模态,采用对比学习和文本监督调制,在共享嵌入空间区分真伪人脸。跨域数据集上的实验结果显示 VISTA 的半总错误率 HTER 最低降至3.36%,曲线下面积 AUC最高达99.44%,优于传统方法和其他深度学习方法,特别是在未见伪造类型和采集环境下表现出色。该方法为跨域FAS提供新方案,突破传统方法对域标签的依赖。

关键词: 人脸反欺诈对比学习跨域泛化提示学习

中图分类号: TP391.4 文献标识码: A

Face Anti-Spoofing Method Based on Multi-modal Semantic Alignment

LIN Daoyang, CHEN Danwei

(School of Computer Science, Software, and Cybersecurity, Nanjing University of Posts and Telecommunications, Nanjing 210023, China)
1392170875@qq.com; chendw@njupt.edu.cn

Abstract: To address the insufficient generalization capability of existing face ant-i spoofing methods against unknown attack types, propose VISTA (Vision-Integrated Semantic Text Alignment), a novel model that enhances cross-domain generalization through multimodal semantic alignment. By integrating visual and linguistic modalities with contrastive learning and textual supervision modulation, VISTA effectively distinguishes genuine from spoofed faces in a shared embedding space. Experimental results on cross-domain datasets demonstrate VISTA’s superior performance, achieving state-o-f the-art metrics with the lowest HTER of 3. 36% and highest AUC of 99. 44% ,outperforming both traditional methods and other deep learning approaches. Notably, it exhibits exceptional robustness against unseen spoof types and diverse acquisition environments. This approach provides a new paradigm for cross-domain FAS by overcoming the reliance on domain labels inherent in conventional methods.

Keywords: face ant-i spoofing contrastive learning cross-domain generalization prompt learning

用微信扫一扫