| 摘 要: 针对现有人脸反欺诈方法在面对未知欺诈类型时泛化能力不足的问题,提出 VISTA(Vision-Integrated Semantic Text Alignment)模型。通过多模态语义对齐,融合视觉和语言模态,采用对比学习和文本监督调制,在共享嵌入空间区分真伪人脸。跨域数据集上的实验结果显示 VISTA 的半总错误率 HTER 最低降至3.36%,曲线下面积 AUC最高达99.44%,优于传统方法和其他深度学习方法,特别是在未见伪造类型和采集环境下表现出色。该方法为跨域FAS提供新方案,突破传统方法对域标签的依赖。 |
| 关键词: 人脸反欺诈 对比学习 跨域泛化 提示学习 |
|
中图分类号: TP391.4
文献标识码: A
|
|
| Face Anti-Spoofing Method Based on Multi-modal Semantic Alignment |
|
LIN Daoyang, CHEN Danwei
|
(School of Computer Science, Software, and Cybersecurity, Nanjing University of Posts and Telecommunications, Nanjing 210023, China)
1392170875@qq.com; chendw@njupt.edu.cn
|
| Abstract: To address the insufficient generalization capability of existing face ant-i spoofing methods against unknown attack types, propose VISTA (Vision-Integrated Semantic Text Alignment), a novel model that enhances cross-domain generalization through multimodal semantic alignment. By integrating visual and linguistic modalities with contrastive learning and textual supervision modulation, VISTA effectively distinguishes genuine from spoofed faces in a shared embedding space. Experimental results on cross-domain datasets demonstrate VISTA’s superior performance, achieving state-o-f the-art metrics with the lowest HTER of 3. 36% and highest AUC of 99. 44% ,outperforming both traditional methods and other deep learning approaches. Notably, it exhibits exceptional robustness against unseen spoof types and diverse acquisition environments. This approach provides a new paradigm for cross-domain FAS by overcoming the reliance on domain labels inherent in conventional methods. |
| Keywords: face ant-i spoofing contrastive learning cross-domain generalization prompt learning |