| 摘 要: 为解决现有日志解析方法中有效词丢失和异常检测忽略关键信息以及无法充分利用日志中隐藏的依赖关系问题,提出了一种基于Bi-LSTM-CNN 的异常日志检测方法 BCNLog。该方法通过分词长度限制保留有效词,利用双向编码器表示来自变压器(BERT)(BidirectionalEncoderRepresentationsfromTransformers)提取模板语义,并通过词频-逆文档频率(TF-IDF)加权生成特征向量和权重矩阵,最后拼接后输入双向长短期记忆(Bi-LSTM-卷积神经网络-CNN)混合模型,结合双向 LSTM 和 CNN的优势提升检测性能。实验结果显示,BCNLog在16个数据集上平均解析准确率为97.57%,在3个数据集上异常检测的F1值分别为98.75%、99.23%和99.83%。 |
| 关键词: 日志解析 参数异常 TF-IDF 卷积神经网络 异常检测 |
|
中图分类号: TP391.1
文献标识码: A
|
|
| Bi-LSTM-CNN-Based Anomaly Log Detection Method |
|
GUO Qingwei1, CHEN Wei1, FAN Yuan2, MIAO Chunyu2
|
(1.School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China; 2.DBAPPSecurity Co., Ltd., Hangzhou 362261, China)
1440631534@qq.com; chenwei@njupt.edu.cn; frank.fan@dbappsecurity.com.cn; crain.miao@dbappsecurity.com.cn
|
| Abstract: To address the issues of effective word loss in existing log parsing methods, the neglect of key information in anomaly detection, and the inability to fully utilize hidden dependencies in logs, an anomaly log detection method named BCNLog based on B-i LSTM-CNN is proposed. This method retains effective words by restricting word length, utilizes BERT (Bidirectional Encoder Representations from Transformers) to extract template semantics, and generates feature vectors and weight matrices through TF-IDF weighting. Finally, the concatenated results are fed into a B-i LSTM-CNN hybrid model, which combines the advantages of bidirectional LSTM and CNN(Convolutional Neural Network) to improve detection performance. Experimental results show that BCNLog achieves an average parsing accuracy of 97.57% on 16 datasets, and F1 scores for anomaly detection on three datasets are 98.75% , 99.23% , and 99.83% , respectively. |
| Keywords: log parsing parameter anomaly TF-IDF convolutional neural network anomaly detection |