软件工程

引用本文:

季玉文,陈哲.基于BERT的金融文本情感分析与应用[J].软件工程,2023,26(11):33-37.【点击复制】

【打印本页】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】

←前一篇|后一篇→

过刊浏览

分享到：微信更多

基于BERT的金融文本情感分析与应用

季玉文¹, 陈哲²

(1.浙江理工大学计算机科学与技术学院, 浙江杭州 310018;
2.浙江理工大学信息科学与工程学院, 浙江杭州 310018)
yuwen.ji.yan@foxmail.com; 18758099691@163.com

摘要: 针对金融文本情感倾向模糊问题,设计了一种基于BERT(Bidirectional Encoder Representations from Transformers,基于Transformer的双向编码技术)和Bi-LSTM(Bidirectional Long Short-Term Memory Network,双向长短时记忆网络)的金融文本情感分析模型,以BERT模型构建词向量,利用全词掩盖方法,能够更好地表达语义信息。为搭建金融文本数据集,提出一种基于深度学习模型的主题爬虫,利用BERT+Bi-GRU(双门控循环单元)判断网页内文本主题相关性,以文本分类结果计算网页的主题相关度。实验结果表明:本文所设计的情感分析模型在做情感分析任务时取得了87.1%的准确率,能有效分析文本情感倾向。

关键词: 情感分析主题爬虫长短时记忆网络预训练语言模型

中图分类号: TP391 文献标识码: A

Financial Text Sentiment Analysis and Application Based on BERT

JI Yuwen¹, CHEN Zhe²

(1.School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China;
2.School of Inf ormation Science and Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China)
yuwen.ji.yan@foxmail.com; 18758099691@163.com

Abstract: Aiming at the problem of vague sentiment orientation in financial texts, this paper proposes to design a financial text sentiment analysis model based on BERT (Bidirectional Encoder Representations from Transformers) and Bi-LSTM(Bidirectional Long Short-Term Memory Network)is designed. The BERT model is used to construct word vectors, and the whole word masking method is employed to better express semantic information. To construct a financial text dataset, a theme crawler based on a deep learning model is proposed, which uses BERT + Bi-GRU (dual Gate Recurrent Unit) to determine the topic relevance of text within a webpage, and calculates the topic relevance of the webpage based on the text classification results. The experimental results show that the proposed sentiment analysis model achieves an accuracy of 87.1% when performing sentiment analysis tasks, and can effectively analyze text sentiment orientation.

Keywords: sentiment analysis theme crawler long short-term memory networks pre-training language model

用微信扫一扫