软件工程

引用本文:

【点击复制】

【打印本页】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】

←前一篇|后一篇→

过刊浏览

分享到：微信更多

基于大语言模型与提示工程的汽车评论方面级情感分析

张慧杰, 曹宇锋

河南理工大学

摘要: 【目的】研究大语言模型对汽车评论情感分析的性能差异，分析样本示例数量对模型情感分析的影响。【方法】基于提示工程方法，通过使用15种主流大语言模型初步比对0样本示例的模型性能，挑选出三个效果好的模型，按照汽车属性顺序逐步增加样本示例修改提示模板，使用BDCI_Car_2018数据集进行实验并比较分析效果。【结论】不同大模型擅长的汽车评论情感分析方面不同；合适数量的示例样本能够提高模型分析效果，但是由于模型的学习能力不同，超过一定数量后性能提升趋缓或出现波动，但分析能力仍然超过0样本时的能力。大模型不擅长分析边界模糊的中性情感，擅长分析正面与负面边界清晰的情感。

关键词: 情感分析大语言模型提示工程汽车评论

中图分类号: 文献标识码:

Aspect-Level Sentiment Analysis of Automobile Reviews Using Large Language Models and Prompt Engineering

zhanghuijie, caoyufeng

Henan Polytechnic University

Abstract: [Objective] This study investigates performance variations among existing large language models (LLMs) in aspect-level sentiment analysis of automobile reviews and examines the impact of the number of in-context examples on model efficacy. [Methods] Leveraging prompt engineering, we first benchmarked 15 mainstream LLMs under zero-shot settings (without in-context examples), selected the top three performers, and incrementally enriched the prompt template with in-context examples following the predefined sequence of automobile attributes. Experiments were conducted on the BDCI_Car_2018 dataset, with comprehensive comparative analysis of results. [Conclusion] Different LLMs excel in different aspects of car review sentiment analysis. An appropriate number of example samples can improve model performance; however, due to differences in model learning capabilities, the performance improvement slows down or fluctuates when the number of examples exceeds a certain threshold, while the analytical ability still remains superior to zero-shot performance. LLMs are not adept at analyzing neutral sentiment with ambiguous boundaries, but they excel at analyzing positive and negative sentiments with clear boundaries.

Keywords: Sentiment Analysis Large Language Models Prompt Engineering Automotive Reviews

用微信扫一扫