• 首页
  • 期刊简介
  • 编委会
  • 投稿指南
  • 收录情况
  • 杂志订阅
  • 联系我们
引用本文:顾 靓,谈子楠,荣 静.基于5CV-Optuna-LightGBM 回归模型的数据预测方法[J].软件工程,2024,27(1):48-54.【点击复制】
【打印本页】   【下载PDF全文】   【查看/发表评论】  【下载PDF阅读器】  
←前一篇|后一篇→ 过刊浏览
分享到: 微信 更多
基于5CV-Optuna-LightGBM 回归模型的数据预测方法
顾 靓, 谈子楠, 荣 静
(扬州大学广陵学院, 江苏 扬州 225000)
1649790465@qq.com; 2318147480@qq.com; 060096@yzu.edu.cn
摘 要: 为解决各类复杂的数据预测问题,文章提出以五折交叉验证(5CV)、Optuna超参数优化和LightGBM回归预测模型为基础的5CV-Optuna-LightGBM混合回归预测模型。采用影响二手车价格的因素数据集,首先进行数据预处理与Pearson相关性分析,确定37个特征指标。其次通过L1正则化对模型进行降噪处理,并利用交叉验证和Optuna算法不断优化模型,最终得到在5CV-Optuna-LightGBM回归预测模型下的数据预测结果。从准确率、花费时间等多个评价指标出发,开展实验分析模型的预测效果,得到准确率为99.433%、花费时间为15 s、平均绝对误差为0.306%的结果,与其他模型对比,其预测值更加准确、建模效率更高、拟合度更高。
关键词: Pearson;五折交叉验证;Optuna;LightGBM;正则化
中图分类号: TP391    文献标识码: A
基金项目: 教育部产学合作协同育人项目(202002320005);江苏省高校哲学与社会科学专题项目(2022SJSZ1130)
Data Prediction Method Based on 5CV-Optuna-LightGBM Regression Model
GU Liang, TAN Zinan, RONG Jing
(Guangling College, Yangzhou University, Yangzhou 225000, China)
1649790465@qq.com; 2318147480@qq.com; 060096@yzu.edu.cn
Abstract: In order to solve various complex data prediction problems, this paper proposes a 5CV-Optuna-LightGBM mixed regression prediction model based on five-fold cross validation (5CV), Optuna hyper-parameter optimization, and LightGBM regression prediction model. Data preprocessing and Pearson correlation analysis are first conducted on a dataset of factors that affect used car prices to determine 37 feature indicators. Next, the model is denoised through L1 regularization, and the cross-validation and Optuna algorithm are used to continuously optimize the model. Finally, the data prediction results under the 5CV-Optuna-LightGBM regression prediction model are obtained. Experiments based on multiple evaluation indicators of accuracy, time consumption, and average absolute error, are conducted to analyze the predictive performance of the model. The results show that the accuracy is 99.433% , the time spent is 15 seconds, and the average absolute error is 0.306% . Compared with other models, the proposed method has more accurate predicted values, higher modeling efficiency and fitting degree.
Keywords: Pearson; five-fold cross-validation; Optuna; LightGBM; regularization


版权所有:软件工程杂志社
地址:辽宁省沈阳市浑南区新秀街2号 邮政编码:110179
电话:0411-84767887 传真:0411-84835089 Email:semagazine@neusoft.edu.cn
备案号:辽ICP备17007376号-1
技术支持:北京勤云科技发展有限公司

用微信扫一扫

用微信扫一扫