| 摘 要: 为解决物流寄递地址准确性问题,提出了一种基于编辑距离的物流地址相似度检测方法。通过引入预定义关键词集合对地址文本进行层级分割,结合最小编辑距离算法计算地址相似度,并为不同地址层级分配权重,实现了对相近地址的精准匹配,有效处理了方言、非标准格式地址及新地名等情况。实验结果表明,与传统算法相比,该方法在相似度准确率、查准率、召回率及运行耗时方面均显著提升,相似度准确率达92.39%,查准率提升约15.00%,召回率提升近20.00%。该方法优化了物流资源分配,推动了物流行业的智能化发展,确保货物安全、及时、准确地送达目的地。 |
| 关键词: 物流数据 相似度 最小编辑距离法 地址文本层级分割 数据质量 |
|
中图分类号: TP391
文献标识码: A
|
|
| Research on Logistics Address Similarity Detection Method Based on Edit Distance |
|
ZHANG Yu, ZHANG Yingjiao, WANG Huqing
|
(School of Modern Posts, Nanjing University of Posts and Telecommunications, Nanjing 210003, China)
1223097110@njupt.edu.cn; 1223097109@njupt.edu.cn; wanghuqing@njupt.edu.cn
|
| Abstract: To address the issue of accuracy in logistics delivery addresses, a method for logistics address similarity detection based on edit distance is proposed. By introducing a predefined keyword set to hierarchically segment address texts, combining the minimum edit distance algorithm to calculate address similarity, and assigning weights to different address levels, this method achieves precise matching of similar addresses. It effectively handles situations such as dialects, non-standard format addresses, and new place names. Experimental results show that,compared with traditional algorithms, this method significantly improves similarity accuracy, precision, recall, andruntime efficiency, achieving a similarity accuracy rate of 92.39% , with precision increased by approximately 15.00% and recall by nearly 20.00% . This method optimizes the allocation of logistics resources, promotes the intelligent development of the logistics industry, and ensures the safe, timely, and accurate delivery of goods. |
| Keywords: logistics data similarity minimum edit distance method hierarchical segmentation of address texts data qualit |