nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo searchdiv qikanlogo popupnotification paper paperNew
2024, 06, v.30 29-38
基于多方法优势组合的烟叶配方模块组合分类
基金项目(Foundation): 浙江中烟揭榜挂帅项目:基于优化烟叶资源使用效率及提升普一类产品竞争力的配方关键技术研究与应用(ZJZY2022A001);“利群”品牌主要烟叶产区烤烟化学成分特征与可用性评价(ZJZY2019C003)
邮箱(Email): zhaozhenjie@zjtobacco.com;
DOI: 10.16472/j.chinatobacco.2023.T0105
摘要:

为深入研究卷烟品牌烟叶配方分组方法和原则,基于457个烟叶样品的12种感官评吸指标,比较了4种判别分析和4种机器学习方法对4种烟叶分类的建模集正确率(R)、验证集正确率(r)和平均正确率(m)的影响,并基于分类方法选择和权重分配构建了一种高精度的组合分类方法。结果表明:(1)与判别分析相比,机器学习的R显著提高,而r显著下降,且LS-SVM的R最高(92.8%),FDA和F-BDA的r最高(80.2%),但m无显著性差异;(2)优化选择M-BDA、FDA、ANN和KNN四种方法,按精度赋权建立的组合分类方法同时提高了R(95.3%)和r(89.0%),且m由低于84%提高到92.2%,并通过理论计算和实际结果验证了组合分类方法的普遍有效性;(3)组合分类方法 Kappa系数均大于0.8,方法可靠,一致性程度高,验证集m-F1度量显著提升21.2%,模型泛化能力大为增强;(4)优雅感、杂气、余味、润感和清晰度5项指标对分类起主要作用,符合利群品牌的风格特征;(5)误判样品(6.5%)指标评分与其模块真实类别的不匹配归因于对库存、成本和质量的平衡,基本符合烟叶配方的调整空间。

Abstract:

To deeply study grouping methods and principles of cigarette brand tobacco leaf formulas, based on the 457 tobacco samples of 12 kinds of sensory evaluation indexes, this study compared four discriminant analysis and four machine learning methods for modeling set accuracy(R), validation set accuracy(r), and average accuracy(m) across four tobacco leaf classifications. A high-precision composite classification method was constructed based on method selection and weight allocation. Results showed that:(1) Compared with discriminant analysis, machine learning significantly improved R, while r significantly decreased, with LS-SVM having the highest R(92.8%), and FDA and F-BDA having the highest r(80.2%), but there was no significant difference in m;(2) Optimized selection of M-BDA, FDA, ANN, and KNN methods and the composite classification method established by accuracy weighting simultaneously improved R(95.3%) and r(89.0%), and increased m from below 84% to 92.2%, validating the general effectiveness of the composite classification method through theoretical calculations and practical results;(3) The Kappa coefficient of the composite classification method was greater than 0.8, indicating reliability, high consistency, and a significant improvement in validation set m-F1 measure by 21.2%, greatly enhancing the model's generalization ability;(4) Five indicators, namely elegance, off-flavor, aftertaste, moistness, and clarity, played a major role in classification, aligning with the style characteristics of the Liqun brand;(5) Misjudged samples(6.5%) with indicator scores not matching their real module categories were attributed to the balance of stock, cost, and quality, generally conforming to the adjustment space of tobacco leaf formulas.

参考文献

[1]闫克玉,王光耀,李春松.我国烤烟质量分析评价研究进展[J].郑州轻工业学院学报:自然科学版,2007, 22(2/3):49-52.YAN Keyu, WANG Guangyao, LI Chunsong, et al. Advance in evaluation research of flue-cured tobacco quality in China[J].Journal of Zhengzhou University of light industry(Natural Science), 2007, 22(2/3):49-52.

[2]郑州烟草研究院.中国烟叶质量白皮书(2016年)[M].北京:中国烟草总公司,2016.Zhengzhou Tobacco Research Institute. White paper of China tobacco leaf quality[M]. Beijing:China National Tobacco Corporation, 2016.(自译)

[3]王强,李孟军,陈英武.卷烟配方数据挖掘技术研究进展[J].中国烟草科学,2007, 28(4):14-17.WANG Qiang, LI Mengjun, CHEN Yingwu. Research progress in data mining technology on cigarette formulation[J]. Chinese Tobacco Science, 2007, 28(4):14-17.

[4]邵惠芳,赵昕宇,许自成,等.基于Fisher判别分析的烤烟感官质量与工业应用价值的关系研究[J].中国烟草学报,2011,17(6):13-18.SHAO Huifang, ZHAO Xinyu, XU Zicheng, et al. Studies on relationship between sensory quality and industrial availability of flue-cured tobacco based on Fisher discriminant analysis[J]. Acta Tabacaria Sinica, 2011, 17(6):13-18.

[5]毕淑峰,朱显灵,马成泽.判别分析在烤烟品质鉴定中的应用[J].中国农学通报,2005, 21(1):79-79.BI Shufeng, ZHU Xianling, MA Chengze. Application of discriminant analysis indistinguishing flue-cured tobacco quality[J].Chinese Agricultural Science Bulletin, 2005, 21(1):79-79.

[6]张灵帅,王卫东,谷运红,等.近红外光谱的主成分分析-马氏距离聚类判于卷烟的真伪鉴别[J].光谱学与光谱分析,2011,31(5):1254-1257.ZHANG Lingshuai, WANG Weidong, GU Yunhong, et al.Identification of authentic and fake cigarettes using near infrared spectroscopy combined with principal component analysis-mahalanobis distance[J]. Spectroscopy and Spectral Analysis, 2011, 31(5):1254-1257.

[7]潘玲,云月利,陈振国,等.基于湖北烤烟综合质量的香型分析[J].华北农学报,2015, 30(S1):217-224.PAN Ling, YUN Yueli, CHEN Zhenguo, et al. Analysis of aroma types based on comprehensive quality of flue-cured tobacco in Hubei[J]. Acta agriculture Boreali-Sinica, 2015, 30(S1):217-224.

[8]高宪辉,王松峰,孙帅帅,等.鲜烟成熟度颜色值指标及其判别函数研究[J].中国烟草学报, 2017, 23(1):77-85.GAO Xianhui, WANG Songfeng, SUN Shuaishuai, et al. Study on color space data-based discriminating functions of fresh tobacco at various mature stages[J]. Acta Tabacaria Sinica, 2017, 23(1):77-85.

[9]李超,李娥贤,张承明,等.基于因子分析的烤烟香型定量判别及其与产区的对应关系[J].中国烟草学报,2016, 22(6):51-62.LI Chao, LI Exian, ZHANG Chengming, et al. Factor analysis based quantitative determination of flavor type and its corresponding relationship with growing areas in flue-cured tobacco[J]. Acta Tabacaria Sinica, 2016, 22(6):51-62.

[10]吴雨露.基于贝叶斯网络的卷烟配方规则提取和配方维护[D].沈阳:东北大学,2016.WU Yulu. Rule extraction and formula maintenance of cigarette based on bayesian network[D]. Shenyang:Northeastern University,2016.

[11]聂铭,周冀衡,杨荣生,等.基于MIV-SVM的烤烟评吸质量预测模型[J].中国烟草学报,2014, 20(6):56-62.NIE Ming, ZHOU Jiheng, YANG Rongsheng, et al.MIV-SVM-based prediction model for smoking quality of flue-cured tobacco[J]. Acta Tabacaria Sinica, 2014, 20(6):56-62.

[12]章英,贺立源,叶颖泽,等.基于LS-SVM的烤烟烟叶产地判别[J].湖北农业科学,2012, 51(3):583-585.ZHANG Ying, HE Liyuan, YE Yingze, et al. Identification of producing area of tobacco leaf based on LS-SVM[J]. Hubei Agricultural Sciences, 2012, 51(3):583-585.

[13]李航.基于聚类和加权K近邻的烟叶分级研究[D].郑州:郑州大学,2017.LI Hang. The research on tobacco classification based on clustering and weighted KNN[J]. Zhengzhou:Zhengzhou University, 2017.

[14]石子健.基于分类算法的卷烟感官评吸指标预测方法[D].沈阳:东北大学,2015.SHI Zijian. Prediction methods of cigarette sensory index evaluation based on classification algorithms[D]. Shenyang:Northeastern University, 2015.

[15]周志华.机器学习[M].北京:清华大学出版社,2017.ZHOU Zhihua. Machine learning[M]. Beijing:Tsinghua University Press, 2017.

[16]吴继忠,毕一鸣,李石头,等.一种基于近红外光谱与感官评吸互信息判别感官表征信息的方法:中国,201810023242.0[P].2018-08-17.WU Jizhong, BI Yiming, LI Shitou, et al. A method for discriminating sensory representation information based on near infrared spectroscopy and sensory assessment mutual information:China, 201810023242.0[P]. 2018-08-17.(自译)

[17]李柏年,吴礼斌. Matlab数据分析方法[M].北京:机械工业出版社,2012.LI Bainian, WU Libin. Matlab data analysis method[M]. Beijing:China Machine Press, 2012.

[18] Yule G U. On the association of attributes in statistics:with illustrations from the material of the childhood society,&c[J].Philosophical Transactions of the Royal Society of London,1900(194):257-319.

[19] Cohen J A. A coefficient of agreement for nominal scales[J].Educational and Psychological Measurement, 1960, 20(1):37-46.

[20] Su L, Gong M, Zhang P, et al. Deep learning and mapping based ternary change detection for information unbalanced images[J].Pattern Recognition, 2017, 66:213-228.

[21] Gong M, Yang H, Zhang P, et al. Feature learning and change feature classification based on deep learning for ternary change detection in SAR images[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2017, 129(jul.):212-225.

[22]李航.统计学习方法[M].清华大学出版社,2012.LI Hang. Statistical learning method[M]. Beijing:Tsinghua University Press, 2012.(自译)

[23]赵晓芳,万明勇,周萍,等.基于RF-WPA混合算法优化SVM参数的电力工程造价预测[J].中国电力企业管理,2018, 549(36):80-81.ZHAO Xiaofang, WAN Mingyong, ZHOU Ping, et al. The SVM parameters optimized based on RF-WPA for cost prediction of electric engineering[J]. Engineering Management, 2018,549(36):80-81.(自译)

[24]黎隽男,吕佳.基于近邻密度和半监督KNN的集成自训练方法[J].计算机工程与应用,2018, 54(20):137-143.LI Junnan, LV Jia. Integrated self-training method based on neighborhood density and semi-supervised KNN. Computer Engineering and Applications, 2018, 54(20:132-138.

[25]金龙,况雪源,黄海洪,等.人工神经网络预报模型的过拟合研究[J].气象学报,2004, 62(1):62-70.JIN Long, KUANG Xueyuan, HUANG Haihong, et al. Study on the overfitting of the artificial neural network for casting model[J].Acta Meteorologica Sinica, 2004, 62(1):62-70.

[26]刘伟,徐磊,刘晓利,等.基于聚类和逐步判别分析的烤烟感官质量特征分类评价[J].贵州农业科学,2017, 45(1):143-147.LIU Wei, XU Lei, LIU Xiaoli, et al. Classified evaluation of sensory quality characteristics of flue-cured tobacco based on clustering and stepwise discriminant analysis[J]. Guizhou Agricultural Sciences, 2017, 45(1):143-147.

[27] Bishop C. Pattern recognition and machine learning[M]. Berlin:Springer, 2006.

[28]胡建鹏,陈强,黄容.逐步贝叶斯判别分析中的变量优化方法研究[J].计算机工程与应用,2014, 50(21):63-67.HU Jianpeng, CHEN Qiang, HUANG Rong. Study on variable optimization method in stepwise Bayes discriminant analysis.Computer Engineering and Applications, 2014, 50(21):63-67.

[29]于录,阮晓明,卢在雨,等.叶组配方的分组加工模块设计[J].烟草科技,2006, 39(7):11-13, 21.YU Lu, RUAN Xiaoming, LU Zaiyu, et al. Method of grouping a tobacco blend for optimization of processing[J]. Tobacco Science&Technology, 2006, 39(7):11-13, 21.

[30]马慧婷,赵铭钦,于海顺,等.基于模糊综合评判烟叶原料使用类群的初步划分[J].中国烟草科学,2015, 36(2):1-7.MA Huiting, ZHAO Mingqin, YU Haishun, et al. Preliminary division of the use groups of raw tobacco leaves based on fuzzy comprehensive Evaluation[J]. Chinese Tobacco Science, 2015,36(2):1-7.

[31]何结望,吴风光,谢豪,等.不同分组方法对原烟配方模块质量的影响[J].中国烟草科学,2011, 32(2):86-89.HE Jiewang, WU Fengguang, XIE Hao, et al. Effects of different grouping on green blend module’s quality[J]. Chinese Tobacco Science, 2011, 32(2):86-89.

[32]浙江大学.无机及分析化学[M].浙江:浙江大学,2006.Zhejiang University. Inorganic and analytical chemistry[M].Zhejiang:Zhejiang University,2006.(自译)

基本信息:

DOI:10.16472/j.chinatobacco.2023.T0105

中图分类号:TP181;TS452

引用信息:

[1]蒋佳磊,廖付,郝贤伟等.基于多方法优势组合的烟叶配方模块组合分类[J].中国烟草学报,2024,30(06):29-38.DOI:10.16472/j.chinatobacco.2023.T0105.

基金信息:

浙江中烟揭榜挂帅项目:基于优化烟叶资源使用效率及提升普一类产品竞争力的配方关键技术研究与应用(ZJZY2022A001);“利群”品牌主要烟叶产区烤烟化学成分特征与可用性评价(ZJZY2019C003)

检 索 高级检索

引用

GB/T 7714-2015 格式引文
MLA格式引文
APA格式引文