基于Smote-KNN的小麦8种真菌毒素共污染特征气候分类模型研究
作者:
作者单位:

1.国家食品安全风险评估中心,北京 100022;2.北京化工大学信息科学与技术学院,北京 100029

作者简介:

唐昊 男 硕士研究生 研究方向为模式识别 E-mail:wdhxs00@163.com

通讯作者:

王小丹 女 副研究员 研究方向为食品安全风险评估 E-mail:wangxiaodan@cfsa.net.cn

中图分类号:

R155

基金项目:

国家重点研发计划(2019YFC1606500);中国医学科学院创新工程食品安全项目(2019-12M-5-024)


Climate classification model of co-occurrence characteristics of eight mycotoxins in wheat based on Smote-KNN
Author:
Affiliation:

1.China National Center for Food Safety Risk Assessment, Beijing 100022, China;2.College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    目的 分析我国不同气候区域小麦真菌毒素共污染特征,建立气候分类模型。方法 对来自12个省、自治区的887份小麦样本中脱氧雪腐镰刀菌烯醇、雪腐镰刀菌烯醇、黄曲霉毒素、赭曲霉毒素、伏马菌素、玉米赤霉烯酮、T-2和HT-2共8种真菌毒素检测数据按样本采集地的气候类型分为温带大陆性气候、温带季风气候和亚热带季风气候3类。对数据进行预处理并使用Borderline-SMOTE方法扩充以平衡数据集。使用主成分分析方法(PCA)对8种真菌毒素检测数据进行特征降维,选择降维后累计贡献率达97%的前二维特征作为小麦毒素数据特征。利用机器学习中的K最近邻(KNN)非线性分类器对上述数据特征进行分类研究,同时使用网格搜索算法对KNN模型参数进行调优。采用混淆矩阵、准确率、召回率和F1得分4个指标对模型进行评价,并比较所构建模型与支持向量机、随机森林和人工神经网络等常见分类模型在上述数据中的表现效果。结果 本文提出的Borderline-SMOTE、PCA与KNN相结合的分类模型对小麦8种真菌毒素共污染特征的气候分类准确率可达98.31%,且方法性能优于其他分类方法。结论 本文建立的分类模型能有效判别我国3种气候条件下小麦8种真菌毒素的共污染特征,可为分地区的真菌毒素联合暴露风险评估提供依据,并提出了一种基于食品多项检测指标进行地区分类的方法。

    Abstract:

    Objective To analyze the co-occurrence characteristics of mycotoxins in wheat, a classification model based on climatic regions of China was built.Methods A total of 887 wheat samples collected from 12 provinces/autonomous regions were analyzed for the concentrations of eight mycotoxins, including deoxynivalenol, nivalenol, aflatoxins, ochratoxin A, fumonisins, zearalenone, T-2 and HT-2. All the samples were divided into three groups, temperate continental climate, temperate monsoon climate and subtropical monsoon climate, according to the climate types of their sampling sites. The borderline-SMOTE method was used for sample augment to balance the data set. Principal component analysis (PCA) was applied for data dimension reduction, and the first two dimensions with a cumulative contribution rate of 97% were chosen as the characteristics of the original data. The classification of the data feature was implemented using the k-nearest neighbor (KNN) nonlinear classifier, and the parameters of the KNN model were optimized using GridSearchCV. Confusion Matrix, accuracy, recall rate and F1 score were used as the indexes for model evaluation, and the performance of this model was compared with three other common models, including support vector machine, random forest and artificial neural network.Results The classification accuracy of eight mycotoxins in wheat using the combination of borderline-SMOTE, PCA and KNN model reached 98.31%, and the performance of this approach was superior to other frequently used methods.Conclusion The classification model established in this paper can effectively categorize the wheat samples into three climate regions based on the co-occurrence characteristics of mycotoxins, which provides a basis for region-specific cumulative risk assessment of combined mycotoxin exposure and puts forward a food classification method based on multiple food safety indicators.

    参考文献
    相似文献
    引证文献
引用本文

唐昊,梁江,吴难,李明璐,杨大进,张磊,薛文博,祝海江,王小丹.基于Smote-KNN的小麦8种真菌毒素共污染特征气候分类模型研究[J].中国食品卫生杂志,2023,35(6):807-812.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-04-12
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-09-25
  • 出版日期:
《中国食品卫生杂志》邮寄地址与联系方式变更通知
关闭