Abstract:Objective Near-infrared (NIR) spectroscopy and partial least squares-discriminant analysis (PLS-DA) were applied to discriminate soybean samples as being transgenic or non-transgenic. The rapid discrimination models for transgenic soybean were established, and the optimal model was selected. Methods Principal component analysis (PCA) was used to extract relevant features from the spectral data and remove anomalous samples. In experimental studies, 94 samples were used to build models and 41 samples were used as the validation to evaluate the performance of the developed models. The effects of sample morphology (intact or ground), wavelength range and spectral pretreatment method on the correctness of the model were discussed. Results Models for intact soybean samples obtain better judgment performance than models for ground samples. The best discriminant model for intact soybean samples possessed both 100.00% discriminant correct rate in calibration and validation sets at 9 403-5 438 cm-1 using second derivative (2nd). The best discriminant model for ground soybean samples also achieved both 100.00% discriminant correct rate in calibration and validation sets at 7 505-4 597 cm-1 using standard normal variate plus first derivative (SNV+1st). Conclusion By selecting sample morphology, wavelength range and spectral pretreatment method, the discrimination model can be optimized and the discriminant correct rate can be significantly improved.