基于组合算法改进的谱库检索算法

朱强, 俞建成, 张荣

朱强, 俞建成, 张荣. 基于组合算法改进的谱库检索算法[J]. 质谱学报, 2018, 39(3): 337-341. DOI: 10.7538/zpxb.2017.0058
引用本文: 朱强, 俞建成, 张荣. 基于组合算法改进的谱库检索算法[J]. 质谱学报, 2018, 39(3): 337-341. DOI: 10.7538/zpxb.2017.0058
ZHU Qiang, YU Jian-cheng, ZHANG Rong. Spectral Library Search Algorithm Based on Improved Composite Algorithm[J]. Journal of Chinese Mass Spectrometry Society, 2018, 39(3): 337-341. DOI: 10.7538/zpxb.2017.0058
Citation: ZHU Qiang, YU Jian-cheng, ZHANG Rong. Spectral Library Search Algorithm Based on Improved Composite Algorithm[J]. Journal of Chinese Mass Spectrometry Society, 2018, 39(3): 337-341. DOI: 10.7538/zpxb.2017.0058
朱强, 俞建成, 张荣. 基于组合算法改进的谱库检索算法[J]. 质谱学报, 2018, 39(3): 337-341. CSTR: 32365.14.zpxb.2017.0058
引用本文: 朱强, 俞建成, 张荣. 基于组合算法改进的谱库检索算法[J]. 质谱学报, 2018, 39(3): 337-341. CSTR: 32365.14.zpxb.2017.0058
ZHU Qiang, YU Jian-cheng, ZHANG Rong. Spectral Library Search Algorithm Based on Improved Composite Algorithm[J]. Journal of Chinese Mass Spectrometry Society, 2018, 39(3): 337-341. CSTR: 32365.14.zpxb.2017.0058
Citation: ZHU Qiang, YU Jian-cheng, ZHANG Rong. Spectral Library Search Algorithm Based on Improved Composite Algorithm[J]. Journal of Chinese Mass Spectrometry Society, 2018, 39(3): 337-341. CSTR: 32365.14.zpxb.2017.0058

基于组合算法改进的谱库检索算法

Spectral Library Search Algorithm Based on Improved Composite Algorithm

  • 摘要: 本工作对Stein和Scott提出的SS组合算法(SS)进行改进,采用Kim等研究得到的权值因子优化该算法中对应的权值因子,并重新分配了加权点积相似度算法和峰比例算法的系数。采用改进的SS组合算法,在NIST 11标准参考谱库(212 961张质谱图)中检索了查询库中的30 932张质谱图,使用气相色谱-质谱联用仪分析了8种不同的化合物样品,并且在NIST 11参考库中检索对应的质谱图。为了评价该算法的性能,分别利用2种组合算法分析查询谱图或实验样品的准确度和相似度。结果表明:与之前的SS组合算法相比,使用本方法后,查询谱图在参考谱库中匹配的准确度平均提高了1.15%,并且查询库中94.45%谱图的相似度得到了提高;通过气相色谱-质谱联用仪得到的样品质谱图在参考谱库中有着更高的命中率,并且谱图的相似度平均提高了3.56%。改进的组合算法能够较好地提高待测谱图在参考库中的准确度和相似度,同时也可以利用这种方法改进以SS组合算法为理论基础的其他算法。
    Abstract: The composite algorithm proposed by Stein and Scott was improved, whose weight factors were optimized by the weighting factors proposed by Kim et al, and whose coefficients of weighted dot-product similarity measure and peak ratio algorithm were redistributed. Using the improved composite algorithm, 30 932 mass spectra in the query library were retrieved in the Mass Spectral Library 2011 (NIST 11) main library (212 961 mass spectra) used as reference spectral library. In addition, 8 kinds of different compounds were analysed by gas chromatography-mass spectrometry (GC/MS), and the corresponding mass spectra were also retrieved in the NIST 11. In order to evaluate the performance of the algorithm, two different sets of experiments were carried out, and the accuracy and similarity of the query spectra or the experimental samples were analysed by using two combinatorial algorithms respectively. The results showed that compared with the previous composite algorithm, the accuracy of the query spectra matching in the reference spectral library was increased by 1.15%, and the similarity of the 94.45% of the mass spectra in the query library were improved. The spectrum of the sample through the GC/MS had higher hit rates in the reference spectral library, and the spectrum similarity increased 3.56% in average. Since the improved composite algorithm can improve the accuracy and similarity of the spectrum to be measured in the reference spectral library, it can also be used to improve other algorithm based on Stein and Scott’s composite algorithm.
  •   2693

  • [1] FERNANDES D R, PEREIRA V B, STELZER K T, et al. Quantification of trace O-containing compounds in GTL process samples via Fischer-Tropsch reaction by comprehensive two-dimensional gas chromatography/mass spectrometry[J]. Talanta, 2015, 144: 627-635.
    [2] SMITH P A, KLUCHINSKY T A, SAVAGE P B, et al. Traditional sampling with laboratory analysis and solid phase microextraction sampling with field gas chromatography/mass spectrometry by military industrial hygienists[J]. American Industrial Hygiene Association Journal, 2002, 63(3): 284-292.
    [3] GUILLONG M, HAMETNER K, REUSSER E, et al. Preliminary characterisation of new glass reference materials (GSA-1G, GSC-1G, GSD-1G and GSE-1G) by laser ablation-inductively coupled plasma-mass spectrometry using 193 nm, 213 nm and 266 nm wavelengths[J]. Geostandards and Geoanalytical Research, 2005, 29(3): 315-331.
    [4] 黄湛艳,王志伟. GC-MS检测食品包装用PET中6种潜在添加的小分子化合物[J]. 现代食品科技,2016,32(1):297-303.HUANG Zhanyan, WANG Zhiwei. Determination of six small-molecule compounds in polyethylene terephthalate (PET) used for food packaging by GC-MS[J]. Modern Food Science and Technology, 2016,32(1): 297-303(in Chineses).
    [5] CHRISTOU C, GIKA H G, RAIKOS N, et al. GC-MS analysis of organic acids in human urine in clinical settings: a study of derivatization and other analytical parameters[J]. Journal of Chromatography B Analytical Technologies in the Biomedical & Life Sciences, 2014, 964: 195-201.
    [6] DUERING R A, KOHL C D, GASCH T, et al. Detection of infochemicals in agriculture and environmental chemistry by in situ GC-MS/EAD and semiconductor gas sensors[C]. Sensors and Measuring Systems 2014; 17. ITG/GMA Symposium; Proceedings of. VDE, 2014: 7-12.
    [7] BEDNAR A J, RUSSELL A L, HAYES C A, et al. Analysis of munitions constituents in groundwater using a field-portable GC-MS[J]. Chemosphere, 2012, 87(8): 894-901.
    [8] 李宝强,李翠萍,郭春涛,等. 基于小波变换的谱图预检索和精检索的组合匹配算法[J]. 质谱学报,2014,35(2):118-124.LI Baoqiang, LI Cuiping, GUO Chuntao, et al. A composed matching algorithm of spectrum pre-search and precision search based on wavelet transform[J]. Journal of Chinese Mass Spectrometry Society, 2014, 35(2): 118-124(in Chinese).
    [9] HERTZ H S, HITES R A, BIEMANN K. Identification of mass spectra by computer-searching a file of known spectra[J]. Analytical Chemistry, 1971, 43(6): 681-691.
    [10] ATWATER B L, STAUFFER D B, MCLAFFERTY F W, et al. Reliability ranking and scaling improvements to the probability based matching system for unknown mass spectra[J]. Analytical Chemistry, 1985, 57(4): 899-903.
    [11] STEIN S E, SCOTT D R. Optimization and testing of mass spectral library search algorithms for compound identification[J]. Journal of the American Society for Mass Spectrometry, 1994, 5(9): 859-866.
    [12] RASMUSSEN G T, ISENHOUR T L. The evaluation of mass spectral search algorithms[J]. Journal of Chemical Information & Modeling, 1979, 19(3): 179-186.
    [13] TABB D L, MACCOSS M J, WU C C, et al. Similarity among tandem mass spectra from proteomic experiments: detection, significance, and utility[J]. Analytical Chemistry, 2003, 75(10): 2470-2477.
    [14] KOO I, ZHANG X, KIM S. Wavelet- and Fourier-transform-based spectrum similarity approaches to compound identification in gas chromatography/mass spectrometry[J]. Analytical Chemistry, 2011, 83(14): 5631-5638.
    [15] KIM S, KOO I, WEI X, et al. A method of finding optimal weight factors for compound identification in gas chromatography-mass spectrometry[J]. Bioinformatics, 2012, 28(8): 1158-1163.
    [16] KIM S, KOO I, JEONG J, et al. Compound identification using partial and semipartial correlations for gas chromatography-mass spectrometry data[J]. Analytical Chemistry, 2012, 84(15): 6477-6487.
    [17] 周义,俞建成,张俊良,等. 一种基于新的向量空间模型的谱库检索算法[J]. 真空科学与技术学报,2016,36(12):1450-1454.ZHOU Yi, YU Jiancheng, ZHANG Junliang, et al. Novel vector space model and algorithm for search of mass spectral library[J]. Chinese Journal of Vacuum Science and Technology, 2016, 36(12): 1450-1454(in Chinese).
图(1)
计量
  • 文章访问数:  784
  • HTML全文浏览量:  1
  • PDF下载量:  656
  • 被引次数: 0
出版历程
  • 刊出日期:  2018-05-19

目录

    /

    返回文章
    返回