Parametric Approach for Regression Quantification and Quality Control of Proteins from Multiple Mass Spectrometry Platforms
-
-
Abstract
Quantitative analysis based on high-resolution mass spectrometry data, as an increasingly popular high-throughput method in the past decade, is a fundamental work in proteomic research. Different MS manufactures typically use different-suite of software for protein identification and quantification. It is inconvenient to compare results from different MS platforms. Furthermore, the accuracy of protein quantification is improvable. Hence, the development of a standardized and automated protein quantification process is urgent for proteomics researches. A parametric approach was established for quantitative research based on mass spectrum. First, basic information of candidate peptides was extracted from mainstream spectra search program, mainly include m/z, retention times (tR), and corresponding protein IDs, peptides, modifications, charge states, as well as scan numbers. Then the intensities of peptides in first level in tandem mass spectrometry (MS1) were retrieved from the decoded raw file basing on m/z and tR in a proper threshold. The extracted intensity values around the retention time for a certain peptide can be constructed into extracted ion chromatograms (XIC) peak. After smoothed by Savitzky-Golay smoothing filter, the XIC peak area was calculated by trapezoidal rule. For the peptides produced MS1 signals but not MS2 signals due to random sampling in MS, regression was used to improve parallelism of pairwise experiments. Polynomial curve fitting of degree 2 was used to retrieve regression function of retention time by common identified peptides between different runs. With the retention time alignment and regression approach, quantification for low abundance proteins can be improved greatly. A parametric model based on the peak width, retention time and other parameters were combined by joint likelihood ratio and constructed by Bayesian model. This approach can effectively filter out poor peptide for quantification. By evaluated the relationship of m/z and the ratio of nth/1st isotope pattern intensities based on theory isotope pattern models, the function relationships were determined between m/z and the ratio of nth/1st isotope pattern intensities, and the functions for every charge state were built. Then their theory intensities of isotope patterns can be calculated simply. And discarding the poor peptides that out of the theory isotope pattern range based on isotope pattern is an effective way to improve the accuracy of quantification. A comprehensive filter with all these factors can adjust the quantification of XIC. This standardized and automated process contains modules for data format conversion, retention time alignment and regression, and a multi-parametric peptide filter. It works on the majority of mainstream high resolution tandem mass spectrum and derived more precise quantification results, especially for low abundance proteins. A comparison among this process, MaxQuant and Proteome Discoverer proved the accuracy of this parametric method for protein quantification. The method provides a more accurate and intuitive method for MS data utilization, a more convenient method for protein quantification researches.
-
-