Abstract:
Determining denovo glycan structure automatically from MS/MS (including monosaccharide composition, sequencing topology and linkage between adjacent monosaccharide) has been studied for many years, but interpreting glycan structure from MS quickly and accurately is still a great challenge. Existing methods can be generally divided into two classes: greedy, heuristic to reduce time complexity, which are inexact by their nature; or exact methods such as dynamic programming or exhaustive method, which are slower than inexact methods and share common problems such as repetitive peak counting and crude scoring function in reconstructing candidate structure procedure. These unheeded details will lead to inaccuracy results. In this paper, a denovo algorithm we designed to accurately reconstruct the tree structure bottomed up from MS/MS with only some logical constrains, which can be applied to N-glycan or O-glycan equally. Different from previous iterative methods, the growing unit in this algorithm is not monosaccharide but substructure produced in the iterative procedure, thus improving the processing speed significantly. By taking unheeded details into consideration, experiments were conducted on 20 complex glycan structures extracted from human sperm, the results show that this algorithm has a high accuracy by ranking 15 real structure the first place.