黏蛋白型<i>O</i>-糖基化蛋白质组学分析进展

刘兆亮; 于永亮; 叶明亮

doi:10.7538/zpxb.2025.0156

摘要: 黏蛋白型糖基化作为蛋白质重要的翻译后修饰，具有高度异质性和多位点修饰的特点，参与细胞识别、免疫应答及信号转导等关键生物学过程，并广泛存在于多种肿瘤中。然而，由于黏蛋白型糖基化具有缺乏保守序列、复杂的糖链结构且在生物样本中丰度较低的特点，使其精确分析面临极大挑战。发展高效的富集技术对于从复杂生物样品中捕获低丰度O-糖肽具有重要意义，先进的质谱数据库检索策略是从复杂碎片信息中准确解读糖肽结构的关键。本文综述了近5年来用于富集O-糖肽的方法及其质谱数据库检索策略的最新进展，并对其发展前景进行展望。

Abstract: Mucin-type O-GalNAc glycosylation is a pivotal post-translational modification of proteins, charactized by high glycan structural heterogeneity and multisite modifications. It regulates core biological processes including cellular recognition, immune response, and signal transduction—aberrations in this modification are linked to autoimmune diseases, diabetes, cardiovascular disorders, inflammation, viral infections, neurodegeneration, and cancers. Elucidating disease-specific O-glycosylation patterns will advance disease detection, decode molecular pathogenesis, and inform the discovery of novel therapeutic targets. The bottom-up strategy is the gold standard for large-scale glycoproteomic analysis, yet comprehensive O-glycosylation profiling remains challenging due to non-conserved flanking sequences around glycosylation sites, complex glycan structures, and the low abundance of O-glycopeptides. High-efficiency enrichment techniques are critical for capturing low-abundance O-glycopeptides from complex matrices, while advanced database search strategies enable accurate interpretation of tandem mass spectrometry (MS/MS) data. This review summarized key progress in O-glycopeptide enrichment and MS-based database search methods over the past five years. Enrichment methods have seen significant innovations: hybrid materials integrating hydrophilic interaction liquid chromatography (HILIC) with complementary affinity techniques (e.g., immobilized metal ion affinity chromatography (IMAC), boronic acid chemistry) greatly enhance enrichment efficiency. For example, Ti-IMAC materials capture sialylated O-glycopeptides from 0.1 μL human serum and enable the identification of ～200 O-glycopeptides, while boronic acid-functionalized mesoporous composites analyzed using 1 μL of serum yield 724 N-glycopeptides and 152 O-glycopeptides. Automated high performance liquid chromatography (HPLC) workflows enable simultaneous N/O-glycopeptide separation, identifying 181 N- and 17 O-glycopeptides with significant changes in gastric cancer serum. O-Glycoprotease-based methods (OgpA/IMPa) combined with solid-phase chemoenzymatic approaches allow specific enrichment: MOTAI distinguishes Tn/sTn from other O-glycopeptides in colon cancer tissues, identifying 32 upregulated Tn/sTn glycoproteins. Bioorthogonal strategies (GalNAz metabolic labeling-click chemistry) such as Click-iG identify 262 O-glycosylation sites in mouse tissues. Database search tools have overcome traditional limitations: O-Search-Pattern uses Y-ion pattern matching to boost O-glycopeptide identifications by 15.4%-199.0% compared with other tools; MSFragger-Glyco leverages open search and ion indexing to increase identifications by 4-6-fold and reduce analysis time to minutes; pGlyco3’s glycan-first strategy enables fast, precise intact glycopeptide analysis. Machine learning approaches show promise: CandyCrunch predicts glycan structures from LC-MS/MS data with 90.3% accuracy in seconds; DeepGlyco uses tree-LSTM and graph neural networks to distinguish glycan isomers; GlyPep-Quant integrates random forests and DBSCAN to improve quantitative performance; MarkerPredict identifies cancer biomarkers through disordered protein and signal network features. These tools address bottlenecks such as low-abundance glycopeptide detection and isomer differentiation. Overall, this review provides a systematic overview of O-glycopeptide analysis methods, guiding technical innovation and deepening understanding of disease mechanisms-ultimately accelerating the translation of glycoproteomic insights into clinical applications.

黏蛋白型O-糖基化蛋白质组学分析进展

Recent Advances in Mucin-Type O-Glycosylation Proteomic Analysis