水中卤代消毒副产物质谱数据库的设计与实现

Design and Implementation of a Mass Spectrometry Database for Halogenated Disinfection By-products in Water

  • 摘要: 在饮用水消毒过程中,含氯消毒剂与水中有机物反应会生成具有毒性的卤代消毒副产物(halogenated disinfection by-products,HDBPs),对人体健康构成威胁。为实现复杂水样中HDBPs的非靶向筛查,本研究基于中国科学院近代物理研究所公共技术中心的高分辨四极杆飞行时间质谱仪(Q-TOF MS),开发了一款综合性的质谱数据管理系统。该系统采用Python开发,使用MySQL构建数据库,并通过PyQt实现图形界面,具备质谱数据的存储、管理、查询和分析功能,且设计了高效的质谱匹配算法,能够快速鉴别目标化合物,并支持多种卤代乙酸质谱数据的录入与管理。本实验通过对水样中卤代乙酸的非靶向筛查,表明所构建的数据管理系统能够实现复杂样品场景下的高效匹配(匹配相似度达93%以上),充分验证了该系统的准确性与可靠性。

     

    Abstract: During the disinfection of drinking water, chlorine-containing disinfectants react with organic matter in the water to form a class of toxic by-products, i.e. halogenated disinfection by-products (HDBPs). These by-products are potentially carcinogenic and mutagenic, posing a serious threat to human health. In recent years, the detection and analysis of HDBPs have gradually become a research hotspot in the fields of environmental chemistry and public health with the increased concern for drinking water safety. However, due to the complexity of water samples and the diversity of HDBPs species, traditional detection methods face many limitations. Therefore, the development of an efficient and accurate non-targeted screening tool becomes the key to solve this problem. Mass spectrometry has become an important tool for the detection of HDBPs due to its high resolution and high sensitivity. And the selection of database and the design of data management system are especially critical in mass spectrometry data analysis. Existing databases are mainly divided into two categories. One is commercial databases, such as Bruker’s data analysis, which is integrated in the instrument software with a friendly interface and supports seamless connection, but with high cost, poor customization, and only applicable to specific instruments. Another is public databases, such as NIST and PubChem, which are open and widely applicable, but with uneven data quality and insufficient coverage of HDBPs. Moreover, the traditional manual comparison method is inefficient and inaccurate when using these databases, which is difficult to meet the demand of high-throughput detection. Therefore, designing a dedicated mass spectrometry data management system to improve the analysis efficiency and realize the non-targeted screening of HDBPs has become the key to current research. To address the above problems, this study developed a comprehensive mass spectrometry data management system based on a high-resolution quadrupole time-of-flight mass spectrometer (Q-TOF MS) at the Public Technology Center of the Institute of Modern Physics, Chinese Academy of Sciences. The system was developed in Python language, using MySQL to build the background database and PyQt to build the graphical interface (GUI). Based on this technical architecture, the system realizes the storage, management, query and analysis functions of mass spectrometry data. The system was designed with an efficient mass spectrometry matching algorithm, which can identify the target compounds quickly. To validate the performance of the developed system, this study experimentally screened haloacetic acid in water samples in a non-targeted manner. The experimental results showed that this system is able to achieve efficient compound matching in complex sample scenarios with a matching similarity of more than 93%, demonstrating the advantages in terms of accuracy and reliability.

     

/

返回文章
返回