Abstract:
During the disinfection of drinking water, chlorine-containing disinfectants react with organic matter in the water to form a class of toxic by-products, i.e. halogenated disinfection by-products (HDBPs). These by-products are potentially carcinogenic and mutagenic, posing a serious threat to human health. In recent years, the detection and analysis of HDBPs have gradually become a research hotspot in the fields of environmental chemistry and public health with the increased concern for drinking water safety. However, due to the complexity of water samples and the diversity of HDBPs species, traditional detection methods face many limitations. Therefore, the development of an efficient and accurate non-targeted screening tool becomes the key to solve this problem. Mass spectrometry has become an important tool for the detection of HDBPs due to its high resolution and high sensitivity. And the selection of database and the design of data management system are especially critical in mass spectrometry data analysis. Existing databases are mainly divided into two categories. One is commercial databases, such as Bruker’s data analysis, which is integrated in the instrument software with a friendly interface and supports seamless connection, but with high cost, poor customization, and only applicable to specific instruments. Another is public databases, such as NIST and PubChem, which are open and widely applicable, but with uneven data quality and insufficient coverage of HDBPs. Moreover, the traditional manual comparison method is inefficient and inaccurate when using these databases, which is difficult to meet the demand of high-throughput detection. Therefore, designing a dedicated mass spectrometry data management system to improve the analysis efficiency and realize the non-targeted screening of HDBPs has become the key to current research. To address the above problems, this study developed a comprehensive mass spectrometry data management system based on a high-resolution quadrupole time-of-flight mass spectrometer (Q-TOF MS) at the Public Technology Center of the Institute of Modern Physics, Chinese Academy of Sciences. The system was developed in Python language, using MySQL to build the background database and PyQt to build the graphical interface (GUI). Based on this technical architecture, the system realizes the storage, management, query and analysis functions of mass spectrometry data. The system was designed with an efficient mass spectrometry matching algorithm, which can identify the target compounds quickly. To validate the performance of the developed system, this study experimentally screened haloacetic acid in water samples in a non-targeted manner. The experimental results showed that this system is able to achieve efficient compound matching in complex sample scenarios with a matching similarity of more than 93%, demonstrating the advantages in terms of accuracy and reliability.