A system for identification of structural markers of Ames mutagenicity based on similarity of xenobiotic structure fingerprints
Abstract
The article focuses on the assessment of the genotoxic potential of chemical compounds that may be released into the environment. The necessity of changing the basic vector of development of modern toxicology in view of the achievements in the field of computer science and information technology is proved. In the framework of the study, attention was focused on the in silico approach, which allows to draw conclusions about the genotoxicity of a chemical compound in accordance with the identified functional groups that may underlie the manifestations of mutagenicity. The Ames system for determining structural markers of mutagenicity was implemented in accordance with publicly available databases of chemical compounds (EFSA, Kazius/Bursi and Hansen). The initial number of the merged dataset was increased by mycotoxins, and duplicates were removed. For each xenobiotic presented in the dataset, the mutagenic potential was determined using the in vitro Ames test. In order to effectively identify functional groups that may be signals of mutagenicity, it was decided to divide the xenobiotics of the combined data set into five structural classes. Such an approach to the formation of homogeneous groups of xenobiotics that may exhibit potential genotoxic properties allows us to identify structural markers of Ames mutagenicity within each class of mutagens. To obtain reliable information on the presence of a certain functional group - mutagenicity signal, taking into account the studied structural class of xenobiotics, it was proposed to use distance matrices calculated for each mutagen/non-mutagen pair of the combined data set. The similarity between the compounds was evaluated using classical similarity evaluation metrics (Tanimoto and Heming) according to the calculated three types of molecular fingerprints for each xenobiotic. The last stage of the implementation of the Ames system for detecting structural markers of mutagenicity was associated with the search for and application of an effective algorithm for visualizing multidimensional data. The literature analysis allowed us to choose the optimal algorithm for solving this problem. The chosen algorithm (t-SNE) allows multidimensional data (distance matrices for all mutagens and non-mutagens) to be represented in two-dimensional space. This visualization allows us to find all pairs (mutagen/non-mutagen) that have a sufficiently high similarity index and draw conclusions about the presence of certain functional groups that may underlie the manifestations of mutagenicity for each of the five structural classes of potential mutagens. It is quite interesting from the scientific point of view to analyze the effectiveness of using different types of structure fingerprints to identify structural warnings of Ames mutagenicity, which was carried out in the framework of this study. The result of the work is the developed software that allows determining structural markers of Ames mutagenicity based on the similarity of the structure fingerprints of chemical compounds represented in the combined data set. The possibility of using the proposed approach to solve the problem of finding cause-and-effect relationships between mutagenicity and the presence of certain functional groups in the structure of the studied xenobiotics is demonstrated.
Downloads
References
Challa, A.P., Beam, A.L., Shen, M., Peryea, T., Lavieri, R.R., Lippmann, E.S., Aronoff, D.M. (2020). Machine learning on drug-specific data to predict small molecule teratogenicity. Reproductive toxicology (Elmsford, N.Y.), 95, 148–158. https://doi.org/10.1016/j.reprotox.2020.05.004
Chu, C.S.M., Simpson, J.D., O'Neill, P.M., Berry, N.G. (2021). Machine learning - Predicting Ames mutagenicity of small molecules. Journal of molecular graphics & modelling, 109, 108011. https://doi.org/10.1016/j.jmgm.2021.108011
Cortes-Ciriano, I. (2016). Bioalerts: a python library for the derivation of structural alerts from bioactivity and toxicity data sets. Journal of cheminformatics, 8, 13. doi: 10.1186/s13321-016-0125-7
Djoumbou Feunang, Y., Eisner, R., Knox, C., Chepelev, L., Hastings, J., Owen, G., Fahy, E., Steinbeck, C., Subramanian, S., Bolton, E., Greiner, R., Wishart, D.S. (2016). ClassyFire: automated chemical classification with a comprehensive, computable taxonomy. Journal of cheminformatics, 8, 61. https://doi.org/10.1186/s13321-016-0174-y
EFSA Journal (2016). EFSA (European Food Safety Authority), 2016. Dietary exposure assessment to pyrrolizidine alkaloids in the European population. Europen Food Safety Authority. the EFSA Journal, 14(8), 4572. https://doi.org/10.2903/j.efsa.2016.4572
Floris, M., Raitano, G., Medda, R., Benfenati, E. (2017). Fragment prioritization on a large mutagenicity dataset. Molecular Informatics, 36, 1600133. https://doi.org/10.1002/minf.201600133
Hansen, K., Mika, S., Schroeter, T., Sutter, A., ter Laak, A., Steger-Hartmann, T., Heinrich, N., Müller, K.R. (2009). Benchmark data set for in silico prediction of Ames mutagenicity. Journal of chemical information and modeling, 49(9), 2077–2081. https://doi.org/10.1021/ci900161g
Helma, C., Schöning, V., Drewe, J., Boss, P. (2021). A Comparison of Nine Machine Learning Mutagenicity Models and Their Application for Predicting Pyrrolizidine Alkaloids. Frontiers in pharmacology, 12, 708050. https://doi.org/10.3389/fphar.2021.708050
Honma, M. (2020). An assessment of mutagenicity of chemical substances by (quantitative) structure-activity relationship. Genes and environment : the official journal of the Japanese Environmental Mutagen Society, 42, 23. https://doi.org/10.1186/s41021-020-00163-1.
Honma, M., Kitazawa, A., Cayley, A., Williams, R.V., Barber, C., Hanser, T., Saiakhov, R., Chakravarti, S., Myatt, G.J., Cross, K.P., Benfenati, E., Raitano, G., Mekenyan, O., Petkov, P., Bossa, C., Benigni, R., Battistelli, C.L., Giuliani, A., Tcheremenskaia, O., DeMeo, C., Norinder, U., Koga, H., Jose, C., Jeliazkova, N., Kochev, N., Paskaleva, V., Yang, Ch., Daga, P.R., Clark, R.D., Rathman, J. (2019). Improvement of quantitative structure-activity relationship (QSAR) tools for predicting Ames mutagenicity: outcomes of the Ames/QSAR International Challenge Project. Mutagenesis, 34(1), 3–16. https://doi.org/10.1093/mutage/gey031.
Kazius, J., McGuire, R., Bursi, R. (2005). Derivation and validation of toxicophores for mutagenicity prediction. Journal of medicinal chemistry, 48(1), 312–320. https://doi.org/10.1021/jm040835a
Kazius, J., Nijssen, S., Kok, J., Bäck, T., Ijzerman, A.P. (2006). Substructure mining using elaborate chemical representation. J. Chem. Inf. Model. 46, 597–605. https://doi.org/10.1021/ci0503715
Kislyak, S., Dugan, O., Yalovenko, O. (2024). Systems for Genetic Assessment of the Impact of Environmental Factors. Innovative Biosystems and Bioengineering, 8(2), 3–27. https://doi.org/10.20535/ibb.2024.8.2.288127.
Kislyak, S., Dugan, O., Yesypenko, R., Starosyla, D., Yalovenko, O. (2025). In silico the Ames Mutagenicity Predictive Model of Environment. Innovative Biosystems and Bioengineering, 9(2), 42–52. https://doi.org/10.20535/ibb.2025.9.2.316239
Lepailleur, A., Poezevara, G., Bureau, R. (2013). Automated detection of structural alerts (chemical fragments) in (eco)toxicology. Comput. Struct. Biotechnol. J. 5, e201302013. https://doi.org/10.5936/csbj.201302013
Maggiora, G., Vogt, M., Stumpfe, D., Bajorath, J. (2014). Molecular similarity in medicinal chemistry. Journal of medicinal chemistry, 57(8), 3186–3204. https://doi.org/10.1021/jm401411z
Mao, J., Akhtar, J., Zhang, X., Sun, L., Guan, S., Li X., Chen, G., Liu, J., Jeon, H.N., Kim, M.S., No, K.T., Wang, G. (2021). Comprehensive strategies of machine-learning-based quantitative structure-activity relationship models. iScience, 24(9), 103052. https://doi.org/10.1016/j.isci.2021.103052
Mellor, C.L., Marchese Robinson, R.L., Benigni, R., Ebbrell, D., Enoch, S.J., Firman, J.W., Madden, J.C., Pawar, G., Yang, C., Cronin, M.T.D. (2019). Molecular fingerprint-derived similarity measures for toxicological read-across: Recommendations for optimal use. Regulatory toxicology and pharmacology: RTP, 101, 121–134. https://doi.org/10.1016/j.yrtph.2018.11.002
Mišík, M., Nersesyan, A., Ferk, F., Holzmann, K., Krupitza, G., Herrera Morales, D., Staudinger, M., Wultsch, G., Knasmueller, S. (2022). Search for the optimal genotoxicity assay for routine testing of chemicals: Sensitivity and specificity of conventional and new test systems. Mutation research. Genetic toxicology and environmental mutagenesis, 881, 503524. https://doi.org/10.1016/j.mrgentox.2022.503524
Müller, L., Mauthe, R.J., Riley, C.M., Andino, M.M., Antonis, D.D., Beels, C., DeGeorge, J., De Knaep, A.G., Ellison, D., Fagerland, J.A., Frank, R., Fritschel, B., Galloway, S., Harpur, E., Humfrey, C.D., Jacks, A.S., Jagota, N., Mackinnon, J., Mohan, G., Ness, D.K., O’Donovan, M.R., Smith, M.D., Vudathala, G., Yotti, L. (2006). A rationale for determining, testing, and controlling specific impurities in pharmaceuticals that possess potential for genotoxicity. Regulatory toxicology and pharmacology: RTP, 44(3), 198–211. https://doi.org/10.1016/j.yrtph.2005.12.001
Orlov, A.A., Akhmetshin, T.N., Horvath, D., Marcou, G., Varnek, A. (2025). From High Dimensions to Human Insight: Exploring Dimensionality Reduction for Chemical Space Visualization. Molecular informatics, 44(1), e202400265. https://doi.org/10.1002/minf.202400265
Ren, N., Atyah, M., Chen, W.Y., Zhou, C.H. (2017). The various aspects of genetic and epigenetic toxicology: testing methods and clinical applications. Journal of translational medicine, 15(1), 110. https://doi.org/10.1186/s12967-017-1218-4
Samanipour, S., O'Brien, J.W., Reid, M.J., Thomas, K.V., Praetorius, A. (2023). From Molecular Descriptors to Intrinsic Fish Toxicity of Chemicals: An Alternative Approach to Chemical Prioritization. Environmental science & technology, 57(46), 17950–17958. https://doi.org/10.1021/acs.est.2c07353
Shen, J., Cheng, F., Xu, Y., Li, W., Tang, Y. (2010). Estimation of ADME properties with substructure pattern recognition. Journal of Chemical Information and Modeling, 50, 1034–1041. https://doi.org/10.1021/ci100104j
Swamidass, S.J., Chen, J., Bruand, J., Phung, P., Ralaivola, L., Baldi, P. (2005). Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity. Bioinformatics (Oxford, England), 21 Suppl 1, i359–i368. https://doi.org/10.1093/bioinformatics/bti1055
Tolosa, J., Serrano Candelas, E., Vallés Pardo, J.L., Goya, A., Moncho, S., Gozalbes, R., Palomino Schätzlein, M. (2023). MicotoXilico: An Interactive Database to Predict Mutagenicity, Genotoxicity, and Carcinogenicity of Mycotoxins. Toxins, 15(6), 355. https://doi.org/10.3390/toxins15060355
Tubbs, A., Nussenzweig, A. (2017). Endogenous DNA Damage as a Source of Genomic Instability in Cancer. Cell, 168(4), 644–656. https://doi.org/10.1016/j.cell.2017.01.002
Turkez, H., Arslan, M.E., Ozdemir, O. (2017). Genotoxicity testing: progress and prospects for the next decade. Expert opinion on drug metabolism & toxicology, 13(10), 1089–1098. https://doi.org/10.1080/17425255.2017.1375097
Valles, G.J., Bezsonova, I., Woodgate, R., Ashton, N.W. (2020). USP7 Is a Master Regulator of Genome Stability. Frontiers in cell and developmental biology, 8, 717. https://doi.org/10.3389/fcell.2020.00717
Willett, P. (2014). The Calculation of Molecular Structural Similarity: Principles and Practice. Molecular informatics, 33(6-7), 403–413. https://doi.org/10.1002/minf.201400024
Yang, H., Sun, L., Li, W., Liu, G., Tang, Y. (2018). In Silico Prediction of Chemical Toxicity for Drug Design Using Machine Learning Methods and Structural Alerts. Frontiers in chemistry, 6, 30. https://doi.org/10.3389/fchem.2018.00030
Yang, X., Zhang, Z., Li, Q., Cai, Y. (2021). Quantitative structure-activity relationship models for genotoxicity prediction based on combination evaluation strategies for toxicological alternative experiments. Scientific reports, 11(1), 8030. https://doi.org/10.1038/s41598-021-87035-y
Baryliak, I.R., Duhan, O.M. (2002). Ecological and genetic research in Ukraine. Cytology and Genetics, 5, 3–10 (in Ukrainian)
Duhan, O.M., Yalovenko, O.I. (2006). Potential mutagenic effect of hair dye ingredients in alternative test systems. Problems of Ecological and Medical Genetics and Clinical Immunology: Collection of Scientific Papers. Kyiv–Luhansk–Kharkiv, 1(64) (in Ukrainian)
Kysliak, S.V., Holub, N.B., Duhan, O.M., Averyanova, O.A. (2023). Molecular Interaction Modeling [Electronic resource]: Textbook for Master’s Degree Students, Specialty 162 "Biotechnology and Bioengineering" (Electronic text data, 1 file, 26 MB). Kyiv: Igor Sikorsky Kyiv Polytechnic Institute (in Ukrainian)
Authors retain copyright of their work and grant the journal the right of its first publication under the terms of the Creative Commons Attribution License 4.0 International (CC BY 4.0), that allows others to share the work with an acknowledgement of the work's authorship.