PROBLEMS OF THE NEURAL NETWORKS OUTPUT DATA QUALITY ASSESSMENT
Abstract
Today, artificial intelligence, particularly neural networks, is increasingly being used in software in a variety of industries, from mission-critical applications such as healthcare and the military to commerce and entertainment. One of the main stages of development and implementation of such software is the stage of quality control. To prevent fatal errors and to survive in a highly competitive environment, the software needs proper testing taking into account the peculiarities inherent in the data obtained as a result of the neural network. This article presents the relevance of using artificial intelligence systems in general and neural networks in particular and analyzes the main challenges that arise when assessing the quality of such networks. The author compares the properties of the output data of the artificial intelligence systems of the previous generation and the latest neural networks, highlights the key differences of the latter, such as the potential infinity of the input data sets and their relative unpredictability, the dependence of the results on the network training stage, and the subjective nature of the evaluation of such results. Based on the analysis, the author formulates a set of problems that can be solved using mathematical algorithms and methods. The main part of an article contains a general overview of existing solutions, with an emphasis on such algorithms and methods as calculating accuracy and loss, finding the F-score, interpretation methods and imitation modeling. As a result of the research, the author comes to the conclusion that, despite a sufficient number of existing solutions that can be used to solve the highlighted problems, they still have to be improved to increase the accuracy of neural network evaluation, as one hundred percent accuracy in evaluating data obtained as a result of the operation of neural networks has not yet been achieved.
Downloads
References
Giesecke K., Horel E., (2020) Significance Tests for Neural Networks by Enguerrand Horel. Journal of Machine Learning Research. July 20, 2020. 1-29. https://jmlr.csail.mit.edu/papers/volume21/19-264/
Saurabh Ranjan Srivastava., (2016) F1 Score Analysis of Search Engines. Skit Research Journal, 2(6)
Ponakala R., Dailey M., (2019). Testing Deep Neural Networks for Classification Tasks Through Adversarial Perturbations on Test Datasets. December, 2019. Asian Institute of Technology, School of Engineering and Technology, Thailand. https://doi.org/10.31237/osf.io/r7wcn
Christian Terwiesch, (2023) Would Chat GPT Get a Wharton MBA? A Prediction Based on Its Performance in the Operations Management Course, Mack Institute for Innovation Management at the Wharton School, University of Pennsylvania, 2023. URL: https://mackinstitute.wharton.upenn.edu/2023/would-chat-gpt3-get-a-wharton-mba-new-white-paper-by-christian-terwiesch/
ISO/IEC/IEEE 29119-1:2013. Software and Systems Engineering. Software Testing Part 1: Concepts and Definitions. Geneva: International Organization for Standardization, 2013
Lundberg S. M., Lee S.-I. A Unified Approach to Interpreting Model Predictions // Advances in Neural Information Processing Systems (NeurIPS), 2017, Vol. 30, с. 4765-4774. DOI: https://doi.org/10.48550/arXiv.1705.07874
Russell S., Norvig P. Artificial Intelligence: A Modern Approach: підручник, 4th ed, Boston: Pearson, 2020, 1152 с.
Zhang J. M., Harman M., Ma L., Liu Y. Machine Learning Testing: Survey, Landscapes and Horizons // IEEE Transactions on Software Engineering. 2020, Vol. 48 No. 1, с. 1-36. DOI: https://doi.org/10.1109/ TSE.2019.2962027
Hossain E. Machine Learning Crash Course for Engineers / Eklas Hossain., 2024. - 453 с.

This work is licensed under a Creative Commons Attribution 4.0 International License.
