EMPIRICAL EQUATION USING GMDH METHODOLOGY FOR THE CHARGED PARTICLES MULTIPLICITY DISTRIBUTION IN HADRONIC POSITRON-ELECTRON ANNIHILATION

The charged particle multiplicity (the total number of charged particles produced in an event) in positron-electron annihilation into multi-hadron final states is one of the most fundamental observables in the fragmentation process during which quark-antiquark pairs are produced [1-5]. For example, the interactions among the elementary particles are represented by Feynman diagrams such as those in the following Figure [1].

Our results are compared with the available experimental and theoretical values. KEYWORDS: hadronic positron-electron annihilation, charged-particles multiplicity distribution, empirical modeling, neural networks, group method of data handling (GMDH).
The charged particle multiplicity (the total number of charged particles produced in an event) in positron-electron annihilation into multi-hadron final states is one of the most fundamental observables in the fragmentation process during which quark-antiquark pairs are produced [1][2][3][4][5]. For example, the interactions among the elementary particles are represented by Feynman diagrams such as those in the following Figure [  The − + e e annihilation process is well understood by the creation of a quark-antiquark pair, branching of these pairs in accordance to perturbative quantum chromo-dynamics (QCD) and finally hadronization [5]. The analysis and investigation of multiplicity production is the first step for understanding the particle production mechanism, especially in − + e e annihilation, it can provide additional information on hadronic final states [6][7][8][9][10][11]. Multiplicity distributions can be characterized either in terms of probability, n P , of producing , n ch , charged particles at energy ,E, i.e. ( ) E P n , or by the moments of these distributions. The normal method of studying the charged-particle multiplicity distribution and its shape, is to calculate its moments. The common behaviors of the charged-particle multiplicity distribution are obtained using low-order moments, such as the mean, n , the dispersion, D, which estimates the width of the distribution, the skewness, S, which measures how symmetric the distribution is, and the kurtosis, K, which measures how sharply peaked the distribution is [8]. The multiplicity distribution are treated experimentally (see e.g. [12][13][14][15][16][17][18][19][20]) and theoretically [21][22][23][24]. However, due to the lacking of fundamental theory, a number of phenomenological models have been proposed to characterize the charged-particle multiplicity in high energy hadron processes [1,9], starting with early investigations by W. Heisenberg and E. Fermi [19,20]. Furthermore, because of the shortage of the fundamental theory to describe the experimental data on multiplicity distribution, the investigation of charged particles production in hadronic collisions is still phenomenological and based on a wide class of models and some theoretical principles. We shall study the distribution of the number of charged particles produced in positron-electron hadronic annihilation i. e. the charged multiplicity distribution and the average multiplicity (first order moment). The probability of producing a given number of charged prongs in an inelastic collision can be written as [1,8]: where = n σ the topological cross-section for n-prong events, and = inel σ the total inelastic cross-section. The average charged multiplicity , n , is defined by [9][10][11]: Recently, computational intelligence (CI) methodologies have become one of the most efficient techniques for the analysis of the charged particles in positron-electron annihilation (High energy physics "HEP" particle interactions) [25][26][27][28][29][30][31]. Generally, CI is a broad term covering a wide range of computational methodologies and approaches (such as artificial neural networks (ANNs), Genetic Programming (GP), ……) and most of them are nature-inspired algorithms.  [ 30,31].
In the light of this, ANNs are considered as nonlinear and highly flexible models that successfully model and analyze any nonlinear problem such as particle physics problems. Given enough data, they can approximate the underlying function for the given problem with high precision [32]. However, the main drawback of ANN approach is concentrated on the neural network architecture. In addition to that, the required time of learning causes difficulties for using ANN in real time system for modeling and regression. Group method data handling (GMDH) type Neural Networks enables restoring the unknown nonlinear regression in parametric form (as an empirical equation). The basic idea of the GMDH is the utilizing of feed-forward networks depending on short-term polynomial transfer functions whose coefficients are obtained utilizing regression combined with the self organizing activity behind neural network [33][34][35][36].
The ANN and the GMDH are inductive algorithms able to build non-linear connections between a set of input data and the output, without need for complex theory [36]. The two algorithms (GMDH and ANN) are combined to develop GMDH-Neural Network paradigm. The GMDH can be used for demonstrating and approximating any nonlinear complex system. It has been found that, the GMDH is an accurate simplified model for inaccurate or noisy data sets.
In the present work, we have used the GMDH-neural network (to obtain empirical physical equation) to model and analyze the charged distribution in hadronic positron-electron annihilation at center of mass energy, s , i.e.
( ) s n P ch , , ranging from 14 to 206 GeV [12][13][14][15][16][17][18][19][20] as well as the energy dependence of average multiplicity, n . The obtained results are compared with the ones from other models [21][22][23][24]. The success of the approach used here is promising for modeling systems for which the relationships between the interaction parameters are not well understood and for which precise data is not available.
The paper is organized as follows: section GROUP METHOD DATA HANDLING (GMDH) gives brief introduction on the GMDH approach. Details of the polynomial model for charged particles multiplicity distribution in hadronic

GROUP METHOD DATA HANDLING (GMDH)
One of the potential issues of using ANN based approaches in any domain is the possible choice of different architectures, network types, learning paradigm, layer topologies and sizes. Trials and errors paradigm are often used to choose the type and topology of a network for a given problem, and this can give poor performance. The utilization of GMDH neural networks can guide users with these choices and diminish the requirement for a priori knowledge about the model for the problem to be solved [33][34][35][36]. The GMDH is able to extract knowledge about the system under observation directly from data sampling. GMDH was developed by A.G. Ivakhnenko in the end of 1960s for identifying non-linear relation between input and output variables. A.G. Ivakhnenko was inspired by the form of Kolmogorov-Gabor polynomials which is the discrete analogue of Volterra Function series [33][34][35][36]. The central idea behind the GMDH technique is that it tries to build a mathematical function (called polynomial model) that behaves as closely as possible to the way the predicted and the actual values of the output would. The GMDH problem consists of E.A. El-Dahshan, S.Y. El-Bakry constructing a polynomial function f that could simulate and model the actual one, f , in order to predict output y € for a given input vector ) ...... , , ( ; as close as possible to its actual output y. Therefore, given M observations of "multi-input single-output" pairs, the problem consists of finding i y as follows: In the next step, GMDH-type neural network is trained in order to approximate the output values for a given input vector: The function f is determined so that the sum of square difference between predicted values and actual ones is minimized as follows: Ivakhnenko employed Kolmorovo-Gaborov sentence [33], which proves that every function ) (x f y n = can be represented by an infinite Volterra-Kolmogorov-Gabor (VKG) polynomial [33][34][35][36] is the vector of coefficients of weights. This mathematical form can be characterized by a system of partial quadratic polynomials (referred to as Partial descriptor "PD") consisting of only two variables (neurons: each neuron is considered as the partial model) as follows: Thus, such PD is recursively utilized in the network of connected neuron to establish the general mathematical form of the inputs and outputs variables given in equation (6). The coefficients i a in equation (7) are calculated using regression techniques during the learning process. Also, the GMDH network model is constructed during the learning process based on the experimental data. The experimental data, including inputs "independent variables" ( ) and output (one dependent variable "y") is split into a training and testing set. During a learning process a forward multi-layer neural network is developed (see Fig. 2). The GMDH network learns in an inductive manner and builds a function (called a polynomial) model) that results in the minimum error between the predicted value and expected output. The resulting network can be represented as a polynomial of polynomial system in the form of explicit mathematical equation. This inductive approach to determining the model structure notably reduces the amount of a priori knowledge required from the user and allows selecting structure that best follows the given dataset. For further details, the authors refer to read [33][34][35][36]. techniques for the analysis of the charged particles in positron-electron annihilation. CI, such as ANN, was first proposed by Bezdek [37] and it has gained much attention. CI is a set of "nature-inspired" computational algorithms and paradigms to model complex real-world problems for which conventional mathematical modeling can be useless. It also offers us automatic modeling techniques using the measurements of system behavior.
In the last few years, GMDH neural network has become one of the most efficient inductive learning algorithms in the family of computational intelligent techniques for computer-based mathematical modeling of non-linear complex phenomena. GMDH neural network tends to automatically generate a computer program in the form of polynomial equations when it is applied to estimate highly nonlinear complex phenomena. This approach has many privileges: it gives the user knowledge about the system and the way of verification of models constructed by human experts, as well as it saves a lot of time used for manual derivation of model equations. GMDH models developed in the present work mainly aim to generate the mathematical functions for the prediction and analysis of multiplicity distributions, ( ) s n P ch , , of positron-electron hadronic annihilation. According to this model, we have obtained empirical physical equation to calculate and predict the charged-particle multiplicity distribution which takes the form ( ) it is easy to obtain the calculation of n , since n is a function of s .
In order to demonstrate the prediction ability and evaluate the generality of GMDH type neural networks, experimental data from several collaboration [12][13][14][15][16][17][18][19][20] has been used to construct the ( ) s n P ch , model (GMDH) (empirical physical equation). In the present study, experimental data from different Collaborations [12][13][14][15][16][17][18][19][20] has been used for the model development. The experimental data are divided into training (calibration) set and testing (validation) set (A 10-fold cross-validation method [32] was used to evaluate the estimation error, that is, all the experimental data are randomly divided into 10 folds. 9 of them are used to train the model and the remaining one is used to test it). When the learning stage is finished, the built model is utilized to calculate and predict the output values for data which never been seen during the training stage.
Three different classes of polynomials, namely linear, quadratic, and cubic are utilized, resulting in the proposed model. The functions used in the network are as follows: Where, q 1 , q 2 , …, are the coefficients of the polynomial functions (estimated the GMDH model during the learning process) and x 1 , x 2 are the variables. Some network parameters (free parameters) must be specified before the learning process (e.g. the maximum number of layers, polynomial type and order). In this study; the number of layers in the neural network that the model may contain is specified as 20, and the polynomial order of a variable that a polynomial may contain is specified as 16). Convergence tolerance that will stop the training algorithm from adding a new layer when it reaches the specified value and detects from adding a layer will not improve the model. The number of neurons in each layer of the network is specified as the same number of neurons in the input layer. Network layer connections method controls how neurons in the network are connected together. In this paper, only the type of connection to the previous layer is chosen (This tells DTREG [38] that the inputs to one layer may come only from outputs generated by the next lower layer).
Based on minimum error performance in validation sets the corresponding polynomial representation of the present GMDH model for ( ) s n P ch , is obtained (empirically) as:   The center-of-mass energy dependence of the average charged-particle multiplicity, n , according to our model GMDH is obtained as: The criteria of root mean square error (MSE) and coefficient of determination ( 2 R ) are used for evaluating the performance of the GMDH models.

RESULTS AND DISCUSSION
Hadron production in positron-electron annihilation is created from the yield of quark-antiquark pairs which can generate gluons, the exchange particle of the field theory of the strong interactions and QCD. The produced gluon depends on the center of mass energy. Based on the GMDH, the general form empirical equation can be represented by a system of particle description using some type of order polynomial such as, linear, equations, quadratic equations and cubic equations (see equations (8) and (9) In this paper, by using group method of data handling (GMDH), we have constructed mathematical functional form for the charged particle multiplicity distribution of − + e e hadronic annihilation The GMDH model is developed by running DTREG software [38]. The simulation results are conducted on a pentium4, with 1.4 GHZ, 2G RAM and Windows XP. With these configurations, the predictive results are obtained in few seconds. A 10-fold cross-validation is used to create and test the model [32].
The proposed GMDH model is tested after training. As shown in Fig. 3 gives the GMDH the proven ability of wide usage in the modeling of high energy physics. Figure 3 demonstrates the calculated charged multiplicity distribution at 14, 29,34.8, 43.6,...., 206 s G e V = which have been compared with the corresponding experimental and theoretical ones [21][22][23][24]. The comparison between the charged particle multiplicity distribution calculated by the generalized multiplicity distribution (GMD) model for − + e e collisions at s =14, and 206 GeV is shown in Fig. 3. This Figure shows that the proposed model gives a good agreement with the experimental [12][13][14][15][16][17][18][19][20] and theoretical [21][22][23][24] data especially at 183 and 206 GeV. Fig. 4 demonstrates the comparison between the multiplicity distributions for different s . From this figure, we notice that the maximum probability of production "n ch " charged particles decreases with the increase of s as well as shifts towards the increase of "n ch ". Further, we notice that the width of the distribution is broadened with the increase of s .  Based on the obtained equations (eqs. (8) and (9) The average multiplicity of charged particles at a hadronic energy at the highest centre of mass energies in − + e e interactions, up to 206 GeV has been calculated to be n . When compared to other models, the value calculated is consistent with the evolution predicted by GMD method [21][22][23][24]. When s =206 GeV n =27.5, and so on for the lower energies. The results prove that the proposed GMDH model has impressively learned well the nonlinear behavior of the charged multiplicity distribution and the average charged multiplicity in − + e e collision.

CONCLUSIONS
We have obtained an empirical physical equation for the description of the charged particles multiplicity distribution in hadronic positron-electron annihilation based on GMDH approach. We have used the obtained equation to calculate and predict ( ) s n P ch , through the energy range 14 GeV to 206 GeV. The average multiplicity, n , is calculated using our empirical physical equation. The comparison between our results and the experimental and theoretical ones explores a good agreement. Based on our GMDH model, we present an empirical equation corresponding to physical phenomenon in a mathematical form, we are still faced with the challenge of justifying and giving words to their meaning. These models define input-output relations based on experimental data and use mathematical and statistical concepts to link input to model the output. Our results are in a good agreement with the experimental and theoretical ones. These results confirm the reliability of our models. Scientists may use such approaches to focus on interesting phenomena more rapidly and to interpret their meaning. This represents one of the key challenges for computational Intelligence techniques in high energy physics modeling.