PID Generation

Online Particle Identifier Number Generator Algorithms created with Artificial Neural Networks for Particle Identification




Background

The ALICE Transition Radiation Detector (TRD) at CERN is planned for a major upgrade in 2017/18. One of the main tasks of the ALICE TRD is the identification of particles, mainly electrons and pions. The upgrade will increase interaction rate of particles, which means new particle identification algorithms need to be developed. The particle identification algorithms are divided into online and offline algorithms. The online algorithms are used to determine the most effective data to be stored by the TRD while the offline algorithms use the data to determine the classification of the particles.
The focus of this section is the development of the online algorithm to create an 8-bit value known as a Particle Identifier (PID) number. The PID number is used to represent a particle’s tracklet obtained from the TRD.

Aim

The new hardware limiations of ALICE requires the development of a new online storage algorithm. This means there is a need to determine a new efficient and effective algorithm to store PID numbers which replaces the current tracklet information.

This project aims to determine whether the use artificial neural networks are better suited for PID generation than a summation approach. Additionally the artificial neural network approach is tested to determine the most effective structure for the neural network for the online PID generation algorithm.

Approach

The summation approach and the approach using artificial neural networks are compared based on pion efficiencies and the PID frequency distribution from the results from simulated data created with AliRoot. The pion efficiency is used to determine the number of pions misclassified as electrons at a fixed electron efficiency. The electron efficiency that is generally used is 90%.

The approach using artificial neural networks to create 8-bit PID numbers are divided into two sections. The first approach creates a single 8-bit PID number and the second creates two 4-bit PID numbers. Each of the artificial neural networks are tested with different input types. These inputs are either the raw tracklet information, preprocessed variables or both. The preprocessed variables are based on observations from the average tracklet results from thousands of particles.
The artificial neural networks are trained with supervised learning with the backpropagation algorithm. The training data used was from particles created with simulations from AliRoot.

Figure 1. Histogram representing the average ADC signal for each time bin. Electrons are represented in red and pions in blue. The average result is obtained from 2000 particles.

Results

In this project, the summation approach and artificial neural network approach have been tested on simulated data to determine which will be better suited to the upgraded system at ALICE.

The summation approach was found to produce worse results than the neural network approach. The PID numbers for the summation approach tended to be grouped for electrons and pions at low PID numbers. The pion efficiency produced with the summation approach was 57.7% at an electron efficiency of 90% which is significantly high. The artificial neural network approach produced pion efficiencies which ranged between 40% and 55% depending on the network used. The neural network approach also improved on the summation approach by creating PID numbers which went across the whole PID range and created a clear division between pions and electrons.

Figure 2. Pion efficiencies of the best networks for each input type. The blue indicates the one 8-bit PID approach and the red indicates the two 4-bit PID approach.

The artificial neural network approach that created two 4-bit PID numbers showed improved results compared to the summation approach as the pion efficiencies were generally lower.
Additionally the inputs from both preprocessed variables and raw tracklet data showed improved results than only using the raw tracklet data or only preprocessed variables.

Figure 3. PID frequency distribution of the summation approach (blue) and an example ANN approach (red). The light blue represents the pions and dark blue represents electrons. The dark red represents the electrons and light red represents the pions.

Conclusions

The results from simulated particle data showed that the approach using artificial neural networks produced better pion efficiencies and PID freqency distributions than the summation approach.
The artificial neural network approach that used inputs from preprocessed variables and unprocessed variables that created two 4-bit PID numbers showed improved results compared to the other networks tested and has potential for the implementation in online PID generation algorithms.