Question answering system for deterministic fault diagnosis of intelligent railway signal equipment

Purpose – The railway signal equipment failure diagnosis is a vital element to keep the railway system operating safely. One of the most dif ﬁ culties in signal equipment failure diagnosis is the uncertainty of causality between the consequence and cause for the accident. The traditional method to solve this problem is based on Bayesian Network, which needs a rigid and independent assumption basis and prior probability knowledge but ignoring the semantic relationship in causality analysis. This paper aims to perform the uncertainty of causality in signal equipment failure diagnosis through a new way that emphasis on mining semantic relationships. Design/methodology/approach – This study proposes a deterministic failure diagnosis (DFD) model based on the question answering system to implement railway signal equipment failure diagnosis. It includes the failure diagnosis module and deterministic diagnosis module. In the failure diagnosis module, this paper exploits the question answering system to recognise the cause of failure consequences. The question answering is composed of multi-layer neural networks, which extracts the position and part of speech features of text data from lower layers and acquires contextual features and interactive features of text data by Bi-LSTM and Match-LSTM, respectively, from high layers, subsequently generates the candidate failure cause set by proposed the enhanced boundary unit. In the second module, this study ranks the candidate failure cause set in the semantic matching mechanism (SMM), choosing the top 1st semantic matching degree as the deterministic failure causative factor. Findings – Experiments on real data set railway maintenance signal equipment show that the proposed DFD model can implement the deterministic diagnosis of railway signal equipment failure. Comparing massive existing methods, the model achieves the state of art in the natural understanding semantic of railway signal equipmentdiagnosis domain. Originality/value – It is the ﬁ rst time to use a question answering system executing signal equipment failure diagnoses, which makes failure diagnosis more intelligent than before. The EMU enables the DFD


Introduction
Railway signal equipment is a general name, which includes railway signal, station interlocking and section block equipment (Yang et al., 2018). It is significant to the safety of train and shunting operations. Fortunately, the railway bureau accumulates a large amount of railway signal equipment fault data, but how to mining the causal relationship between fault consequence and causes from unstructured text is a difficult thing. The fault causes analysis is complex as it refers to human, machine, environmental and management factors and any one of them could lead to railway signal equipment fault. Therefore, the uncertainty fault causes make signal equipment fault diagnosis more serious. The existing methods to diagnosis railway signal equipment fault mainly rely on expert experience which cannot compatible with the rapidly growing data volume. An intelligent method to identify the certainty fault cause above the big data era is an urgent requirement. Many existing machine learning methods based on text mining are used to deal with railway signal equipment failure diagnosis terms. To detect the failure of on-board equipment of high-speed rail signalling system, Zhao et al. (2014) proposed a failure diagnosis method based on the Bayesian Network to deal with the uncertainty and complexity of failure. However, the poor automation and low efficiency in failure diagnosis is another confusing problem, Liang et al. (2017) compromise rough set theory and Bayesian Network to process the failure detection in high-speed railway train control system. The defaults of Bayesian Network are rigid and independent assumption basis, prior probability knowledge and a reasonable Bayesian Network structure, which is too strictly to fulfil the condition in real circumstance. Yang et al. (2018) proposed an intelligent classification model to solve the unbalanced failure text classification problem in the railway signal equipment domain without concerning the uncertainty problem. Zhang et al. (2009) implemented the expert system for failure diagnosis of station signal control equipment. However, poor learning ability and fault tolerance prone the failure detection. In this paper, we propose a deterministic failure diagnosis (DFD) model based on deep learning to solve the uncertainty of causality in failure diagnosis of railway signal equipment.
The uncertainty in failure diagnosis attributes to the causal relationship between the failure consequence and leading the failure causes. It means one failure consequence may correspond to multiple causes hidden in different long texts. To solve the uncertainty problem in the failure diagnosis of railway signal equipment, we adopt the question answering system to generate candidate failure causes and semantic matching mechanism (SMM) to deal with the uncertainty issues. In the question answering system module, we attend to find candidate failure causes that hidden in the long continuous text sequences of the fault overview paragraph. It is hard to extract specific long continuous text sequences in a conventional question answering system as the boundary model works inefficiently (Wang and Jiang, 2017). Therefore, we propose an enhanced boundary unit (EBU) to locate the failure cause of a continuous long text sequence in the fault overview paragraph accurately. The EBU adds the position label of the middle participle of the fault cause, which plays an important role in predicting the participle location of the beginning and end. It also works as a boundary check role. In the DFD module, we exploit SMM to implement DFD of railway signal equipment. The traditional method of question answering system to generate candidate answers use a statistical ranking method to sort the candidate answers by the TFIDF similarity of the centroid vector (Xu et al., 2005). The SMM provides candidate answers by calculating the semantic relevance degree between questions and article sentences (Zhang et al., 2020). In the web search, the candidate answer documents are ranking by the semantic matching degree between the query and the documents (Huang et al., 2018). So, we choose SMM to calculate the semantic similarity between failure consequence and candidate reasons. Then choose the top 1st semantic matching degree as the deterministic failure causative factors.
The main contributions of this paper are presented as follows: It is the first time to use a question answering system executing signal equipment failure diagnoses, which provides a new orientation to solve railway signal equipment failure in large-scale unstructured data. We propose an EBU to optimise the traditional boundary model with more power to locate the candidate failure reasons in the given failure overview paragraph. The SMM is applied to resolve the uncertainty problem in failure diagnosis of railway signal equipment, which understanding the real failure reason in a natural semantic angle.

Task definition
To enhance the generalisation of railway signal equipment failure diagnosis, we define the problem in the formalisation pattern. It includes uncertainty answers generation, semantic match degree calculation and certainty failure cause detection. Definition 1 The uncertainty answers generation refers to acquire candidate causes of corresponding railway signal equipment failure consequence from failure describe text through the question answering system, it can be defined as follows: If the failure consequence represents as Q and the failure describes text represent as S, then the candidate causes of failure is a set A: (1) where a i is a text sequence of the candidate failure cause hidden in the failure overview paragraph set. Definition 2 The semantic match degree calculation serves for understanding the natural semantic relationship between failure consequence and causes, it can be defined as follows: where the match() function calculates the semantic matching degree of Q and a i . Definition 3 The certainty failure cause detection takes the candidate causes which semantic matching degree rank top 1st, it can be defined as follows: where a is the certainty failure cause that corresponds to the failure consequence Q.

Methodology
The DFD model includes the failure diagnosis module and the DFD module. The failure diagnosis module generates candidate failure causes and the DFD module selects the failure cause with the highest semantic matching degree as the deterministic failure cause, which solves the uncertainty problem in failure diagnosis of railway signal equipment semantically. The flow chart of the DFD model is shown in Figure 1.

Failure diagnosis module
Failure diagnosis module includes word embedding layer, context feature extraction layer, interactive feature extraction layer and failure modelling layer. The failure diagnosis question answering system module is shown in Figure 2.
3.1.1 Word embedding layer. Part of speech, shallow semantic and position features of participle can improve the performance of the question answering system. In the word embedding  Therefore, the failure overview paragraph text and failure consequence text can be vectorised, P 2 R lÂjPj max is the failure overview paragraph embedded matrix and Q 2 R lÂjQj max is the failure consequence embedded matrix. l is the embedded dimension, jPj max is the maximum number of participles in the failure overview paragraph, jQj max is the maximum number of participles in the failure consequence.
3.1.2 Context feature extraction layer. The contextual characteristics of failure text can improve the performance of the question answering system. Bi-LSTM can extract the contextual characteristics of the failure text automatically (Matthew et al., 2018). Thus, the context feature extraction layer adopts Bi-LSTM to extract the context features of participle from the forward and backward of the failure text automatically.
where H q 2 R 2lÂjQj max is the context semantic embedded matrix of the failure consequence and H p 2 R 2lÂjPj max is the context semantic embedded matrix of the failure overview paragraph.
3.1.3 Interactive feature extraction layer. The interactive feature refers to the semantic feature between the failure consequence and the failure overview paragraph. The interactive feature extraction layer adopts bi-directional Match-LSTM to extract the interactive features between failure consequence and failure overview paragraph. Forward Match-LSTM goes through the failure overview passage sequentially. At position t of the failure overview paragraph, the standard word-by-word attention mechanism (Vaswani et al., 2017) is used to obtain attention weight vector a ! t as follows: where W q ; W p ; W r 2 R lÂ2l , b p 2 R, are parameters to be learned; is the hidden vector of the forward Match-LSTM at position tÀ1 in failure overview paragraph and e Q produces a matrix or row vector by repeating the vector or scalar on the left for Q times. The attention weight between the t participle in the failure overview paragraph and the j participle in the failure consequence can be expressed as b t,j (1 # j # jPj). Subsequently, the weighted average of failure consequence can be used to obtain H q a ! T t and combine it with h p t to form a vector z ! t 2 R 4lÂ1 : Intelligent railway signal equipment where a ! t 2 R 1ÂjQj max . This vector z ! t is fed into a forward LSTM to form our so-called Match-LSTM: where h ! r t 2 R lÂ1 . To obtain the bidirectional representation of each participle in the failure overview paragraph after matching, a backward Match LSTM, which is the same as the forward 3.1.4 Failure modelling layer. The failure modelling layer proposes EBU to model the causal relationship of railway signal equipment failure. The EBU adds the position label of the middle participle of the failure cause in the given failure overview paragraph based on the boundary model to solve the problem that predicting a longer participle sequence of the failure cause using the boundary model is difficult. The middle participle of the failure cause not only plays a conditional role in the prediction of the beginning and end participles but also plays a boundary check role in the prediction of the failure cause. The EBU is shown in Figure 3. a is the text sequence of the failure cause; a s , a m and a e are the positions of the beginning, middle and ending participles in the failure overview paragraph, respectively;b k 2 R 1ÂjPj max is the attention weight vector, which is obtained through the attention mechanism; let a = (c 1 ,c 2 ,c 3 ), then c 1 = a m , c 2 = a s , c 3 = a e . b k;j 2 R 1ÂjPj max 1 # k # 3; 1 # j # jPj max ; À Figure 3. EBU SRT k 2 N þ ; j 2 N þ Þ is the probability of the jth participle being a k in the failure overview paragraph. The calculation of b k is: where V 2 R lÂ2l ; W a 2 R lÂl ; b a ; v 2 R lÂ1 ; c 2 R are the parameters to be learned; F k 2 R lÂjPj max is the intermediate result; e P produces a matrix or row vector by repeating the vector or scalar on the left for P times; h ! a kÀ1 2 R lÂ1 is the LSTM hidden layer state that corresponds to the kÀ1 position of the failure cause; h ! a 0 is randomly initialised and h ! a k is defined as follows: The EBU initially predicts the position of the middle participle of the failure cause in the failure overview paragraph. Subsequently, the position of the middle participle in the failure overview paragraph and H r are used as the preconditions to predict the position of the beginning participle of the failure cause in the failure overview paragraph, Finally, the position of the middle and beginning participles in the failure overview paragraph and H r are used as the preconditions to predict the position of the ending participle in the context paragraph. When predicting the failure cause, a s , a m and a e should satisfy the following boundary check conditions and d 2 N þ is the boundary check factor, a s þ a e 2 À a m # d : To train the model, we minimise the following loss function based on the training examples: where u corresponds to all the parameters to be learned in the failure diagnosis question answering system module, P k i is the text sequence of the kth failure overview paragraph that corresponds to the ith failure consequence, Q i is the text sequence of the ith failure consequence and A k i is the text sequence of the kth failure cause that corresponds to the ith failure consequence. The goal is to find the parameter u that minimises L(u ).

Deterministic failure diagnosis module
The failure diagnosis module diagnoses the candidate failure cause set A of failure consequence Q. The DFD module selects the deterministic failure cause of Q from A semantically. The DFD module uses SMM to sort A and the failure cause with the highest Intelligent railway signal equipment semantic matching degree in A is selected as the deterministic failure cause. The failure diagnosis module can model causality efficiently. Thus, the DFD module adopts the word embedding layer and context feature extraction layer of the failure diagnosis module to encode (Q, A) semantically. A = {a 1 , a 2 ,[. . .],a n } is the candidate failure cause set for the Q, Q is the text sequence of the failure consequence and n 2 N þ is the number of candidate failure causes of the failure consequence Q.
where E Q 2 R jQj max Â2l is the semantic encoding of the failure consequence Q, jQj max is the maximum number of participles in the failure consequence, E a i 2 R jaj max Â2l is the semantic encoding of the ith candidate failure cause, jaj max is the maximum number of participles in the failure cause and l is the number of dimensions. The semantic matching degree between the failure consequence and the candidate failure cause is calculated as d i , where d i 2 R is the semantic matching degree between E Q and E a i .
where rank function sorts A to obtain the set Sort A in descending order according to the semantic matching degree. The Top function selects the candidate failure cause with the highest semantic matching degree in Sort A and a is the deterministic failure cause of the Q.

Experiments
To verify the validity of the DFD model proposed in this paper for failure diagnosis of railway signal equipment, this paper tests on railway maintenance signal equipment (RMSE) data sets. The experimental environment configuration data are shown in Table 1.

Data set
The RMSE data sets are derived from the failure text data of railway signalling equipment in a railway bureau. These data sets have the characteristic that a failure consequence may correspond to multiple failure causes in several failure overview paragraphs. They also contain 6,000 question-and-answer data. The failure overview paragraph P in the RMSE data sets is the original failure description text. The failure consequence Q is asked by the annotator according to P and failure cause A is the continuous long word participle sequence in P. A total of 5,000 and 1,000 experimental training and test sets are available. In this paper, BLEU-4 and Rouge-L are used to evaluate the similarity between the fault cause diagnosed by the DFD model and the true fault cause.

Experimental set
The part of speech, semantic and location encodings of the participles have 25,300 and 25 dimensions, respectively. The final participle encoding has 350 dimensions, the hidden layer has 150 neurons and the learning rate is 0.001. ADAMAX (Bengio and LeCun, 2015) optimisation models with coefficients b 1 = 0.9 and b 2 = 0.999 are used. The boundary checks factor d = 2, Max_P_num = 5 and batchsize = 32. The experimental parameters on the RMSE data set are shown in Table 2.

Baseline methods.
Five mainstream question answering system models were used as baseline models to compare with the DFD model. The five models are presented as follows: BiDAF model (Seo et al., 2017) is a high-performance question answering system model that uses context-to-question attention and question-to-context attention to emphasise the important parts of the question and the context. It uses the attention flow layer to fuse all useful information to obtain a vector representation of each location.
The Match-LSTM model is a widely used question answering system model. Match-LSTM traverses the entire context. It aggregates the attention-weighted question representation dynamically with the matching of each participle in the context and finds the answer span in the context through the answer pointer network.
RNet model (Microsoft Asia Natural Language Computing Group, 2017) is an end-to-end question answering system model developed by Microsoft Research Asia. It is the first to achieve near-human model performance on the SQuAD data sets.
QANet model (Yu et al., 2018) is a new question answering system model proposed by the Carnegie Mellon University and Google Brain. QANet does not need a circulating neural network and the encoder is only composed of convolution and self-attention. On the premise of maintaining good accuracy, the training efficiency can be improved significantly.  Table 3 shows that the DFD model achieves the best results amongst all the baseline models on the RMSE data set. The experimental results show that the DFD model can realise the failure diagnosis of railway signal equipment and has good diagnostic performance. Compared with Match LSTM, BLEU-4 increases by 0.0739 and Rouge-L increases by 0.0425.

Ablation experiments.
To verify the contribution of EBU and SMM to the model, the ablation experiments were conducted on the RMSE data sets. Roug-L was used as the evaluation index for the ablation experiments. The results of the ablation experiments are shown in Table 4. Table 4 shows the impact of EBU and SMM on the model and the RMSE test set. As seen, EBU and SMM can improve the performance of the model. 4.3.4 Hyper-parameter sensitivity test experiments. The number of hidden layer neurons in the DFD model has an important impact on the performance of the question answering model. Therefore, in this work, hidden size was tested for hyper-parameter sensitivity on the RMSE data sets (Figure 4). DFD obtains the best results in BLEU-4 and Rouged-L when the hidden size is 200.
4.3.5 Case study. The case study of railway signal equipment failure diagnosis includes proving that DFD can realise failure diagnosis, testing that the DFD model can locate failure  Figure 4 shows the diagnostic results of the DFD model on the test set. The case study of railway signal equipment failure diagnosis is shown in Figure 4, "what causes automatic bow stop?" is the failure consequence and three related causes for the failure consequence are provided in three failure overview paragraphs. Ground truth is displayed in boldface letter. Failure diagnosis question answering system module diagnosed all possible candidate failure causes and then the failure diagnosis question answering system module adopted the SMM to sort the candidate failure causes in descending order according to the semantic matching degree. The failure cause with the highest semantic matching degree is taken as the deterministic diagnosis result semantically. The orders candidate failure causes in Figure 4 show that the failure causes diagnosed by the DFD model are consistent with the ground truth, indicating that the model can realise failure diagnosis and locate the failure causes in the failure overview paragraphs accurately. The   Figure 5 is the most related failure cause with the failure consequence semantically. Experiment results show that the proposed DFD model can realise the failure diagnosis of railway signal equipment in the form of natural language and solve the problem of uncertainty in the failure diagnosis of railway signal equipment semantically.

Conclusion
In this paper, we propose a DFD model to diagnose the railway signal equipment fault in a novel orientation. It converts the failure diagnosis problem to a certainty question answering problem. The EBU enables the DFD model to generate candidate fault causes by natural understanding semantic between failure consequences and failure causes. The SMM of the DFD model acquires the certainty failure cause, which solves the uncertainty problem that confused railway signal equipment failure diagnosis. Experiments on real data set RMSE show the DFD model can solve the uncertainty failure diagnosis of railway signal equipment efficiently. It is progress in the railway operation system which drives the railway signal equipment failure diagnosis intelligently. In the future, we consider applying the question answering system to other implications in the railway domain.