A local enhanced spatiotemporal tensor decomposition for missing travel time completion

Yilong Ren (School of Transportation Science and Engineering, Beihang University, Beijing, China; Zhongguancun Laboratory, Beijing, China and Beihang Hangzhou Innovation Institute Yuhang, Hangzhou, China)

Jianbin Wang (School of Transportation Science and Engineering, Beihang University, Beijing, China and Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, Beijing, China)

Smart and Resilient Transportation

ISSN: 2632-0487

Article publication date: 8 November 2022

Issue publication date: 8 December 2022

Downloads

297

pdf (1.5 MB)

Abstract

Purpose

The missing travel time data for roads is a common problem encountered by traffic management departments. Tensor decomposition, as one of the most widely used method for completing missing traffic data, plays a significant role in the intelligent transportation system (ITS). However, existing methods of tensor decomposition focus on the global data structure, resulting in relatively low accuracy in fibrosis missing scenarios. Therefore, this paper aims to propose a novel tensor decomposition model which further considers the local spatiotemporal similarity for fibrosis missing to improve travel time completion accuracy.

Design/methodology/approach

The proposed model can aggregate road sections with similar physical attributes by spatial clustering, and then it calculates the temporal association of road sections by the dynamic longest common subsequence. A similarity relationship matrix in the temporal dimension is constructed and incorporated into the tensor completion model, which can enhance the local spatiotemporal relationship of the missing parts of the fibrosis type.

Findings

The experiment shows that this method is superior and robust. Compared with other baseline models, this method has the smallest error and maintains good completion results despite high missing rates.

Originality/value

This model has higher accuracy for the fibrosis missing and performs good convergence effects in the case of the high missing rate.

Keywords

Citation

Ren, Y. and Wang, J. (2022), "A local enhanced spatiotemporal tensor decomposition for missing travel time completion", Smart and Resilient Transportation, Vol. 4 No. 3, pp. 194-208. https://doi.org/10.1108/SRT-03-2022-0003

Publisher

:

Emerald Publishing Limited

License

Published in Smart and Resilient Transportation Emerald Publishing Limited. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence maybe seen at http://creativecommons.org/licences/by/4.0/legalcode

1. Introduction

With the development of intelligent transportation systems, the performance of monitoring devices and sensors is constantly improving, and we can obtain massive amounts of travel time data which can provide data support for traffic management and control. However, during the actual travel time data collection process, problems such as data distortion or even loss of data due to equipment failures and transmission problems cannot be avoided. These problems may further affect the correct decision on traffic issues. Research on the problem of missing travel time data has always been the focus of scholars’ attention (Tang et al., 2018).

Due to the multidimensional attributes of travel time data, potential spatiotemporal associations are implicit in it. How to effectively use this spatiotemporal potential pattern and enhance the process of tacit knowledge mining is the key to improve the accuracy of data completion (Acar and Yener, 2009). The tensor is a multidimensional data storage form and constructing travel time data into a tensor can store the spatial structure and multidimensional hidden associations of the travel time data. Through the tensor decomposition algorithm, the hidden structured patterns can be solved, and the missing data can be completed with limited observations without a large number of samples. Many studies have proved that the tensor decomposition algorithm can accurately complete travel time data (Signoretto et al., 2011).

Three basic missing patterns that often appear in the travel time data are random missing, systematic missing and fibrosis missing (Tomasi and Bro, 2005; Li et al., 2020). Random missing refers to the random dispersion of missing elements in the data set, and there has been considerable research work on random missing. Systematic missing refers to the large-scale distortion and failure of data caused by systematic problems such as server failure; scholars generally do not include this loss mode in the scope of research. Fibrosis missing refers to the channel-like arrangement of missing entries in a tensor along a certain dimension, as shown in Figure 1. When fibrosis missing occurs, different missing dimensions correspond to different missing scenarios, the corresponding relationship between the missing dimension and the scene is shown in Table 1. For Scenario III, when data loss occurs along the dimension of the road, it will cause data loss on all road sections in the road network at a specific date and time. In real life, the probability of this phenomenon is extremely low, so it is also not considered in this article. For Scenario I and Scenario II, data missing for a long period on a specific date, or a specific road section at a fixed time every day, often happens in real life (for example, a long-term failure of the bayonet monitoring device result data missing, and the maintenance at a fixed time every day results in data missing).

The actual results show that the existing methods cannot show high fitting ability in fibrosis missing modes. Therefore, this paper considers proposing a local enhanced tensor decomposition model that considers the spatial-temporal characteristics of road networks. Based on the large-scale urban road network bayonet data, we construct a data completion framework by finding road segments with similar temporal and spatial characteristics of travel time to the missing road segments.

The rest of the article is organized as follows. Section 2 is a literature review of travel time completion. Section 3 briefly introduces the theory of tensor Tucker decomposition. Section 4 introduces the tensor decomposition model designed for travel time completion. The experiment based on the bayonet detector will be shown in Section 5. Section 6 concludes this paper with some remarks and potential future research.

2. Related works

There are a lot of work and research focus on missing traffic flow data completion based on different models, which can be summarized as the following sections.

Traditional models mainly use regression interpolation methods based on historical data of missing data or surrounding sample data to estimate missing values, including multiple linear regression, median quadratic regression and nonnormal Bayesian linear regression models (Bae et al., 2018; Williams and Hoel, 2003; Usman and Ramdhani, 2019). Although these traditional methods are relatively simple in structure and easy to calculate, the performance is greatly affected by short-term traffic fluctuations.

Statistical models are widely used in data completion due to their strong interpretability such as Bayesian principal component analysis (Qu et al., 2008) and probabilistic principal component analysis (PPCA) (Qu et al., 2008). Based on the PPCA algorithm, Li proposed a kernel PPCA algorithm considering the spatiotemporal correlation of traffic flow to improve the interpolation performance (Li et al., 2013).

Another typical class of methods for traffic data completion is machine learning-related methods. The combination of machine learning framework and traffic flow theory can effectively improve completion accuracy. The neural network has become a commonly used method in traffic data completion (Zhang and Xie, 2006; Chen et al., 2019; Hofleitner et al., 2012). Compared with the neural network, support vector machine shows better fitting performance and generalization ability (Wu et al., 2003; Zhang and Xie, 2007). The Gaussian process model, which is widely used for data classification and completion, is simpler in model structure and has higher efficiency and accuracy (Haran, 2011; Zhao et al., 2015).

Most of the above completion methods use the time series model to input travel time in the low-dimensional form. However, storing in a low-dimensional form not only destroys its data structure but also wastes potential correlations between dimensions of traffic data (Li et al., 2020). As a high-dimensional space data storage model, the tensor can preserve the original spatial structure and internal latent information of traffic data. By mining the multicollinearity between the multiple dimensions of the data, the invisible knowledge discovery process is enhanced, and the accuracy is higher than the traditional data completion method under the high data missing rate (Signoretto et al., 2011; Chen et al., 2020; Gong and Zhang, 2020; Liu et al., 2009). Tan first introduced the tensor for four-dimensional modeling of traffic data and proposed an estimation method based on TDI to estimate missing traffic values (Tan et al., 2013). Zhao and his group proposed a naive Bayesian canonical polyadic (CP) decomposition method that can handle incomplete and noisy tensor data naturally (Zhao et al., 2015). Ran proposed an algorithm that utilizes spatial information to improve the performance of missing travel time data completion (Ran et al., 2016). Wang used a third-order tensor to model the driver’s travel time in different road segments and periods and used a context-aware tensor decomposition method to estimate the missing values of the tensor (Wang et al., 2014).

Although there are many kinds of research on travel time and other traffic data completion methods based on tensor decomposition, most of them fail to fully exploit the data structure characteristics of fibrosis data missing patterns. And in the process of tensor decomposition, only the global structure is preserved based on the low-rank assumption, ignoring the strong local consistency of the data, which leads to the poor expression of the model under fibrosis missing patterns. Therefore, it is necessary to improve the ability to capture the local consistency of missing data in tensor decomposition methods.

3. Tucker decomposition

TDI is a kind of tensor decomposition that can extract the interactions among core tensor and three factor matrices. The factor matrices can explain the characteristics along each dimension, and the core tensor represents the degree of connection between the dimensions (Kolda and Bader, 2009; Sun and Axhausen, 2016). The TDI method is shown in Figure 2.

TDI decomposes a given tensor X∈ℝn1×n2×n3 into a core tensor G∈ℝr1×r2×r3 and factor matrices U∈ℝn1×r1, V∈ℝn2×r2, W∈ℝn3×r3 in a sequence (Tucker, 1966), such that:

(3.1) X≈G × 1U × 2V × 3W

The TDI can be regarded as a generalization of the CP decomposition (Kolda and Bader, 2009; Acar et al., 2011). By comparison with CP decomposition, TDI with its flexible structure can achieve lower modeling errors. More discussions about tensor decomposition and its application can be referred to by Kolda and Bader (2009).

4. Methodology

This section introduces a novel model for tensor decomposition. The framework of the model is shown in Figure 3. First, an incomplete data tensor is constructed. We constructed the road transportation time data as tensor data, where the missing data fit the fibered missing pattern in the tensor. In the second and third steps, we used similarity discriminations to find road segments with similar temporal and spatial characteristics to the roads with missing data, and constructed a feature matrix of similar segments to strengthen the constraints on the tensor decomposition process. Finally, to achieve more accurate data complementation, we proposed a new algorithm called local-enhanced spatiotemporal tensor decomposition.

4.1 Tensor construction

Figure 4 shows a traffic management department deploying a bayonet surveillance system on a key road in the city. The system records the images of passing vehicles by taking pictures. It can automatically identify the vehicle license plate and record the passing time and the direction of the car using high-precision image recognition technology and semantic classification technology. This part introduces the method of building travel time tensor based on the bayonet data.

4.1.1 Bayonet data preprocessing.

4.1.1.1 Eliminate outliers.

Errors in computer vision technology can lead to incorrect or unrecognized license plate numbers. Vehicles changing lanes will record multiple trajectory records of the same vehicle at the same time node. Abnormal data generated in the above two situations need to be eliminated.

4.1.1.2 Calculation of average travel time.

Assuming that for a given road, a total of n vehicles pass through a given period of time. We store the time node data of all vehicles into a sequence {T₁₁, T₁₂, T₂₁, T₂₂…T_i₁, T_i₂…T_n₁, T_n₂}, T_i₁ represents the time point when the i car passes the starting point of the road and T_i₂ represents the time point when the i car passes the endpoint of the road. The average travel time T^ can be obtained by the following formula (4.1):

(4.1) T^=∑i=1n(Ti2−Ti1)n

4.1.2 Travel time tensor construction.

In this paper, travel time tensor modeling is carried out in the form of a three-dimensional tensor. The three dimensions of the tensor are road segment, date and time window. The travel time tensor is defined as X∈ℝN × M × L, where N is the number of road segments in the road network, M is the number of days and L is the number of time windows selected in one day, determined by the time window length. The element X∈ℝ108 × 21 × 30 in the tensor represents the average travel time of the vehicle through road segment i in the j day and the k period. The structure of the travel time tensor X is shown in Figure 5.

4.2 Similar road excavation

The purpose of this part is to excavate roads with similar temporal and spatial characteristics of traffic flow to missing roads and use their traffic flow time series data as parameters for sensing road network characteristics to improve the accuracy of data completion.

4.2.1 Road spatial similarity.

The road attributes (length of the road section, road grade, number of lanes, etc.) will directly affect the time it takes for vehicles to pass through the road section. Therefore, when estimating the missing road transit time, taking the physical attributes of the road itself into consideration can significantly improve the estimation accuracy. When the attribute of the road x with missing data is known, we search the road y similar to its road attribute, the travel time of y has great reference significance for estimating the travel time of x. Therefore, in this chapter, the roads in the road network are first clustered according to their physical attributes.

In this paper, the K-means method is selected, and the roads are clustered based on the length of the roads, road grade and the number of lanes. In the K-means algorithm, the number of clusters needs to be manually selected, and the selection of the K value usually affects the quality of the clustering effect. In this paper, the silhouette coefficient method is used to determine the final K value. Silhouette coefficient is a way of evaluating the quality of clustering. First proposed by Peter J. Rousseeuw in 1986, it combines two factors, cohesion and separation. The calculation method of the silhouette coefficient is as follows (4.2). The larger the silhouette coefficient, the more compact the instances within the cluster, the greater the distance between clusters the better the clustering effect. Judging the clustering effect by comparing the size of the silhouette coefficients after clustering under different K values:

(4.2) Si=b(i)−a(i)max{a(i),b(i)}

In the formula, a(i) is the average dissimilarity of the vector to other points in the same cluster of i, and b(i) is the minimum value of the average dissimilarity of the vector i to other clusters.

4.2.2 Road temporal similarity.

The similarity between two objects is a numerical measure of the similarity of the two objects. There are three algorithms for measuring the similarity of sequence data: Euclidean distance, dynamic time warping (DTW) algorithm, and longest common subsequence (LCSS) algorithm. Among them, Euclidean distance refers to calculating the Euclidean distance of two points corresponding to the trajectory at each time point, and then comprehensively processing the Euclidean distance of all points, including taking the average value, summing and taking the median. This method is the simplest and most feasible but is more sensitive to noise points. The DTW algorithm uses the recorded points before the repeated points to fill in the corresponding vacancies and finds the minimum distance to be the most similar measure of the trajectory, which solves the demanding requirements of the Euclidean distance for sampling, but still has the disadvantage of being greatly affected by noise points. If the two series have similar shapes in most time periods, and only have a certain difference in a short period. Euclidean distance and DTW cannot accurately measure the similarity of these two series. LCSS can effectively deal with such problems.

The traditional LCSS algorithm compares two corresponding sequence points, the determination of the similarity threshold is usually fixed. However, due to the unique time characteristics of traffic flow, there is a big difference in different periods. In the time when the flow is small, even a small flow difference may also represent a large difference in traffic flow characteristics. This article proposes an adaptive threshold setting method by setting similar proportions. The product of the larger value and the ratio of traffic flow is used to replace the fixed threshold in the traditional LCSS method. Then the improved LCSS algorithm can be expressed as:

(4.3) LCSS(A,B)={0 if A=ϕ or B=ϕ1+LCSS(ai−1,bj−1)if dist(ai,bj)<θ⋅max(ai,bj)max⁡(LCSS(ai−1,bj),LCSS(ai,bj−1)) otherwise}

where A and B, respectively, represent the traffic flow sequence of the two road sections, n and m represent the road lengths, a_i and b_j represent the sequence points in A and B, i = 1,2,…, n, j = 1,2,…, m. θ is the similarity threshold, in this paper, θ = 0.2 is selected, which can not only ensure the mining of similar road sections with missing data, but also obtain a better model accuracy. dist(a_i, b_j) represent the distance between a_i and b_j, it can be calculated as:

(4.4) dist(ai,bj)=∥ai−bj∥2

Based on the above formula, the similarity formula of the LCSS is:

(4.5) DLCSS=LCSS(A,B)min⁡(lenA,lenB)

where len_A and len_B represent the length of road A and B. This formula calculates the trajectory similarity D_LCSS of the traffic flow time series vector between other road segments and the missing road segment, and selects the road segment with the largest similarity as the final target road segment.

4.3 Constructing the feature matrix

To complete the missing road data, a feature matrix of similar roads is extracted. The matrix F is constructed based on bayonet data, and each column represents the traffic flow sequence of the road segment which is similar to the missing road segment in spatiotemporal characteristics. Integrating matrix F into the tensor decomposition framework can improve the accuracy of data completion by adding decomposition constraints.

4.4 Local-enhanced spatiotemporal tensor decomposition

Based on TDI, we designed an improved decomposition framework, which combines the spatial and temporal correlation. The completion problem of travel time can be modeled as an optimization problem, and the objective function can be defined as follows:

(4.6) MIN(G,U,V,W,T) =12‖X−G × 1U × 2V × 3W‖2+λ12‖F−WT‖2 +λ22(‖G‖2+‖U‖2+‖V‖2+‖W‖2+‖T‖2)

The objective function is guided by the minimum difference between the decomposed result and the original tensor. ‖X – G × 1U × 2V × 3W‖2 represents the norm of the difference between the travel time tensor χ and the estimated travel time tensor, F is the traffic flow sequence of similar sections, T is the coefficient matrix, ‖F–WT‖2 makes sure that the decomposed time factor matrix is similar to the traffic flow series of similar segments, ‖G‖2+‖U‖2+‖V‖2+‖W‖2+‖T‖2 is a regular term which is used to prevent overfitting in the decomposition process and λ₁, λ₂ are the weight coefficient of each item in the formula.

Since the objective function is a nonconvex optimization problem, we use the gradient descent method to solve it:

(4.7) G⇐(1− αλ2)G+α⋅δ × 1UT × 2VT × 3WT

(4.8) U⇐(1− αλ2)U+α⋅δ(1)(W⊗V)G(1)T

(4.9) V⇐(1− αλ2)V+α⋅δ(2)(W⊗U)G(2)T

(4.10) W⇐(1− αλ2)W+α⋅δ(3)(V⊗U)G(3)T+αλ1(F−WT) × T T⇐(1− αλ2)T+αλ1(F−WT) × W,

where δ=X – G × 1U × 2V × 3W, α represents the learning rate, and above formulas represent the update method of the G, U, V, W, T.

After the algorithm iteration is terminated, the final value of each factor matrix can be obtained. At this time, the final inferred travel time tensor X^ can be calculated by tensor reconstruction. The calculation method is shown in the following formula (4.11):

(4.11) X^=G × 1U × 2V × 3W

The algorithm procedure is shown in the following:

AlgorithmInput: initial tensor X ; similarity matrix F1. Set learning rate α; regularization parameter λ₁, λ₂; error threshold ε and maximum iteration count k_max2. Initialize G, U, V, W with small random values3. For k = 1, 2,⋯, k_max do: X^=G × 1U × 2V × 3W δ=X−G × 1U × 2V × 3WUpdate G⇐(1− αλ2)G+α⋅δ × 1UT × 2VT × 3WT U⇐(1− αλ2)U+α⋅δ(1)(W⊗V)G(1)T V⇐(1− αλ2)V+α⋅δ(2)(W⊗U)G(2)T W⇐(1− αλ2)W+α⋅δ(3)(V⊗U)G(3)T+αλ1(F−WT) × T T⇐(1− αλ2)T+αλ1(F−WT) × WIf ‖G × 1U × 2V × 3W−X^‖F2<ε : breakOutput: X^=G × 1U × 2V × 3W

In the process of reconstructing the inferred travel time tensor, the data of the nonmissing road segments in the original tensor will also change accordingly. Therefore, it is necessary to adjust this part of the data in the inferred travel time tensor to the original data, and adjust the formula:

(4.12) X^ijk={Xijk,(i,j,k)∈φX^ijk,(i,j,k)∈φ¯

The data set of nonmissing road segments in the original tensor is φ = {(i, j, k)|χ_ijk ≠ 0}, and the data set of missing road segments is φ¯={(i,j,k)|Xijk=0}.

5. Experimental results

5.1 Introduction to experimental

The monitor data set used in this article was collected from Ruian City, Zhejiang Province, China. And the monitor devices were installed at approximately 108 road sections, as shown in the Figure 6 (the dots in the figure represent the mounting position of the bayonet detector). We selected a regional road network within an area of about 7.5 km² in the center of the urban area as the research object. The bayonet detector in the road network can cover most of the main roads, as shown in Figure 6. The road network contains 35 monitor devices and 108 road sections.

The bayonet detector can obtain information such as the license plate number, passing direction and passing time of the passing vehicle, and each bayonet detector also has a fixed number and its location in the road network. After preprocessing the bayonet data such as data cleaning, the data are extracted according to the license plate number, and sorted according to the passing time, then the vehicle travel trajectory can be obtained. The data format is shown in Table 2.

This data set collected a total of 21 days of travel time (from 00:00:00 on March 1, 2016, to 23:59:59 on March 21, 2016). Thirty minutes is selected as the length of the time window so that one day can be divided into 48 time periods. Due to the data sparsity problem, the travel time tensor is constructed from 7 a.m. to 10 p.m. with dense data distribution. The travel time data can be constructed as a three-dimensional tensor X∈ℝ108×21×30 (section × day × time window, with a size of 108 × 21 × 30).

To simulate the data performance under different missing rates, a tensor B∈ℝ108×21×30 is constructed, with the same structure as the original tensor χ. The elements in the tensor B are only 0 or 1, and the complete tensor χ′ is controlled by setting the distribution probability of 0 elements in tensor B, so as to get the tensor χ under different missing rates. The calculation method is as follows:

(5.1) X=BX′

In this part, we compared and analyzed the method proposed in this article with several existing conventional methods. To measure the accuracy of data completion, we used root mean square error (RMSE), average absolute error (MAE) and average relative error (MRE), to evaluate the accuracy of travel time estimation. The formulas are as follows:

(5.2) RMSE=∑iN∑jM∑kL(X^ijk−Xijk′)2Z , ∀Xijk=0

(5.3) MAE=∑iN∑jM∑kL|X^ijk−Xijk′|Z, ∀Xijk=0

(5.4) MRE=∑iN∑jM∑kL|X^ijk−Xijk′|∑i∑j∑kXijk′, ∀Xijk=0

where the parameter Z represents the count of the missing data (χ_ijk = 0), X^ijk represents the estimated travel time, and the X ijk′ represents the original travel time.

Then, we tested the robustness of the proposed model and initialized the constructed original tensor under different missing rate conditions (10% data missing, 20% data missing, 30% data missing and 40% data missing).

5.2 Analysis of algorithm accuracy

In this part, the proposed model is compared with the traditional CP decomposition, traditional TDI and high-precision low-rank completion models to verify the applicability and superiority of the proposed method in dealing with tensor fibrosis missing scenarios.

The experiments were performed to calculate the error evaluation parameters for the four models, respectively, with 20% of missing data, and the model effects were quantified and compared. The results are shown in the following table (Table 3).

It can be seen intuitively from the above table that the model proposed in this paper is better than other models in each error evaluation index. The MAE index is 0.78% higher than the TDI, and the MRE index is 4.5% higher than the TDI.

To visually see the fitting effect of the model in each numerical interval, we established an XOY coordinate system that projects the original travel time to the X-axis, and the estimated travel time of the model to the Y-axis. It can be predicted that the better the fitting effect of the model, the more “convergent” the scatter plot as a whole should be, and the image should be closer to the line y = x. The scatter plots of the four models are as follows (Figure 7).

From the results, the local-enhanced spatiotemporal tensor decomposition framework can infer better than other methods.

5.3 Analysis of algorithm robustness

We tested the robustness of the proposed model and verified that at different data missing rates the model still has good effects. The specific completion effect is as follows (Figure 8).

Since the feature extraction of missing road sections and missing time is considered, the local feature extraction in the tensor decomposition process is enhanced, so this method is extremely robust.

6. Conclusion

The model proposed in this paper maps the missing structure of tensor data to actual traffic scenes, especially for high-frequency fibered missing scenes. It mined the road sections with similar spatiotemporal features and constructed the feature matrix. By incorporating the spatiotemporal feature matrix of similar road sections into the tensor decomposition model, the local features of the tensor are enhanced, which standardizes the direction of iteration in the decomposition process. The experimental results show that our model has high accuracy and shows good convergence in the case of a high missing rate.

This study still needs further research and improvement. This paper only selects three dimensions of road length, grade and lane number to cluster, and whether there are other parameters that will affect the travel time needs further discussion. The length of the time window does not need to be completely consistent, and subsequent studies can discuss how to divide the time window to improve the accuracy of the model.

Figures

Figure 1.

Different missing dimensions

Figure 2.

Tucker decomposition

Figure 3.

The framework of the proposed algorithm

Figure 4.

Schematic diagram of bayonet data

Figure 5.

Travel time tensor

Figure 6.

Research area road network diagram

Figure 7.

Model accuracy scatter plot

Figure 8.

Fibrotic defect completion effect under different defect rates

Table 1.

The relationships between tensor missing dimensions and scenes

Fibrosis scenario	Missing dimension	Scenes	Probability of occurrence
Scenario I	Time slots	Long-term data missing for a specific road section on a specific date	High
Scenario II	Days	Data is missing for certain road sections at a fixed time each day	High
Scenario III	Road segments	Data on all road segments in the road network is missing at a specific date and time slot	Low

Table 2.

The monitoring data

Serial no.	License plate no.	Monitoring no.	Passing direction	Passing time
1	1	67	2	2016-3-1 15:00:04
2	1	77	1	2016-3-1 15:05:53
3	1	81	1	2016-3-1 15:07:33
4	1	62	1	2016-3-1 15:09:32
–	–	–	–	–

Table 3.

Error index comparison table

Model	RMSE (s)	MAE (s)	MRE (%)
Model of this paper	25.62	17.79	0.0715
CP decomposition	28.15	18.57	0.0781
Tucker decomposition	27.98	17.93	0.0749
High precision low-rank completion	36.61	24.29	0.1021

References

Acar, E. and Yener, B. (2009), “Unsupervised multiway data analysis: a literature survey”, IEEE Transactions on Knowledge and Data Engineering, Vol. 21 No. 1, pp. 6-20.

Acar, E., Dunlavy, D.M., Kolda, T.G. and Mørup, M. (2011), “Scalable tensor factorizations for incomplete data”, Chemometrics and Intelligent Laboratory Systems, Vol. 106 No. 1, pp. 41-56.

Bae, B., Kim, H., Lim, H., Liu, Y., Han, L.D. and Freeze, P.B. (2018), “Missing data imputation for traffic flow speed using spatiotemporal cokriging”, Transportation Research Part C: Emerging Technologies, Vol. 88, pp. 124-139.

Chen, X., Yang, J. and Sun, L. (2020), “A nonconvex low-rank tensor completion model for spatiotemporal traffic data imputation”, Transportation Research Part C: Emerging Technologies, Vol. 117.

Chen, X., Wang, S., Shi, C., Wu, H., Zhao, J. and Fu, J. (2019), “Robust ship tracking via multi-view learning and sparse representation”, Journal of Navigation, Vol. 72 No. 1, pp. 176-192.

Gong, C.F. and Zhang, Y.Y. (2020), “Urban traffic data imputation with detrending and tensor decomposition”, IEEE Access, Vol. 8, pp. 11124-11137.

Haran, M. (2011), “Gaussian random field models for spatial data”, Handbook of Markov Chain Monte Carlo, CRC Press, United States, pp. 449-478.

Hofleitner, A., Herring, R. and Bayen, A. (2012), “Arterial travel time forecast with streaming data: a hybrid approach of flow modeling and machine learning”, Transportation Research Part B: Methodological, Vol. 46 No. 9, pp. 1097-1122.

Kolda, T.G. and BaDer, B.W. (2009), “Tensor decompositions and applications”, SIAM Review, Vol. 51 No. 3, pp. 455-500.

Li, L., Li, Y. and Li, Z. (2013), “Efficient missing data imputing for traffic flow by considering temporal and spatial dependence”, Transportation Research Part C: Emerging Technologies, Vol. 34, pp. 108-120.

Li, H., Li, M., Lin, X., He, F. and Wang, Y. (2020), “A spatiotemporal approach for traffic data imputation with complicated missing patterns”, Transportation Research Part C: Emerging Technologies, Vol. 119.

Liu, J., Musialski, P., Wonka, P. and Ye, J. (2013), “Tensor completion for estimating missing values in visual data”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35 No. 1, pp. 208-220.

Qu, L., Li, L., Zhang, Y. and Hu, J. (2019), “PPCA-based missing data imputation for traffic flow volume: a systematical approach”, IEEE Transactions on Intelligent Transportation Systems, Vol. 10 No. 3, pp. 512-522.

Qu, L., Zhang, Y., Hu, J., Jia, L. and Li, L. (2008), “A BPCA based missing value imputing method for traffic flow volume data”, 2008 IEEE Intelligent Vehicles Symposium, IEEE, Eindhoven, Netherlands, pp. 985-990.

Ran, B., Tan, H., Wu, Y. and Jin, P.J. (2016), “Tensor based missing traffic data completion with spatial – temporal correlation”, Physica A: Statistical Mechanics and ITS Applications, Vol. 446, pp. 54-63.

Signoretto, M., Van de Plas, R.D., Moor, B. and Suykens, J.A. (2011), “Tensor versus matrix completion: a comparison with application to spectral data”, IEEE Signal Processing Letters, Vol. 18 No. 7, pp. 403-406.

Sun, L.J. and Axhausen, K.W. (2016), “Understanding urban mobility patterns with a probabilistic tensor factorization framework”, Transportation Research Part B: Methodological, Vol. 91, pp. 511-524.

Tan, H., Feng, G., Feng, J., Wang, W., Zhang, Y.J. and Li, F. (2013), “A tensor-based method for missing traffic data completion”, Transportation Research Part C: Emerging Technologies, Vol. 28, pp. 15-27.

Tang, K., Chen, S., Liu, Z. and Khattak, A.J. (2018), “A tensor-based Bayesian probabilistic model for citywide personalized travel time estimation”, Transportation Research Part C: Emerging Technologies, Vol. 90, pp. 260-280.

Tomasi, G. and Bro, R. (2005), “PARAFAC and missing values”, Chemometrics and Intelligent Laboratory Systems, Vol. 75 No. 2, pp. 163-180.

Tucker, L.R. (1966), “Some mathematical notes on three-mode factor analysis”, Psychometrika, Vol. 31 No. 3, pp. 279-311.

Usman, K. and Ramdhani, M. (2019), “Comparison of traditional interpolation methods and compressive sensing for missing data reconstruction”, 2019 IEEE International Conference on Signals and Systems (ICSigSys), IEEE, Bandung, pp. 29-33.

Wang, Y., Zheng, Y. and Xue, Y. (2014), “Travel time estimation of a path using sparse trajectories”, Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 25-34.

Williams, B.M. and Hoel, L.A. (2003), “Modeling and forecasting vehicular traffic flow as a seasonal ARIMA process: theoretical basis and empirical results”, Journal of Transportation Engineering, Vol. 129 No. 6, pp. 664-672.

Wu, C.H., Ho, J.M. and Lee, D.T. (2003), “Travel time prediction with support vector regression”, 6th IEEE International Conference on Intelligent Transportation Systems, IEEE, SHANGHAI, PEOPLES R CHINA, pp. 1438-1442.

Zhang, Y.L. and Xie, Y.C. (2006), “Application of genetic neural networks to real-time intersection accident detection using acoustic signals”, 85th Annual Meeting of the Transportation-Research-Board, Washington, DC, No.1968, pp. 75-82.

Zhang, Y.L. and Xie, Y.C. (2007), “Forecasting of short-term freeway volume with v -support vector machines”, Transportation Research Record: Journal of the Transportation Research Board, Vol. 2024 No. 1, pp. 92-99.

Zhao, K., Popescu, S. and Zhang, X. (2015), “Bayesian learning with Gaussian processes for supervised classification of hyperspectral data”, Photogrammetric Engineering and Remote Sensing, Vol. 74 No. 10, pp. 1223-1234.

Zhao, Q., Zhang, L. and Cichocki, A. (2015), “Bayesian CP factorization of incomplete tensors with automatic rank determination”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 37 No. 9, pp. 1751-1763.

Acknowledgements

Funding: This research was funded by the National Natural Science Foundation of China (No. 51908018, U1964206).

Corresponding author

Yilong Ren can be contacted at: yilongren@buaa.edu.cn