Research on mapmatching algorithm based on priority rule for low sampling frequency vehicle navigation data

Purpose – There is a certain error in the satellite positioning of the vehicle. The error will cause the drift point of the positioning point, which makes the vehicle trajectory shift to the real road. This paper aims to solve this problem. Design/methodology/approach – The key technology to solve the problem is map matching (MM). The low sampling frequency of the vehicle is far from the distance between adjacent points, which weakens the correlation between the points, making MM more difficult. In this paper, an MM algorithm based on priority rules is designed for vehicle trajectory characteristics at low sampling frequencies. Findings – The experimental results show that the MM based on priority rule algorithm can effectively match the trajectory data of low sampling frequency with the actual road, and the matching accuracy is better than other similar algorithms, the processing speed reaches 73 per second. Research limitations/implications – In the algorithm verification of this paper, although the algorithm design and experimental verification are considered considering the diversity of GPS data sampling frequency, the experimental data used are still a single source. Originality/value – Based on the GPS trajectory data of the Ministry of Transport, the experimental results show that the accuracy of the priority-based weight-based algorithm is higher. The accuracy of this algorithm is over 98.1 per cent, which is better than other similar algorithms.


Introduction
With the increase in car ownership, the application of car networking technology has developed rapidly.Based on the driver's driving behavior data acquired by the vehicle network terminal, vehicle networking insurance services such as vehicle network insurance, traffic supervision, route recommendation, travel time estimation and prediction and urban planning can be carried out.The acquisition of this data is mainly collected by GPS vehicle terminal and onboard diagnostic.However, the existence of positioning error makes it impossible to directly apply it, especially the specificity of some vehicle networking applications, such as transportation.The GPS data of the "two passengers and one danger" national vehicle supervision platform is generally sent every 30 s.It is difficult to accurately match the GPS data and the GIS road network by using the existing map matching algorithm (MMA).
MMA is an algorithm to precisely match the location with the digital map in GIS.At present, a large number of scholars at home and abroad have studied and improved the GPS MMA (Ollero, 2007;Boucher et al., 2013;Paefgen et al., 2014;Perrine et al., 2016).Through analysis and induction, it is found that the existing MMA is applied to vehicle track data with a low sampling frequency (sampling interval of 15 s and above for adjacent points), and its MM precision is low.Marchal et al. (2005) proposed a scoring model for large-scale data.Considering the point-tosegment distance and connectivity principle, a weight formula was set up and the parameter settings were discussed.The results show that the algorithm has good timeliness.Blazquez and Vonderohe (2009) consider the difference of different positioning data, propose a parameter adjustment idea for the MM algorithm of rule decision and find that different parameters have an influence on the matching result accuracy of the positioning data of different sampling frequencies.Mokhtari et al. (2014) proposed an integrated-weighted MMA for particle filtering.The algorithm considers two factors: heading angle and speed.It has a good matching effect when the GPS signal is shielded in the case of complex road network.Quddus and Washington (2015) considered the acquisition frequency of GPS data to carry out experiments and proposed an algorithm of MM weight model based on distance and driving direction.Hashemi and Karimi (2016) proposed an MMA with dynamic weight considering the course angle, distance outside the distance and the distance between adjacent points.The above all algorithms have much lower matching precision when applied to GPS points with low sampling frequency.Ming and Karimi (2009) proposed a global map matching method based on markov model for wheelchair navigation (low speed and low sampling frequency).Goh et al. (2012) improved the HMM algorithm based on the state transfer probability determined by SVM and conducted experimental analysis on the same data.The algorithm has high precision but high time cost.Based on this, Raymond et al. (2013) did not consider the information between observation points and directly calculated the distance between the two points as the calculation parameter of the state transition probability, and the experimental results performed better.The above three algorithms are too complex to determine the state transition probability and cannot guarantee the matching precision of the low-frequency sampling points.
Therefore, this paper draws on the idea of the existing MMA, considering the angle between the speed direction and the road traffic direction, and the shortest distance from the point to the candidate road segment.Based on the high-precision GIS electronic map, the two departments for the Ministry of Transport Based on the application range of the algorithm, a map matching based on priority rule (MMPR) is designed and compared.The effectiveness of the algorithm.The innovation of the algorithm lies in: MMPR finds the candidate road segment by setting the priority of the factor, which is different from the existing weight algorithm, so that the importance of the two factors of speed direction and distance can be effectively measured.
The innovative MMPR accurately calculates the angle between the speed and the road traffic direction.
First, find the closest point to the candidate road segment, comprehensively consider the tangential direction and road traffic type of the point and finally determine the angle between the road traffic direction and the speed direction through angle conversion.

Research and design of map matching algorithm
The specific process of the algorithm is as follows: First, the candidate road segment and the electronic map are input, the candidate road segment is determined by the candidate radius and then the angle between the speed direction and the road traffic direction is calculated to find the candidate segment with the smallest angle, and then based on the point to the shortest distance of the candidate road segment finds the best candidate point, and finally the coordinate information of the observation point is corrected based on the coordinate information of the candidate point and the trajectory is drawn.The process essentially repeats the process of iterating forward according to the timestamp of the observation point, and the algorithm ends when all the points match.

Factor selection
Based on the research status of MMA, it is found that the current map matching mainly considers three factors: speed direction, distance from observation point to candidate point and road accessibility.In this paper, the MMA based on the priority rule first considers the speed direction and, second, considers the distance factor and ignores the road accessibility factor.Next, the factors of the MMA are selected and calculated from the perspectives of the existing algorithms and factor analysis.The limitations of the existing algorithms are analyzed as follows: The rule-based comprehensive weight MMA has different advantages and disadvantages.The reason is mainly the complexity of GPS data set and the diversity of road network structure, which makes it difficult to dynamically adapt the weight coefficients of distance and speed direction.Therefore, it is necessary to find an efficient method to measure the importance of the two directions of speed direction and distance.It can be known from the GPS data observation that the distance between the observation points and the real road segment is not the smallest among all the candidate road segments.The fundamental reason is that the error region of the GPS system positioning is an elliptical domain, and the candidate segment is not closer to the center point in the elliptical domain.
The greater the possibility.Therefore, the distance factor is actually less important than the speed angle factor.The HMM model solves the map matching problem, and the road accessibility problem is considered in essence.As shown in Figure 1, the frequency statistics of the length of the broken line segment of the electronic map, after statistical analysis of the road network structure of the electronic map, the measurement range fluctuation range of the element of the road segment, that is, a single broken line segment is 0$1000 m.Comparing the distance between the broken line segment of the electronic map and the distance of the adjacent GPS points, it can be found that the reference value of the accessibility factor of the road is unstable.As can be seen from the figure, 95 per cent of the length of a single road segment is distributed at (0m, 200m).Assume that the average speed is v = 50 km/h%13.8m/s, then the threshold of the acquisition frequency of the road adjacent to the GPS point is f = 200/13.8%15s, Therefore, it can be initially considered: When f < 15 s, the GPS sampling frequency is high, and the adjacent points are close, and the road accessibility is considered to be good.When f > 15 s, the GPS sampling frequency is lower and the adjacent points are far away, and the road accessibility factor has little reference significance.
If we consider different GPS algorithms for different sampling frequencies to use different algorithms to match, then we need to identify the sampling frequency of GPS points according to time, and then use different algorithms.However, the actual GPS data noise may be large.According to the data observation, the sampling frequency of the GPS point sequence changes, and the time difference of the adjacent points fluctuates between 0 and 30 s.Therefore, for the GPS point with the noise sequence, if the frequency is matched, the switch is switched back and forth.The matching algorithm is obviously less efficient.At the same time, when considering the candidate matching road reachability, if the last point matches the wrong candidate road segment, the next time point will match to the same road segment, so that consecutive multiple point matching errors will occur.Based on the above analysis, the MMPR algorithm proposed in this section first considers the speed, and secondly considers the distance factor, and the road accessibility factor is not considered in the MMPR algorithm.

Candidate segment selection
The selection of candidate road segments is the first step of map matching.Selecting suitable candidate road segments can improve the computational efficiency of the algorithm on the one hand and improve the matching accuracy on the other hand.The GPS real point generally falls within the elliptical area of a certain length and length axis centered on the observation point.To simplify the calculation, the ellipse area can be abstracted into a candidate circle.By using the observation point as the center of the circle, the buffer is constructed with a certain radius, and the space cross-query operation is performed through the buffer and the road network to obtain the candidate road segment.The calculation method of the candidate circle radius can be calculated according to equation (1): where a represents the positioning error of the road network, generally 5m, v represents the width of the road, taking an average of 20m, b is the GPS error, the pseudo-random code C/A used in the civil signal system, the ranging accuracy is 29.3 meters between 2.93 meters, the average error b = 16m can be taken first.m represents the width of the vehicle, the width of the vehicle is generally 2m, and finally r = 30m.After the initial setting of the candidate radius, the adjustment can be made through experimental analysis.Map matching algorithm

The angle between the speed direction and the road traffic direction
In the angle design, first find the point closest to the observation point on the candidate road segment, then draw the tangential direction of the curve of the point, and finally use the tangential direction of the point as the direction of the road segment.The range of the velocity direction in the GPS data is (0, 360°); the direction of the passage in the electronic map is represented by an ordered variable.By combining the start and end points of the fold line, the angle between the tangent of a point on the curve and the true north direction can be calculated, and then the difference between the speed direction angle and the absolute value can be used to obtain the angle between the speed direction and the road passing direction (indicated by l ).The determination of the angle of the road traffic direction (indicated by u ) can be divided into two-way street and one-way street.In the electronic map, the one-way street can be divided into the forward path (the entrance of the road entry in the GIS electronic map is the starting point of the line segment) and the retrograde road (the road entrance is marked as the end point of the line segment in the GIS electronic map); therefore, the determination of the angle can be discussed in three cases.
2.3.1 Two-way street.As shown in Figure 2 (left), the road network in the electronic map is represented by a broken line segment, and the starting point and the ending point are generally marked.On a two-lane road segment, the vehicle can be from the start point to the end point, or from the end point to the starting point.When you do not know that the vehicle is entering from that intersection, you can ignore the difference between the start point and the end point and directly project the observation point vertically onto the line segment., the foot is, and then the tangent of the line segment at the foot point, the angle of the tangent is generally the angle of the horizontal positive direction, expressed by u , the speed angle is a, generally in the GPS receiver is positive The north direction is the baseline calculation.
2.3.2Antegrade road.As shown in Figure 2 (middle), the vehicle on the forward path travels from the starting point to the end point of the folding line segment.Therefore, the difference between the starting point and the ending point of the folding line segment cannot be ignored, and the finding point is found after the footing point on the candidate road segment.The foot is tangent, and the tangent is directional, and the direction is from the beginning of the fold line to the point of the foot.
2.3.3Retrograde road.As shown in Figure 2 (right), the vehicle on the retrograde road travels from the end of the folding line to the starting point.Therefore, the difference between the starting point and the ending point of the folding line segment cannot be ignored.After finding the footing point of the observation point on the candidate road section, the foot point is tangent, and the direction of the tangent is from the end of the line segment to the foot point.
The GPS speed direction angle is based on the angle of the true north direction.The horizontal direction of the tangential angle of the road direction obtained by the above calculation method needs to be converted into an angle with the true north direction to be able to be distinguished from the speed direction angle.Therefore, according to the geometry knowledge, design conversion rules are shown in Table I.The angle l determined at this time and the road are in the range of (0, 2p ) and need to be converted to (0, p ) to participate in the calculation.Therefore, according to the geometric knowledge, the rules for the conversion of the design are shown in Table I.

Data preprocessing 3.1 GPS track data processing
The GPS data in this paper come from the "two passengers and one danger" data of the Ministry of Communications.The data format is the offline DMP file format, which needs to be imported and reused through the Oracle database.Through data observation, the original GPS data includes 16 field information such as time stamp, latitude and longitude, speed and heading angle.There are data redundancy and data anomalies.Therefore, it needs to be preprocessed as follows.
3.1.1Remove data redundancy.There are serious data redundancy situations in GPS data.There are two main types of situations.The first type of data redundancy refers to the existence of this duplicate record in the original GPS data table.This should be caused by the backup mechanism of the database.This article uses SQL statements to deduplicate in Oracle.At the same time of weight reduction, the massive GPS data is filtered and the GPS data to be used is selected.The second type of data redundancy means that there is no road network structure in the location of the GPS point.This is probably because the team has arrived in an area without a road network structure after work, usually because the GPS receiver has not been turned off after the vehicle is parked.Data are still being collected at this time, but this part of the data does not help map matching.Therefore, to improve the matching efficiency of the algorithm, a spatial query mode is adopted, and once the candidate segment is found to have no matching candidate segments, the deletion operation is performed.
3.1.2Abnormal data rejection.GPS track data anomalies are mainly caused by terminal recording errors or data transmission, and are mainly classified into three types of abnormalities.The first is the abnormal speed extreme value.This kind of data is caused by the recording error.In the actual matching process, the upper limit of the speed can be set by the SQL statement to filter.Second, the latitude and longitude is extremely abnormal.Such data may cause the positioning error to be greater than 100 m due to the presence of the vehicle's surrounding obstructions.The reference data of this type of data in the actual matching process should be eliminated.The main fields used in the map matching process include time, latitude and longitude, and speed direction.For unnecessary fields, the filtering operation should be performed to improve the timeliness of the algorithm.The final GPS attributes are shown in Table II.

Electronic map processing
The data of this article's electronic map are in industrial-grade shp format vector electronic map, which contains national roads, provincial highways, high-speed, county-level road network information, and takes up about 7 GB of disk space.Imported into the spatial database ArcSDE through ArcGIS software, the road network contains a total of 35 attribute fields.To improve the matching timeliness of the algorithm, the electronic map needs to be processed as follows.
3.2.1 Classification and screening of electronic map attribute fields.The electronic map belongs to spatial data, including spatio-temporal information and attribute information.The spatio-temporal information contains the spatial attributes and geometric topological relations of the map data object and is mainly used to satisfy the calculation requirements, and is automatically generated in the spatial database.The attribute information contains the feature attributes of the features in the map.It is mainly used for display and description.After analysis and filtering, the key road network attribute fields are selected as shown in Table III.
3.2.2Space clipping of electronic maps.To improve the speed at which the front-end program loads the map, it is necessary to tailor the map.First, select the road network information in Zhejiang, including national roads, national highways and highway networks.Then according to the administrative division and latitude and longitude range of Zhejiang Province (118°-123°E, 27°-32°N), the national electronic map will be cut and the rectangular road network including Zhejiang Province will be obtained, as shown in Figure 3.

Spatial coordinate transformation.
The position description of any object requires a reference coordinate system.In the process of map matching, it involves the unification of the spatial coordinate system of GPS coordinates and electronic maps.The map used in this paper is an electronic map of China.The domestic electronic map mainly uses the geographic coordinate system of Beijing 1954 coordinate system or Xian 1980 coordinates.
The GPS uses WGS1984 (1984 geodetic coordinate system).The result of this coordinate system is displayed in latitude and longitude and altitude (B, L, H).Therefore, WGS1986 needs to be transformed into a geographic coordinate system.The conversion process is carried out according to equation (2): In the above formula, (X, Y, Z) represents the coordinates of the geographic coordinate system, and (l , f , H) represents the WGS coordinates.After transforming the coordinates of the above formula, the GPS coordinates can be converted into Beijing 1954 coordinate system or Xian 1980 coordinates by referring to the reference plane parameters in the earth ellipsoid data table.After the coordinate conversion is completed and the reference coordinate system is unified, the map matching work can be performed.Otherwise, the GPS display error in the road network will exceed 100 m, so that map matching cannot be performed.

Experiment analysis
4.1 Analysis of results 4.1.1Analysis of overall matching results.Figure 4 shows the overall result of matching GPS traces at low sampling frequencies.Take the trajectory of a certain day of the car as an example.The original data of the day have 2,833 points.After data preprocessing (removing data redundancy), there are 130 points left.The trajectory passes through G15-Shenhai Expressway, G104, S10-Wenzhou Ring Expressway and G1513-Wenli Expressway.
There are 128 points matching the correct MMPR algorithm and 2 points matching the error.The error reason is that the head angle of the GPS point is recorded incorrectly.The The weighting algorithm considers the angle between the velocity and the road direction and the distance factor.First, the two factors are normalized.Through iterative experiments, it is found that the weighting coefficient is 0.6 and the distance weighting coefficient is 0.4, which has the highest matching precision.The results show that there are 109 points with correct weight algorithm matching, 21 points with matching errors, matching precision of 83.3 per cent, matching duration of 1.23 s, matching speed of 106/s, and testing under the same environment.
4.1.2Analysis of local matching results.In Figure 5, the X shape represents the original point, the dot represents the corrected point, the arrow indicates the traveling direction, the green bold curve indicates the traveling trajectory and the other curves indicate the road network.
Figure 5 (left) is the result of using MMPR to match on the road network with roundabouts and parallel roads.There are four points in the figure, all matching to the correct road.Figure 6 (right) is the same weight algorithm.The geographical location is matched, but there is an error in the matching of two points, and one point in the lower left corner matches the roundabout.The reason is that the point is closer to the inner section of the roundabout, and the uppermost point matches the other side of the parallel road.So that Figure 6 (left) is the result of matching the complex intersection using the algorithm of this paper.Figure 6 (right) is the result of matching the same position by the weight method.It is found through observation that the matching points of the matching algorithm in this paper are all matched correctly, and the weighting method has a point on the lower left corner of the island that is misaligned and matched to the adjacent road segment.The reason is that the point is closer to the inner side of the roundabout.

Algorithm performance evaluation
To eliminate the contingency, the 10-day GPS trajectory data of ten vehicles with lowfrequency sampling rate points were selected for experimental analysis.The results are shown in Table IV.The left side of the symbol "k" is the MMPR result, and the right side is the result of the weighting method.The final matching accuracy of the algorithm is 98.10 per cent, the standard deviation is 0.012, the matching duration is 73/s and the standard deviation is 1.748.
To visually represent the effectiveness of the MMPR algorithm, in addition to comparing the performance of the matching results with the weighting method, the calculation results in the two documents are also cited (Ming and Karimi, 2009;Liu et al., 2007).It is GPS data for wheelchair navigation, and the sampling frequency is low.The literature (Liu et al., 2007) collects bus data every 30 seconds, so it is comparable.
It can be seen from the table that the average accuracy of the algorithm is 98.10 per cent, 15.82 per cent higher than the weight algorithm, 2.10 per cent higher than the algorithm in Ming and Karimi (2009) and 0.30 per cent higher than the algorithm in Liu et al. (2007).The algorithm processing speed average reaches 73 points per second; the running speed is lower than the weight method because the weighting method calculation rule is simple and sacrifices the accuracy while improving the running speed.

Conclusions and discussion
Based on the existing MMA theory, this paper does the following work: A priority-based MMA is designed.On the basis of demonstrating the factors that the angle between the speed direction angle and the road traffic direction is higher than the distance from the point to the candidate road segment, a method for calculating the angle between the speed direction and the road traffic direction is designed and based on the Ministry of Transport.The "two passengers and one danger" GPS trajectory data, the experimental results verify that the accuracy ratio based on the priority ratio weight-based algorithm is higher, the accuracy of the algorithm in this paper exceeds 98.1 per cent, which is better than other similar algorithms.On the physical machine used in the experiment, the map matching speed reached 73 per second.
In the algorithm verification of this paper, although the algorithm design and experimental verification are considered considering the diversity of GPS data sampling frequency, the experimental data used is still a single source.The road network structure that the experimental vehicle trajectory data can match on the electronic map is mainly three kinds of road network structures: national highway, provincial highway and high-speed highway.When the algorithm is applied to inter-city road matching or some other more complex environment, the accuracy and timeliness of the proposed algorithm may be reduced.Therefore, the next research direction is to collect vehicle trajectory data of more complex road network structure for algorithm testing, find the problems in actual matching, find the reason and further improve the algorithm of this paper.

Figure 1 .
Figure 1.Frequency distribution of the length of the line segment Figure 2. Two-way tangential angle and velocity direction angle Figure 3. Rectangular road network (including Zhejiang Province) Figure 4. Overall matching results Figure 6.Complex intersection matching results