Traffic flow prediction method, system, computer equipment and storage mediumTechnical Field
The invention belongs to the technical field of traffic prediction, and particularly relates to a traffic flow prediction method, a system, computer equipment and a storage medium based on an attention mechanism and a bidirectional GRU.
Background
In the current traffic flow prediction method, the traffic prediction problem is mainly: traffic information of a certain time period in the future is predicted based on historical traffic information in the road network, and the traffic information specifically comprises traffic flow, speed, congestion degree and the like. Traffic flow data is time sequence data, and due to the fact that the time correlation of traffic flows is strong, how to mine more features from low-dimensional data, and the time correlation of capturing traffic flow data is a key of traffic flow prediction.
Prior to the rise of deep learning, methods of traffic flow prediction include regression analysis, moving average, kalman filtering, and the like. The regression analysis method is a non-parametric modeling method based on a state space theory method and a statistical pattern recognition technology, has higher prediction precision, but has stronger assumption conditions, requires the predicted variable to meet the stability characteristic, and the function meets a certain distribution rule; the moving average method takes the average value of all data obtained before a prediction time point as a prediction value of the current period, is a static prediction method, cannot reflect the dynamic property and uncertainty of traffic flow, and has low prediction precision; the Kalman filtering method realizes the prediction of traffic state parameters by establishing and calculating a traffic flow change state equation, is suitable for a linear change scene, and cannot fully grasp the nonlinear characteristics of short-term traffic flow, so that the prediction effect is poor.
With the rise and development of deep learning technology, the gating neural network in deep learning provides a new direction for short-term traffic flow prediction. The gating neural network is one of the cyclic neural networks, has the memory of the traditional cyclic neural network, has advantages for learning the nonlinear characteristics of the sequence, and solves the problem of long-period dependence of the traditional neural network. However, the unidirectional gating neural network ignores the extraction of data in the reverse time direction, and the problem that local features cannot be fully extracted still exists on short-term traffic flow prediction.
Disclosure of Invention
The invention aims to provide a traffic flow prediction method, a system, computer equipment and a storage medium, which are used for solving the problems that the short-term traffic flow prediction precision is low in the existing traffic flow prediction, and local characteristics cannot be fully extracted by neglecting the extraction of reverse time direction data when unidirectional gate-controlled neural network prediction is adopted.
The invention solves the technical problems by the following technical scheme: a traffic flow prediction method comprising the steps of:
Step 1: acquiring traffic flow data of a predicted road section and a road section adjacent to the predicted road section, wherein the traffic flow data comprises time, road section numbers and traffic flow information;
Step 2: preprocessing the traffic flow data to obtain traffic flow data vectors;
step 3: establishing a bidirectional GRU model, wherein the bidirectional GRU model comprises a plurality of GRU units, and each GRU unit consists of a state vector, a reset gate and an update gate;
Dividing the traffic flow data vector in the step 2 into an input vector and an output vector, taking the traffic flows of all road sections at t-1 moments as the input vector, and taking the traffic flows of the predicted road sections at the t-th moment as the output vector;
Step 4: inputting the input vector in the step 3 into the bidirectional GRU model to obtain an intermediate state vector output by the bidirectional GRU model;
Step 5: inputting the intermediate state vector in the step 4 to an attention layer to obtain a predicted output vector;
step 6: judging whether the difference between the predicted output vector in the step 5 and the output vector in the step 3 meets a set error, and if so, obtaining a trained attention-bidirectional GRU model; otherwise, turning to step 7;
Step 7: and simultaneously adjusting parameters of the bidirectional GRU model and the attention try layer, and repeating the steps 4-6 until the set error is met, so as to obtain the trained attention-bidirectional GRU model.
In the invention, the forward GRU of the bidirectional GRU model receives the input vector in the forward time direction, the reverse GRU receives the input vector in the reverse time direction, and the intermediate state vector output by the bidirectional GRU model determines the weight matrix of the predicted output vector by using the attention mechanism, so that the weight corresponding to the important feature in the predicted output vector is larger, compared with the unidirectional GRU, the hidden layer state is doubled, the time feature of the data can be fully extracted, the attention mechanism identifies the important feature, the model convergence is quickened, and the accuracy of short-term traffic flow prediction is more effectively improved.
Further, in the step 2, the preprocessing includes standard normalization processing and vectorization processing; and carrying out standard normalization processing on the traffic flow data, and carrying out vectorization processing on the data subjected to the standard normalization processing to obtain a traffic flow data vector.
Further, the Z-score method is adopted to carry out standard normalization processing on the traffic flow data, and the standard normalization processing formula is as follows:
wherein,The i=1, 2,3, …, t-1, t, u are the average value of the traffic data of all the road sections at t times, sigma is the standard deviation of the traffic data of all the road sections at t times,Is the traffic flow data of the kth road section at the ith moment.
The data after standard normalization processing accords with normal distribution, and the traffic flow data of all road sections at all historical moments can be concentrated in a high-frequency section, so that the convergence speed of subsequent model training is increased.
Further, in the step 4, the intermediate state vector is:
wherein ht is an intermediate state vector at time t,The hidden layer state of the forward circulating layer at the time t,The hidden layer state of the layer is reversely circulated at the time t,Is the hidden layer state of the forward circulating layer at the time t-1,For the hidden layer state of the reverse loop layer at time t+1, z1 is the forward GRU update gate state, z2 is the reverse GRU update gate state,Is a candidate state for a forward direction GRU,As a candidate state for the reverse GRU,The gate weight parameters are updated for the forward GRU,The gate weight parameters are updated for the reverse GRUs,The gate weight parameters are updated for the forward GRU,The gate weight parameters are updated for the reverse GRUs,Updating the bias term of the gate for the forward GRU,Updating the bias term of the gate for the reverse GRU,The gate weight parameters are reset for the positive gre,The gate weight parameters are reset for the inverse GRU,The gate weight parameters are reset for the positive gre,The gate weight parameters are reset for the inverse GRU,Reset the bias term of the gate for a positive GRU,For the bias term of the reverse GRU reset gate, delta () is a sigmoid activation function, Xt is the traffic flow input at time t, r1 is the forward GRU reset gate state, r2 is the reverse GRU reset gate state, W1 is the forward GRU trainable weight matrix, U1 is the forward GRU trainable weight matrix, b1 is the bias term of the forward GRU, W2 is the reverse GRU trainable weight matrix, U2 is the reverse GRU trainable weight matrix, b2 is the bias term of the reverse GRU, tan h is the tan h activation function, and t is the predicted time.
Further, in the step 5, the predicted output vector is:
Wherein, Ot is the predicted traffic flow output by the attention layer at time t, at is the characteristic variable weight at time t, aj is the characteristic variable weight at time j, hj is the intermediate state vector at time j, ht is the intermediate state vector at time t, qt is the attention probability distribution value determined by the forward GRU and the reverse GRU at time t, qj is the attention probability distribution value determined by the forward GRU and the reverse GRU at time j, j is the cyclic variable, f and w are trainable weight parameters, b is the bias term, tanh is the tanh activation function, and t is the predicted time.
Further, in step 7, the adam optimization algorithm is used to adjust parameters of the bidirectional GRU model and the attention seeking layer.
The adam optimization algorithm is suitable for training large-scale data, has high training speed, and has large data volume and more trained parameters, so that the adam optimization algorithm is selected to train parameters of a bidirectional GRU model and an attention-seeking layer and can be converged rapidly.
The invention also provides a traffic flow prediction system, comprising:
the system comprises a data acquisition unit, a prediction unit and a data processing unit, wherein the data acquisition unit is used for acquiring traffic flow data of a predicted road section and a road section adjacent to the predicted road section, and the traffic flow data comprises time, road section numbers and traffic flow information;
The preprocessing unit is used for preprocessing the traffic flow data to obtain traffic flow data vectors;
The system comprises a model building unit, a model updating unit and a model updating unit, wherein the model building unit is used for building a bidirectional GRU model, the bidirectional GRU model comprises a plurality of GRU units, and each GRU unit consists of a state vector, a reset gate and an update gate;
The data dividing unit is used for dividing the traffic flow data vector into an input vector and an output vector, taking the traffic flow of all road sections at t-1 moments as the input vector, and taking the traffic flow of the predicted road section at the t moment as the output vector;
the training input unit is used for inputting the input vector into the bidirectional GRU model to obtain an intermediate state vector output by the bidirectional GRU model;
The training prediction unit is used for inputting the intermediate state vector to the attention layer to obtain a prediction output vector;
And the judging and adjusting unit is used for judging whether the difference between the predicted output vector and the output vector meets the set error, if so, obtaining a trained attention-bidirectional GRU model, otherwise, adjusting parameters of the bidirectional GRU model and the attention try layer at the same time, inputting the input vector into the bidirectional GRU model, and continuing training the bidirectional GRU model and the attention try layer until obtaining the trained attention-bidirectional GRU model.
The invention also provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, which processor implements the traffic flow prediction method as described above when executing the program.
The present invention also provides a storage medium having stored thereon a computer program which, when executed by a processor, implements the traffic flow prediction method as described above.
Advantageous effects
Compared with the prior art, the traffic flow prediction method, the system, the computer equipment and the storage medium provided by the invention have the advantages that the forward GRU of the bidirectional GRU model receives the input vector in the forward time direction, the reverse GRU receives the input vector in the reverse time direction, the intermediate state vector output by the bidirectional GRU model determines the weight matrix of the prediction output vector by using the attention mechanism, so that the weight corresponding to the important feature in the prediction output vector is larger, the occurrence of local dependence in the prediction process is avoided, compared with the unidirectional GRU, the hidden layer state is doubled, the time feature of the data can be fully extracted, the attention mechanism identifies the important feature, the model convergence is quickened, the prediction effect is improved, the training data comprises the traffic flow data of the prediction road section and the adjacent road section, the prediction is carried out by combining the space correlation on the basis of the time correlation, and the accuracy of the short-term traffic flow prediction is more effectively improved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawing in the description below is only one embodiment of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a traffic flow prediction method in an embodiment of the invention;
FIG. 2 is a diagram of a attention mechanism-bi-directional GRU model in an embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made more apparent and fully by reference to the accompanying drawings, in which it is shown, however, only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, the traffic flow prediction method and system provided in this embodiment include the following steps:
1. Data acquisition
And acquiring traffic flow data of the predicted road section and the adjacent road section, wherein the traffic flow data comprises time, road section numbers and traffic flow information. In this embodiment, the traffic flow data is derived from CALTRANS PEMS data sources, and the time span of the traffic flow data is 1 st in 2020 to 6 th in 2020, where the traffic flow data of 1 st in 2020 to 20 th in 2020 is used as training set, and the traffic flow data of 21 st in 2020 to 30 th in 2020 is used as test set.
2. Preprocessing of data
And preprocessing the traffic flow data to obtain traffic flow data vectors. The preprocessing comprises standard normalization processing and vectorization processing of traffic flow data, wherein the standard normalization processing is firstly carried out on the traffic flow data, and then the vectorization processing is carried out on the data after the standard normalization processing to obtain a traffic flow data vector.
The Z-score method is adopted to carry out standard normalization processing on traffic flow data, so that the processed data accords with normal distribution, and the standard normalization processing formula is as follows:
wherein,The i=1, 2,3, …, t-1, t, u are the average value of the traffic data of all the road sections at t times, sigma is the standard deviation of the traffic data of all the road sections at t times,Is the traffic flow data of the kth road section at the ith moment. The data after standard normalization processing accords with normal distribution, and the traffic flow data of all road sections at all historical moments can be concentrated in a high-frequency section, so that the convergence speed of subsequent model training is increased.
For example, the traffic flow data vector subjected to vectorization processing is x= { X1,X2,X3 }, where X2 represents traffic flow data of a predicted link, X1、X3 represents traffic flow data of a neighboring link of the predicted link, and x= { X1,X2,X3 } is taken as an input vector of the bidirectional GRU model.
3. Bidirectional GRU model establishment and training samples
Establishing a bidirectional GRU model, wherein the bidirectional GRU model comprises a plurality of GRU units, and each GRU unit consists of a state vector, a reset gate and an update gate; the state vector is only determined by the traffic flow data currently input and the hidden layer state vector at the previous moment.
Setting the time step of the bidirectional GRU model, if the time step is set to 10, then the bidirectional GRU model generates an output vector after receiving the input vectors at 10 moments. The input vector accepted by the bidirectional GRU model comprises an input vector in the forward time direction and an input vector in the reverse time direction, the hidden layer state at each moment is combined with the hidden layer states in the forward time direction and the reverse time direction, the number of hidden layers of the bidirectional GRU model is twice as large as the dimension of the input vector, and if the dimension of the input vector is 2, the number of hidden layers is 4.
The time steps and the number of hidden layers are set according to the actual training data.
The traffic flow data vector is divided into an input vector and an output vector, the traffic flow of all road sections at t-1 time is taken as the input vector, the traffic flow of the predicted road section at the t-th time is taken as the output vector, and the input vector and the output vector form a training sample. When the bidirectional GRU model is trained, the input vector is used as the input of the bidirectional GRU model, and the output vector is compared with the predicted output vector output by the attention map layer to judge whether the error precision requirement is met.
4. Training of bi-directional GRU models
Inputting the input vector into the bidirectional GRU model to obtain an intermediate state vector output by the bidirectional GRU model, wherein the intermediate state vector is as follows:
wherein ht is an intermediate state vector at time t,The hidden layer state of the forward circulating layer at the time t,The hidden layer state of the layer is reversely circulated at the time t,Is the hidden layer state of the forward circulating layer at the time t-1,For the hidden layer state of the reverse loop layer at time t+1, z1 is the forward GRU update gate state, z2 is the reverse GRU update gate state,Is a candidate state for a forward direction GRU,As a candidate state for the reverse GRU,The gate weight parameters are updated for the forward GRU,The gate weight parameters are updated for the reverse GRUs,The gate weight parameters are updated for the forward GRU,The gate weight parameters are updated for the reverse GRUs,Updating the bias term of the gate for the forward GRU,Updating the bias term of the gate for the reverse GRU,The gate weight parameters are reset for the positive gre,The gate weight parameters are reset for the inverse GRU,The gate weight parameters are reset for the positive gre,The gate weight parameters are reset for the inverse GRU,Reset the bias term of the gate for a positive GRU,For the bias term of the reverse GRU reset gate, delta () is a sigmoid activation function, Xt is the traffic flow input at time t, r1 is the forward GRU reset gate state, r2 is the reverse GRU reset gate state, W1 is the forward GRU trainable weight matrix, U1 is the forward GRU trainable weight matrix, b1 is the bias term of the forward GRU, W2 is the reverse GRU trainable weight matrix, U2 is the reverse GRU trainable weight matrix, b2 is the bias term of the reverse GRU, tan h is the tan h activation function, and t is the predicted time.
5. Training of attention layer
Inputting the intermediate state vector in the step 4 to the attention layer to obtain a predicted output vector output by the attention layer, wherein the predicted output vector is as follows:
qt=f·tanh(wht+b) (13)
Wherein, Ot is the predicted traffic flow output by the attention layer at time t, at is the characteristic variable weight at time t, aj is the characteristic variable weight at time j, hj is the intermediate state vector at time j, ht is the intermediate state vector at time t, qt is the attention probability distribution value determined by the forward GRU and the reverse GRU at time t, qj is the attention probability distribution value determined by the forward GRU and the reverse GRU at time j, j is the cyclic variable, f and w are trainable weight parameters, b is the bias term, tanh is the tanh activation function, and t is the predicted time.
And a attention mechanism (namely an attention force diagram layer) is used between the hidden layer state and the output of the bidirectional GRU model, and weight parameters are calculated for the hidden layer state at each moment, so that the local dependence on the hidden layer state is reduced, and the prediction accuracy is improved.
6. Determination of whether the setting error is satisfied
Judging whether the difference between the predicted output vector in the step 5 and the output vector in the step 3 meets the set error, and if so, obtaining a trained attention-bidirectional GRU model; otherwise go to step 7.
In this embodiment, the setting error is 0.01, and can be adjusted according to the prediction accuracy requirement.
7. Parameter adjustment
And simultaneously adjusting parameters of the bidirectional GRU model and the attention try layer, and repeating the steps 4-6 until the set error is met, so as to obtain the trained attention-bidirectional GRU model.
The parameters of the bidirectional GRU model and the attention try layer are adjusted by adopting an adam optimization algorithm, the adam optimization algorithm is suitable for training of large-scale data, the training speed is high, the traffic flow data is large in data quantity, and the number of the trained parameters is large, so that the parameters of the bidirectional GRU model and the attention try layer can be trained by adopting the adam optimization algorithm, and the parameters can be converged rapidly.
Parameters to be adjusted for the bidirectional GRU model are as follows:W1、U1、b1、W2、U2、b2。
note that the parameters that the layer needs to adjust are: f. w, b.
Evaluation indexes of the unidirectional GRU, bidirectional GRU, and attention layer-bidirectional GRU models are compared with each other as shown in table 1 below, and the evaluation indexes are MSE (mean square error) and RMSE (root mean square error).
Table 1 evaluation index comparison table
The foregoing disclosure is merely illustrative of specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art will readily recognize that changes and modifications are possible within the scope of the present invention.