CN118587893B

Movatterモバイル変換

Info

Publication number: CN118587893B
Application number: CN202411044400.2A
Authority: CN
Inventors: 章超; 刘卓; 嵇昂; 孔明明; 孙湛博
Original assignee: Sichuan Police College
Current assignee: Sichuan Police College
Priority date: 2024-07-31
Filing date: 2024-07-31
Publication date: 2024-10-18
Anticipated expiration: 2044-07-31
Also published as: CN118587893A

Abstract

Translated fromChinese

本发明公开了一种基于动态自适应卷积神经网络的城市交通流预测方法，涉及交通工程及智能交通技术领域，包括：数据采集及预处理；建立基于时空残差卷积的神经网络模型；构建动态自适应卷积；建立时空动力学模型并对其进行优化；利用随机梯度下降算法，通过训练不断优化动态时空自适应卷积网络超参数，得到交通状态预测方案；本发明利用DSTACN在预测交通流考虑时空依赖关系，捕捉路网中的动态时空关系，使得预测结果更加精确。

The present invention discloses a method for predicting urban traffic flow based on a dynamic adaptive convolutional neural network, which relates to the fields of traffic engineering and intelligent traffic technology, including: data collection and preprocessing; establishing a neural network model based on spatiotemporal residual convolution; constructing a dynamic adaptive convolution; establishing a spatiotemporal dynamics model and optimizing it; using a stochastic gradient descent algorithm to continuously optimize the dynamic spatiotemporal adaptive convolutional network hyperparameters through training to obtain a traffic state prediction scheme; the present invention uses DSTACN to consider the spatiotemporal dependency in predicting traffic flow, captures the dynamic spatiotemporal relationship in the road network, and makes the prediction result more accurate.

Description

Translated fromChinese

基于动态自适应卷积神经网络的城市交通流预测方法Urban traffic flow prediction method based on dynamic adaptive convolutional neural network

技术领域Technical Field

本发明涉及交通工程及智能交通技术领域，特别是一种基于动态自适应卷积神经网络的城市交通流预测方法。The present invention relates to the field of traffic engineering and intelligent traffic technology, and in particular to a method for predicting urban traffic flow based on a dynamic adaptive convolutional neural network.

背景技术Background Art

随着城市快速扩张，精确的城市交通流量预测在分配道路资源、城市规划、公共安全和风险评估方面至关重要。然而，交通流量受多方面动态因素的影响，包括设施建设、出行高峰期和其他外部条件，使得预测任务极为复杂。交通流量预测面临巨大挑战，主要因为这些因素产生了复杂的时空依赖关系。As cities expand rapidly, accurate urban traffic flow forecasting is crucial in allocating road resources, urban planning, public safety, and risk assessment. However, traffic flow is affected by many dynamic factors, including infrastructure construction, travel peak hours, and other external conditions, making the forecasting task extremely complex. Traffic flow forecasting faces huge challenges, mainly because these factors generate complex spatiotemporal dependencies.

传统预测方法多依赖统计模型，如非线性回归模型、卡尔曼滤波器（Kalmanfilters）、支持向量回归模型（SVR）、自回归整合移动平均（ARIMA）等，主要关注交通流的周期性，忽略了空间相关性。近年来，深度学习的发展为交通流预测提供了新方法，尤其是循环神经网络（RNN）和长短期记忆（LSTM）网络，可以有效处理长期和短期依赖关系。深度学习模型将城市交通流概念化为矩阵，从中学习时空信息。例如，深度时空残差卷积网络利用卷积神经网络（CNN）提取空间信息，并结合LSTM建立短期和长期依赖关系模型。尽管这些方法性能良好，但可能未充分考虑道路网络的动态空间复杂性。Traditional prediction methods mostly rely on statistical models, such as nonlinear regression models, Kalman filters, support vector regression models (SVR), autoregressive integrated moving average (ARIMA), etc., which mainly focus on the periodicity of traffic flow and ignore spatial correlation. In recent years, the development of deep learning has provided new methods for traffic flow prediction, especially recurrent neural networks (RNN) and long short-term memory (LSTM) networks, which can effectively handle long-term and short-term dependencies. Deep learning models conceptualize urban traffic flow as matrices and learn spatiotemporal information from them. For example, deep spatiotemporal residual convolutional networks use convolutional neural networks (CNNs) to extract spatial information and combine LSTM to establish short-term and long-term dependency models. Although these methods perform well, they may not fully consider the dynamic spatial complexity of road networks.

现有模型在理解交通动态中固有的复杂时空依赖性方面仍面临挑战，难以适应城市交通系统的动态性和复杂性。尽管深度学习和图神经网络技术有所改善，但仍面临模型复杂性、训练数据需求量大和计算资源高要求等挑战。Existing models still face challenges in understanding the complex spatiotemporal dependencies inherent in traffic dynamics and are difficult to adapt to the dynamics and complexity of urban transportation systems. Despite improvements in deep learning and graph neural network technologies, they still face challenges such as model complexity, large training data requirements, and high computing resource requirements.

发明内容Summary of the invention

为解决现有技术中存在的问题，本发明的目的是提供一种基于动态自适应卷积神经网络的城市交通流预测方法，本发明利用DSTACN在预测交通流考虑时空依赖关系，捕捉路网中的动态时空关系，使得预测结果更加精确。In order to solve the problems existing in the prior art, the purpose of the present invention is to provide a method for urban traffic flow prediction based on a dynamic adaptive convolutional neural network. The present invention uses DSTACN to consider the spatiotemporal dependency in predicting traffic flow, capture the dynamic spatiotemporal relationship in the road network, and make the prediction result more accurate.

为实现上述目的，本发明采用的技术方案是：一种基于动态自适应卷积神经网络的城市交通流预测方法，包括以下步骤：To achieve the above object, the technical solution adopted by the present invention is: a method for predicting urban traffic flow based on a dynamic adaptive convolutional neural network, comprising the following steps:

步骤1、数据采集及预处理：根据实验设计对数据进行不同时间域的处理、数据集划分、归一化处理，获得道路网络上的历史交通数据特征，并形成输入特征矩阵；Step 1: Data collection and preprocessing: According to the experimental design, the data is processed in different time domains, the data sets are divided, and normalized to obtain the historical traffic data characteristics on the road network and form an input feature matrix;

步骤2、建立基于时空残差卷积的神经网络模型：结合处理后的输入特征矩阵，构建出基础的基于时空残差卷积的神经网络模型；Step 2: Establish a neural network model based on spatiotemporal residual convolution: Combine the processed input feature matrix to construct a basic neural network model based on spatiotemporal residual convolution;

步骤3、构建动态自适应卷积：交通数据表现出的空间相关性由注意力引导的动态卷积和可变形卷积捕捉，将动态卷积和可变形卷积结合为自适应卷积；Step 3: Construct dynamic adaptive convolution: The spatial correlation shown by traffic data is captured by attention-guided dynamic convolution and deformable convolution, which are combined into adaptive convolution;

步骤4、建立时空动力学模型并对其进行优化：交通数据的时间相关性由时间注意力捕捉，动态时间注意力与动态自适应卷积集成到所述基于时空残差卷积的神经网络模型中同步构建时空动力学模型，得到动态时空自适应卷积网络；Step 4: Establish a spatiotemporal dynamics model and optimize it: the temporal correlation of traffic data is captured by temporal attention, and dynamic temporal attention and dynamic adaptive convolution are integrated into the neural network model based on spatiotemporal residual convolution to synchronously construct the spatiotemporal dynamics model, thereby obtaining a dynamic spatiotemporal adaptive convolution network;

步骤5、利用随机梯度下降算法，通过训练不断优化动态时空自适应卷积网络超参数，得到交通状态预测方案。Step 5: Use the stochastic gradient descent algorithm to continuously optimize the dynamic spatiotemporal adaptive convolutional network hyperparameters through training to obtain a traffic status prediction solution.

作为本发明的进一步改进，步骤1还包括：整合节假日信息、天气状况和温度外部因素。As a further improvement of the present invention, step 1 also includes: integrating holiday information, weather conditions and temperature external factors.

作为本发明的进一步改进，在步骤1中，对数据进行不同时间域的处理具体如下：As a further improvement of the present invention, in step 1, the data is processed in different time domains as follows:

划分城市空间区块和量化交通流量，城市区域被离散化为由个方格组成的网格，以经纬度划分，每个方格由其在网格中的位置坐标唯一标识，表示行，表示列；在规定的时间段内，通过任何一个方格的流量流入和流出由以下公式量化：Dividing urban space blocks and quantifying traffic flow, the urban area is discretized into The grid is composed of squares, divided by longitude and latitude, and each square is represented by its position coordinates in the grid. Unique identifier, Indicates the line, Indicates a column; within a specified time period Inside, through any square The flow inflow and outflow is quantified by the following formula:

； ;

其中，表示集合中的一条轨迹，表示轨迹上第个点的地理空间坐标，表示点位于网格方格内，表示它位于网格方格外，表示集合中不同元素的个数；in, Representing a collection A track in Indicates the trajectory The geospatial coordinates of the points, express Points are located in grid squares Inside, means it is outside the grid square, Indicates the number of different elements in a set;

让表示前个时隙在单元区域中观察到的流量张量；对于任何给定的时间间隔，区域内所有网格方格的累计流入和流出量用张量表示；第一层表示流入量，第二层表示流出量指标；目标是根据截至时间的历史数据，预测下一个时隙的交通客流张量。let Before time slots in the cell area The observed flow tensor in ; for any given time interval , The cumulative inflow and outflow of all grid squares in the region are expressed as tensors Representation; first layer Indicates inflow, second layer Represents the outflow indicator; the target is based on the end time Historical data to predict the next time slot The traffic passenger flow tensor .

作为本发明的进一步改进，在步骤1中，对数据进行数据集划分和归一化处理具体如下：As a further improvement of the present invention, in step 1, the data set division and normalization processing are specifically performed as follows:

按照时间顺序将数据集分为训练集、验证集和测试集；并应用MinMax标准化方法对数据集进行归一化。The dataset is divided into training set, validation set and test set in chronological order; and the MinMax normalization method is applied to normalize the dataset.

作为本发明的进一步改进，在步骤2中，所述基于时空残差卷积的神经网络模型利用挤压-激发注意力机制进行注意力计算：首先，通过全局平均池对全局空间信息进行压缩；然后利用全连接层、ReLU和Softmax运算来产生用于卷积核和偏移学习的归一化注意力权重：As a further improvement of the present invention, in step 2, the neural network model based on spatiotemporal residual convolution uses a squeeze-excite attention mechanism to perform attention calculation: first, the global spatial information is compressed by a global average pool; then a fully connected layer, ReLU and Softmax operations are used to generate normalized attention weights for convolution kernel and offset learning:

； ;

其中，代表特征经过注意力得到的结果标量，代表注意力标量与特征的通道乘积。in, Represents the result scalar obtained by attention of the feature, Represents the attention scalar Multiply by the channel-wise product of the feature.

作为本发明的进一步改进，在步骤3中，所述可变形卷积通过调整卷积核位置与路网结构保持一致，同时利用注意力引导的可变形卷积网络V2学习可变形卷积的偏移量：As a further improvement of the present invention, in step 3, the deformable convolution is kept consistent with the road network structure by adjusting the position of the convolution kernel, and the offset of the deformable convolution is learned by using the attention-guided deformable convolution network V2:

在输入特征图上使用规则网格进行采样，然后将采样值按加权求和，且给所述规则网格添加偏移量；公式如下：In the input feature map The sample is then sampled using a regular grid and the sampled values are then Weighted summation and adding an offset to the regular grid ; The formula is as follows:

； ;

这里，代表特征图上的像素点位置，代表特征图上点的卷积结果，代表卷积核的第n个位置，代表位置通过卷积学习到的偏移量，为浮点数，而则是第个位置的调制标量，代表偏差，通过对输入特征图应用单独的卷积层而获得，由于是浮点数，因此要通过双线性插值法进行变换。here, Represents the pixel position on the feature map, Representative feature map The convolution result of the point, represents the nth position of the convolution kernel, represent The offset learned by the convolution is a floating point number, and The The modulation scalar of the position, Represents the deviation, by inputting feature maps Obtained by applying a single convolutional layer, due to It is a floating point number, so it is transformed by bilinear interpolation.

作为本发明的进一步改进，在步骤3中，通过聚合多个线性函数来引入动态卷积，具体如下：As a further improvement of the present invention, in step 3, by aggregating multiple Linear Function To introduce dynamic convolution, as follows:

； ;

其中，是第k个线性函数的注意力权重。通过注意力后权重并非固定不变，而是随每个输入的变化而变化。in, is the kth linear function The attention weight. It is not fixed, but changes with each input changes with the changes of.

作为本发明的进一步改进，在步骤4中，所述时间注意力计算方法具体如下：As a further improvement of the present invention, in step 4, the temporal attention calculation method is specifically as follows:

； ;

其中，和表示时间通道的权重，和是偏置项，是激活函数。in, and represents the weight of the time channel, and is the bias term, is the activation function.

作为本发明的进一步改进，在步骤4中，整个网络的训练方法是通过时间反向传播最大化生成准确未来预测的可能性。连续变量（温度、风速）和经过独热编码的分类变量（假期）等额外信息经过全连接层输入到模型作为特征提取的一部分。As a further improvement of the present invention, in step 4, the entire network is trained by maximizing the probability of generating accurate future forecasts through time back propagation. Additional information such as continuous variables (temperature, wind speed) and one-hot encoded categorical variables (holidays) are input into the model through a fully connected layer as part of feature extraction.

作为本发明的进一步改进，还包括：根据已有评价指标对动态时空自适应卷积网络效果进行评价。As a further improvement of the present invention, it also includes: evaluating the effect of the dynamic spatiotemporal adaptive convolutional network according to existing evaluation indicators.

本发明设计了与注意力机制相结合的可变形卷积核，以提取道路信息并使其适应道路网络结构。此外，本发明还利用注意力机制动态聚合多个并行卷积核，精确地关注网络中的关键区域。从时间角度来看，本发明根据每周、每天和最近的时间数据应用了动态时间关注机制，以把握周期性交通流中的关键时刻。通过整合重要的时间和空间元素，构建了动态时空自适应卷积网络（DSTACN）。The present invention designs a deformable convolution kernel combined with an attention mechanism to extract road information and adapt it to the road network structure. In addition, the present invention also uses the attention mechanism to dynamically aggregate multiple parallel convolution kernels to accurately focus on key areas in the network. From a temporal perspective, the present invention applies a dynamic temporal attention mechanism based on weekly, daily, and recent time data to grasp the key moments in periodic traffic flows. By integrating important temporal and spatial elements, a dynamic spatiotemporal adaptive convolutional network (DSTACN) is constructed.

本发明的有益效果是：The beneficial effects of the present invention are:

本发明通过优化卷积核内采样点的定位以及动态调整卷积神经网络不同时间框架内的参数，改进了空间特征的提取。这种改进包括利用注意力机制引导可变形卷积核朝向感兴趣的区域，以及通过注意力动态聚合多个并行卷积核，以增强对不同输入的适应性。The present invention improves the extraction of spatial features by optimizing the positioning of sampling points within the convolution kernel and dynamically adjusting the parameters of the convolutional neural network in different time frames. This improvement includes using the attention mechanism to guide the deformable convolution kernel towards the area of interest, and dynamically aggregating multiple parallel convolution kernels through attention to enhance adaptability to different inputs.

2、本发明将基于注意力的自适应卷积与动态时间注意力机制相结合，以适用于每周、每天和最近的时间信息提取，并将其结合到 DSTACN 模型中。2. The present invention combines attention-based adaptive convolution with dynamic temporal attention mechanism to be suitable for weekly, daily and recent temporal information extraction, and incorporates it into the DSTACN model.

3、通过改进的交通预测方法，本发明有助于交通管理部门更准确地预判交通流量变化，提前采取措施缓解交通拥堵，提高道路利用效率。这不仅可以减少车辆等待时间和燃料消耗，还能降低碳排放，促进可持续城市交通的发展。3. Through the improved traffic prediction method, the present invention helps traffic management departments to more accurately predict traffic flow changes, take measures in advance to alleviate traffic congestion, and improve road utilization efficiency. This can not only reduce vehicle waiting time and fuel consumption, but also reduce carbon emissions and promote the development of sustainable urban transportation.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明实施例的流程图；FIG1 is a flow chart of an embodiment of the present invention;

图2为本发明实施例中自适应卷积过程的示意图；FIG2 is a schematic diagram of an adaptive convolution process in an embodiment of the present invention;

图3为本发明实施例中动态时空自适应卷积神经网络的模型架构图；FIG3 is a model architecture diagram of a dynamic spatiotemporal adaptive convolutional neural network according to an embodiment of the present invention;

图4为本发明实施例中Taxibj数据集的模型预测结果与实际交通流数据对比图；FIG4 is a diagram comparing the model prediction results of the Taxibj data set and the actual traffic flow data in an embodiment of the present invention;

图5为本发明实施例中预测结果评价指标RMSE的折线图对比图；FIG5 is a line graph comparison diagram of the RMSE of the prediction result evaluation index in an embodiment of the present invention;

图6为本发明实施例中预测结果评价指标MAE的折线图对比图；FIG6 is a line graph comparison diagram of the prediction result evaluation index MAE in an embodiment of the present invention;

图7为本发明实施例中自适应卷积层嵌入模式参数的实验结果图；FIG7 is a diagram showing the experimental results of adaptive convolutional layer embedding mode parameters in an embodiment of the present invention;

图8为本发明实施例中动态卷积核模型参数的实验结果图。FIG8 is a diagram showing the experimental results of the dynamic convolution kernel model parameters in an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

下面结合附图对本发明的实施例进行详细说明。The embodiments of the present invention are described in detail below with reference to the accompanying drawings.

实施例Example

如图1所示，一种基于动态自适应卷积神经网络的城市交通流预测方法，利用DSTACN在预测交通流考虑时空依赖关系，捕捉路网中的动态时空关系，使得预测结果更加精确。通过步骤1进行交通流数据的采集及预处理，通过步骤2至步骤4完成动态时空自适应卷积神经网络模型构建，通过步骤5、步骤6进行结果处理与分析：As shown in Figure 1, a method for predicting urban traffic flow based on a dynamic adaptive convolutional neural network uses DSTACN to consider spatiotemporal dependencies when predicting traffic flow, capture the dynamic spatiotemporal relationships in the road network, and make the prediction results more accurate. Traffic flow data is collected and preprocessed through step 1, and the dynamic spatiotemporal adaptive convolutional neural network model is constructed through steps 2 to 4. The results are processed and analyzed through steps 5 and 6:

步骤1：数据的采集及预处理，根据实验设计对数据进行时间域处理、数据集划分、归一化处理等预处理，获得道路网络上的历史交通数据特征；Step 1: Data collection and preprocessing: According to the experimental design, the data is preprocessed in the time domain, divided into data sets, normalized, etc. to obtain the historical traffic data characteristics on the road network;

步骤2：建立基于时空残差卷积的神经网络模型，结合处理后的输入特征矩阵，构建出基础时空残差卷积神经网络模型框架；Step 2: Establish a neural network model based on spatiotemporal residual convolution, and combine the processed input feature matrix to construct a basic spatiotemporal residual convolution neural network model framework;

步骤3：构建动态自适应卷积，交通数据表现出的空间相关性由动态卷积和可变形卷积捕捉，二者结合为自适应卷积；Step 3: Construct dynamic adaptive convolution. The spatial correlation of traffic data is captured by dynamic convolution and deformable convolution, which are combined into adaptive convolution.

步骤4：时空动力学建模及优化，交通数据的时间相关性由时间注意力捕捉，动态时间注意力与动态自适应卷积集成到步骤2得到的神经网络模型中同步构建时空动力学模型，得到本实施例的动态时空自适应卷积网络DSTACN；Step 4: Spatiotemporal dynamics modeling and optimization. The temporal correlation of traffic data is captured by temporal attention. Dynamic temporal attention and dynamic adaptive convolution are integrated into the neural network model obtained in step 2 to synchronously construct a spatiotemporal dynamics model, thereby obtaining the dynamic spatiotemporal adaptive convolution network DSTACN of this embodiment.

步骤5：模型求解，根据上述步骤建立模型，利用随机梯度下降算法，通过训练不断优化模型超参数，得到交通状态预测方案；Step 5: Model solution: Establish a model according to the above steps, use the stochastic gradient descent algorithm, continuously optimize the model hyperparameters through training, and obtain a traffic status prediction solution;

步骤6：模型效果评价，根据已有评价指标，对所提出的模型效果进行评价。Step 6: Model effect evaluation: evaluate the effect of the proposed model based on the existing evaluation indicators.

下面对本发明作进一步说明：The present invention will be further described below:

（1）对数据进行时间域处理：(1) Time domain processing of data:

首先划分城市空间区块和量化交通流量。城市区域被离散化为由个方格组成的网格，以经纬度划分。每个方格由其在网格中的位置坐标唯一标识，表示行，表示列。在规定的时间段内，通过任何一个方格的流量流入和流出由以下公式量化：First, the urban space is divided into blocks and the traffic flow is quantified. The urban area is discretized into The grid is composed of squares, divided by longitude and latitude. Each square is represented by its position coordinates in the grid. Unique identifier, Indicates the line, Indicates a column. In the specified time period Inside, through any square The flow inflow and outflow is quantified by the following formula:

； ;

这里的表示集合中的一条轨迹，其中表示轨迹上第个点的地理空间坐标，符号表示点位于网格方格内，而则表示它位于网格方格外。表示集合中不同元素的个数。Here Representing a collection A trajectory in which Indicates the trajectory The geospatial coordinates of the points, symbol express Points are located in grid squares Inside, , it is outside the grid square. Indicates the number of distinct elements in a collection.

让表示前个时隙在单元区域中观察到的流量张量。对于任何给定的时间间隔，区域内所有网格方格的累计流入和流出量用张量表示。第一层表示流入量，第二层表示流出量指标。主要目标是根据截至时间的历史数据，预测下一个时隙的交通客流张量。let Before Time slots in the unit area The observed flow tensor in . For any given time interval , The cumulative inflow and outflow of all grid squares in the region are expressed as tensors Indicates. First layer Indicates inflow, second layer Represents the outflow indicator. The main goal is to Historical data to predict the next time slot The traffic passenger flow tensor .

除交通流信息外，方法还整合了节假日信息、天气状况和温度等外部因素，收集这些信息可提供更多预测额外信息。In addition to traffic flow information, the method also integrates external factors such as holiday information, weather conditions and temperature. Collecting this information can provide more additional information for prediction.

（2）数据集划分及归一化处理：(2) Data set division and normalization processing:

本实施例按照时间顺序将数据集分为训练集、验证集和测试集。训练集、验证集的分流比为8：2，最后10天交通流量作为测试集。应用MinMax标准化方法对数据集进行归一化。In this example, the data set is divided into a training set, a validation set, and a test set in chronological order. The split ratio of the training set to the validation set is 8:2, and the traffic flow of the last 10 days is used as the test set. The MinMax normalization method is used to normalize the data set.

（3）残差卷积神经网络（ResNet）因能有效避免深度神经网络中遇到的梯度消失问题而被应用在各种深度学习任务中。这一特性在交通流量预测中尤为突出，因为即时交通状况（低层次特征）和历史趋势（高层次特征）对准确预测都有重要作用。为了更全面地说明问题，如图3所示，第一层残差连接的输入是自适应卷积层的输出。通过实际实验，模型中最多使用了L=12层残差连接。(3) Residual convolutional neural networks (ResNet) are used in various deep learning tasks because they can effectively avoid the gradient vanishing problem encountered in deep neural networks. This feature is particularly prominent in traffic flow prediction, because both the current traffic conditions (low-level features) and historical trends (high-level features) are important for accurate prediction. To illustrate the problem more comprehensively, as shown in Figure 3, the input of the first layer of residual connections is the output of the adaptive convolution layer. Through actual experiments, a maximum of L = 12 layers of residual connections are used in the model.

为了提高计算效率，利用挤压-激发（SE）注意力机制进行注意力计算。首先，通过全局平均池对全局空间信息进行压缩。然后，利用全连接层、ReLU和Softmax运算来产生用于卷积核和偏移学习的归一化注意力权重。In order to improve computational efficiency, the squeeze-excite (SE) attention mechanism is used for attention calculation. First, the global spatial information is compressed through global average pooling. Then, fully connected layers, ReLU and Softmax operations are used to generate normalized attention weights for convolution kernel and offset learning.

； ;

（4）步骤3对于自适应卷积建模，由于传统的卷积核只能进行常规的信息聚合，难以有效捕捉空间依赖关系。受物体检测中可变形卷积方法的启发，采用了可变形卷积方法，通过调整卷积核位置来更好地与路网结构保持一致。为了提高性能，同时利用注意力引导的可变形卷积网络V2来学习可变形卷积的偏移量，充分利用注意力机制的优势。(4) Step 3: For adaptive convolution modeling, since traditional convolution kernels can only perform conventional information aggregation, it is difficult to effectively capture spatial dependencies. Inspired by the deformable convolution method in object detection, the deformable convolution method is adopted to better align with the road network structure by adjusting the position of the convolution kernel. In order to improve performance, the attention-guided deformable convolution network V2 is also used to learn the offset of the deformable convolution, making full use of the advantages of the attention mechanism.

传统的卷积法是在输入特征图上使用规则网格进行采样，然后将采样值按加权求和。在本模型的可变形卷积实现中，常规网格被添加了偏移量，其中。公式如下：The traditional convolution method is to input feature maps The sample is then sampled using a regular grid and the sampled values are then Weighted sum. In the deformable convolution implementation of this model, the regular grid is offset ,in The formula is as follows:

； ;

为了解决特定地点在不同时间框架内的时空变化问题，本实施例从动态卷积中汲取灵感，以解决传统卷积方法固有的局限性。这种新方法结合了注意力机制，可根据输入过滤多个并行卷积核，从而在错综复杂的道路网络中实现有效的信息聚合。In order to address the spatiotemporal changes of a specific location in different time frames, this embodiment draws inspiration from dynamic convolution to address the inherent limitations of traditional convolution methods. This new method combines an attention mechanism to filter multiple parallel convolution kernels based on the input, thereby achieving effective information aggregation in complex road networks.

传统卷积的定义是，其中和分别代表权重矩阵和偏置向量，是激活函数（如Tanh）。通过聚合多个线性函数来引入动态卷积，具体如下：The traditional definition of convolution is ,in and Represent the weight matrix and bias vector respectively, is an activation function (such as Tanh). By aggregating multiple Linear Function To introduce dynamic convolution, as follows:

； ;

其中，是第k个线性函数的注意力权重。通过注意力后权重并非固定不变，而是随每个输入的变化而变化。因此，与静态卷积法相比，动态卷积法在捕捉各种特征方面更具优势。in, is the kth linear function The attention weight. It is not fixed, but changes with each input Therefore, compared with the static convolution method, the dynamic convolution method has more advantages in capturing various features.

（5）步骤4结合时空残差卷积神经网络与自适应卷积进行时空动力学建模，本文通过自适应卷积（包括基于注意力的可变形卷积和动态卷积）来解决空间依赖性问题，同时利用时间注意力机制来管理时间依赖性。自适应卷积示意过程如图2所示。这一过程包括通过动态卷积核学习输入特征图，同时考虑位置偏移。实现了动态卷积和基于注意力的可变形卷积的融合。该公式整合了动态卷积中获得的偏置和权重，从而建立了自适应卷积的框架。(5) Step 4 combines the spatiotemporal residual convolutional neural network with adaptive convolution to model spatiotemporal dynamics. This paper solves the spatial dependency problem through adaptive convolution (including attention-based deformable convolution and dynamic convolution), and uses the temporal attention mechanism to manage temporal dependency. The schematic process of adaptive convolution is shown in Figure 2. This process includes learning the input feature map through a dynamic convolution kernel while taking into account the position offset. The fusion of dynamic convolution and attention-based deformable convolution is achieved. This formula integrates the bias and weight obtained in the dynamic convolution, thereby establishing the framework of adaptive convolution.

时间注意力机制可有效过滤不同时间间隔内的近期交通流模式，包括日周期和周周期，从而使模型能够优先考虑重要的时间段，如高峰时段和值得注意的工作日。时间注意力计算方法如下：The temporal attention mechanism can effectively filter recent traffic flow patterns in different time intervals, including daily and weekly cycles, so that the model can prioritize important time periods such as peak hours and noteworthy weekdays. The temporal attention calculation method is as follows:

； ;

其中，和表示时间通道的权重，和是偏置项，是激活函数。通过整合上述时空动态建模架构，建立了DSTACN模型。整个网络的训练方法是通过时间反向传播最大化生成准确未来预测的可能性。连续变量（温度、风速）和经过独热编码的分类变量（假期）等额外信息经过全连接层输入到模型作为特征提取的一部分。in, and represents the weight of the time channel, and is the bias term, is the activation function. The DSTACN model was established by integrating the above spatiotemporal dynamic modeling architecture. The entire network was trained by maximizing the probability of generating accurate future predictions through time back propagation. Additional information such as continuous variables (temperature, wind speed) and one-hot encoded categorical variables (holidays) were input into the model through the fully connected layer as part of feature extraction.

为了验证所提模型的性能，本实施例在三个真实交通数据集上进行了实验。首先介绍数据集的数据预处理，然后介绍了选择的几个基准模型。In order to verify the performance of the proposed model, this embodiment conducts experiments on three real traffic datasets. First, the data preprocessing of the dataset is introduced, and then several selected benchmark models are introduced.

本实施例在各种模式、区域大小和数据量的现实交通流数据上测试了该模型。具体来说，使用了北京出租车、纽约自行车和纽约出租车数据集来代表不同的交通模式。北京出租车数据集包括北京出租车在多个时间间隔内的轨迹和气象数据。根据经纬度，北京市被划分为一个32乘以32个方格的网格图。纽约市自行车数据集包含该市自行车系统2014年的出行记录。每条出车记录都包括出车持续时间、站点ID以及开始和结束时间。此外，还采用了2014年纽约市出租车的行车记录。这些记录包括接送日期、时间和地点。This embodiment tests the model on real traffic flow data of various modes, area sizes and data volumes. Specifically, Beijing taxi, New York bicycle and New York taxi data sets are used to represent different traffic modes. The Beijing taxi data set includes the trajectory and meteorological data of Beijing taxis in multiple time intervals. According to longitude and latitude, Beijing is divided into a grid map of 32 by 32 squares. The New York City bicycle data set contains the travel records of the city's bicycle system in 2014. Each vehicle record includes the duration of the vehicle, the station ID, and the start and end time. In addition, the driving records of New York City taxis in 2014 are also used. These records include the date, time and location of pick-up and drop-off.

将数据集分为训练集、验证集和测试集。最后十天被指定为测试集，其余时间按8:2的比例分为训练集和验证集。最大值和最小值的归一化用于将输出缩放为[-1,1]。使用PyTorch实现了所提出的模型和基线，并在配备128GB内存和24GB GPU内存的RTX 4090 GPU上进行了训练。模型的超参数设置如下：学习率为0.0005、200个训练周期（提前停止周期设置为30）、使用L1损失函数和自动学习率衰减。The dataset is divided into training, validation, and test sets. The last ten days are designated as the test set, and the rest of the days are divided into training and validation sets in a ratio of 8:2. Maximum and minimum normalization is used to scale the output to [-1,1]. The proposed model and baselines are implemented using PyTorch and trained on an RTX 4090 GPU with 128GB of RAM and 24GB of GPU memory. The hyperparameters of the model are set as follows: learning rate of 0.0005, 200 training epochs (early stopping epochs set to 30), using L1 loss function and automatic learning rate decay.

其次，为了评估本实施例的效果，采用以下使用较为广泛插补方法作为基准模型：Secondly, in order to evaluate the effect of this embodiment, the following widely used interpolation method is used as a benchmark model:

1）HA：历史平均值模型，使用历史平均值作为相应时期的预测值。1) HA: Historical average model, which uses the historical average as the forecast value for the corresponding period.

2）ARIMA:自回归综合移动平均法是一种广泛采用的时间序列预测模型，可应用历史信息预测未来。2) ARIMA: Autoregressive Integrated Moving Average is a widely used time series forecasting model that can apply historical information to predict the future.

3）ConvLSTM：卷积LSTM网络利用 LSTM 单元中每个门的卷积运算来取代矩阵乘法，可以通过卷积运算捕捉可变维度数据中的基本空间特征。3) ConvLSTM: The convolutional LSTM network uses the convolution operation of each gate in the LSTM unit to replace the matrix multiplication, and can capture the basic spatial features in variable-dimensional data through convolution operations.

4）ST-ResNet：框架利用卷积单元和残差网络对时变数据的动态进行有效建模，并提取多种尺度的空间依赖关系。4) ST-ResNet: The framework uses convolutional units and residual networks to effectively model the dynamics of time-varying data and extract spatial dependencies at multiple scales.

5）STDN：采用了流量门控机制和循环过渡注意力机制，以解决动态空间依赖性和长期周期性时间变化问题。5) STDN: It adopts flow gating mechanism and cyclic transition attention mechanism to address the problems of dynamic spatial dependency and long-term periodic temporal variation.

6）SA-ConvLSTM：引入了自注意力记忆模块，以提取大规模空间依赖特征。6) SA-ConvLSTM: A self-attention memory module is introduced to extract large-scale spatial dependency features.

7）STGSP：利用全局空间语义信息，通过表征学习方法系统地提取城市流特征。7) STGSP: Utilizes global spatial semantic information to systematically extract urban flow features through representation learning methods.

（1）实验结果：(1) Experimental results:

基于三个指标，表1对三种数据集（TaxiBJ、BikeNYC 和 TaxiNYC）中的各种方法进行了比较分析：RMSE、MAE和MAPE，数值越低表示预测准确性越高。总体而言，在所有三个数据集上，所提出的DSTACN方法都优于所有其他方法，表明它具有最高的预测准确性。这表明，与其他模型相比，DSTACN能更有效地捕捉复杂的时空依赖关系。Table 1 compares various methods in three datasets (TaxiBJ, BikeNYC, and TaxiNYC) based on three metrics: RMSE, MAE, and MAPE, where lower values indicate higher prediction accuracy. Overall, the proposed DSTACN method outperforms all other methods on all three datasets, indicating that it has the highest prediction accuracy. This suggests that DSTACN can capture complex spatiotemporal dependencies more effectively than other models.

HA和ARIMA在所有数据集和指标中都表现较差，这是因为这些模型相对简单，不能很好地捕捉复杂的时空模式。ST-ResNet、STDN、SA-ConvLSTM和STGSP等先进方法的性能有所提高，这表明了在预测中考虑时空依赖性的重要性。分析表明，DSTACN模型在预测城市交通数据集的交通流量方面具有优势：与STGSP相比，DSTACN的RMSE改进了11.98%，MAE降低了8.41%，在所有评估指标上都凸显了其较高的精度。SA-ConvLSTM在大幅降低了数据集的MAE和MAPE指标。然而，DSTACN以最低的RMSE和MAPE领先，在RMSE方面比STGSP降低了8.6%。在纽约市出租车数据集中，STDN和STGSP的表现都很好。然而，DSTACN在所有数据集和指标的整体有效性上都超越了这些模型。例如，DSTACN的RMSE和MAE分别为17.63和10.26，与STGSP的RMSE18.76和MAE10.51相比，准确率分别降低了6.02%和2.38%。HA and ARIMA perform poorly in all datasets and metrics because these models are relatively simple and cannot capture complex spatiotemporal patterns well. The performance of state-of-the-art methods such as ST-ResNet, STDN, SA-ConvLSTM, and STGSP has improved, which shows the importance of considering spatiotemporal dependencies in prediction. The analysis shows that the DSTACN model has an advantage in predicting traffic flow on the urban traffic dataset: compared with STGSP, DSTACN's RMSE improved by 11.98% and MAE decreased by 8.41%, highlighting its higher accuracy in all evaluation metrics. SA-ConvLSTM significantly reduced the MAE and MAPE metrics of the dataset. However, DSTACN leads with the lowest RMSE and MAPE, reducing RMSE by 8.6% over STGSP. In the New York City Taxi dataset, both STDN and STGSP performed well. However, DSTACN surpassed these models in terms of overall effectiveness across all datasets and metrics. For example, the RMSE and MAE of DSTACN are 17.63 and 10.26, respectively, which are 6.02% and 2.38% lower than the RMSE18.76 and MAE10.51 of STGSP.

这些结果凸显了DSTACN相对于基线方法的明显优势，图5、6中的可视化效果则进一步证明了这一点。此外，图4中还展示了所提方法的预测结果与实况之间的比较。该对比展示了所提方法生成的预测结果以及网格中节点（8,26）的真实流量数据。通过观察放大的子图，可以明显看出DSTACN在预测高峰交通流量和总体趋势方面都表现出了更高的准确性。这种直观的表现形式强调了DSTACN在捕捉交通流模式复杂动态方面的有效性。These results highlight the clear advantages of DSTACN over the baseline methods, which is further demonstrated by the visualizations in Figures 5 and 6. In addition, a comparison between the predictions of the proposed method and the actual situation is also shown in Figure 4. The comparison shows the predictions generated by the proposed method and the real traffic data of nodes (8,26) in the grid. By observing the zoomed-in sub-graph, it is obvious that DSTACN shows higher accuracy in predicting both peak traffic flow and overall trends. This intuitive representation emphasizes the effectiveness of DSTACN in capturing the complex dynamics of traffic flow patterns.

（2）消融实验：(2) Ablation experiment:

DSTACN实验的消融研究结果总结于表2。通过从DSTACN架构中分离出特定的子模块，对不同的变体进行了测试，以了解它们的独特贡献。然后在taxiBJ数据集上分析了每个变体的评估指标RMSE。The ablation study results of the DSTACN experiments are summarized in Table 2. Different variants are tested by isolating specific submodules from the DSTACN architecture to understand their unique contributions. The evaluation metric RMSE of each variant is then analyzed on the taxiBJ dataset.

结果发现，DSTACN 的每个子模块都发挥着不可或缺的作用，尽管它们的功能贡献大小不一。缺失动态卷积和可变形卷积的模块的性能下降最为明显，RMSE增加了15.84%。The results show that each submodule of DSTACN plays an indispensable role, although their functional contributions vary. The performance of the module without dynamic convolution and deformable convolution drops most significantly, with an RMSE increase of 15.84%.

动态时间注意力通过在不同时间维度上精确定位关键时间点，弥补了卷积神经网络缺乏时间依赖性的缺陷。动态注意力卷积和动态时间注意力使模型能够自适应地关注相关的空间和时间特征，增强了捕捉数据中复杂时空模式的能力。因此，去掉其中任何一个组件都会削弱模型从输入数据中学习的能力。Dynamic temporal attention makes up for the lack of time dependency in convolutional neural networks by pinpointing key time points in different time dimensions. Dynamic attention convolution and dynamic temporal attention enable the model to adaptively focus on relevant spatial and temporal features, enhancing the ability to capture complex spatiotemporal patterns in the data. Therefore, removing any of these components will weaken the model's ability to learn from the input data.

（3）参数与模型结构实验：(3) Parameter and model structure experiments:

在图7中进行的分析阐明了在不同模型层，采用自适应卷积对 RMSE 指标的影响。结果表明，随着更多的自适应卷积层被纳入到架构中，精确度会逐步提高。值得注意的是，在所有三个层中部署自适应卷积可获得最佳精度。此外，在只有一个层可以利用自适应卷积的情况下，将其纳入最后的卷积层可以获得最有利的性能。The analysis performed in Figure 7 illustrates the impact of using adaptive convolutions on the RMSE metric at different model layers. The results show that the accuracy gradually improves as more adaptive convolution layers are incorporated into the architecture. Notably, the best accuracy is achieved by deploying adaptive convolutions in all three layers. Furthermore, in cases where only one layer can take advantage of adaptive convolutions, incorporating it in the last convolutional layer yields the most favorable performance.

如图8所示，随着动态卷积核数量的增加，DSTACN 模型的RMSE呈下降趋势，但卷积核数量超过4个后RMSE变化缓慢。出现这种现象的原因是，当大量的卷积核进行动态卷积而赋予模型更强的表示能力时，由于需要同时训练多个卷积内核和注意力机制，优化过程变得极具挑战性，从而增加了模型过度拟合的可能性。As shown in Figure 8, as the number of dynamic convolution kernels increases, the RMSE of the DSTACN model shows a downward trend, but the RMSE changes slowly when the number of convolution kernels exceeds 4. The reason for this phenomenon is that when a large number of convolution kernels are dynamically convolved to give the model stronger representation capabilities, the optimization process becomes extremely challenging due to the need to train multiple convolution kernels and attention mechanisms at the same time, thereby increasing the possibility of model overfitting.

以上所述实施例仅表达了本发明的具体实施方式，其描述较为具体和详细，但并不能因此而理解为对本发明专利范围的限制。应当指出的是，对于本领域的普通技术人员来说，在不脱离本发明构思的前提下，还可以做出若干变形和改进，这些都属于本发明的保护范围。The above-mentioned embodiments only express the specific implementation of the present invention, and the description thereof is relatively specific and detailed, but it cannot be understood as limiting the scope of the present invention. It should be pointed out that, for ordinary technicians in this field, several variations and improvements can be made without departing from the concept of the present invention, which all belong to the protection scope of the present invention.