CN118468217B

Movatterモバイル変換

Info

Publication number: CN118468217B
Application number: CN202410541352.1A
Authority: CN
Inventors: 胡春强; 丁令超; 邓绍江; 叶青; 向涛; 蔡斌
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2024-04-30
Filing date: 2024-04-30
Publication date: 2024-12-17
Anticipated expiration: 2044-04-30
Also published as: CN118468217A

Abstract

The invention relates to an intelligent decision technology and discloses a driving control method based on personalized federal contrast learning, comprising the steps that a client acquires driving data, and a multi-mode processor is utilized to extract characteristics of the driving data to obtain multi-mode driving characteristic data; the method comprises the steps of carrying out data processing on multi-mode driving characteristic data to obtain multi-mode driving comparison characteristic data, training a comparison learning driving prediction model to obtain a target driving prediction model, carrying out parameter aggregation on global neuron parameters by a central server to obtain fusion parameters, distributing the fusion parameters to a plurality of clients, carrying out personalized parameter optimization on the target driving prediction model by the clients through the fusion parameters, and receiving real-time driving data of a user by the clients to carry out driving prediction to obtain a user driving prediction result. The invention further provides a driving control system based on personalized federal contrast learning. The method can realize personalized driving prediction of driving control based on personalized federal comparison learning.

Description

Driving control method and system based on personalized federal contrast learning

Technical Field

The invention relates to the technical field of intelligent decision making, in particular to a driving control method and system based on personalized federal contrast learning.

Background

Autopilot is an emerging field with the potential to completely change the way people travel. The existing automatic driving methods are mostly based on machine learning, especially a deep learning technology requiring large-scale training data, many researches discuss the capability of directly deducing an end-to-end driving strategy from perceived data, and the results of the methods are applied to different application scenes, such as lane following, autonomous navigation in a complex environment, automatic driving on an artificial road, unmanned aerial vehicle navigation and the like. However, most of these approaches focus on improving the accuracy of the system by using pre-collected data sets, while ignoring the privacy of the user data. At the same time, the trained large models also need a large number of marked data sets, and no data sets which can meet the requirements are available at present, and moreover, the large models are generally not suitable for driving habits of each driver. The variability between driving prediction models in existing control methods is typically determined by driving scenarios. However, in many similar driving scenarios, different drivers may exhibit distinct driving behaviors. Some drivers may frequently get through the lane change and some drivers are more prone to keep the current lane running, and the difference is difficult to solve through a consistent driving prediction model, so that the personalized recommendation of the driving behavior of the user cannot be met.

Disclosure of Invention

The invention provides a driving control method and device based on personalized federal contrast learning, which can realize personalized driving prediction of driving control based on personalized federal contrast learning.

In order to achieve the above object, the present invention provides a driving control method based on personalized federal contrast learning, including:

The method comprises the steps that a client obtains driving data, and the driving data are subjected to feature extraction by utilizing a preset multi-mode processor to obtain multi-mode driving feature data, wherein the driving data comprise multi-view images, liDAR data and system information;

The client processes the multi-mode driving characteristic data by using a preset contrast learning algorithm to obtain multi-mode driving contrast characteristic data;

The client trains the pre-constructed contrast learning driving prediction model according to the multi-mode driving contrast characteristic data, and a target driving prediction model is obtained after training is completed;

The method comprises the steps that a central server selects a plurality of clients participating in training of a target driving prediction model, and obtains global neuron parameters of the target driving prediction model in the clients, and the central server aggregates the global neuron parameters to obtain fusion parameters;

the central server distributes the fusion parameters to the plurality of clients, the clients receive the fusion parameters and perform personalized parameter optimization on the target driving prediction model which belongs to the clients by utilizing the fusion parameters to obtain a personalized target driving prediction model;

and the client receives the real-time driving data of the user, and inputs the real-time driving data of the user into the personalized target driving prediction model to perform driving prediction, so as to obtain a driving prediction result of the user.

Optionally, the feature extraction of the driving data by using a preset multi-mode processor to obtain multi-mode driving feature data includes:

performing feature extraction on the multi-view image in the driving data by using a time Q-Former processing algorithm to obtain multi-view image features;

extracting features of LiDAR data in the driving data by using an encoder, and performing feature conversion on the extracted features by using a three-valued processing algorithm to obtain LiDAR features;

Performing feature extraction on the system information in the driving data by using a transducer encoder to obtain system information features;

And executing feature fusion operation on the multi-view image features, the LiDAR features and the system information features to obtain the multi-mode driving feature data.

Optionally, the feature extraction of the multi-view image in the driving data by using a time Q-Former processing algorithm, to obtain multi-view image features, includes:

extracting all multi-view images between a current time stamp and a preset historical time stamp;

Extracting features of the multi-view image under the preset historical timestamp by utilizing the time Q-Former processing algorithm to obtain multi-view image mark embedding under the preset historical timestamp;

and sequentially extracting the multi-view image marks from the preset historical timestamp to the current timestamp, and embedding the multi-view image marks to obtain the multi-view image features.

Optionally, the extracting features of the LiDAR data in the driving data by using an encoder, and converting the extracted features by using a tri-valued processing algorithm to obtain LiDAR features, including:

extracting features of the LiDAR data by using an encoder to obtain initial features of the LiDAR;

And converting the LiDAR initial characteristic into characteristics of an obstacle state, an obstacle-free state and an uncertain obstacle state by using a three-valued processing algorithm, and obtaining the LiDAR characteristic.

Optionally, the performing feature fusion operation on the multi-view image feature, the LiDAR feature, and the system information feature to obtain the multi-mode driving feature data includes:

The LiDAR features subjected to feature conversion according to a three-valued processing algorithm meet the fusion requirement of feature fusion with the multi-view image features, and the LiDAR features and the multi-view image features are subjected to feature fusion to obtain fusion features;

Performing feature alignment on the fusion features and the system information features, and constructing a feature comparison matrix;

and extracting features from the feature contrast matrix to obtain the multi-mode driving feature data.

Optionally, the inputting the real-time driving data of the user into the personalized target driving prediction model to perform driving prediction, to obtain a driving prediction result of the user, includes:

acquiring personalized decision information in real-time driving data of a user, and performing feature coding on the personalized decision information to obtain a personalized decision information feature matrix, wherein the personalized decision information comprises system information, driving decision strategy information and user driving prompt information;

coding the picture information and the LiDAR information in the user real-time driving data, and fusing the picture information and the LiDAR information to obtain real-time fusion information characteristics;

Performing matching degree calculation on the real-time fusion information features and the personalized decision information feature matrix, and extracting personalized decision information corresponding to the personalized decision information feature matrix with the highest matching degree;

And extracting corresponding driving decision strategy information from the personalized decision information with the highest matching degree to obtain the user driving prediction result.

Optionally, the optimizing the personalized parameter for the target driving prediction model in the client by using the fusion parameter includes:

The following formula is adopted to carry out personalized parameter optimization on the target driving prediction model which belongs to the client side:

Wherein,The parameters are personalized for the target driving prediction model,Original parameters of a target driving prediction model are obtained, eta is the learning rate,Model gradients are predicted for target driving.

In order to solve the problems, the invention also provides a driving control system based on personalized federal contrast learning, which comprises a central server and a plurality of clients respectively connected with the central server.

Optionally, the client is disposed within the vehicle.

Optionally, the central server acquires global neuron parameters of the target driving prediction models in the plurality of clients, aggregates the global neuron parameters to obtain fusion parameters, distributes the fusion parameters to the plurality of clients, receives the fusion parameters by the clients, and optimizes personalized parameters of the target driving prediction models in the clients by using the fusion parameters.

According to the invention, the characteristic extraction is carried out on the driving data by utilizing the preset multi-mode processor, so that the category characteristics of different data in the driving data can be extracted more comprehensively, the comprehensiveness of the extracted characteristics is ensured, in addition, the data processing is carried out on the multi-mode driving characteristic data by utilizing the preset contrast learning algorithm, the subsequent contrast learning driving prediction model can be ensured to make corresponding correct response under different conditions, thereby improving the robustness and adaptability of the system, in addition, the fusion parameters are distributed to the plurality of clients by adopting the central server, the clients receive the fusion parameters, and the personalized parameter optimization is carried out on the target driving prediction model in the clients by utilizing the fusion parameters, so that the model can be more suitable for the driving habit and personalized requirements of a driver and the privacy safety of the user is ensured.

Drawings

FIG. 1 is a schematic diagram of model training of a driving control method based on personalized federal contrast learning according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a model decision flow of a driving control method based on personalized federal contrast learning according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a comparative learning driving prediction model framework of a driving control method based on personalized federal comparative learning according to an embodiment of the present invention;

FIG. 4 is a technical detailed diagram of a multi-information fusion module of a decision model of a driving control method based on personalized federal comparison learning according to an embodiment of the present invention;

FIG. 5 is a framework flow chart of personalized federal learning for a method of driving control based on personalized federal contrast learning according to an embodiment of the present invention;

fig. 6 is a personalized encoder training flowchart of a driving control method based on personalized federal contrast learning according to an embodiment of the present invention.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

The embodiment of the application provides a driving control method based on personalized federal contrast learning. The execution subject of the driving control method based on personalized federal contrast learning includes, but is not limited to, at least one of a server, a terminal, and the like, which can be configured to execute the method provided by the embodiment of the application. In other words, the driving control method based on personalized federal contrast learning may be executed by software or hardware installed in a terminal device or a server device, where the software may be a blockchain platform. The server side comprises, but is not limited to, a single server, a server cluster, a cloud server or a cloud server cluster and the like. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.

Referring to fig. 1, a flow chart of a driving control method based on personalized federal contrast learning according to an embodiment of the present invention is shown. In this embodiment, the driving control method based on personalized federal contrast learning includes:

S1, a client acquires driving data, performs feature extraction on the driving data by using a preset multi-mode processor to obtain multi-mode driving feature data, wherein the driving data comprise multi-view images, liDAR data and system information, the client processes the multi-mode driving feature data by using a preset contrast learning algorithm to obtain multi-mode driving contrast feature data, and the client trains a preset contrast learning driving prediction model according to the multi-mode driving contrast feature data to obtain a target driving prediction model after training is completed.

In the embodiment of the invention, the client refers to a storage device of driving data in federal learning.

In the embodiment of the invention, the driving data refers to data recorded when a user drives a vehicle to run on a road, and the data comprises a multi-view image, liDAR data and system information. Light Detection AND RANGING (laser Detection and ranging).

As an embodiment of the present invention, the feature extraction of the driving data by using a preset multi-mode processor to obtain multi-mode driving feature data includes:

In the embodiment of the invention, the time Q-Former processing algorithm is a lightweight transducer, and a set of learnable query vectors are used to extract visual features from the frozen visual model for achieving visual-language alignment between the visual model and the large language model.

In the embodiment of the present invention, the tri-value processing algorithm is a technology of dividing data in a group of continuous intervals into three different sections so as to maximize data differentiation among the three sections.

In an embodiment of the present invention, the multi-view image may be represented asWherein T represents the length of time, N_I represents the number of views, H and W represent the height and width of the image, liDAR data is ternary data L ε { Y, N, O }, where Y represents the presence of an obstacle in front, N represents the absence of an obstacle in front, O represents the presence of an obstacle in front but uncertainty, a system messageWhere N_M represents the number of system information tags, the system information is a summary of vehicle speed, driving decisions, and task definitions.

Further, the feature extraction of the multi-view image in the driving data by using a time Q-Former processing algorithm to obtain multi-view image features includes:

Illustratively, the feature extraction is performed on the multi-view image in the driving data by using a time Q-Former processing algorithm to obtain multi-view image features, and the method comprises the following steps:

Step 1-each view at timestamp-TSending ViT and Q-force with N_Q random initialization query to obtain image mark embedding

Step 2, embedding using image markersAs a query of the Q-Former, the next time stamp image mark embedding is obtained by performing the first step againRepeating the two steps until the image mark of the current time stamp is obtained and embeddedAll time information from-T to 0 is collected.

The embodiment of the invention can avoid the linear increase of resources required to process time sequence data as the time length increases by processing the multi-view image from the preset historical timestamp-T to 0 (current timestamp) by using the time Q-Former processing algorithm for the time multi-view image.

In the embodiment of the present invention, viT is a model based on a transducer architecture, which is specially used for processing image data.

Further, the extracting features of the LiDAR data in the driving data by using an encoder, and converting the extracted features by using a three-valued processing algorithm to obtain LiDAR features, including:

For example, for LiDAR data, in order to reduce the processing load of the vehicle-mounted computer, a three-valued LiDAR data processing method is adopted, the data collected by the LiDAR sensor is subjected to the extraction of features by an encoder, and then the features are converted into three-state data by a decoder, which are respectively represented as { Y, N, O }, wherein Y represents that an obstacle exists in front, N represents that no obstacle exists in front, O represents that an obstacle exists in front but is in an uncertain state, and the LiDAR features obtained by the three-valued processing are recorded as

In the embodiment of the present invention, the performing feature fusion operation on the multi-view image feature, the LiDAR feature, and the system information feature to obtain the multi-mode driving feature data includes:

Illustratively, performing a feature fusion operation on the multi-view image feature, the LiDAR feature, and the system information feature, and obtaining the multi-modal driving feature data includes:

Step 1, because of the isomerism of LiDAR data and multi-view images, the LiDAR data and the multi-view images need to be fused into a processable data form, and the LiDAR features and the multi-view images are fused to obtain fused data thanks to the previous three-valued processing

Step 2, fusing the dataSystem information featuresIn alignment, a contrast matrix C is constructed in which the elements on the diagonal are positive samples that are matched and the elements on the non-diagonal are negative samples that are not matched, so that by N positive samples, there are N² -N negative samples, the positive samples can be made as close as possible, the negative samples are as far apart as possible, and the InfoNCE function is used to calculate the loss:

Wherein the method comprises the steps ofIs a characteristic representation of the ith system information,Is the fusion data (positive samples) that matches it,Is a negative-working example of the sample,Is a sampleAndSimilarity measure between τ is a temperature parameter used to control the loss function shape.

S2, a central server selects a plurality of clients participating in training of the target driving prediction model, acquires global neuron parameters of the target driving prediction model in the clients, aggregates the global neuron parameters to obtain fusion parameters, distributes the fusion parameters to the clients, receives the fusion parameters by the clients, and performs personalized parameter optimization on the target driving prediction model belonging to the clients by utilizing the fusion parameters to obtain a personalized target driving prediction model.

In the embodiment of the invention, the central server is a server responsible for coordinating each client to perform model training and updating.

As an embodiment of the present invention, the optimizing the personalized parameters of the target driving prediction model in the client by using the fusion parameters includes:

Illustratively, the optimizing the personalized parameters of the target driving prediction model in the client by using the fusion parameters includes the following steps:

On the t-th training round, the central server randomly selects a set of participating clients C_t. At a set of participating clients C_t, client k is selected randomly, the central server selects a set of p_k active neurons from the global model, where p_k E (0, 1) is the proportion of active neurons, the value of p_k depends on the system characteristics of client k, the more powerful client k has a larger p_k, and parameters A_k(w^t) for these neurons are sent to k, client k updates the local model with A_k(w^t) receivedObtaining an intermediate modelClient k then uses a random descent (SGD) modelUpdating to obtain a new model

Wherein, among them,The parameters are personalized for the target driving prediction model,Original parameters of a target driving prediction model are obtained, eta is the learning rate,For gradients of F_k relative to active parameters, i.e

S3, the client receives real-time driving data of the user, and inputs the real-time driving data of the user into the personalized target driving prediction model to conduct driving prediction, so that a user driving prediction result is obtained.

In the embodiment of the invention, the user real-time driving data refers to data which is acquired in real time and records the driving behavior of the user.

As an embodiment of the present invention, the inputting the real-time driving data of the user into the personalized target driving prediction model to perform driving prediction, to obtain a driving prediction result of the user, includes:

In the embodiment of the invention, the user driving prompt information can include user requirements, driving road type information and the like.

Illustratively, the method for inputting the real-time driving data of the user into the personalized target driving prediction model to perform driving prediction comprises the following steps:

In order to make the model more fit with the habit of the driver when taking a specific decision, a personalized encoder momentum encoder is adopted, and parameters are copied after the pre-constructed comparison learning driving prediction model is trained according to the multi-mode driving comparison characteristic dataAnd updating the text encoder in the comparison learning driving prediction model by using the local data to obtain another parameterAnd updating the text encoder in the prediction model by combining the momentum encoder with the two parameters to obtain a personalized encoder.

Wherein,The gradient calculated for the momentum encoder at time t using the local data,And the gradient is finally updated for the moment t momentum encoder, and m is a momentum super parameter.

As an embodiment of the invention, a driving control system based on a driving control method of personalized federal contrast learning comprises a central server and a plurality of clients respectively connected with the central server.

Further, the client is disposed in the vehicle.

Further, the central server acquires global neuron parameters of the target driving prediction models in the clients, aggregates the global neuron parameters to obtain fusion parameters, distributes the fusion parameters to the clients, receives the fusion parameters, and optimizes personalized parameters of the target driving prediction models in the clients by utilizing the fusion parameters.

Referring to fig. 1, a model training schematic diagram of a driving control method based on personalized federal contrast learning according to an embodiment of the present invention is shown.

Referring to fig. 2, a schematic diagram of a model decision flow of a driving control method based on personalized federal comparison learning according to an embodiment of the present invention is shown.

Referring to fig. 4, a technical detail diagram of a multi-information fusion module of a decision model of a driving control method based on personalized federal comparison learning according to an embodiment of the present invention is shown.

Referring to fig. 5, a framework flow chart of personalized federal learning of a driving control method based on personalized federal contrast learning according to an embodiment of the present invention is shown.

Referring to fig. 6, a personalized encoder training flowchart of a driving control method based on personalized federal contrast learning according to an embodiment of the present invention is shown.

It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.

The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Wherein artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results.

Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms first, second, etc. are used to denote a name, but not any particular order.

Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims

1. The driving control method based on personalized federal contrast learning is characterized by comprising the following steps:

The method comprises the steps of processing multi-mode driving feature data by a client to obtain multi-mode driving contrast feature data by a preset contrast learning algorithm, wherein the processing of the multi-mode driving feature data by a preset multi-mode processor to obtain multi-mode driving feature data comprises the steps of performing feature extraction on multi-view images in the driving data by a time Q-Former processing algorithm to obtain multi-view image features, performing feature extraction on LiDAR data in the driving data by an encoder, performing feature conversion on the extracted features by a three-valued processing algorithm to obtain LiDAR features, performing feature extraction on system information in the driving data by a transformer encoder to obtain system information features, performing feature fusion operation on the multi-view image features, the LiDAR features and the system information features to obtain multi-mode driving feature data, wherein the multi-mode driving feature data comprises the steps of performing feature conversion according to a three-valued processing algorithm to meet the requirements of the multi-mode driving feature and performing feature fusion operation on the multi-mode driving feature data;

2. The driving control method based on personalized federal contrast learning according to claim 1, wherein the feature extraction of the multi-view image in the driving data by using a time Q-Former processing algorithm to obtain multi-view image features comprises:

3. The method for controlling driving based on personalized federal contrast learning according to claim 1, wherein the step of extracting features from the LiDAR data in the driving data by using an encoder, and performing feature conversion on the extracted features by using a tri-valued processing algorithm to obtain LiDAR features comprises:

4. The driving control method based on personalized federal contrast learning according to claim 1, wherein the inputting the real-time driving data of the user into the personalized target driving prediction model for driving prediction to obtain the driving prediction result of the user comprises:

5. The driving control method based on personalized federal contrast learning according to claim 1, wherein the optimizing the personalized parameters for the target driving prediction model in the client by using the fusion parameters comprises:

6. A driving control system based on the driving control method based on personalized federal contrast learning according to any one of claims 1 to 5, comprising a central server and a plurality of clients respectively connected to the central server.

7. The drive control system of claim 6, wherein the client is disposed within a vehicle.

8. The driving control system according to claim 6, wherein the central server obtains global neuron parameters of the target driving prediction models in the plurality of clients, aggregates the global neuron parameters to obtain fusion parameters, distributes the fusion parameters to the plurality of clients, receives the fusion parameters by the clients, and performs personalized parameter optimization on the target driving prediction models in the clients by using the fusion parameters.