Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The specific manner described in the following exemplary embodiments does not represent all aspects consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
In order to make the vehicle lane-changing turn light identification method provided by the present disclosure clearer, the following describes in detail the implementation process of the scheme provided by the present disclosure with reference to the accompanying drawings and specific embodiments.
Referring to fig. 1, fig. 1 is a flowchart illustrating a method for identifying a lane-changing turn signal of a vehicle according to an embodiment of the present disclosure. As shown in fig. 1, the process includes:
step 101, acquiring a video stream to be detected; the video stream to be detected comprises at least two lanes.
In the embodiment of the present disclosure, a video stream that needs to be identified by the lane-changing turn signal lamp of the vehicle may be referred to as a video stream to be detected. And the picture of the video stream to be detected comprises at least two lanes. Wherein, the vehicles in the video stream picture can realize lane change driving among different lanes.
The specific manner of obtaining the video stream to be tested may include multiple manners, and the embodiment of the present disclosure is not limited. For example, taking the identification of the vehicle continuous lane change on the expressway as an example, in this step, the video acquisition device installed on the expressway may be multiplexed to obtain the video stream to be detected. By the method, the original video acquisition equipment can be reused, and the hardware cost is reduced.
And 102, determining the vehicle in the video stream to be detected and the lane where the vehicle is located based on vehicle detection and tracking detection of the video stream to be detected.
The step can detect the video stream to be detected, and determine the vehicles and the lanes where the vehicles are located in the video stream to be detected. The detection of the video stream to be detected may include vehicle detection and tracking detection, and a plurality of vehicle detection frames may be obtained correspondingly. Through vehicle detection on the video stream to be detected, vehicles included in the video stream to be detected can be identified. It is to be understood that the specific manner of performing vehicle detection on the video stream to be detected is not limited by the embodiments of the present disclosure. For example, the vehicle detection of the vehicle in the video stream to be detected can be realized by using a network model which can perform vehicle detection in the related art. Or, according to the collected sample data, a vehicle detection network meeting the detection requirements is obtained based on deep neural network model training, and vehicle detection of the vehicle in the video stream to be detected is realized by utilizing the vehicle detection network obtained through training.
After the video stream to be detected is subjected to vehicle detection, different image frames included in the video stream to be detected may include a plurality of different vehicles. However, the same vehicle is not associated between different image frames, i.e. the same vehicle in different image frames cannot be identified. Therefore, on the basis of carrying out vehicle detection on the video stream to be detected, the step also comprises the step of carrying out tracking detection on the video stream to be detected. The same vehicles appearing in the video stream to be detected can be associated through tracking detection, and the vehicle tracking sequence images of each vehicle can be obtained by combining the vehicle detection frames obtained before. For example, the same vehicle included in different image frames may be identified, i.e. the same vehicle in the video stream to be tested may be identified. It can be understood that the specific manner of performing tracking detection on the video stream to be detected may include various ways, and the embodiments of the present disclosure are not limited thereto. Similar to the specific way of carrying out vehicle detection on the video stream to be detected and carrying out tracking detection on the video stream to be detected, the tracking detection on the video stream to be detected can be realized based on a tracking algorithm in the related technology; or the method can obtain a tracking detection model meeting the requirements based on the Shenshen network self-training, so that the tracking detection of the video stream to be detected can be realized by utilizing the tracking detection model.
On the basis of vehicle detection and tracking detection of the video stream to be detected, the lane where the vehicle is located can be further determined after the vehicle in the video stream to be detected is determined. The detailed process of determining the lane where the vehicle is located according to the video stream to be detected will be described in detail later, and will not be described herein again.
And 103, determining the time when the lane where the vehicle is located is changed from the original driving lane to the adjacent lane as the lane changing time of the vehicle.
After determining the vehicle in the video stream to be detected and the lane where the vehicle is located, the step may further determine a time when the vehicle changes from the original driving lane where the vehicle is originally located to the adjacent lane, as a lane change time of the vehicle. The original driving lane is the lane where the vehicle is located before lane changing. The adjacent lane is the lane where the vehicle changes lane from the original driving lane. The original driving lane and the adjacent lane are two adjacent and different lanes.
Fig. 2 shows a schematic diagram of a lane change during the driving of a vehicle. Wherein, include 3 lanes in fig. 2: l1, L2 and L3; the vehicle is changing lane from lane L1 to lane L2 when in Car1 position, and lane L2 to lane L3 when in Car2 position to effect a lane change from lane L1 to lane L3. Taking fig. 2 as an example, in this step, lane L1 may be used as the original driving lane, lane L2 may be used as the adjacent lane, and the time when the vehicle enters lane L2 from lane L1 may be used as the lane change time. Alternatively, in this step, the lane L2 may be the original traveling lane, the lane L3 may be the adjacent lane, and the time when the vehicle enters the lane L3 from the lane L2 may be the lane change time.
In some optional embodiments, the lane change time of the vehicle may be determined according to the time corresponding to the image frame in the video stream to be detected, in combination with the change of the lane where the vehicle is located in the image frame. Specifically, the image frame in the video stream to be tested, in which the lane where the vehicle is located is changed from the original driving lane to the adjacent lane, may be determined as the image frame corresponding to the lane change of the vehicle from the original driving lane to the adjacent lane, and the time corresponding to the image frame may be used as the lane change time of the vehicle.
And 104, detecting the steering lamp of the vehicle in the video stream to be detected with preset time length before and after the lane changing time.
In the embodiment of the disclosure, a certain preset time can be preset based on the lane change time so as to detect the turn signal of the vehicle in the video stream to be detected within the preset time. The specific setting mode of the preset duration is not limited in the embodiments of the present disclosure. For example, a time length of 2 seconds may be taken before and after the lane change time, and a time length of 4 seconds may be taken as the preset time length. The step can detect the turn signal of the vehicle in the video stream to be detected corresponding to the total time of 4 seconds including the lane change time.
The specific manner of identifying the turn signal of the vehicle in the video stream to be detected is not limited in the embodiments of the present disclosure. For example, the existing detection model in the related art that can perform turn signal recognition on the vehicle in the video stream can be used for detection. Or training by utilizing a learnable deep neural network based on the training sample to obtain a satisfactory steering lamp detection network, so as to detect the steering lamp of the vehicle in the video stream to be detected by utilizing the steering lamp detection network.
And 105, under the condition that the turn lights of the vehicle are not turned on within a preset time length before and after the lane changing time, determining that the turn lights of the vehicle are not turned on during the lane changing.
After the vehicle in the video stream to be detected with the preset time length is subjected to the turn signal detection, if the turn signals of the vehicle are not turned on within the preset time length, it can be determined that the vehicle does not turn on the turn signal in the lane changing process. That is, the vehicle does not turn the turn signal during the lane change in accordance with the driving norm.
At this time, a warning message may be issued in an appropriate form to indicate that the vehicle has changed lanes without turning on the turn signal lights. In one possible implementation, an alert may be initiated and/or image frames of the vehicle changing lane without a turn signal may be saved. The specific manner of initiating the alarm may include multiple implementation manners, and the embodiments of the present disclosure are not limited thereto. For example, an alarm prompt message may be sent to a detection system of a traffic detection department to remind relevant staff of detecting that a vehicle has changed lanes and is not turning on a turn signal. As another example, an alert may be initiated directly to equipment used by the driver to directly alert the driver to a cautious drive. For example, the warning message may be sent directly to a mobile phone used by the driver by short message or other means. In addition, after the fact that the vehicle changes lane without turning the turn signal lamp is detected, the related image frame or video stream of the vehicle which changes lane without turning the turn signal lamp can be further stored to be used as a certificate in the processing process.
In the embodiment of the disclosure, after the lane changing time of the vehicle is determined, the turn signal lamp detection may be performed on the vehicle in the video stream for a preset time duration in the lane changing process of the vehicle, and it is determined that the turn signal lamp is not turned on when the turn signal lamp of the vehicle is not turned on within the preset time duration. By the method, whether the vehicle turns the steering lamp in the lane changing process can be further detected after the vehicle is determined to change the lane, and warning information can be sent out in time to prompt under the condition that the steering lamp is not turned in the lane changing process of the vehicle.
On the other hand, the process of identifying the vehicle lane change without turning on the turn signal lamp and the process of identifying the vehicle lane change are two completely decoupled processes. Namely, the process of identifying the lane change of the vehicle can be optimized or improved independently, and the process of identifying the lane change steering lamp of the vehicle can be optimized or improved independently, so that the two processes can be optimized respectively and conveniently.
In some alternative embodiments, the specific implementation ofstep 102 may include: obtaining a vehicle tracking sequence image based on vehicle detection and tracking detection of the video stream to be detected; and determining the lane where the vehicle in the vehicle tracking sequence image is located.
The video stream to be detected is composed of a plurality of continuous images, so the process of detecting the video stream to be detected actually is the process of respectively detecting the plurality of images included in the video stream to be detected and determining the lane where the vehicle is located in the plurality of images.
In the above embodiment, the vehicle tracking sequence image may be obtained based on vehicle detection and tracking detection of the video stream to be detected. The vehicle tracking sequence image comprises a plurality of vehicle foreground images of the same vehicle, and the plurality of vehicle foreground images are sorted according to the time sequence. Wherein, only one vehicle is included in the vehicle foreground map. In one possible implementation, the image may be clipped to a vehicle foreground map of a preset size based on a vehicle detection frame of a vehicle detected from the image.
After the vehicle tracking sequence image is obtained, the vehicle tracking sequence image can be directly detected to determine the lane where the vehicle is located. When the turn signal detection is carried out on the vehicle in the video stream to be detected with the preset time length before and after the lane change time, the turn signal detection can be directly carried out on the vehicle tracking sequence images with the preset time length before and after the lane change time respectively.
In some optional embodiments, the turn signal detection of the vehicle tracking sequence image comprises: inputting the vehicle tracking sequence image into a pre-trained detection model comprising at least one detection branch; and determining the turn-on state of the steering lamp based on the probability of turning on and/or off of the steering lamp output by the detection model.
In the above embodiment, the training sample may be used to train in advance to obtain a detection model meeting the requirement, so as to detect the vehicle tracking sequence image by using the detection model, determine the probability that the turn light of the vehicle in the vehicle tracking sequence image is on, or determine the probability that the turn light of the vehicle in the vehicle tracking sequence image is off, thereby determining the turn-on state of the turn light of the vehicle according to the determined probability value.
Also, one or more detection branches may be included in the detection model. Any detection branch can detect and determine the probability that the turn lights of the vehicles in the vehicle tracking sequence images are on or determine the probability that the turn lights of the vehicles in the vehicle tracking sequence images are off according to the input vehicle tracking sequence images.
In some optional embodiments, in the case that the detection model includes a plurality of detection branches, the step of performing turn signal detection on the vehicle tracking sequence image includes: inputting the vehicle tracking sequence images into a plurality of detection branches of the detection model respectively; and fusing the turn light on and/or off probabilities output by the detection branches to obtain the turn light on state of the vehicle tracking sequence image.
Under the condition that the detection model comprises a plurality of detection branches, the vehicle tracking sequence images needing to be detected can be respectively input into the plurality of detection branches, and the probability values of turning on or turning off of the steering lamps of the corresponding vehicles are respectively obtained by different detection branches according to the input vehicle tracking sequence images. Because the detection model comprises a plurality of detection branches, the same vehicle in the same vehicle tracking sequence image can obtain a plurality of corresponding detection results, namely the probability value of whether the corresponding steering lamps are on or off.
Further, in the embodiment of the present disclosure, multiple probability values obtained by different detection branches may be fused to obtain a final probability value of whether the turn signal of the vehicle is on or off, so as to determine the turn-on state of the turn signal of the vehicle according to the final probability value. The specific manner of fusing the probability values is not limited in the embodiments of the present disclosure. For example, a plurality of probability values of turning on of the turn signal obtained by the plurality of detection branches may be averaged to obtain a final probability value of turning on of the turn signal corresponding to the vehicle, and if the final probability value is greater than a preset probability threshold, it may be determined that the turn signal of the vehicle is in an on state.
In some alternative embodiments, the detection of the turn signal for the vehicle tracking sequence image, as shown in fig. 3, may include the following steps:
step 301, inputting the vehicle tracking sequence images into the detection model comprising at least one detection branch, wherein each branch comprises a self-attention module.
In the embodiment of the disclosure, the detection model includes a detection branch into which a self-attention mechanism can be fused to enhance the image characteristics of the turn light region in the vehicle tracking sequence image. For example, a self-attention module may be added in each branch included in the detection model. In one possible implementation, a self-attention module may be added to a part of the detection branches included in the detection model; in other implementations, a self-attention module may be added to each detection branch of the detection model, and is not limited herein.
And step 302, updating image characteristics in the vehicle tracking sequence image based on the weight generated by the self-attention module, wherein in the vehicle tracking sequence image, the weight of the turn light region is different from the weights of other regions.
This step may generate weights corresponding to the feature map based on the self-attention module included in the detection branch to update the image features of the vehicle tracking sequence images. The weight of the turn light region in the vehicle tracking sequence image can be distinguished from the weights of other regions based on the self-attention module, so as to enhance the image characteristics of the turn light region in the vehicle tracking sequence image. For example, the turn signal region in the vehicle tracking sequence image may be weighted more heavily than other regions based on a self-attention mechanism. Alternatively, the turn signal region in the vehicle tracking sequence image may be weighted less than other regions based on a self-attention mechanism.
Step 303, determining the turn-on state of the turn signal based on the image characteristics of the vehicle tracking sequence image.
The image characteristics obtained after the vehicle tracking sequence image is updated based on the self-attention mechanism can more obviously highlight the turn light region of the vehicle, so that the detection model can enhance the attention to the turn light region in the detection process of the vehicle tracking sequence image, and the opening state of the vehicle in the vehicle tracking sequence image can be more accurately detected and determined.
Two detection branches are included in the detection model as follows: the first detection model and the second detection model are taken as examples, and the process of detecting the turn signal by the vehicle tracking sequence image is exemplified. It is understood that the detection model in the embodiments of the present disclosure is not limited to include two detection branches, and the following description is only exemplary.
The process of detecting the turn signal for the vehicle tracking sequence image may include: inputting the vehicle tracking sequence image into a first detection model trained in advance, and outputting a first steering lamp detection result of the vehicle tracking sequence image by the first detection model; inputting the vehicle tracking sequence image into a pre-trained second detection model, and outputting a second steering lamp detection result of the vehicle tracking sequence image by the second detection model; and fusing the first steering lamp detection result and the second steering lamp detection result to obtain a final steering lamp detection result of the vehicle tracking sequence image.
In the embodiment of the present disclosure, the training data may be used to perform separate training in advance to obtain two detection models: a first detection model and a second detection model. The first detection model or the second detection model can detect whether a turn signal of the vehicle in the vehicle tracking sequence image is turned on or not according to the input vehicle tracking sequence image to obtain a first turn signal detection result or a second turn signal detection result.
The specific presentation form of the first turn signal detection result or the second turn signal detection result is not limited by the embodiments of the present disclosure. For example, the first detection model may output a probability value that a turn signal of the vehicle is turned on in the vehicle tracking sequence image as the first turn signal detection result. For example, the second detection model may output, as the second turn signal detection result, a probability value that a turn signal of the vehicle is not turned on in the vehicle tracking sequence image.
In one possible implementation, the first detection model may obtain the first detection result of the turn signal lamp, which includes: a probability value of turn-on of the turn lights and/or a probability value of turn-off of the turn lights. Similarly, the second detection result obtained by the second detection model may also include: a probability value of turn-on of the turn lights and/or a probability value of turn-off of the turn lights.
In some optional embodiments, before the turn signal detection is performed on the vehicle tracking sequence image, the method further comprises: taking a first proportion of model parameters of a basic detection model to obtain the first detection model; and taking a second proportion to the model parameters of the basic detection model to obtain the second detection model.
The base detection model includes a network model that can be used for turn signal detection on vehicle tracking sequence images. For example, the ResNet18 model may be used as the base detection model. In the above embodiment, a first proportion may be taken for the model parameters of the ResNet18 model to obtain the first detection model; and taking a second proportion to the model parameters of the ResNet18 model to obtain the second detection model. Optionally, the sum of the first ratio and the second ratio is smaller than 1, so that the number of the parameters of the two detection models is smaller than the model parameters of the complete ResNet18 model, the calculation amount can be reduced, and the detection efficiency can be improved.
For example, 1/2 may be taken as the number of channels in the model parameters of the ResNet18 model to obtain a first detection model, and 1/4 may be taken as the number of channels in the model parameters of the ResNet18 model to obtain a second detection model. After the first detection model and the second detection model are obtained by using the ResNet18 model, the two detection models need to be trained separately by using training data, so as to obtain two detection models which can be used for detecting a turn signal of a vehicle tracking sequence image.
In some optional embodiments, a self-attention mechanism may be fused in the first detection model and the second detection model, respectively, to emphasize a turn light region of a vehicle in the vehicle tracking sequence images. In one possible implementation manner, after the inputting the vehicle tracking sequence image into the first detection model trained in advance, the method further includes: updating the weight value of the corresponding turn signal region in the feature map of the first detection model based on a self-attention mechanism; and/or after the inputting of the vehicle tracking sequence images into a pre-trained second detection model, further comprising: and updating the weight value of the corresponding turn light region in the feature map of the second detection model based on the self-attention mechanism.
For example, as shown in fig. 4, a self-attention mechanism may be merged in the first detection model and the second detection model, respectively, and a weight value of a turn signal region of a vehicle in the feature map is obtained through maximum pooling and convolution, so as to enhance the turn signal region of the vehicle in the vehicle tracking sequence image. Therefore, the first detection model or the second detection model can more accurately detect the information of the steering lamp of the vehicle in the vehicle tracking sequence image and more accurately detect whether the steering lamp of the vehicle is turned on or not. In the method, the robustness to external conditions such as illumination is improved, and the algorithm effect under severe conditions is optimized.
After the first turn light detection result and the second turn light detection result are obtained, the embodiment of the disclosure can fuse the two turn light detection results to obtain a final turn light detection result of the vehicle. For example, the average probability value may be obtained by averaging two probability values of turning on the turn signal, so as to determine the turn signal detection result of the vehicle according to the average probability value, and determine whether the turn signal of the vehicle is turned on.
Since the first turn signal detection result may include a probability value that the turn signal is on and a probability value that the turn signal is off, and the second turn signal detection result may also include a probability value that the turn signal is on and a probability value that the turn signal is off. Therefore, in the process of fusing the probability values, the probability value of turning on the turn signal and the probability value of turning off the turn signal can be respectively fused.
For example, the probability value of turning on the turn signal included in the first turn signal detection result and the probability value of turning on the turn signal included in the second turn signal detection result may be averaged to obtain a final probability value of turning on the turn signal after fusion; and averaging the probability value of turn-off of the turn lights included in the first turn light detection result and the probability value of turn-off of the turn lights included in the second turn light detection result to obtain the final probability value of turn-off of the turn lights after fusion. Therefore, the detection result of the steering lamp of the vehicle can be determined according to the final probability values of the turn-on and turn-off of the steering lamp obtained after fusion, and whether the steering lamp of the vehicle is turned on or not can be determined.
In the embodiment of the disclosure, the vehicle tracking sequence image can be detected through the first detection model and the second detection model respectively to obtain a first steering lamp detection result and a second steering lamp detection result; and then fusing the first steering lamp detection result and the second steering lamp detection result to obtain a final steering lamp detection result of the vehicle tracking sequence image. The detection mode detects the image relative to a detection model, and the detection of the steering lamp of the vehicle in the image is more accurate, so that whether the steering lamp is turned on or not in the lane changing process of the vehicle can be more accurately identified.
In some optional embodiments, thestep 102 of determining the specific implementation of the lane in which the vehicle is located based on the video stream to be tested, as shown in fig. 5, may include the following steps:
step 501, respectively performing vehicle detection and tracking detection on multiple frames of images included in the video stream to be detected, and obtaining multiple frames of vehicle foreground images of the vehicle.
The video stream to be detected is composed of a plurality of continuous images, so the process of detecting the video stream to be detected actually is the process of respectively detecting the plurality of images included in the video stream to be detected and determining the lane where the vehicle is located in the plurality of images.
In the process of respectively detecting multiple frames of images included in the video stream to be detected, the video stream to be detected can be preprocessed in advance to reduce the number of image frames included in the video stream, so that the detection calculated amount can be reduced, and the detection efficiency can be improved. In some optional embodiments, multiple frames of images may be extracted from the video stream to be tested according to a preset rule; sequencing the extracted multi-frame images according to a time sequence to obtain an image set of the vehicle; and respectively detecting the multi-frame images in the image set, and determining the lane where the vehicle is located in the multi-frame images.
In the above embodiment, an image extraction rule may be preset, image frames are extracted from the video stream to be detected according to the preset rule, and the extracted image set of the vehicle is used as a new video stream to be detected to detect images included therein. For example, 5 frames of images may be uniformly extracted from the video stream every 1 second, and then the images extracted in a certain time period are sorted according to the time sequence of the images in the video stream to be detected, so as to obtain an image set of the vehicle, which is used as a new video stream to be detected and detect the images therein.
In the mode, the images in the video stream to be detected can be extracted based on the preset rules, and the number of image frames in the video stream to be detected can be reduced to a certain extent, so that the calculated amount of detecting the images in the video stream to be detected can be reduced, and the detection efficiency of the lane where the vehicle is detected is improved.
In the embodiment of the present disclosure, the image included in the video stream to be tested is an original size image acquired by the image acquisition device. Due to differences in image capturing devices, differences in the size of captured images may result. Therefore, in the step, the vehicle foreground images of the multiple frames of preset sizes of the vehicle can be obtained by respectively carrying out vehicle detection and tracking detection on the multiple frames of images included in the video stream to be detected. Wherein, only one vehicle is included in the vehicle foreground map.
For example, the image may be clipped to a vehicle foreground map of a preset size based on a vehicle detection frame of a vehicle detected from the image. The method cuts the image into the vehicle foreground image with uniform size, and can further detect the vehicle in the vehicle foreground image more conveniently. For example, the image shown in fig. 6 may be cropped in this step to obtain a foreground map of the vehicle shown in fig. 7. The vehicle foreground map shown in fig. 7 may be an image with a preset size, and the vehicle foreground map includes only one vehicle.
Step 502, determining the lane where the vehicle is located in each frame of image based on the relative position of the vehicle and the lane in each frame of the vehicle foreground image.
After obtaining the multiple frames of foreground images of the vehicle according to the multiple frames of images included in the video stream to be detected, the step can further detect the vehicle in the vehicle foreground image, and determine the lane where the vehicle is located in the image.
In some alternative embodiments, the specific implementation ofstep 502, as shown in fig. 8, may include the following steps:
step 801, detecting the vehicle foreground map, and determining a vehicle area of the vehicle in the vehicle foreground map; wherein the vehicle area is used to represent a road surface area occupied by the vehicle.
During the driving of the vehicle on the road, the area of the road occupied by the vehicle is most representative of the actual position of the vehicle. In this step, the vehicle foreground image may be detected and a road area occupied by the vehicle in the vehicle foreground image may be used as a vehicle area of the vehicle in the vehicle foreground image.
Fig. 6 shows one of the multiple frames of images included in the video stream to be tested. Where Box1 is the vehicle detection Box determined in the related image detection technology, and the vehicle Region determined by the detection of the vehicle foreground map in this step isRegion 1. As shown in fig. 6, the Region represented by Region1 may be closer to the area of the road surface occupied by the actual vehicle, and thus the vehicle Region may more accurately represent the actual position of the vehicle.
Step 802, determining the lane where the vehicle is located according to the relative position of the vehicle area and the lane.
After the vehicle region of the vehicle in the vehicle foreground map is determined, the step may further determine the lane where the vehicle is located according to the relative position of the vehicle region and the lane. Referring to the image shown in fig. 6, a lane L1 and a lane L2 are included. Region1 is used as the vehicle Region of the vehicle. This step may determine whether the vehicle is located on L2 based on the relative position of Region1 and lane L2.
The specific manner of determining the lane where the vehicle is located according to the relative position of the vehicle region and the lane may be flexibly implemented according to specific applications, and the embodiment is not limited. For example, the lane in which the vehicle is located may be determined in the case where the overlapping area of the vehicle area and the lane is sufficiently large.
In the embodiment of the disclosure, the vehicle foreground map can be detected, and the road surface area occupied by the vehicle in the vehicle foreground map is determined as the vehicle area, so that the lane where the vehicle is located can be further determined according to the relative position of the vehicle area and the lane. In this mode, since the road surface area occupied by the vehicle is detected from the vehicle foreground map as the vehicle area, the actual position of the vehicle can be more accurately represented, and thus the lane where the vehicle is located can be more accurately determined. On the basis of accurately determining the lane where the vehicle is located, the vehicle continuous lane changing can be more accurately identified.
In some alternative embodiments, the specific implementation ofstep 801, as shown in fig. 9, may include the following steps:
step 901, inputting the vehicle foreground image into a key point detection network obtained by pre-training, and detecting the wheel key points of the vehicle in the image by the key point detection network.
In the embodiment of the disclosure, a key point detection network capable of detecting key points of wheels of a vehicle in a vehicle foreground map can be obtained through pre-training. The key point detection network can be obtained by training based on any machine learning model or neural network model which can be learned. In the embodiment of the present disclosure, the specific form of the key point detection network is not limited.
As an example, fig. 10 shows a schematic network structure diagram of a key point detection network. Wherein ResNet is used as a backbone network for extracting picture features. The input of the backbone network can be a vehicle foreground image, the spatial resolution of the feature image is gradually reduced after the convolution operation of the backbone network, and the semantic features are more obvious. It will be appreciated that the backbone network may comprise more than a ResNet, and may be other types of backbone networks, such as google lenet, VGGNet, or ShuffleNet, among other types of general convolutional neural network structures.
Further, a Feature Pyramid Network (FPN) can be used to extract multi-scale features. Specifically, the low-resolution feature map may be subjected to resolution restoration by operations of deconvolution and element-level addition, and the output of the FPN is a feature map with a resolution corresponding to 32 × 32, which is one quarter of the original size.
Still further, the output of the FPN may be further convolved for predicting 5 localization thermodynamic diagrams. Wherein, 5 location thermodynamic diagrams correspond left front wheel, left rear wheel, right front wheel and the background of vehicle respectively. And further determining wheel key points in the vehicle foreground map according to the positioning thermodynamic map.
In one possible implementation, the wheel key point includes a location point where the wheel is in direct contact with the road surface, or includes a wheel center point. Wherein the wheel keypoints are used to represent the position of the wheel. It will be appreciated that different vehicles have different numbers of wheels, so the number of wheel keypoints may also vary from vehicle to vehicle.
For example, the location coordinates of 4 wheel keypoints may be obtained. Illustratively, the wheel keypoints include: a left front wheel keypoint, a left rear wheel keypoint, a right rear wheel keypoint, and a right front wheel keypoint. As shown in fig. 7, the wheel keypoints of the vehicle may include a left front wheel keypoint S1, a left rear wheel keypoint S2, a right rear wheel keypoint S3, and a right front wheel keypoint S4.
Step 902, determining a vehicle area of the vehicle in the image based on a polygon surrounded by the wheel key points.
Since the wheels are specific positions in direct contact with the road surface during the running process of the vehicle on the road surface, the vehicle area formed by the key points of the wheels can more accurately represent the road surface area occupied by the vehicle. The vehicle area of the vehicle can be determined according to the wheel key points detected in the vehicle foreground map. The specific manner of determining the vehicle region according to the wheel key point may include various implementations, and the embodiment is not limited.
In one possible implementation, a polygonal area formed by a plurality of wheel key points can be used as the vehicle area. In the case where the vehicle includes 4 wheel key points, as shown in fig. 7, a quadrilateral area formed by the 4 wheel key points may be used as the vehicle area of the vehicle, i.e., the quadrilateral S1S2S3S4 may be used as the vehicle area of the vehicle.
In the embodiment of the disclosure, by detecting the wheel key points of the vehicle in the vehicle foreground map, the vehicle area of the vehicle can be determined according to the wheel key points, the road area occupied by the vehicle can be more accurately represented, and the lane where the vehicle is located can be accurately determined according to the relative position of the vehicle area and the lane.
In some alternative embodiments, the specific implementation ofstep 802, as shown in fig. 11, may include the following steps:
step 1101, determining an overlapping area of the vehicle area and the lane as a first area;
step 1102, determining the proportion of the first area in the vehicle area as the vehicle overlapping degree.
By way of example, fig. 6 includes a vehicle Region1 and a lane L2, where an overlapping Region of the vehicle Region1 and the lane L2 is R1. This overlapping region R1 may be referred to as a first region. The step may determine the proportion of the first Region R1 in the vehicle Region1 as the vehicle overlap degree, that is, the vehicle overlap degree is calculated by: R1/Region 1.
In a possible implementation manner, after the overlapping area of the vehicle area and the lane is obtained, that is, the first area is determined, the area of the first area may be further calculated, and in a case that the area of the first area is greater than a preset area threshold, it is determined that the vehicle is located in the corresponding lane. The preset area threshold value may be determined based on the total area of the vehicle region. For example, the preset area threshold may be predefined to be half of the total area of the vehicle region, so the area threshold may be different according to the total area of the vehicle region.
Step 1103, determining the lane corresponding to the vehicle overlap degree as the lane where the vehicle is located, when the vehicle overlap degree is greater than a preset overlap degree threshold value.
The embodiment of the disclosure may preset an overlap threshold as a contrast value of the vehicle overlap to determine whether the vehicle is in the corresponding lane. For example, the overlap threshold may be set to 0.5, and in the case where the vehicle overlap R1/Region1 is greater than 0.5, then it may be determined that the vehicle is in lane L2.
In the embodiment of the disclosure, the proportion of the overlapping area of the vehicle area and the lane to the vehicle area may be defined as a vehicle overlapping degree, the vehicle overlapping degree is compared with a preset overlapping degree threshold, and the vehicle is determined to be located in the corresponding lane when the vehicle overlapping degree is greater than the overlapping degree threshold. In the method for determining the lane where the vehicle is located, the vehicle area is the road surface area occupied by the vehicle, and the lane where the vehicle is located can be more accurately determined by taking the proportion of the overlapped area occupying the vehicle area as a determination basis. After the lane where the vehicle is located is accurately determined, the lane changing behavior of the vehicle can be detected more accurately, and the continuous lane changing behavior of the vehicle can be identified more accurately.
As shown in fig. 12, the present disclosure provides a lane-change turn signal recognition apparatus that can perform the lane-change turn signal recognition method of any of the embodiments of the present disclosure. The apparatus may include a videostream acquisition module 1201, a vehiclelane determination module 1202, a time ofday determination module 1203, a turnsignal detection module 1204, and a non-turnsignal determination module 1205. Wherein:
a videostream obtaining module 1201, configured to obtain a video stream to be detected; the video stream to be detected comprises at least two lanes;
a vehiclelane determining module 1202, configured to determine, based on vehicle detection and tracking detection on the video stream to be detected, a vehicle in the video stream to be detected and a lane where the vehicle is located;
atime determining module 1203, configured to determine a time when the lane where the vehicle is located is changed from an original driving lane to an adjacent lane, as a lane changing time of the vehicle;
a turnsignal detection module 1204, configured to perform turn signal detection on the vehicle in the video stream to be detected at a preset time length before and after the lane change time;
and a non-turn-onlamp determining module 1205, configured to determine that no turn-on lamp is turned on when no turn-on lamp of the vehicle is turned on within a preset time period before and after the lane change time.
Optionally, the vehiclelane determining module 1202, when configured to determine the vehicle in the video stream to be detected and the lane where the vehicle is located based on vehicle detection and tracking detection on the video stream to be detected, includes: obtaining a vehicle tracking sequence image based on vehicle detection and tracking detection of the video stream to be detected; determining a lane in which a vehicle in the vehicle tracking sequence images is located;
the turnsignal detection module 1204, when being used for detecting the turn signal of the vehicle in the video stream to be detected with a preset duration before and after the lane change time, includes: and respectively detecting the steering lamps of the vehicle tracking sequence images with preset time lengths before and after the lane changing time.
Optionally, the turnsignal detecting module 1204, when configured to detect a turn signal in the vehicle tracking sequence image, includes: inputting the vehicle tracking sequence image into a pre-trained detection model comprising at least one detection branch; and determining the turn-on state of the steering lamp based on the probability of turning on and/or off of the steering lamp output by the detection model.
Optionally, in a case that the detection model includes a plurality of detection branches, the turnsignal detection module 1204, when being used for performing turn signal detection on the vehicle tracking sequence image, includes: inputting the vehicle tracking sequence images into a plurality of detection branches of the detection model respectively; and fusing the turn light on and/or off probabilities output by the detection branches to obtain the turn light on state of the vehicle tracking sequence image.
Optionally, in a case that the detection model includes a plurality of detection branches, the turnsignal detection module 1204, when being used for performing turn signal detection on the vehicle tracking sequence image, includes: inputting the vehicle tracking sequence images into the detection model comprising at least one detection branch, wherein each branch includes a self-attention module; updating image features in the vehicle tracking sequence image based on the self-attention module generated weights, wherein in the vehicle tracking sequence image, the weights of turn signal regions are different from the weights of other regions; and determining the turn-on state of the steering lamp based on the image characteristics of the vehicle tracking sequence image.
Optionally, as shown in fig. 13, the turnsignal detecting module 1204 includes:
thefirst detection submodule 1301 is configured to input the vehicle tracking sequence image into a first detection model trained in advance, and output a first turn light detection result of the vehicle tracking sequence image by using the first detection model;
thesecond detection sub-module 1302 is configured to input the vehicle tracking sequence image into a second detection model trained in advance, and output a second turn light detection result of the vehicle tracking sequence image by the second detection model;
and aresult fusion submodule 1303 configured to fuse the first steering lamp detection result and the second steering lamp detection result to obtain a final steering lamp detection result of the vehicle tracking sequence image.
Optionally, as shown in fig. 14, thefirst detection sub-module 1301 includes: a firstweight updating submodule 1401, configured to update a weight value of a corresponding turn signal region in the feature map of the first detection model based on a self-attention mechanism; and/or, thesecond detection sub-module 1302 includes: the second weight updating sub-module 1402 is configured to update the weight value of the corresponding turn signal region in the feature map of the second detection model based on the self-attention mechanism.
Optionally, as shown in fig. 15, the turnsignal detecting module 1204 further includes:
the firstmodel generation submodule 1501 is configured to obtain a first ratio of model parameters of a basic detection model, and obtain the first detection model;
the second model generation sub-module 1502 is configured to obtain a second detection model by taking a second ratio of the model parameters of the basic detection model.
Optionally, as shown in fig. 16, the vehiclelane determination module 1202 includes:
the foreground image obtaining sub-module 1601 is configured to perform vehicle detection and tracking detection on multiple frames of images included in the video stream to be detected, respectively, to obtain multiple frames of vehicle foreground images of the vehicle;
a lane determining sub-module 1602, configured to determine a lane in which the vehicle is located in each frame of image based on the relative position of the vehicle and the lane in each frame of the vehicle foreground map.
Optionally, as shown in fig. 17, the lane determining sub-module 1602 includes:
a vehicle region determination submodule 1701, configured to detect the vehicle foreground map, and determine a vehicle region of the vehicle in the vehicle foreground map; wherein the vehicle area is used for representing a road surface area occupied by the vehicle;
and the vehiclelane determining submodule 1702 is used for determining the lane where the vehicle is located according to the relative position of the vehicle area and the lane.
Optionally, the vehicle region determining sub-module 1701, when configured to detect the vehicle foreground map and determine the vehicle region of the vehicle in the vehicle foreground map, includes: inputting the vehicle foreground image into a key point detection network obtained by pre-training, and detecting wheel key points of the vehicle in the image by the key point detection network; and determining a vehicle area of the vehicle in the image based on a polygon surrounded by the wheel key points.
Optionally, the vehiclelane determining submodule 1702, when configured to determine the lane where the vehicle is located according to the relative position of the vehicle area and the lane, includes: determining an overlapping area of the vehicle area and the lane as a first area; determining the proportion of the first area in the vehicle area as the vehicle overlapping degree; and determining the lane corresponding to the vehicle overlapping degree as the lane where the vehicle is located under the condition that the vehicle overlapping degree is greater than a preset overlapping degree threshold value.
Optionally, as shown in fig. 18, the apparatus further includes: and theresult processing module 1801 is configured to initiate an alarm and/or store an image frame of the vehicle changing lane without turning on a turn signal.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of at least one embodiment of the present disclosure. One of ordinary skill in the art can understand and implement it without inventive effort.
The present disclosure also provides a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor, when executing the program, can implement the method for identifying a lane-changing turn signal lamp of a vehicle according to any embodiment of the present disclosure.
Fig. 19 is a more specific hardware structure diagram of a computer device provided in an embodiment of the present disclosure, where the device may include: aprocessor 1010, amemory 1020, an input/output interface 1030, acommunication interface 1040, and abus 1050. Wherein theprocessor 1010,memory 1020, input/output interface 1030, andcommunication interface 1040 are communicatively coupled to each other within the device viabus 1050.
Theprocessor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
TheMemory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. Thememory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in thememory 1020 and called to be executed by theprocessor 1010.
The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
Thecommunication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
Bus 1050 includes a path that transfers information between various components of the device, such asprocessor 1010,memory 1020, input/output interface 1030, andcommunication interface 1040.
It should be noted that although the above-mentioned device only shows theprocessor 1010, thememory 1020, the input/output interface 1030, thecommunication interface 1040 and thebus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
The present disclosure also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, is capable of implementing the method of identifying a lane-changing turn signal for a vehicle of any of the embodiments of the present disclosure.
The non-transitory computer readable storage medium may be, among others, ROM, Random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like, and the present disclosure is not limited thereto.
In some optional embodiments, the disclosed embodiments provide a computer program product comprising computer readable code which, when run on a device, is executed by a processor in the device for implementing a vehicle lane change indicator identification method as provided in any of the above embodiments. The computer program product may be embodied in hardware, software or a combination thereof.
Other embodiments of the present disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
The above description is only exemplary of the present disclosure and is not intended to limit the present disclosure, so that any modification, equivalent replacement, or improvement made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.