CN112099481B

Movatterモバイル変換

Info

Publication number: CN112099481B
Application number: CN201910525727.4A
Authority: CN
Inventors: M·多梅林; 李千山; 田文鑫
Original assignee: Bayerische Motoren Werke AG
Current assignee: Bayerische Motoren Werke AG
Priority date: 2019-06-18
Filing date: 2019-06-18
Publication date: 2025-06-20
Anticipated expiration: 2039-06-18
Also published as: CN112099481A

Abstract

Methods and apparatus for constructing a real-time road model are provided. The method may include and the apparatus may be used to obtain a first set of sensor data output via a plurality of different types of sensors onboard the vehicle at a first time, feed the first set of sensor data to a generic statistical model to form generic statistical model format data for the first time, extract map data from the map that is within a threshold range around a location of the vehicle at the first time, correct each set of sensor data in the generic statistical model format data for the first time based on the map data, fuse the generic statistical model format data for the first time with historical generic statistical model format data to update the generic statistical model format data for the first time, compare the updated generic statistical model format data for the first time with an implicit model to identify objects, and combine the identified objects with the map data to form a real-time road model.

Description

Method and system for constructing road model

Technical Field

The present invention relates to constructing a road model, and more particularly, to constructing a road model using real-time sensor data and an offline map.

Background

An autonomous vehicle (also known as an unmanned car, an autonomous car, a robotic car) is a vehicle that is able to sense its surroundings and navigate without human input. Autonomous vehicles (hereinafter "ADV") use various techniques to detect their surroundings, such as radar, laser, GPS, range finding, and computer vision. Advanced control systems interpret the sensed information to identify the appropriate navigation path, as well as obstacles and associated signs.

More specifically, the ADV collects sensor data from various in-vehicle sensors, such as vision-type sensors (e.g., cameras), radar-type ranging sensors (such as lidar, millimeter wave radar, ultrasonic radar), and the like. Based on the sensor data, the ADV can construct a real-time road model around it. The road model may include various information including, but not limited to, lane information (such as location, type, width, etc. of lane lines), traffic lights, traffic signs, road boundaries, etc. By comparing the constructed road model with a previously obtained road model, such as that included in a High Definition (HD) map provided by a HD map provider, the ADV can more accurately determine its location in the road. Meanwhile, the ADV may also identify objects around it, such as vehicles, pedestrians, and buildings, based on the sensor data. ADV can make appropriate driving decisions, such as lane changes, acceleration, braking, etc., based on the determined road model and identified surrounding objects.

As is known in the art, different types of sensors produce different forms or formats of data. In processing sensor data from different sensors, each type of sensor data must be processed separately. Thus, for each type of sensor data, one or more models for storing that type of sensor data must be built for object identification. Currently, there is no model that can support multiple different types of sensor data simultaneously.

Furthermore, a single set of sensor data obtained for a single instant in time is unstable and unreliable. For example, at some point an object on the road (such as a vehicle) may obscure the sensor, an object on the road (such as a lane marker) may be obscured by other vehicles on the road or the sensor may be dithered due to vehicle jitter, in either case incorrect sensor data obtained by the sensor of the ADV may result in erroneous road model construction. Thus, it is desirable to compare the sensor data for that time instant with a priori information (e.g., a map) to correct for significantly incorrect sensor data, and fuse the sensor data for that time instant with multiple sets of sensor data for previous times to fit the data to construct a real-time road model.

It is therefore desirable to provide a solution that enables simultaneous support of multiple types of models of sensor data and enables real-time road model construction in combination with historical data, so as to overcome the above-mentioned drawbacks.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.

According to an embodiment of the present invention, there is provided an apparatus for constructing a real-time road model, comprising a sensor data acquisition module configured to acquire a first set of sensor data output by different types of sensors onboard a vehicle at a first time instant, a data feed module configured to feed the first set of sensor data to a generic statistical model to form a generic statistical model format data for the first time instant, the generic statistical model format comprising a plurality of sensor data sets, wherein each of the plurality of sensor data sets is output by one of the different types of sensors, a map data extraction module configured to extract map data from a map within a range of position surrounding a threshold where the vehicle is located at the first time instant, a data correction module configured to correct one of a plurality of sensor data sets for the first time instant based on the map data to a generic statistical model format, the generic statistical model format comprising a plurality of sensor data sets for the first time instant, a statistical model to be further defined for the generic model format, the statistical model to be fused with the generic model data set of the first time instant, a statistical model to identify an object, a statistical model to be used for the generic model to identify the generic model, a statistical model to be an object to be fused with the generic model, a map data extraction module to extract map data within a range of a position surrounding a threshold where the vehicle is located at the first time instant, wherein each of the plurality of sensor data sample sets comprises a pre-acquired sensor data sample set describing the predefined object by one of the different types of sensors, and a real-time road model forming module configured to combine the identified object with the map data to form the real-time road model.

According to one embodiment of the present invention, a vehicle is provided that includes a plurality of different types of sensors, and the apparatus for constructing a real-time road model described above. The plurality of different types of sensors include a vision-type sensor including a camera and a radar-type ranging sensor, the radar including one or more of a lidar, an ultrasonic radar, a millimeter wave radar.

By adopting the method, the device and the vehicle disclosed by the invention, a plurality of different types of sensor data can be supported in one model, so that the accuracy of positioning the vehicle at a specific position of a road model is improved. Further, by taking into account the history data, the accuracy of the resulting sensor data can be greatly improved. And by taking the map data as priori information to eliminate obvious errors of the real-time sensor data, the constructed real-time road model is more reliable.

These and other features and advantages will become apparent upon reading the following detailed description and upon reference to the associated drawings. It is to be understood that both the foregoing general description and the following detailed description are explanatory only and are not restrictive of aspects as claimed.

Drawings

So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only certain typical aspects of this invention and are therefore not to be considered limiting of its scope, for the description may admit to other equally effective aspects.

Fig. 1 shows a schematic diagram of an autonomous vehicle 100 with different types of sensors traveling on a road according to one embodiment of the invention.

FIG. 2 illustrates a flow chart of a method 200 for constructing a real-time road model according to one embodiment of the invention.

FIG. 3 shows schematic diagrams 301 and 302 for generic statistical model format data fusion according to one embodiment of the invention.

FIG. 4 illustrates a flow chart of a method 400 for fusing generic statistical model format data with historical generic statistical model format data for a first moment in time according to the embodiment of FIG. 3.

Fig. 5 is a block diagram of an apparatus 500 for constructing a real-time road model according to one embodiment of the invention.

FIG. 6 illustrates a block diagram of an exemplary computing device 600, according to one embodiment of the invention.

Detailed Description

The features of the present invention will become more apparent from the detailed description set forth below when taken in conjunction with the drawings.

The following detailed description refers to the accompanying drawings that illustrate exemplary embodiments of the invention. The scope of the invention is not limited to these embodiments but is defined by the appended claims. Accordingly, embodiments other than those shown in the figures, such as modified versions of the illustrated embodiments, are still encompassed by the present invention.

Reference in the specification to "one embodiment," "an example embodiment," etc., means that the embodiment may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Furthermore, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the relevant art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.

For convenience of explanation, only an embodiment in which the technical solution of the present invention is applied to a "vehicle" or an "autopilot vehicle" (these two terms are used interchangeably hereinafter) is described in detail herein, but it is fully understood by those skilled in the art that the technical solution of the present invention can be applied to any vehicle capable of implementing unmanned autopilot, such as an airplane, a helicopter, a train, a subway, a ship, etc. The term "a or B" as used throughout this specification refers to "a and B" and "a or B" unless otherwise indicated, and does not mean that a and B are exclusive.

Universal statistical model

When an autonomous vehicle travels on a road, it is necessary to know real-time road conditions. For this purpose, various types of sensors are loaded on the vehicle to act as the "eyes" of the vehicle. Sensors currently in widespread use include vision-type sensors (e.g., cameras) and radar-type ranging sensors (such as lidar, millimeter wave radar, ultrasonic radar). Each sensor has its own advantages and weaknesses. For example, the camera has low cost, can identify different objects, has advantages in terms of object height and width measurement accuracy, lane line identification, pedestrian identification accuracy and the like, is a sensor which is indispensable for realizing functions such as lane departure early warning, traffic sign identification and the like, has a smaller action distance and ranging accuracy than a radar, and is easily influenced by factors such as illumination, weather and the like. The millimeter wave radar is suitable for all weather environments, is not influenced by severe weather such as light, haze, sand storm and the like, can realize the identification of dynamic and static obstacles, but has larger influence on the millimeter wave radar in the environment of coexistence of multiple wave bands under the driving environment, and therefore, the condition of lower data accuracy can be generated. The laser radar has wide detection range and high detection precision, but has poorer performance and higher cost under extreme weather such as rain, snow, fog and the like. Thus, it is desirable to use two or more sensor covers to verify each other during driving. At present, the automatic driving vehicle widely adopts the following comprehensive solutions to achieve the aim of safety redundancy, namely (1) the automatic driving vehicle is provided with a camera and a millimeter wave radar, (2) the automatic driving vehicle is provided with the camera and a laser radar, and (3) the automatic driving vehicle is provided with the camera, the millimeter wave radar and the laser radar.

Referring to fig. 1, a schematic diagram of an autonomous vehicle 100 traveling on a roadway with different types of sensors is shown. Although vehicle 100 in fig. 1 employs cameras 101, millimeter wave radar 102, and lidar 103 to identify objects for purposes of illustration, those skilled in the art will fully appreciate that aspects of the present invention may include one or more other onboard sensors 104. In addition, it is within the scope of the invention to employ more or fewer sensors, such as only camera 101 and millimeter wave radar 102 or only camera 101 and lidar 103.

In the case of a vehicle having multiple types of sensors, each sensor records its own sensor data and provides it to the central processing unit of the vehicle. The format of sensor data provided by various types or various sensor manufacturers is typically different. Generally, the sensor outputs raw sensor data or outputs data after preprocessing (e.g., feature extraction, object extraction, etc.) the raw sensor data according to the vehicle manufacturer's or sensor manufacturer's settings. For example, for the same object such as the lane line 105 in fig. 1, the camera 101 outputs camera data representing the lane line 105, such as original image data or image features extracted from the original image data. Millimeter wave radar 102 outputs millimeter wave radar data representing the lane line 104, such as raw millimeter wave radar data or polygon sequence data constructed from raw millimeter wave radar data. And the lidar 103 outputs lidar data representing the lane line 105, such as raw lidar data or three-dimensional point cloud data constructed from raw lidar data. Of course, the above-listed data formats of the sensor output are merely illustrative, and those skilled in the art will fully appreciate that any data format of the sensor output is within the scope of the present invention.

Typically, for each type of sensor data, one or more models are employed to record that type of sensor data. But this approach would produce multiple sensor data records. For example, at a certain time t, a plurality of data records recording the camera data output by the camera 101, the millimeter wave radar data output by the millimeter wave radar 102, and the laser radar data output by the laser radar 103, respectively, are generated for the same object. This approach not only consumes excessive memory space, but can potentially cause a delay in data processing speed.

The present invention defines a generic statistical model that is capable of supporting multiple types of sensor data simultaneously. The generic statistical model format is given by { t, d_s1,d_s2…d_sn }, which represents a set of sensor datasets output by sensor 1 through sensor n at time t. Where s1 denotes a sensor 1 loaded on the vehicle, s2 denotes a sensor 2 loaded on the vehicle, sn denotes a sensor n (n is any integer greater than 1) loaded on the vehicle, the n sensors being different types of sensors for identifying objects, such as cameras, millimeter wave radars, lidars, etc. Those skilled in the art will fully appreciate that other types of sensors, such as ultrasonic sensors, are also included within the scope of the present invention, depending on the particular configuration and needs of the vendor of the autonomous vehicle.

D_s1 represents the data set output at time t s1, d_s2 represents the data set output at time t s2, and d_sn represents the data set output at time t sn (n is any integer greater than 1). In general, each dataset in d_s1……d_sn has a different data format, such as camera data, millimeter wave radar data, lidar data, and the like. Those skilled in the art will fully appreciate that the above data set may well include sensor data output by other types of sensors, depending on the sensor employed.

By employing the generic statistical model, data sets output by a plurality of different types of sensors at a single time can be integrated into one data model, thereby reducing storage pressure and enabling more efficient data processing.

Implicit model

In the present invention, a plurality of hidden models are predefined, each hidden model including a plurality of different types of sensor data describing an object. The objects may be road signs, lane lines, buildings, pedestrians, road edges, bridges, utility poles, overhead structures, traffic lights or signs, to name a few.

The implicit model format is { pd_s1,pd_s2…pd_sn, object }, which represents a set of different types of sensor data sample sets that are used to describe an object. Wherein s1...sn corresponds to sensor s1...sn, respectively, in the general statistical model described above. pd_s1 represents a sample set describing the object output by s1 obtained in advance, pd_s2 represents a sample set describing the object output by s2 obtained in advance, and pd_sn represents a sample set describing the object output by sn obtained in advance. For example, if the object is a building, and s1 is a camera, s2 is millimeter wave radar, and s3 is lidar, then the implicit model is instantiated as { pd_{Camera head},pd_{Millimeter wave radar},pd_{Laser radar}, building }, whereby pd_{Camera head} may comprise a camera data sample set describing the building, pd_{Millimeter wave radar} may comprise a millimeter wave radar data sample set describing the building, and pd_{Laser radar} may comprise a lidar data sample set describing the building.

The sensor data sample set may be obtained in advance by the autopilot vehicle manufacturer, the sensor manufacturer, or the user. For example, as known to those skilled in the art, a vehicle manufacturer, sensor manufacturer may collect a large number of sensor data samples while training a target model or a road model, and mark the collected sensor data samples by various algorithms or feature extraction means. Thus, sample data output by a certain sensor, labeled as describing the same object, may be grouped together to form a sample set of sensor data for that object for that sensor.

For further explanation, by way of example only, { pd_{Camera head},pd_{Millimeter wave radar},pd_{Laser radar}, building }, may be constructed in the following manner. For example, during training, a vehicle manufacturer may drive a vehicle from point a to point B, during which time sensors such as cameras, millimeter wave radar, lidar, etc. acquire a large number of sensor data samples. After each of these sensor data samples is processed, each sensor data sample may be marked to identify an object (e.g., a roadway sign, lane line, building, pedestrian, roadway edge, bridge, pole, overhead structure, or traffic sign, etc.). Next, multiple camera data samples identifying the same object (e.g., building) are clustered into pd_{Camera head} for that object, multiple millimeter wave radar data samples identifying the same object (e.g., building) are clustered into pd_{Millimeter wave radar} for that object, and multiple lidar data samples identifying the same object are clustered into pd_{Laser radar} for that object. Finally, the three sets of sensor data samples described above are fed into an implicit model to instantiate an implicit model for the building, i.e., { pd_{Camera head},pd_{Millimeter wave radar},pd_{Laser radar}, building }.

Furthermore, the number of samples included in the sensor data sample set may vary depending on specific hardware constraints, usage scenarios, and user requirements. According to one embodiment of the invention, the sensor data sample set may be pre-stored in a storage device of the autonomous vehicle or may be obtained in real-time from a server of the autonomous vehicle manufacturer, a server of the sensor manufacturer, or various cloud services over a network.

According to one embodiment of the invention, after a real-time sensor is obtained, an object may be identified by comparing the sensor data to a sample set of sensor data. The specific modes are detailed below.

Implementation mode

FIG. 2 depicts a flowchart of a method 200 for constructing a real-time road model according to one embodiment of the invention. For example, the method 200 may be implemented within at least one processor (e.g., the processor 604 of fig. 6), which may be located in an onboard computer system, a remote server, or a combination thereof. Of course, in various aspects of the invention, the method 200 may also be implemented by any suitable means capable of performing the relevant operations.

The method 200 begins at step 210. At step 210, at a first time instant (hereinafter "first time instant" is understood to be "real-time"), a first set of sensor data output at the first time instant via a plurality of different types of sensors onboard the vehicle is acquired. The vehicle may take on two or more different types of sensors. According to one embodiment of the invention, the plurality of different types of sensors may include a camera and one or more of a millimeter wave sensor and a lidar sensor. For example, the vehicle may employ a camera and millimeter wave radar, a camera and lidar, or a camera, millimeter wave radar and lidar. The first set of sensor data includes a plurality of sensor data sets corresponding to real-time sensor data output by the camera, millimeter wave sensor, and/or lidar sensor, respectively. Of course, other numbers and other types of sensors are within the scope of the invention, as will be appreciated by those skilled in the art.

At step 220, the acquired first set of sensor data is fed to a generic statistical model to form generic statistical model format data for the first time instant. That is, the model is instantiated by feeding time information (such as a timestamp) and a plurality of sensor data sets included in the first set of sensor data into the generic statistical model { t, d_s1,d_s2…d_sn }. For example, in the case where the vehicle employs a camera, millimeter wave radar, and lidar, the common statistical model format data for this first time is { t_{At a first time},d_{Camera head},d_{Millimeter wave radar},d_{Laser radar} }. As described above, the data format output by d_{Camera head},d_{Millimeter wave radar},d_{Laser radar} may be set by the vehicle manufacturer or sensor manufacturer. In general, d_{Camera head},d_{Millimeter wave radar},d_{Laser radar} has different data formats.

At step 230, map data is extracted from the map that is within a threshold range around the location where the vehicle is located at the first time. At step 240, one or more of the respective sensor data sets in the generic statistical model format data for the first time instant are corrected based on the extracted map data. As described above, sensors such as cameras or lidars are susceptible to weather factors, and the accuracy of the collected sensor data is low in situations where the detection environment is not ideal (e.g., rain, snow, haze, fog). In practice, moreover, it is often found that ground markings or lane lines wear considerably, and that sensor data for such objects do not identify the original features of the object well. In addition, when the in-vehicle sensor is obscured by other obstacles, the collected sensor data is also inaccurate, or even erroneous. For this reason, map data may be used as a priori information to provide support in the event of poor or failure of the sensor.

According to one embodiment of the invention, the map may be preloaded in the memory of the autonomous vehicle or may be obtained from a map provider over a network. According to one embodiment of the invention, the map as a priori information may include an offline map such as an OSM (Open STREET MAP) offline map. Or to obtain a further more accurate centimeter level of positioning, a high-precision map (such as that provided by map suppliers such as Google, hele, etc.) may be used. Those skilled in the art will appreciate that other types of maps are within the scope of the present invention.

The GPS information of the current vehicle can be obtained through the GPS module loaded on the vehicle. Map data within a threshold range around the vehicle is extracted from the map based on the GPS information. The selection of the threshold range may be set by the vehicle manufacturer or user, such as an area encompassed by a circle centered on the vehicle with a radius of 20 meters, 50 meters, 100 meters, etc. Of course, those skilled in the art will appreciate that other values or threshold ranges of shapes are within the scope of the present invention.

By comparing the real-time sensor data with the extracted map data, the significantly erroneous or abnormal real-time sensor data can be corrected, thereby preventing the respective sensor data sets in the general statistical model format data from being greatly deviated due to the performance problem of the sensor or the influence of environmental factors. As can be appreciated by those skilled in the art, to obtain accurate comparison results, the real-time sensor data and the map data are typically first transformed into the same coordinate system (e.g., various coordinates are uniformly transformed into coordinates in the world coordinate system), and then the real-time sensor data and the corresponding map data are compared.

In general, map data, which is a priori information, indicates a static object such as a road sign, a lane line, a building, a road edge, a bridge, a utility pole, an overhead structure, a traffic light or a traffic sign, or the like. Thus, if the real-time sensor data indicates, for example, pedestrians or other vehicles (i.e., dynamic objects) around the vehicle, no correction is required to the real-time sensor data to effectively avoid accidents and accidents. In other words, if the coordinates of the object indicated by the real-time sensor data are not recorded by the corresponding static object at the same position in the map data, no correction is required for the real-time sensor data. However, if the coordinates of the object indicated by the real-time sensor data are recorded by the corresponding static object at the same position in the map data and the data of the real-time sensor data is not complete or accurate, the data representing the corresponding static object is extracted from the map data to correct the real-time sensor data.

There may be a variety of ways to correct real-time sensor data in accordance with one or more embodiments of the present invention. Assuming that the generic statistical model format data for this first moment (real-time) is { t_{At a first time},d_{Camera head},d_{Millimeter wave radar},d_{Laser radar} }, depending on the extracted map data, the three types of sensor data sets can be corrected separately so that d_{Camera head},d_{Millimeter wave radar},d_{Laser radar} contains a sensor data set describing the correct static object, respectively. Or only one or both of d_{Camera head},d_{Millimeter wave radar},d_{Laser radar} may be corrected due to the size of the data amount, leaving the data of the other two or one empty, to save storage space.

For example, at a first time (i.e., in real time), if there should be a traffic sign turning to the right on a road 10 meters in front of the vehicle based on the extracted map data within the threshold range around the vehicle, but since the traffic sign is mostly covered by sand scattered by a certain sand truck ahead, the traffic sign cannot be completely sensed by the sensor, the respective sensor data sets in the generic statistical model format data for the first time only record data indicating the incomplete traffic sign. Thus, by comparison with the extracted map data, one or more of the respective sets of sensor data in the generic statistical model format data for the first time instant may be updated such that at least one type of sensor data describing a complete traffic sign is included in the generic statistical model format data for the first time instant.

Returning to FIG. 2, at step 250, the corrected generic statistical model format data for the first time instant is fused with the historical generic statistical model format data to update the generic statistical model format data for the first time instant. In practice, real-time sensor data acquired at a certain moment alone does not depict an object very accurately. In particular, for an object having continuity, such as a lane line or the like, fusion of sensor data at a plurality of successive times is required to describe the lane line.

According to one embodiment of the present invention, it is assumed that multiple types of sensors loaded in a vehicle are synchronized in time and sensor data is output at the same time interval. The time interval may be different, for example 0.1 seconds, 0.2 seconds, 0.5 seconds, 1 second, etc., according to different practical requirements. Of course, other time intervals are also within the scope of the invention. According to one or more embodiments of the invention, historical generic statistical model format data may be formed at several successive moments prior to a first moment and stored in a memory of the vehicle or cached for quick reading. As should be appreciated, the historical common statistical model format data has the same data format as the common statistical model format data for the first time instant and is formed at one or more time instants before the first time instant in the same manner as the common statistical model format data for the first time instant is formed.

FIG. 3 illustrates schematic diagrams 301 and 302 for generic statistical model format data fusion in accordance with one or more embodiments of the invention. In short, plot 301 shows a single fusion, while plot 302 shows an iterative multiple fusion.

Diagram 301 is a schematic diagram showing fusing generic statistical model format data for a first time instant with historical generic statistical model format data comprising a plurality of generic statistical model format data for a plurality of previous times instants within a threshold period of time before the first time instant. That is, { t_{At a first time},d_s1,d_s2…d_sn } is fused with {t_{At a first time -1},d_s1,d_s2…d_sn},{t_{At a first time -2},d_s1,d_s2…d_sn}……{t_{At a first time -tn},d_s1,d_s2…d_sn} for a previous plurality of consecutive times to update { t_{At a first time},d_s1,d_s2…d_sn }. The intervals between two adjacent moments, for example, between t_{At a first time} and t_{At a first time -1}, between t_{At a first time -1} and t_{At a first time -2}, are the predetermined time intervals as described above, and the threshold time period passing between t_{At a first time -tn} and t_{At a first time} may also be selected according to actual requirements. For example, in the case where the predetermined time interval is 0.1 seconds, the threshold time period may be selected to be 1 second, whereby 10 (i.e., tn is 10 in this case) historical generic statistical model format data within 1 second from the first time point are selected for fusion. For example, { t_{At a first time},d_s1,d_s2…d_sn } may be fused with 10 historical generic statistical model format data for the previous 1 second to update { t_{At a first time},d_s1,d_s2…d_sn } with the fused sensor data to get { t_{At a first time},d_s1',d_s2'…d_sn' }, where each of the 10 historical generic statistical model format data corresponds to the generic statistical model format data obtained at 0.1 meter time intervals for the previous 1 second of the first time. It can be seen that the manner of illustration 301 is to fuse { t_{At a first time},d_s1,d_s2…d_sn } with historical generic statistical model format data once to update { t_{At a first time},d_s1,d_s2…d_sn }.

Diagram 302 is a schematic diagram showing iteratively fusing a plurality of generic statistical model format data for a plurality of previous moments in time within a threshold period of time from a first moment in time. Continuing with the above example, assume that the threshold time periods are 1 second, respectively, and the predetermined time interval between adjacent two times is 0.1 second. The common statistical model format data for the previous moment is iteratively fused with the common statistical model format data for the next moment to update the common statistical model format data for the next moment until the common statistical model format data for the first moment is updated, resulting in { t_{At a first time},d_s1',d_s2'…d_sn' }.

In order to make the fused data more accurate, the following mathematical method can be adopted in the fusion process.

FIG. 4 illustrates a flow chart of a method 400 for fusing generic statistical model format data with historical generic statistical model format data for a first time according to the embodiment of FIG. 3. At step 410, historical generic statistical model format data is obtained, the historical generic statistical model format data comprising a plurality of generic statistical model format data for a plurality of previous moments in a threshold period of time prior to a first moment in time.

At step 420, the generic statistical model format data for the first time instant is converted into the same coordinate system as the historical generic statistical model format data. For example, assume that the vehicle is taken as the origin of the local coordinate system, the traveling direction of the vehicle is taken as the x-axis of the local coordinate system, and the direction perpendicular to the traveling direction of the vehicle is taken as the y-axis of the local coordinate system. Then as the vehicle travels a distance L toward the direction of travel from time t_{At a first time -1} to t_{At a first time}, it is understood that the origin of the local coordinate system at t_{At a first time} is moved compared to the origin of the local coordinate system at t_{At a first time -1} (L_x,L_y). Through coordinate transformation, the sensor datasets in the collected historical generic statistical model format data are transformed into the local coordinate system of t_{At a first time}, thereby leaving all the sensor datasets for fusion in the same coordinate system. According to another embodiment of the present invention, the collected historical general statistical model format data and various coordinates adopted for the general statistical model format data at the first moment can be uniformly converted into coordinates in a world coordinate system, so that all sensor data sets for fusion are in the same coordinate system. Various coordinate transformation methods include, but are not limited to, translation and rotation of coordinates in two-dimensional space, translation and rotation of coordinates in three-dimensional space, and the like.

At step 430, the generic statistical model format data for the first moment and the historical generic statistical model format data for the first moment having the sensor dataset under the same coordinate system are fused according to either of the two fusion approaches 301 and 302 shown in fig. 3 such that the generic statistical model format data for the first moment is updated to include the fused sensor data. As known to those skilled in the art, the fusion process involves aggregation and denoising of the data sets in order to obtain smooth and consistent data. For example, in the fusion method using 301, it is assumed that the threshold time period is 1 second, and the predetermined time interval between adjacent two times is 0.1 second, respectively. The sensor dataset included in d_s1,d_s2…d_sn in { t_{At a first time},d_s1,d_s2…d_sn } and the sensor dataset included in d_s1,d_s2…d_sn in the first 10 historical generic statistical model format data are aggregated, and then duplicate or anomalous data in the aggregated sensor dataset is removed or filtered to obtain a fused updated { t_{At a first time},d_s1',d_s2'…d_sn' }. For another example, in the fusion approach employing 302, the sensor data sets included in the generic statistical model format data for two consecutive moments in time are similarly aggregated and denoised, thereby updating the generic statistical model format data for the next moment in time until the generic statistical model format data for the first moment in time is updated.

In one embodiment, a weighted average algorithm may also be employed for fusion. For example, when aggregated, historical generic statistical model format data recorded closer in time to a first time is given higher weight, while generic statistical model format data recorded farther in time from the first time is given lower weight. Of course, other ways of weighting are also contemplated.

Returning to FIG. 2, at step 260, the updated generic statistical model format data for the first moment is compared to the implicit model to identify the object. As described above, the implicit model format is { pd_s1,pd_s2…pd_sn, object }, which represents a set of different types of sensor data sample sets that are used to describe an object. By comparing { t_{At a first time},d_s1',d_s2'…d_sn' } obtained in step 250 with one or more { pd_s1,pd_s2…pd_sn, object } a specific object described by { t_{At a first time},d_s1',d_s2'…d_sn' } can be derived. According to one embodiment of the invention, assume that the vehicle employs three sensors and compare { t_{At a first time},d_s1',d_s2',d_s3' } with { pd_s1,pd_s2,pd_s3, object₁ }. In this example, d_s1' and pd_s1,d_s2' and pd_s2,d_s3' and pd_s3, respectively, are compared to determine if { t_{At a first time},d_s1',d_s2',d_s3' } describes object₁. In specific practice, it is highly likely that not all of the three comparison results are true among the three comparison results for the three sensors. In this regard, the vehicle manufacturer may predefine a decision criterion, such as whether two of the three sensor data sets are true, i.e., the overall comparison result is considered to be true, or whether three of the three sensor data sets must be true, i.e., the overall comparison result is considered to be true.

Or the vehicle may automatically select the decision criteria based on the current climate, road environment or identified object, as set forth in advance. For example, as described above, different kinds of sensors have different advantages and disadvantages. Depending on the different adaptability of the sensor to environmental conditions, the confidence of the sensor dataset output by the millimeter wave radar may be designated as higher in environments with poor visibility such as haze, fog, etc., and the confidence of the sensor dataset output by the lidar and the camera may be designated as higher in the case of better environmental conditions. Furthermore, the confidence of different sensor datasets may be set for different types of objects, depending on the manner in which the data is obtained by the different types of sensors. For example, for some objects having three-dimensional characteristics, such as buildings, the confidence of the sensor data set output by the camera may be set lower than the confidence of the sensor data sets output by the laser sensor and the millimeter wave radar sensor, whereas for some planar objects, such as lane lines, ground traffic signs, the confidence of the sensor data set output by the camera may be set lower than the confidence of the sensor data sets output by the laser sensor and the millimeter wave radar sensor. In this way, in general, the overall determination result can be calculated by the following equation:

Overall = confidence_s1 × s1+ confidence_s2 s2+. The..+ -. Is a.c. + confidence_sn Sn.

Wherein confidence_s1 +confidence_s2 +. The.a. confidence_sn =1, S1 is the result of comparing d_s1' with pd_s1, S2 is the result of comparing d_s2' with pd_s2, and Sn bit d_sn' with pd_sn. wherein, S1 is the same as the main component, S2. Is 1 or 0. For example, s1=1 indicates that d_s1' compared to pd_s1 yields d_s1' identifying the object described by pd_s1, while s1=0 indicates that d_s1' compared to pd_s1 yields d_s1' not identifying the object described by pd_s1. The same is true for s2. Thus, the vehicle manufacturer or user may set that if the entirety > a predetermined value (e.g., 50%), it may be determined that { t_{At a first time},d_s1',d_s2'…d_sn' } identifies the object described by { pd_s1,pd_s2…pd_sn, object }. Of course, a variety of other different decision criteria are also contemplated.

At step 270, the identified objects are combined with the extracted map data to form a real-time road model. As described above, real-time sensor data collected by sensors cannot reflect the surrounding road conditions comprehensively depending on environmental factors and the like. Thus, the updated objects identified by the generic statistical model format data for the first moment may be stitched with the map data extracted in step 230 to form a real-time road model. In some embodiments, at step 260, the updated generic statistical model format data for the first moment may be compared only with an implicit model describing the dynamic object (e.g., pedestrian, other vehicle, etc.) such that the real-time sensor data is used only to identify the dynamic object. Thus, real-time dynamic objects identified by the sensors are combined with static objects in the map data (e.g., road signs, lane lines, buildings, road edges, bridges, utility poles, overhead structures, traffic lights or signs, etc.) to form a real-time road model. Thus, the computational burden of the in-vehicle processor is reduced.

Also, in step 260, after comparing the updated generic statistical model format data for the first time with the implicit model, it is possible to derive a case where the updated generic statistical model format data for the first time does not successfully identify the object. In this case, the extracted map data may be used as a real-time road model.

Thus, by using the method of the present invention, sensor data sets can be processed more quickly by placing different kinds of sensor data sets into a unified data model than by separately obtaining sensor data from each of the different kinds of sensors and processing separately. Meanwhile, the map data is used as priori information, so that the real-time data of the sensor and the existing information of the map data can be combined, and a real-time road model can be built more accurately and rapidly.

Fig. 5 is a block diagram of an apparatus 500 for constructing a real-time road model according to one embodiment of the invention. All of the functional blocks of the apparatus 500 (including the respective units in the apparatus 500) may be implemented by hardware, software, a combination of hardware and software. Those skilled in the art will appreciate that the functional blocks depicted in fig. 5 may be combined into a single functional block or divided into multiple sub-functional blocks.

The apparatus 500 may include a sensor data acquisition module 510, the sensor data acquisition module 510 configured to acquire a first set of sensor data output by different types of sensors onboard the vehicle at a first time. The apparatus 500 may further include a data feed module 520, the data feed module 520 configured to feed the first set of sensor data to the generic statistical model to form generic statistical model format data for the first moment in time. The apparatus 500 may further comprise a map data extraction module 530, the map data extraction module 530 being configured to extract map data from a map of the vehicle within a threshold range around a location at which the first moment is located. The apparatus 500 may further include a data correction module 540, the data correction module 540 being configured to correct one or more of the plurality of sensor data sets in the generic statistical model format data for the first moment in time based on the map data. The apparatus 500 may further include a data fusion module 550, the data fusion module 550 being configured to fuse the generic statistical model format data for the first time instant with the historical generic statistical model format data to update the generic statistical model format data for the first time instant. The apparatus 500 may further include an object identification module 560, the object identification module 560 configured to compare the updated generic statistical model format data for the first time instance with the implicit model to identify the object. The apparatus 500 may still further comprise a real-time road model formation module 570, the real-time road model formation module 570 being configured to combine the identified objects with map data to form the real-time road model.

FIG. 6 illustrates a block diagram of an exemplary computing device, which is one example of a hardware device that may be used with aspects of the invention, according to one embodiment of the invention.

With reference to FIG. 6, a computing device 600 will now be described as one example of a hardware device that may be employed with aspects of the present invention. Computing device 600 may be any machine that may be configured to implement processes and/or calculations and may be, but is not limited to, a workstation, a server, a desktop computer, a laptop computer, a tablet computer, a personal digital assistant, a smart phone, a vehicle mount computer, or any combination thereof. The various methods/apparatus/servers/client devices described above may be implemented, in whole or in part, by computing device 600 or a similar device or system.

Computing device 600 may include components that may be connected or in communication with bus 602 via one or more interfaces. For example, computing device 600 may include a bus 602, one or more processors 604, one or more input devices 606, and one or more output devices 608. The one or more processors 604 may be any type of processor and may include, but is not limited to, one or more general purpose processors and/or one or more special purpose processors (e.g., special purpose processing chips). Input device 606 may be any type of device capable of inputting information to a computing device and may include, but is not limited to, a mouse, keyboard, touch screen, microphone, and/or remote controller. Output device 608 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, video/audio output terminals, vibrators, and/or printers. Computing device 600 may also include or be connected to a non-transitory storage device 610, which may be non-transitory and capable of data storage, and which may include, but is not limited to, a disk drive, an optical storage device, solid state memory, a floppy disk, a flexible disk, a hard disk, magnetic tape, or any other magnetic medium, an optical disk or any other optical medium, a ROM (read only memory), a RAM (random access memory), a cache memory, and/or any memory chip or cartridge, and/or any other medium from which a computer may read data, instructions, and/or code. The non-transitory storage device 610 may be separate from the interface. The non-transitory storage device 610 may have data/instructions/code for implementing the methods and steps described above. Computing device 600 may also include a communication device 612. The communication device 612 may be any type of device or system capable of enabling communication with internal equipment and/or with a network and may include, but is not limited to, a modem, a network card, an infrared communication device, a wireless communication device, and/or a chipset, such as a Bluetooth device, an IEEE 1302.11 device, a WiFi device, a WiMax device, a cellular communication device, and/or the like.

When the computing device 600 is used as an in-vehicle device, it may also be connected with external devices (e.g., a GPS receiver, sensors for sensing different environmental data (such as acceleration sensors, wheel speed sensors, gyroscopes, etc.)). In this way, computing device 600 may receive, for example, positioning data and sensor data indicative of a vehicle-form condition. When the computing device 600 is used as an in-vehicle device, it may also be connected with other devices (e.g., an engine system, a wiper, an antilock brake system, etc.) for controlling the running and operation of the vehicle.

Further, the non-transitory storage device 610 may have map information and software components so that the processor 604 may implement route guidance processing. Further, the output device 606 may include a display for displaying a map, displaying positioning marks of the vehicle, and displaying an image indicating a running condition of the vehicle. The output device 606 may also include a speaker or headphone interface for audio guidance.

Bus 602 may include, but is not limited to, an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an Enhanced ISA (EISA) bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnect (PCI) bus. In particular, for an in-vehicle device, bus 602 may also include a Controller Area Network (CAN) bus or other architecture designed for use in an automobile.

Computing device 600 may also include a working memory 614, which working memory 614 may be any type of working memory capable of storing instructions and/or data that facilitate the operation of processor 604 and may include, but is not limited to, random access memory and/or read-only memory devices.

Software components may reside in working memory 614 and include, but are not limited to, an operating system 616, one or more application programs 618, drivers, and/or other data and code. Instructions for implementing the above-described methods and steps may be included in the one or more applications 618, and modules/units/components of the various foregoing apparatus/servers/client devices may be implemented by the processor 604 reading and executing the instructions of the one or more applications 618.

It should also be appreciated that variations may be made according to particular needs. For example, custom hardware may also be used, and/or particular components may be implemented in hardware, software, firmware, middleware, microcode, hardware description voices, or any combination thereof. In addition, connections to other computing devices, such as network input/output devices, etc., may be employed. For example, some or all of the disclosed methods and apparatus may be implemented with programming hardware (e.g., programmable logic circuits including Field Programmable Gate Arrays (FPGAs) and/or Programmable Logic Arrays (PLAs)) having an assembly language or hardware programming language (e.g., VERILOG, VHDL, C ++).

Although aspects of the present invention have been described so far with reference to the accompanying drawings, the above-described methods, systems and apparatuses are merely examples, and the scope of the present invention is not limited to these aspects but is limited only by the appended claims and equivalents thereof. Various components may be omitted or replaced with equivalent components. In addition, the steps may also be implemented in a different order than described in the present invention. Furthermore, the various components may be combined in various ways. It is also important that as technology advances, many of the described components can be replaced by equivalent components that appear later.

Claims

1. A method for constructing a real-time road model, comprising:

Acquiring a first set of sensor data output via a plurality of different types of sensors onboard the vehicle at a first time;

feeding the first set of sensor data to a generic statistical model to form generic statistical model format data for the first moment in time, the generic statistical model format data comprising a plurality of sensor data sets, wherein each sensor data set of the plurality of sensor data sets is output by one of the different types of sensors;

extracting map data of the vehicle in a threshold range around the position of the first moment from a map;

correcting one or more of a plurality of sensor data sets in a generic statistical model format data for the first moment in time based on the map data;

fusing the universal statistical model format data for the first time with historical universal statistical model format data to update the universal statistical model format data for the first time;

Comparing the updated generic statistical model format data for the first time instance with an implicit model to identify an object, the implicit model comprising a plurality of sensor data sample sets describing a predefined object, wherein each of the plurality of sensor data sample sets comprises a pre-acquired sensor data sample set describing the predefined object by one of the different types of sensors, and

The identified objects are combined with the map data to form the real-time road model.

2. The method of claim 1, wherein correcting one or more of the plurality of sensor data sets in the generic statistical model format data for the first time instant based on the map data further comprises:

converting a plurality of sensor data sets in a generic statistical model format data for the first moment and the map data into the same coordinate system;

If the coordinates of the objects indicated by the plurality of sensor data sets are not documented by the respective objects at the same location in the map data, no correction is required for one or more of the plurality of sensor data sets in the generic statistical model format data for the first time instant;

If the coordinates of the object indicated by the plurality of sensor data sets are recorded by the corresponding object at the same position in the map data, and the data of one or more of the plurality of sensor data sets is incomplete or accurate, the data representing the corresponding object is extracted from the map data to correct the one or more of the plurality of sensor data sets.

3. The method of claim 1, wherein fusing the common statistical model format data for the first time with historical common statistical model format data comprises fusing the common statistical model format data for the first time with historical common statistical model format data comprising a plurality of common statistical model format data for a plurality of previous times within a threshold period of time prior to the first time.

4. The method of claim 1, wherein fusing the generic statistical model format data for the first time instance with historical generic statistical model format data comprises:

For a plurality of common statistical model format data for a plurality of previous moments in time within a threshold period of time from before the first moment in time, iteratively performing fusing the common statistical model format data for each moment in time with the common statistical model format data for a subsequent moment in time after a predetermined time interval to update the common statistical model format data for the subsequent moment in time until the common statistical model format data for the first moment in time is updated.

5. The method of claim 1, wherein the fusing further comprises causing the historical generic statistical model format data and the generic statistical model format data for the first time to each be represented using the same coordinate system by converting the historical generic statistical model format data into a local coordinate system using a location of the vehicle at the first time as an origin of the local coordinate system or by uniformly converting various coordinates employed in the historical generic statistical model format data and the generic statistical model format data for the first time into coordinates in a world coordinate system.

6. The method of claim 5, wherein the fusing further comprises aggregating the historical generic statistical model format data and the sets of sensor data output by the same sensor for the first time instant in the generic statistical model format data, respectively, and removing duplicate data in each aggregated set of sensor data.

7. The method of claim 1, wherein the object comprises at least one of a road sign, a lane, a building, a pedestrian, another vehicle, a road edge, a bridge, a pole, an overhead structure, or a traffic sign.

8. The method of claim 1, wherein the plurality of different types of sensors comprises a vision-type sensor comprising a camera and a radar-type ranging sensor comprising one or more of a lidar, an ultrasonic radar, a millimeter wave radar.

9. The method of claim 1, wherein the map comprises a high-precision map, wherein comparing the updated generic statistical model format data for the first time instance with an underlying model to identify objects further comprises comparing only the updated generic statistical model format data for the first time instance with an underlying model describing dynamic objects to identify only dynamic objects, the dynamic objects comprising pedestrians and/or another vehicle;

Combining the identified objects with the map data to form the real-time road model further includes combining the identified dynamic objects with the map data to form a real-time road model.

10. An apparatus for constructing a real-time road model, comprising:

A sensor data acquisition module configured to acquire a first set of sensor data output by different types of sensors onboard the vehicle at a first time;

A data feed module configured to feed the first set of sensor data to a generic statistical model to form generic statistical model format data for the first time instant, the generic statistical model format data comprising a plurality of sensor data sets, wherein each of the plurality of sensor data sets is output by one of the different types of sensors;

a map data extraction module configured to extract map data of the vehicle within a threshold range around a position where the first time is located from a map;

a data correction module configured to correct one or more of a plurality of sensor data sets in a generic statistical model format data for the first time instant based on the map data;

a data fusion module configured to fuse the generic statistical model format data for the first time with historical generic statistical model format data to update generic statistical model format data for the first time;

An object identification module configured to compare the updated generic statistical model format data for the first time instance with an implicit model to identify an object, the implicit model comprising a plurality of sensor data sample sets for describing a predefined object, wherein each of the plurality of sensor data sample sets comprises a pre-acquired sensor data sample set describing the predefined object by one of the different types of sensors, and

A real-time road model formation module configured to combine the identified objects with the map data to form the real-time road model.

11. The apparatus of claim 10, wherein the data correction module is further configured to:

12. The apparatus of claim 10, wherein the map comprises a high-precision map, wherein the object identification module is further configured to compare only updated generic statistical model format data for the first time instant with an implicit model describing a dynamic object to identify only dynamic objects, the dynamic object comprising a pedestrian and/or another vehicle;

Wherein the real-time road model formation module is further configured to combine the identified dynamic objects with the map data to form a real-time road model.

13. The apparatus of claim 10, wherein the plurality of different types of sensors comprise a vision-type sensor comprising a camera and a radar-type ranging sensor comprising one or more of a lidar, an ultrasonic radar, a millimeter wave radar.

14. A vehicle, comprising:

A plurality of different types of sensors, and

The device of any one of claims 10-13.

15. The vehicle of claim 14, wherein the plurality of different types of sensors include a vision-type sensor including a camera and a radar-type ranging sensor including one or more of a lidar, an ultrasonic radar, a millimeter wave radar.