Movatterモバイル変換


[0]ホーム

URL:


CN113469739B - Prediction method and system for taxi taking demand of network taxi taking - Google Patents

Prediction method and system for taxi taking demand of network taxi taking
Download PDF

Info

Publication number
CN113469739B
CN113469739BCN202110711952.4ACN202110711952ACN113469739BCN 113469739 BCN113469739 BCN 113469739BCN 202110711952 ACN202110711952 ACN 202110711952ACN 113469739 BCN113469739 BCN 113469739B
Authority
CN
China
Prior art keywords
data
training
feature set
prediction
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110711952.4A
Other languages
Chinese (zh)
Other versions
CN113469739A (en
Inventor
吴元琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Chenqi Travel Technology Co Ltd
Original Assignee
Guangzhou Chenqi Travel Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Chenqi Travel Technology Co LtdfiledCriticalGuangzhou Chenqi Travel Technology Co Ltd
Priority to CN202110711952.4ApriorityCriticalpatent/CN113469739B/en
Publication of CN113469739ApublicationCriticalpatent/CN113469739A/en
Application grantedgrantedCritical
Publication of CN113469739BpublicationCriticalpatent/CN113469739B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The invention relates to the technical field of taxi taking, in particular to a taxi taking demand prediction method and a taxi taking demand prediction system for network taxi taking. Comprising the following steps: reading a plurality of associated data which have an influence on the driving requirement; repairing abnormal and missing data in the associated data; respectively calculating the characteristics of the associated data and generating a characteristic set; carrying out overall training on the data in the feature set; performing secondary training on the data of the high single-volume region in the feature set; carrying out multi-model fusion on the results of the two training to obtain a prediction model; and inputting current associated data to obtain a predicted future driving demand result. The prediction method and the prediction system have the advantages of high accuracy, high operation efficiency and difficult fitting, machine learning modeling processing is carried out according to a plurality of associated data, the prediction method and the prediction system have higher accuracy compared with simple order quantity calculation, the problems of lower accuracy, low operation efficiency and easy fitting existing in the existing prediction technology are solved, and the prediction requirement of driving requirements is met.

Description

Prediction method and system for taxi taking demand of network taxi taking
Technical Field
The invention relates to the technical field of taxi taking, in particular to a taxi taking demand prediction method and a taxi taking demand prediction system for network taxi taking.
Background
With the development of the Internet, the business of taxi taking travel is gradually transferred from offline to online, and a user can be matched with a required network taxi taking and ordering only by inputting a starting point and a terminal point through application software. In general, the dispatching work of the network about car is regulated and controlled by a platform side, and a network about car driver is reasonably dispatched according to the single quantity condition displayed by the urban thermodynamic diagram so as to improve the capacity efficiency of a coverage area; because the taxi taking demands of passengers in each area are continuously changed, real-time adjustment work is difficult to be effective, historical data are required to be analyzed in advance, and future taxi taking demands are predicted so as to improve the carrying efficiency of network taxi taking business.
The existing prediction work of the taxi taking demand is mainly calculated by depending on the change of the number of historical orders, the order quantity in the same time period in each month or each day is predicted by calculating the average value of the orders in the subsequent same time period, and the method is simple and easy to implement, but has poor prediction effect, often has larger deviation from the actual situation, and the prediction accuracy cannot meet the requirement; in addition, at present, there is a technology for predicting the driving requirement by machine learning, but because the algorithm is not optimized according to the driving travel field, when a large amount of order data is processed, the machine model has lower running efficiency, and the situation of fitting is easy to occur, that is, the history data is used for testing to obtain a better effect, but the accuracy is slipped down when the driving requirement in the future time period is predicted, so that a prediction method and a prediction system for the driving requirement of the network bus are needed to solve the problems.
Disclosure of Invention
In order to overcome the technical defects of lower accuracy, low operation efficiency and easy overfitting existing in the existing prediction technology, the invention provides a prediction method and a prediction system for the taxi taking demand of a net taxi, which have the advantages of high accuracy, high operation efficiency and difficult overfitting.
In order to solve the problems, the invention is realized according to the following technical scheme:
The invention discloses a prediction method for taxi taking demand of a network taxi, which is characterized by comprising the following steps of:
reading a plurality of associated data which have an influence on the driving requirement;
Repairing abnormal and missing data in the associated data;
respectively calculating the characteristics of the associated data and generating a characteristic set;
carrying out overall training on the data in the feature set;
Performing secondary training on the data of the high single-volume region in the feature set;
carrying out multi-model fusion on the results of the two training to obtain a prediction model;
And inputting current associated data to obtain a predicted future driving demand result.
The associated data includes: historical order volume data, order weather data, holiday information data, regional population density data, and regional community density data.
The repairing of the abnormal and missing data in the associated data is specifically as follows: sorting and ordering the associated data respectively, arranging the associated data into a normal distribution diagram according to the occurrence frequency of the data, and deleting the data outside the 3 delta range in the normal distribution diagram to reduce abnormal data; and if the value can be assigned to the missing data, filling the missing data by using the element 0, and if the value cannot be assigned to the missing data, acquiring the data of the front section and the rear section of the missing data, extracting the data with high frequency, and filling the missing data.
The method comprises the steps of respectively calculating the characteristics of the associated data and generating a characteristic set, and specifically comprises the following steps: dividing the historical order quantity data at fixed time intervals to obtain a sequence of the historical order quantity data, then respectively calculating a single quantity ring ratio, a single quantity daily homonymy ratio and a single quantity Zhou Tongbi of the sequence, marking the time dimension and the space dimension of the historical order quantity data by utilizing the related data, and finally processing by adopting a time-frequency transformation algorithm to obtain the feature set.
The time-frequency transformation algorithm comprises a Fourier transformation algorithm.
The data in the feature set is integrally trained, specifically: the feature set is read, data is input into a first machine learning model for training, then the generated data is combined into the feature set, the data is extracted again and input into a second machine learning model for training, and the generated data is combined into the feature set.
The first machine learning model is XGBoost algorithm model.
The second machine learning model is LightGBM algorithm model.
The second training is performed on the data of the high single-volume area in the feature set, specifically: and reading the feature set, extracting data marked as a high-single-volume region, inputting the data into a deep learning model for training, and then merging the generated data into the feature set.
The deep learning model is an LSTM algorithm model.
The multi-model fusion is carried out on the results of the two training to obtain a prediction model, which is specifically as follows: and respectively obtaining the prediction results and the mean square error of the whole training and the secondary training, obtaining the prediction result of multi-model fusion by carrying out weighted average on the two prediction results, and obtaining a stable prediction model by repeated iterative computation.
A predictive system for taxi taking demand for network taxi taking, the system comprising:
the reading module is used for reading a plurality of associated data which have influence on the driving requirement;
the repair module is used for repairing abnormal and missing data in the associated data;
The feature module is used for respectively calculating the features of the associated data and generating a feature set;
the integral training module is used for carrying out integral training on the data in the feature set;
the secondary training module is used for carrying out secondary training on the data of the high single-volume area in the feature set;
the fusion module is used for carrying out multi-model fusion on the results of the two training to obtain a prediction model;
And the prediction module is used for inputting the current associated data and obtaining a predicted future taxi taking demand result.
Compared with the prior art, the invention has the beneficial effects that:
The prediction method and the prediction system for the taxi taking demand of the online taxi taking have the advantages of being high in accuracy, high in operation efficiency and not easy to fit, machine learning modeling processing is conducted according to a plurality of associated data, the accuracy is higher compared with that of simply calculating the order quantity, different training models are selected according to the magnitude of the data through separating feature sets corresponding to the associated data, the efficiency is higher compared with that of operation of a general system, the complexity of single training is reduced, the condition of fitting is reduced, the problems of low accuracy, low operation efficiency and easiness in fitting existing in the existing prediction technology are solved, and the prediction demand of the taxi taking demand is met.
Drawings
The invention is described in further detail below with reference to the attached drawing figures, wherein:
FIG. 1 is a schematic flow diagram of the method of the present invention;
fig. 2 is a schematic diagram of the system architecture of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, it being understood that the preferred embodiments described herein are for illustration and explanation of the present invention only, and are not intended to limit the present invention.
As shown in fig. 1 to 2, the method for predicting the taxi taking demand for the internet taxi according to the present invention is characterized in that the method includes:
101. reading a plurality of associated data which have an influence on the driving requirement;
The associated data includes, but is not limited to: the historical order quantity data, order weather data, holiday information data, regional population density data and regional community density data have the advantages that the historical order quantity data, the regional population density data and the regional community density data are processed and analyzed, so that the historical order quantity data has higher prediction accuracy relative to single historical order quantity data, the influence of a small amount of abnormal data on the overall result is reduced, and the prediction stability is ensured.
102. Repairing abnormal and missing data in the associated data;
The repairing of the abnormal and missing data in the associated data is specifically as follows: and respectively sorting and ordering the associated data, and arranging the associated data into a normal distribution diagram according to the occurrence frequency of the data, wherein the data outside the 3 delta range in the normal distribution diagram is the data with lower frequency, usually the maximum value or the minimum value caused by statistical errors, repeated calculation, system loopholes or extreme events, so that the partial data is deleted to reduce abnormal data.
If the missing data can be assigned with numerical values, such as historical order quantity data, regional population density data or regional community density data, filling the missing data by using 0 element, so that the data is kept continuous; if the value cannot be assigned, such as order weather data or holiday information data, acquiring the data of the front section and the rear section of the missing data, calculating the frequency of effective data in the fixed interval section, and extracting the data with high frequency to fill the missing data.
103. Respectively calculating the characteristics of the associated data and generating a characteristic set;
The method comprises the steps of respectively calculating the characteristics of the associated data and generating a characteristic set, and specifically comprises the following steps: setting a fixed time interval as a constant t, and dividing the historical order quantity data at the time interval t to obtain a sequence { d0,d1,...,dn } of the historical order quantity data, wherein dn is the historical order quantity data contained in the nth time interval t.
Then, calculating the single-quantity loop ratio of the sequence respectively, wherein the formula is as follows: in order to obtain a percentage change in the historical order quantity data over the last period of time.
Calculating a single-quantity daily homonymy of the sequence, wherein the formula is as follows: in order to obtain a percentage change in the historical order quantity data relative to the last day.
The single quantity Zhou Tongbi of the sequence is calculated, and the formula is as follows: in order to obtain a percentage change in the historical order quantity data relative to the last day.
Marking the time dimension of the historical order quantity data through the order weather data and the holiday information data, and marking the historical order quantity data as data of information such as peak time intervals, flat peak time intervals, working days, holidays, raindays and the like; marking the space dimension of the historical order quantity data through the regional population density data and the regional community density data, marking the administrative area, the hexagonal area and the longitude and latitude of the central area corresponding to the historical order quantity data area, marking a high-order-quantity area and a low-order-quantity area at the same time, and finally processing by adopting a time-frequency transformation algorithm, wherein the time-frequency transformation algorithm comprises a Fourier transformation algorithm, and the formula of the time-frequency transformation algorithm is as follows:
Wherein Dk is the amplitude after Fourier transformation, Dk is the historical order quantity data of the kth data, and the feature set can be obtained by reading the data processed by the time-frequency transformation algorithm.
104. Carrying out overall training on the data in the feature set;
The data in the feature set is integrally trained, specifically: the method comprises the steps of reading a feature set, inputting data into a first machine learning model for training, wherein the first machine learning model is a XGBoost algorithm model, merging generated data into the feature set, extracting data again, inputting the extracted data into a second machine learning model for training, and obtaining a prediction result of a low single-quantity region in the feature set through a XGBoost algorithm model and a LightGBM algorithm model, marking as predict1, obtaining a mean square error of the feature set as error1, and merging the generated data into the feature set.
Wherein, the calculation formula of the mean square error is as follows
Di is the true value of the data as a reference,Is a predicted value of the data.
105. Performing secondary training on the data of the high single-volume region in the feature set;
the secondary training is carried out on the data of the high single-volume area in the feature set, specifically: the feature set is read, data marked as a high-single-volume region is extracted, the data is input into a deep learning model for training, and as a preferred implementation mode of the invention, the deep learning model is an LSTM algorithm model, a prediction result of the high-single-volume region in the feature set is obtained through training of the LSTM algorithm model and is marked as predict < 2>, a mean square error of the feature set is obtained and is marked as error < 2>, and then the generated data is combined into the feature set.
106. Carrying out multi-model fusion on the results of the two training to obtain a prediction model;
The multi-model fusion is carried out on the results of the two training to obtain a prediction model, which is specifically as follows: the prediction results and the mean square error of the whole training and the secondary training are respectively obtained, and the prediction results of the multi-model fusion are obtained by carrying out weighted average on the two prediction results, wherein the formula is as follows:
and obtaining a stable prediction model through repeated iterative computation.
107. And inputting current associated data to obtain a predicted future driving demand result.
Current association data is entered including, but not limited to: historical order quantity data, order weather data, holiday information data, regional population density data and regional community density data are input into a prediction model, and predicted driving demand results of all regions in a future designated time period are calculated.
A predictive system for taxi taking demand for network taxi taking, the system comprising:
the reading module 1 is used for reading a plurality of associated data which have influence on the driving requirement;
a repair module 2, configured to repair abnormal and missing data in the associated data;
a feature module 3, configured to calculate features of the associated data and generate feature sets, respectively;
the integral training module 4 is used for integral training of the data in the feature set;
the secondary training module 5 is used for performing secondary training on the data of the high single-volume region in the feature set;
The fusion module 6 is used for carrying out multi-model fusion on the results of the two training to obtain a prediction model;
And the prediction module 7 is used for inputting the current associated data and obtaining a predicted future taxi taking demand result.
The prediction method and the prediction system have the advantages of high accuracy, high operation efficiency and difficult fitting, machine learning modeling processing is carried out according to a plurality of associated data, higher accuracy is achieved relative to simple calculation of order quantity, different training models are selected according to the magnitude of data by carrying out separation processing on feature sets corresponding to the associated data, higher efficiency is achieved relative to general operation, complexity of single training is reduced, fitting condition is reduced, the problems of lower accuracy, low operation efficiency and easy fitting existing in the existing prediction technology are solved, and the prediction requirement of driving requirements is met.
The present invention is not limited to the preferred embodiments, and any modifications, equivalent variations and modifications made to the above embodiments according to the technical principles of the present invention are within the scope of the technical proposal of the present invention.

Claims (5)

CN202110711952.4A2021-06-252021-06-25Prediction method and system for taxi taking demand of network taxi takingActiveCN113469739B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202110711952.4ACN113469739B (en)2021-06-252021-06-25Prediction method and system for taxi taking demand of network taxi taking

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202110711952.4ACN113469739B (en)2021-06-252021-06-25Prediction method and system for taxi taking demand of network taxi taking

Publications (2)

Publication NumberPublication Date
CN113469739A CN113469739A (en)2021-10-01
CN113469739Btrue CN113469739B (en)2024-05-28

Family

ID=77873006

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202110711952.4AActiveCN113469739B (en)2021-06-252021-06-25Prediction method and system for taxi taking demand of network taxi taking

Country Status (1)

CountryLink
CN (1)CN113469739B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109117973A (en)*2017-06-262019-01-01北京嘀嘀无限科技发展有限公司A kind of net about vehicle order volume prediction technique and device
CN110490365A (en)*2019-07-122019-11-22四川大学A method of based on the pre- survey grid of multisource data fusion about vehicle order volume
CN110599767A (en)*2019-09-042019-12-20广东工业大学Long-term and short-term prediction method based on network taxi appointment travel demands
CN110968767A (en)*2018-09-282020-04-07北京嘀嘀无限科技发展有限公司Ranking engine training method and device, and business card ranking method and device
CN111199343A (en)*2019-12-242020-05-26上海大学Multi-model fusion tobacco market supervision abnormal data mining method
US10733515B1 (en)*2017-02-212020-08-04Amazon Technologies, Inc.Imputing missing values in machine learning models
CN112330215A (en)*2020-11-262021-02-05长沙理工大学Urban vehicle demand prediction method, equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US10733515B1 (en)*2017-02-212020-08-04Amazon Technologies, Inc.Imputing missing values in machine learning models
CN109117973A (en)*2017-06-262019-01-01北京嘀嘀无限科技发展有限公司A kind of net about vehicle order volume prediction technique and device
CN110968767A (en)*2018-09-282020-04-07北京嘀嘀无限科技发展有限公司Ranking engine training method and device, and business card ranking method and device
CN110490365A (en)*2019-07-122019-11-22四川大学A method of based on the pre- survey grid of multisource data fusion about vehicle order volume
CN110599767A (en)*2019-09-042019-12-20广东工业大学Long-term and short-term prediction method based on network taxi appointment travel demands
CN111199343A (en)*2019-12-242020-05-26上海大学Multi-model fusion tobacco market supervision abnormal data mining method
CN112330215A (en)*2020-11-262021-02-05长沙理工大学Urban vehicle demand prediction method, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于深度CNN-LSTM-ResNet组合模型的出租车需求预测;段宗涛;张凯;杨云;倪园园;SAURAB Bajgain;;交通运输系统工程与信息;20180815(第04期);全文*
基于深度学习的网约车供需缺口短时预测研究;谷远利;李萌;芮小平;陆文琦;王硕;;交通运输系统工程与信息;20190415(第02期);摘要,1-2节*

Also Published As

Publication numberPublication date
CN113469739A (en)2021-10-01

Similar Documents

PublicationPublication DateTitle
CN111653088B (en)Vehicle driving quantity prediction model construction method, prediction method and system
CN110555561B (en)Medium-and-long-term runoff ensemble forecasting method
CN110599767A (en)Long-term and short-term prediction method based on network taxi appointment travel demands
CN113538067B (en)Inter-city network vehicle-closing demand prediction method and system based on machine learning
CN111192090A (en)Seat allocation method and device for flight, storage medium and electronic equipment
CN108416619B (en)Consumption interval time prediction method and device and readable storage medium
CN108415885A (en)The real-time bus passenger flow prediction technique returned based on neighbour
CN117912235B (en)Planning data processing method and system for smart city
CN112926809B (en)Flight flow prediction method and system based on clustering and improved xgboost
CN113674524A (en) Multi-scale short-term traffic flow prediction modeling, prediction method and system based on LSTM-GASVR
CN113537596A (en)Short-time passenger flow prediction method for new line station of urban rail transit
CN111985731A (en)Method and system for predicting number of people at urban public transport station
CN116862573A (en) Intercity ride-hailing short-term travel demand forecasting method and system based on incremental training
CN116663742A (en)Regional capacity prediction method based on multi-factor and model fusion
CN112785089A (en)Agent service configuration method and device, electronic equipment and storage medium
CN110020666B (en) A public transportation advertisement delivery method and system based on passenger behavior pattern
CN113469739B (en)Prediction method and system for taxi taking demand of network taxi taking
CN117314504B (en) A public transportation passenger flow prediction method and system
CN114139984A (en) Risk prediction method of urban traffic accident based on collaborative perception of traffic flow and accident
CN117196695B (en)Target product sales data prediction method and device
CN118645017A (en) A track prediction method in autonomous operation mode based on time-frequency analysis
JP3268520B2 (en) How to forecast gas demand
CN116843374A (en)Multi-factor sales prediction system for quick-elimination commodity
Widhalm et al.Robust road link speed estimates for sparse or missing probe vehicle data
Widiyaningtyas et al.Use of ARIMA Method To Predict The Number of Train Passenger In Malang City

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp