CN117851809B

Movatterモバイル変換

Info

Publication number: CN117851809B
Application number: CN202410018740.1A
Authority: CN
Inventors: 李�昊; 李志斌; 郑茂程
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2024-01-05
Filing date: 2024-01-05
Publication date: 2024-06-18
Anticipated expiration: 2044-01-05
Also published as: CN117851809A

Abstract

The invention discloses a feature extraction method based on reinforcement learning of mobile phone signaling historical data, which comprises the steps of firstly constructing a historical database according to the mobile phone signaling historical data in the last period of time; then, performing double mobile phone signaling data reinforcement learning, wherein the first reinforcement learning is data characteristic learning, and outputting user complete trip sequence data with high credibility; the second reinforcement learning is travel behavior learning, and outputs motion state fuzzy weights of three motion states, and constructs a fuzzy travel membership set; thirdly, providing three Gaussian mixture clustering algorithms based on fuzzy trip membership, and carrying out clustering division on the complete trip sequence data of the user; and providing a double-dynamic POI similarity mapping algorithm, and dynamically adjusting algorithm parameters by combining land information to determine the accurate trip origin-destination and the complete trip chain of the user. The method improves the utilization rate of the signaling data of the user mobile phone and effectively solves the problems of travel analysis and data statistics accuracy of the user.

Description

Feature extraction method based on mobile phone signaling historical data reinforcement learning

Technical Field

The invention relates to the field of mobile phone signaling data processing, in particular to a feature extraction method based on mobile phone signaling historical data reinforcement learning.

Background

Along with the gradual strong functions of the mobile phone, people become more and more away from the mobile phone in daily life, such as mobile phone payment, mobile phone navigation, mobile phone information inquiry and the like, so that the mobile phone can basically accompany the user around all weather, and the position information of the mobile phone can represent the position information of the user to a great extent. Therefore, in the aspect of travel information acquisition technology, the mobile phone signaling data is widely applied as one of acquisition data sources.

At present, in the aspect of a travel analysis technology based on mobile phone signaling, two modes are mainly adopted to identify a travel origin-destination point, namely, the origin-destination point is determined by switching mobile phone signaling data and converting a position update information judgment track; the second category is determined by using a hierarchical or density-based clustering algorithm.

However, the current widely used technical means still cannot perfectly solve the problem, and particularly under the condition that the mobile phone signal base station has a coordinate recognition error, accurate acquisition of the geographic information of the user can have certain difficulty. Meanwhile, the incompatibility problem existing between the division of the base station area and the division of the traffic cell can also cause errors in identifying the user position information, data drift and ping-pong phenomenon occur, and errors in identifying the trip origin-destination are further aggravated.

In addition, in terms of algorithm, the algorithm for processing the mobile phone signaling data also has the problems of insufficient noise data processing, unclear definition of attribution division of a travel mode of the data, undefined threshold setting, unsatisfactory point location data clustering effect, difficult recognition of travel mode and mode selection and the like, and finally results in a series of pain points such as insufficient data utilization rate, inconsistent travel analysis results and reality and the like, and a new method needs to be specifically explored.

Disclosure of Invention

The invention aims to solve the problems that: the feature extraction method based on the reinforcement learning of the mobile phone signaling historical data is used for solving the problems of user travel analysis and data statistics accuracy and improving the mobile phone signaling data utilization rate.

The invention adopts the following technical scheme: a feature extraction method based on mobile phone signaling history data reinforcement learning comprises the following steps:

S10, constructing a history database according to the mobile phone signaling history data in the last period of time, and using the history database for double mobile phone signaling data reinforcement learning;

S20, first re-reinforcement learning and data characteristic learning: constructing a multichannel convolution Bayesian learning algorithm, learning signal intensity, time length and coordinate information characteristics of mobile phone signaling historical data, eliminating drift data and ping-pong data existing in initial mobile phone signaling data of a user, complementing the missing data points, and outputting complete trip sequence data of the user with high credibility;

s30, second reinforcement learning and travel behavior learning: defining travel coincidence degree calculation, learning historical data with higher travel coincidence degree with complete travel sequence data of a user, decomposing the data into three motion states of travel, stillness and small-range activity, outputting motion state fuzzy weights of the three motion states, and constructing a fuzzy travel membership set;

s40, combining the fuzzy travel membership sets, providing a three-Gaussian mixed clustering algorithm based on the fuzzy travel membership degree, carrying out clustering division on the complete travel sequence data of the user, extracting a travel set and a stay point set comprising two subsets of a static set and an edge motion set, and calculating the coverage area of each stay point set;

S50, reading land utilization information in the coverage range of the stay point set by means of an API (application program interface) of a map platform, providing a double-dynamic POI (point of interest) similar mapping algorithm, carrying out geographic information mapping and POI matching on the stay point set by combining land information to dynamically adjust algorithm parameters, and outputting accurate origin and destination points and a complete travel chain of a user;

And S60, reading and processing all the user data, dividing and outputting a travel OD matrix between each traffic cell in the same day according to time, counting the user travel mode and rule, and feeding back the user data result to the historical database in S10 for parameter updating.

Specifically, according to the mobile phone signaling historical data characteristics, each piece of point location data content stored in the S10 historical database comprises a user identifier, a user mobile phone number, a timestamp record of each piece of data comprising date and time information, position information comprising longitude and latitude coordinates and attribution of a base station, signal strength information reflecting signal quality between the base station and a mobile phone and dynamic and static three-value weights, and the record format of the data is as follows:

MS(U₁,U₂,…,U_s)

U_s(i)＝{U_s,U_sp,DATE,MM,(L_o,L_a,L_CI),SI,Tra(j)}^T

wherein MS (U_s) is a complete user mobile phone signaling historical data sample, U_s (i) refers to ith point location data of user trip sequence data U_s, U_sp is a mobile phone number of user U_s, DATE is DATE data in a format of year/month/day, MM is time length data in a format of time/minute/second, L_o is longitude coordinates of the user, L_a is latitude coordinates of the user, L_CI is base station cell home coordinates, SI is signal strength, tra (j) is a jth motion state fuzzy weight value, and T represents matrix transposition.

Specifically, in S20, the drift data refers to noise data in which user data is recorded as noise data that is suddenly changed to an abnormal value and then is switched back to the original value in a short time; ping-pong data refers to noise data of user data, which is recorded in the coverage area of two base stations and switched back and forth; the first reinforcement learning in the reinforcement learning of the signaling data of the dual mobile phone comprises the following steps:

S21, performing format conversion on the historical data in S10 and the initial data of the user in S20, normalizing the timestamp data, longitude and latitude coordinates and signal strength information, and ensuring that the scales are consistent;

S22, constructing a multichannel convolution Bayesian learning algorithm, learning historical data, calculating abnormal scores of user data according to a learning result, and eliminating abnormal data and carrying out Bayesian processing according to a dynamic threshold.

A multi-channel convolutional bayesian learning algorithm comprising the steps of:

S221, taking the mobile phone signaling historical data as input to perform feature learning, performing CNN first-order learning on a signal intensity channel, and outputting a one-dimensional convolution feature learning result, wherein the one-dimensional convolution feature learning result is expressed as follows:

Wherein X₁ (i) is the one-dimensional convolution feature learning result of the ith point location data, b_SI is the signal strength bias value, omega_SI is the signal strength weight, P_SIl (i) is the signal loss probability of the ith point location data,N is the data quantity of the signaling information input;

s222, on the basis of S221, CNN second-order learning is carried out by combining a time channel, and a two-dimensional convolution characteristic learning result is output and expressed as follows:

Wherein X₂ (i) is the two-dimensional convolution feature learning result of the ith point location data, b_TIM is a time stamp offset value, sigma_TIM is a time compensation parameter, omega_TIM is a time weight, P_TIMl (i) is the time loss probability of the ith point location data,The two-dimensional convolution input is carried out, and n and p are the data quantity of the signaling information input;

s223, on the basis of S222, performing CNN third-order learning by combining a coordinate channel, and outputting a three-dimensional convolution characteristic learning result, wherein the three-dimensional convolution characteristic learning result is expressed as follows:

Wherein X₃ (i) is the three-dimensional convolution characteristic learning result of the ith point location data, b_LOA is a coordinate offset value, sigma_LOA is a coordinate balance parameter, omega_LOA is a coordinate weight, P_LOAl (i) is the coordinate loss probability of the ith point location data,For three-dimensional convolution input, n, p and q are signaling information input data quantity;

S224, introducing user initial data, and calculating anomaly score for each point location data:

Wherein:

AS (i) is an anomaly score value of the ith point location data of the user data U_s, AS_m (i) is an anomaly residual error in the mth dimension, rho_m is a dimension index, a convolution feature learning result in the mth dimension of X_m (i), and J represents the total number of historical data, wherein j=1, 2, … and J;

U_si(SI)、U_si(MM)、U_si (LOA) is the signal intensity, the time length data and the user coordinate value of the ith point data in the user data respectively, and H_j(SI)、H_j(MM)、H_j (LOA) is the signal intensity, the time length information and the coordinate information of the jth historical data respectively;

S_d、D_d、T_d is a one-dimensional, two-dimensional and three-dimensional data form, and ωs, ωd and ωt are one-dimensional, two-dimensional and three-dimensional anomaly correction weights.

S225, introducing Bayesian posterior observation smooth prediction processing calculation, carrying out data correction and elimination on the basis of historical data, and outputting each point data after user correction

Wherein,

For each point data of a complete trip sequence of a user, MAXA is the maximum anomaly allowable value, A (i) is the dynamic anomaly segmentation value of the ith point data, alpha and beta are dynamic parameters, bay (U_S (i)) is the expression of Bayesian posterior observation smooth prediction processing of the ith data point, and/ >The posterior observation probability of the data Y (j) and the data X (i);

specifically, the second reinforcement learning in the reinforcement learning of the signaling data of the dual mobile phone comprises the following steps:

S31, data are called from a historical database, and trip coincidence ratio CR (i) between the user complete trip sequence data in S20 is calculated:

wherein CR (i) is the coincidence degree between the ith historical data and the user data, H (i) is the extracted ith historical data,For the coincidence index scale value, ε is a spatially similar decay parameter,/>Representing X information in the user travel sequence data, wherein XH (i) represents X information in the ith historical data;

S32, setting a coincidence degree dividing threshold CR_T according to the data quantity, and extracting historical data meeting the condition that CR (i) is more than or equal to CR_T to construct an available data set HC, wherein the method is specifically expressed as follows:

HC＝{HC₁,HC₂,HC₃,…,HC_k}

s33, constructing a travel covariance matrix incorporating the coincidence weight according to the available data set, wherein the travel covariance matrix is specifically expressed as follows:

Wherein, R_cov HC is a travel covariance matrix incorporating coincidence weights, R_HC is a matrix element, CR_T is a coincidence degree dividing threshold, COV is a covariance function, K is the data volume contained in the dataset HC, and k=1, 2, …, K;

S34, adopting an algorithm of combining PCA with K-means clustering, carrying out principal component analysis on R_cov HC, dividing data into three motion states of travel, stillness and small-range activity on a new feature matrix, and giving a fuzzy weight value Tra (i), wherein the specific expression is as follows:

Wherein,

Tra (1), tra (2) and Tra (3) are fuzzy weight values of three motion states of travel, stillness and small-range activity respectively,Time fuzzy balance coefficients of three motion states of travel, static and small-range movement are respectively represented, λC= { λC₁,λC₁,…,λC_K } is a coincidence characteristic root of R_cov HC, and Nm represents data quantity of the m-th class in a clustering result;

S35, constructing a fuzzy trip membership parameter set mu_f (i) according to fuzzy weight values of the three motion states in S34:

Wherein mu_f(1)、μ_f(2)、μ_f (3) is fuzzy membership parameters of three states of travel, stillness and small-range activity respectively,For the space distance operation between two data points before and after the complete trip sequence data of the user, epsilon T, zeta T and eta T are single-term, two-term and three-term fuzzy matching coefficients of the trip state respectively, epsilon S, zeta S and eta S are single-term, two-term and three-term fuzzy matching coefficients of the static state respectively, and epsilon A, zeta A and eta A are single-term, two-term and three-term fuzzy matching coefficients of the small-range active state respectively.

Specifically, in step S40, the three-gaussian mixture clustering algorithm (FTT-GMM) based on the fuzzy trip membership includes the following steps:

s41, adopting the idea of three decisions, combining the three types of states of the travel behaviors described in S30 to construct a three-weight GMM posterior probability function, and iterating function parameters by using a CS algorithm to obtain three-weight GMM posterior probability values of each point data in the user dataSpecifically expressed as follows;

Wherein,

Complete travel sequence for user/>The ith point location data below belongs to the three-weighted GMM posterior probability of the kth gaussian component, k=3, and p_i represents data/>Probability of the kth Gaussian component, p_t (x) is a three-weight GMM probability density function, tra (k) is a fuzzy weight value of the kth Gaussian component,/>The mean value is ablated for the vector dimension of the kth gaussian component, Σ_k is the tri-state covariance matrix of the kth gaussian component,/>Is a data dimension coefficient;

S42, defining three GMM cluster division thresholds based on the fuzzy trip membership parameter set;

Wherein, sigma_TLV is an upper threshold value for dividing three travel states, sigma_LLV is a lower threshold value, and theta_Tf、θ_Sf、θ_Af is respectivelyDividing loss costs to travel, stillness and small range of activities under fuzzy decision, and enabling theta_Tc、θ_Sc、θ_Ac to be/>, respectivelyLoss costs divided into travel, stillness and small range of activities under clear decision, Ω is fuzzy state decision index,/>Is a clear state decision index;

s43, clustering the data of each point, and dividing the data into three classes of travel, static and edge movement according to the threshold value in S42;

Wherein,

TC_U＝{TC₁,TC₂,TC₃}

TC_U is a clustering result set, TC₁ is a travel set, TC₂ is a static set, and TC₃ is an edge motion set;

S44, merging the stationary set and the edge motion set, constructing a stay point set, and calculating a coverage area weighted radius SOR_i of each stay point set, wherein the coverage area weighted radius SOR is specifically expressed as:

S_U＝{TC₂}∪{TC₃}

Wherein S_U is a stay point set, SR_i is a coverage area weighted radius of an ith stay point set, max </DEG > represents a maximum value calculation, sr_ij represents a two-dimensional coverage radius of jth point data in the ith stay point set, f (·) represents a range mapping normalization calculation, v is a conversion rate control parameter, τ_i is a point set clustering attribution coefficient, and |d_j-D_i‖² represents a Euclidean distance calculation of jth point data in the ith stay point set and a central point D_i;

Specifically, in step S50, the dual dynamic POI similarity mapping algorithm includes the following steps:

s51, taking longitude and latitude coordinates of a center point of the stay point set as a center, reading land utilization information in a weighted radius range of a coverage domain of the stay point set, wherein the land utilization information comprises population number, building rectangular degree, urban degree, POI number and daily average travel amount, and providing a matching domain density Lρ, wherein the specific expression is as follows:

Wherein Lρ is the density of the matching domain, ADT is the daily average travel in the domain, AR is the building rectangle, N_POI is the number of POIs in the domain, peo is the number of population in the domain, SR is the weighted radius of the coverage domain, CLL is the urban coefficient in the domain, and lambda is the dimension normalization coefficient;

s52, dynamically adjusting a matching domain: the matching domain radius SR_ρ of the dwell point set is corrected according to the matching domain density as described in S51, specifically expressed as:

Wherein SR_ρ is the radius of the matching domain, SR (&) is the calculation of the weighted radius of the coverage domain, Lρ is the density of the matching domain, sr_i is the two-dimensional coverage radius of the ith point location, τ_i is the cluster attribution coefficient, deltar is the radius adjustment step length, and Tρ is the density dividing value;

POIE_i＝max<SIM(k)>

PATH_Us＝{POIE₁,POIE₂,…,POIE_n}

Wherein SIM (k) is the similar estimated value of the kPOI th point in the corresponding matching domain of the ith stay point set,Is the coordinate vector of kPOI th point in the ith matching domain,/>Coordinate vector of jth point data in ith retention point set, II·IIis vector norm operation expression,/>For the region radius of kPOI th point, sr_ij represents the two-dimensional coverage radius of the j-th point data in the i-th stay point set, ω_Tj is state set inclination weight, ed (·) is the expression of similar adjustment euclidean distance calculation, POIE_i takes the accurate POI point with the POI corresponding to the maximum similar estimated value as the i-th origin and destination point, and PATH_Us is the complete travel chain of the user Us;

Specifically, in step S60, after reading and processing all the user data, the user is complete with the travel sequence dataAnd returning to S10 the historical database for updating with the motion state fuzzy weight { Tra }.

The technical scheme of the invention also provides: an electronic device, comprising:

One or more processors;

a storage device having one or more programs stored thereon;

when the one or more programs are executed by the one or more processors, the one or more processors implement any of the feature extraction methods based on reinforcement learning of mobile phone signaling history data described above.

The technical scheme of the invention also provides a computer readable storage medium, on which a computer program is stored, which when being executed by a processor, realizes the steps in any feature extraction method based on the reinforcement learning of the mobile phone signaling historical data.

Compared with the prior art, the technical scheme provided by the invention has the following technical effects:

The feature extraction method based on the reinforcement learning of the mobile phone signaling historical data further ensures the accuracy and the reliability of the mobile phone signaling data as a data source for traffic travel research on the basis of the double reinforcement learning; the provided clustering algorithm enhances the data clustering effect; the dynamic information matching mechanism further enriches the information quantity about the trip origin-destination of the user; meanwhile, the historical database built based on the original user data and the processed user data can be used for travel analysis and research of other purposes in the future, so that the improvement of the utilization rate of signaling data of the user mobile phone is realized, and the problems of travel analysis and data statistics accuracy of the user are effectively solved.

Drawings

Fig. 1 is a flowchart of a trip origin-destination extraction algorithm based on reinforcement learning of mobile phone signaling history data according to an embodiment of the present application;

FIG. 2 is an algorithm flow chart of a trip origin-destination extraction algorithm based on reinforcement learning of mobile phone signaling history data;

FIG. 3 is a graph of user raw data versus user precision value-to-value extracted at the end.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the application will be further elaborated in conjunction with the accompanying drawings, and the described embodiments are only a part of the embodiments to which the present invention relates. All non-innovative embodiments in this example by others skilled in the art are intended to be within the scope of the invention. Meanwhile, the step numbers in the embodiments of the present invention are set for convenience of illustration, the order between the steps is not limited, and the execution order of the steps in the embodiments can be adaptively adjusted according to the understanding of those skilled in the art.

The invention discloses a feature extraction method based on reinforcement learning of mobile phone signaling historical data, which comprises the following specific steps as shown in fig. 1:

In one embodiment of the invention, taking the mobile phone signaling data collected in a certain city as an example, a history database is constructed according to the mobile phone signaling history data in the last year, and meanwhile, the mobile phone signaling original data of a certain user in the city is taken as a processing sample to carry out reinforcement learning and feature extraction, and the specific method is as follows:

step one: user data cleaning pretreatment.

Historical database creation and user data reading: according to the signaling data characteristics of the mobile phone, defining that each piece of point location data content stored in a historical database comprises a user identifier, a user mobile phone number, a timestamp record of each piece of data comprising date and time information, position information comprising longitude and latitude coordinates and attribution of a base station, signal strength information reflecting signal quality between the base station and the mobile phone and dynamic and static three-value weights, wherein the record format of the data is as follows:

MS(U₁,U₂,…,U_s)

U_s(i)＝{U_s,U_sp,DATE,MM,(L_o,L_a,L_CI),SI,Tra(j)}^T

Wherein MS (U_s) is a complete user data sample, U_s (i) refers to ith point location data of user travel sequence data U_s, U_sp is a mobile phone number of user U_s, DATE is DATE data in year/month/day format, MM is time length data in time/minute/second format, L_o is longitude coordinates of the user, L_a is latitude coordinates of the user, L_CI is base station cell home coordinates, SI is signal strength, and Tra (j) is a jth motion state fuzzy weight value.

Step two: and 4, strengthening learning of the signaling data of the dual mobile phone.

In this embodiment, based on the mobile phone signaling history data, double reinforcement learning is performed to determine the accurate trip origin-destination and the complete trip chain of the user, as shown in fig. 2.

First, performing first re-reinforcement learning and data characteristic learning.

Firstly, carrying out format conversion on the mobile phone signaling historical data in S10 and the user initial mobile phone signaling data in S20, normalizing the timestamp data, longitude and latitude coordinates and signal strength information, and ensuring the consistency of the scales;

then, constructing a multichannel convolution Bayes learning algorithm, which specifically comprises the following steps:

Firstly, performing feature learning by using CNN pairs, namely performing CNN first-order learning on a signal intensity channel, and outputting a one-dimensional convolution feature learning result; performing CNN second-order learning by combining the time channel, and outputting a two-dimensional convolution characteristic learning result; and combining the coordinate channel to perform CNN third-order learning, and outputting a three-dimensional convolution characteristic learning result.

Next, introducing user initial mobile phone signaling data, and calculating an anomaly score for each point location data:

Wherein AS (i) is an anomaly score value of the ith point location data of the user data U_s, AS_m (i) is an anomaly residual under the mth dimension, and ρ_m is a dimension index.

Next, a Bayesian posterior observation smooth prediction processing calculation is introduced, data correction and elimination are carried out on the basis of historical data, and each point data after user correction is output

Wherein,The method comprises the steps that point location data of a complete trip sequence of a user are obtained, and MAXA is the maximum abnormal permission value;

A (i) is a dynamic anomaly segmentation value of the ith point data, and Bay (U_S (i)) is an expression of Bayesian posterior observation smooth prediction processing of the ith data point;

And then, performing second reinforcement learning and travel behavior learning.

Firstly, data are called from a historical database, and trip coincidence ratio CR (i) between the data and complete trip sequence data of a user is calculated, wherein the trip coincidence ratio CR (i) is specifically expressed as:

Wherein CR (i) is the coincidence degree between the ith historical data and the complete trip sequence data of the user, H (i) is the extracted ith historical data,For the coincidence index scale value, ε is a spatially similar decay parameter,/>Representing X information in the user travel sequence data, wherein XH (i) represents X information in the ith historical data;

Next, a coincidence degree dividing threshold CR_T is set according to the data volume, and the historical data meeting the conditions of CR (i) not less than CR_T is extracted to construct an available data set HC, which is specifically expressed as:

HC＝{HC₁,HC₂,HC₃,…,HC_k}

next, constructing a travel covariance matrix incorporating the coincidence weight, which is specifically expressed as:

Wherein, R_cov HC is a travel covariance matrix incorporating coincidence weight, R_HC is a matrix element, and CR_T is a coincidence degree dividing threshold;

Next, adopting an algorithm of combining PCA with K-means clustering to analyze the main component of R_cov HC, dividing the data into three motion states of travel, stillness and small range activity on a new feature matrix, and giving a fuzzy weight value Tra (i); tra (1), tra (2) and Tra (3) are used for respectively representing fuzzy weight values of three motion states of travel, stillness and small-range activity;

next, according to the fuzzy weight values of the three motion states, a fuzzy trip membership parameter set mu_f (i) is constructed, which is specifically expressed as follows:

Wherein mu_f(1)、μ_f(2)、μ_f (3) is fuzzy membership parameters of three states of travel, stillness and small-range activity respectively,For the space distance operation between two data points before and after the complete trip sequence data of a user, epsilon T, zeta T and eta T are single-term, two-term and three-term fuzzy matching coefficients of a trip state respectively, epsilon S, zeta S and eta S are single-term, two-term and three-term fuzzy matching coefficients of a static state respectively, and epsilon A, zeta A and eta A are single-term, two-term and three-term fuzzy matching coefficients of a small-range activity state respectively;

Step three: user origin-destination extraction matches geographic information.

The user original data is compared with the user accurate origin-destination extracted finally, as shown in fig. 3, and the method comprises the following steps:

Firstly, extracting a stay point set and demarcating a coverage area:

In the embodiment, on the basis of the fuzzy travel membership set, three Gaussian mixture clustering algorithm (FTT-GMM) based on the fuzzy travel membership degree is provided, and the specific process of the FTT-GMM algorithm is as follows:

Firstly, adopting the idea of three decisions, combining three types of states of travel behaviors to construct a three-weight GMM posterior probability function, and iterating function parameters by using a CS algorithm to obtain three-weight GMM posterior probability values of each point data in user dataSpecifically expressed as follows;

Wherein,

then, defining three GMM cluster division thresholds based on the fuzzy trip membership parameter set;

then, clustering the data of each point, and dividing the data into three classes of travel, static and edge movement according to three GMM class cluster dividing thresholds;

Wherein,

TC_U＝{TC₁,TC₂,TC₃}

Finally, merging the stationary set and the edge motion set, constructing a stay point set, and calculating a coverage area weighted radius SOR_i of each stay point set, wherein the coverage area weighted radius SOR_i is specifically expressed as follows:

S_U＝{TC₂}∪{TC₃}

Wherein S_U is a resting point set, SR_i is a coverage area weighted radius of an ith resting point set, max </DEG > represents a maximum value calculation, sr_ij represents a two-dimensional coverage radius of jth point data in the ith resting point set, f (·) represents a range mapping normalization calculation, v is a transformation rate control parameter, τ_i is a point set clustering attribution coefficient, and |d_j-D_i‖² represents a Euclidean distance calculation of jth point data in the ith resting point set and a central point D_i.

Second, travel origin-destination information matching:

by means of an API (application program interface) of a map platform, land utilization information including factors such as POIs (points of interest), population number and the like in the coverage range of a stay point set is read, and a double-dynamic POI similarity mapping algorithm is provided in the embodiment, and the specific process is as follows:

Firstly, taking longitude and latitude coordinates of a center point of a stay point set as a center, reading land utilization information in a weighted radius range of a coverage area of the stay point set, wherein the land utilization information comprises population number, building rectangular degree, urban degree, POI number and daily average travel amount, and providing matching area density Lρ, wherein the specific expression is as follows:

ADT is the daily average travel in the domain, AR is the building rectangle degree, N_POI is the number of POIs in the domain, peo is the population in the domain, SR is the weighted radius of the coverage domain, CLL is the urban coefficient in the domain, and lambda is the dimension normalization coefficient;

Then, the matching field is dynamically adjusted. Correcting the matching domain radius SR_ρ of the stay point set according to the matching domain density:

then, dynamic similarity weighted matching:

Extracting POIs within the radius range of the matching domain, constructing POI candidate sets, providing a dynamic and static set weighted similarity estimation algorithm, outputting similarity estimation values of all candidate POIs, and finally matching POI points of origin and destination points and complete travel chains, wherein the method is specifically expressed as follows:

POIE_i＝max<SIM(k)>

PATH_Us＝{POIE₁,POIE₂,…,POIE_n}

step four: data statistics and feedback.

Reading and processing all user data, dividing and outputting a travel OD matrix among traffic cells in the same day according to time, and counting the travel modes and rules of the user to obtain complete travel sequence data of the userThe motion state fuzzy weight { Tra } is returned to the historical database for updating, parameter updating is carried out, and the motion state fuzzy weight { Tra } is used for carrying out the same data learning and data processing on the mobile phone signaling data of the next user; the parameters include the complete trip sequence data/>, of the userMotion state blur weights { Tra }.

The foregoing is only a preferred embodiment of the invention, it being noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the invention.

Claims

1. The feature extraction method based on the reinforcement learning of the mobile phone signaling historical data is characterized by comprising the following steps:

2. The feature extraction method based on reinforcement learning of mobile phone signaling history data according to claim 1, wherein each piece of point data content stored in the history database includes: the method comprises the steps of user identifiers, user mobile phone numbers, timestamp records containing date and time information, position information containing longitude and latitude coordinates and attribution of a base station, signal strength information and dynamic and static three-value weights, wherein the recording format of data is as follows:

MS(U₁,U₂,…,U_s)

U_s(i)＝{U_s,U_sp,DATE,MM,(L_o,L_a,L_CI),SI,Tra(j)}^T

Wherein MS (U_s) is a complete user mobile phone signaling historical data sample, U_s (i) refers to ith point location data of user travel sequence data U_s, U_sp is a mobile phone number of user U_s, DATE is DATE data in a format of year/month/day, MM is time length data in a format of time/minute/second, L_o is longitude coordinates of the user, L_a is latitude coordinates of the user, L_CI is base station cell home coordinates, SI is signal strength, and Tra (j) is a j-th motion state fuzzy weight value; { X }^T is represented as a transposed form of a matrix containing X elements, and T is represented as a transposed matrix.

3. The feature extraction method based on reinforcement learning of mobile phone signaling history data according to claim 2, wherein the drift data is noise data in which user data occurrence data is recorded and suddenly changed to an abnormal value and then switched back to an original value in a short time; the ping-pong data is noise data which is recorded by user data and switched back and forth in the coverage areas of two base stations; the first re-reinforcement learning includes the sub-steps of:

S21, format conversion is carried out on the mobile phone signaling historical data in S10 and the user initial mobile phone signaling data in S20, and the time stamp data, longitude and latitude coordinates and signal strength information are normalized, so that the consistency of the scales is ensured;

s22, constructing a multichannel convolution Bayesian learning algorithm, learning mobile phone signaling historical data, calculating abnormal scores for user data according to learning results, and eliminating abnormal data and carrying out Bayesian processing according to dynamic thresholds.

4. The feature extraction method based on reinforcement learning of mobile phone signaling history data according to claim 3, wherein the multichannel convolution bayesian learning algorithm in S22 specifically comprises the following steps:

Wherein X₁ (i) is the one-dimensional convolution feature learning result of the ith point location data, b_SI is the signal strength bias value, omega_SI is the signal strength weight, P_SIl (i) is the signal loss probability of the ith point location data,Is one-dimensional convolution input;

s222, performing CNN second-order learning by combining a time channel, and outputting a two-dimensional convolution characteristic learning result, wherein the two-dimensional convolution characteristic learning result is expressed as follows:

Wherein X₂ (i) is the two-dimensional convolution feature learning result of the ith point location data, b_TIM is a time stamp offset value, sigma_TIM is a time compensation parameter, omega_TIM is a time weight, P_TIMl (i) is the time loss probability of the ith point location data,Is a two-dimensional convolution input;

S223, performing CNN third-order learning by combining a coordinate channel, and outputting a three-dimensional convolution characteristic learning result, wherein the three-dimensional convolution characteristic learning result is expressed as follows:

s224, introducing user initial mobile phone signaling data, and calculating anomaly score for each point location data:

Wherein:

S_d、D_d、T_d is a one-dimensional, two-dimensional and three-dimensional data form, and ωs, ωd and ωt are respectively one-dimensional, two-dimensional and three-dimensional anomaly correction weights;

Wherein,

The method is characterized in that the method comprises the steps that point location data of a complete trip sequence of a user are obtained, MAXA is the maximum abnormal permission value, and alpha and beta are dynamic parameters;

A (i) is a dynamic anomaly segmentation value of the i-th point data, bay (U_S (i)) is an expression of bayesian posterior observation smoothing prediction processing for the i-th data point, and P_i^j (X (i) |y (j)) is posterior observation probability of the data Y (j) for the data X (i).

5. The feature extraction method based on reinforcement learning of mobile phone signaling history data according to claim 4, wherein the second reinforcement learning comprises the steps of:

S32, setting a coincidence degree dividing threshold CR_T according to the data quantity, extracting historical data meeting the condition that CR (i) is not less than CR_T, and constructing an available data set HC:

HC＝{HC₁,HC₂,HC₃,…,HC_k}

s33, constructing a travel covariance matrix incorporating the coincidence weight according to the available data set, wherein the travel covariance matrix is expressed as follows:

S34, adopting an algorithm of combining PCA with K-means clustering, carrying out principal component analysis on R_cov HC, dividing data into three motion states of travel, static and small-range activity on a new feature matrix, and respectively giving fuzzy weight values Tra (i):

Wherein,

Tra (1), tra (2) and Tra (3) are fuzzy weight values of three motion states of travel, stillness and small-range activity respectively,Time fuzzy balance coefficients of three motion states of travel, static and small-range movement are respectively represented, λC= { λC₁,λC₂,…,λC_k } is a coincidence characteristic root of R_cov HC, and Nm represents data quantity of the m-th class in a clustering result;

S35, constructing a fuzzy trip membership parameter set mu_f (i) according to fuzzy weight values of three motion states, wherein the fuzzy trip membership parameter set mu_f (i) is expressed as:

6. The feature extraction method based on reinforcement learning of mobile phone signaling history data according to claim 5, wherein in step S40, the three-gaussian hybrid clustering algorithm based on fuzzy trip membership comprises the following sub-steps:

S41, adopting the idea of three decisions, combining three types of states of travel behaviors to construct a three-weight GMM posterior probability function, and iterating function parameters by using a CS algorithm to obtain three-weight GMM posterior probability values of each point data in user dataSpecifically expressed as follows;

Wherein,

Complete travel sequence for user/>The ith point location data belongs to the three-weight GMM posterior probability of the kth Gaussian component; p_i represents data/>Probability of the kth Gaussian component, p_t (x) is a three-weight GMM probability density function, tra (k) is a fuzzy weight value of the kth Gaussian component,/>The mean value is ablated for the vector dimension of the kth gaussian component, Σ_k is the tri-state covariance matrix of the kth gaussian component,/>Is a data dimension coefficient;

Wherein, sigma_TLV、σ_LLV is an upper threshold and a lower threshold for dividing three travel states;

θ_Tf、θ_Sf、θ_Af are respectivelyDividing loss costs to travel, stillness and small range of activities under fuzzy decision, and enabling theta_Tc、θ_Sc、θ_Ac to be/>, respectivelyDividing the loss cost of travel, stillness and small-range activities under clear decision;

Omega is the fuzzy state decision index and,Is a clear state decision index;

s43, clustering the data of each point, and dividing the data into three categories of travel, static and edge movement;

Wherein,

TC_U＝{TC₁,TC₂,TC₃}

S44, combining the stationary set and the edge motion set, constructing a stay point set, and calculating a coverage area weighted radius SOR_i of each stay point set, wherein the coverage area weighted radius SOR_i is expressed as:

S_U＝{TC₂}∪{TC₃}

S_U is a stay point set, SR_i is a coverage area weighted radius of an ith stay point set, max </DEG > represents maximum value calculation, and Sr_ij represents a two-dimensional coverage radius of jth point data in the ith stay point set;

f (·) represents a range mapping normalization operation, v is a transformation rate control parameter, τ_i is a point set clustering attribution coefficient, and |d_j-D_i‖² is a Euclidean distance operation between j-th point data in an i-th stay point set and a central point D_i.

7. The feature extraction method based on reinforcement learning of mobile phone signaling history data according to claim 6, wherein in step S50, a dual dynamic POI similarity mapping algorithm specifically comprises the following steps:

S51, taking longitude and latitude coordinates of a center point of the stay point set as a center, reading land utilization information in a weighted radius range of a coverage domain of the stay point set, wherein the land utilization information comprises population number, building rectangular degree, urban degree, POI number and daily average travel amount, and providing a matching domain density Lρ expressed as:

S52, dynamically adjusting a matching domain, and correcting the matching domain radius SR_ρ of the stay point set according to the density of the matching domain:

Wherein SR_ρ is the radius of the matching domain, SR (-) is the calculation of the weighted radius of the coverage domain, sr_i is the two-dimensional coverage radius of the ith point location, tau_i is the clustering attribution coefficient, deltar is the radius adjustment step length, and Tρ is the density dividing value;

POIE_i＝max<SIM(k).

PATH_Us＝{POIE₁,POIE₂,…,POIE_n}

The SIM (k) is a similar estimated value of the kPOI th point in the corresponding matching domain of the ith stay point set;

for the coordinate vector of the kPOI th point in the corresponding matching domain of the ith stay point set,/>Coordinate vector of jth point data in ith retention point set, II·IIis vector norm operation expression,/>A region radius of kPOI th point;

sr_ij represents the two-dimensional coverage radius of the j-th point data in the i-th stay point set, omega_Tj is state set inclination weight, and ed (·) is the expression of similar adjustment Euclidean distance operation;

POIE_i obtaining the maximum similar estimated value, wherein the corresponding POI is the accurate POI point of the matching domain corresponding to the ith stop point set, and the point is considered to be the ith origin-destination point, and PATH_Us is the complete travel chain of the user Us.

8. The feature extraction method based on mobile phone signaling history data reinforcement learning according to claim 1, wherein after all user data are read and processed, complete trip sequence data of the user are obtainedReturning the motion state fuzzy weight { Tra } to the historical database in S10, and updating parameters for carrying out the same data learning and data processing on the mobile phone signaling data of the next user; the parameters include the complete trip sequence data/>, of the userMotion state blur weights { Tra }.

9. An electronic device, comprising:

One or more processors;

a storage device having one or more programs stored thereon;

When executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-7.

10. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, implements the steps in the feature extraction method based on reinforcement learning of mobile phone signaling history data as claimed in any one of claims 1 to 8.