Movatterモバイル変換


[0]ホーム

URL:


CN119521138A - User residence point identification method, device, electronic device and storage medium - Google Patents

User residence point identification method, device, electronic device and storage medium
Download PDF

Info

Publication number
CN119521138A
CN119521138ACN202411679064.9ACN202411679064ACN119521138ACN 119521138 ACN119521138 ACN 119521138ACN 202411679064 ACN202411679064 ACN 202411679064ACN 119521138 ACN119521138 ACN 119521138A
Authority
CN
China
Prior art keywords
user
signaling
signaling data
resident
residence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411679064.9A
Other languages
Chinese (zh)
Inventor
詹子琪
刘亚溪
陈立峰
郭珊妮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Information Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Information Technology Co LtdfiledCriticalChina Mobile Communications Group Co Ltd
Priority to CN202411679064.9ApriorityCriticalpatent/CN119521138A/en
Publication of CN119521138ApublicationCriticalpatent/CN119521138A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本申请公开了一种用户驻留点识别方法、装置、电子设备及存储介质,对获取到的第一用户信令数据中各用户添加目标信令,得到第二用户信令数据,目标信令为开始时间和结束时间均为最大时间戳且经纬度均为无效值的信令,通过递归算法对第二用户信令数据进行递归计算,根据距离阈值和时间阈值识别得到各用户对应的多个驻留段。根据各驻留段内的每条信令的信令时长,对每条信令分配对应的权重,计算得到各驻留段的偏移校准驻留点。最后通过递归算法对各驻留段的偏移校准驻留点和第二用户信令数据进行判断,输出各用户的多个驻留段的对应的驻留点。由此,能够提升对用户驻留状态进行识别的准确度。

The present application discloses a method, device, electronic device and storage medium for identifying user residence points. A target signaling is added to each user in the acquired first user signaling data to obtain second user signaling data. The target signaling is a signaling whose start time and end time are both maximum timestamps and whose longitude and latitude are both invalid values. The second user signaling data is recursively calculated by a recursive algorithm, and multiple residence segments corresponding to each user are obtained according to the distance threshold and time threshold identification. According to the signaling duration of each signaling in each residence segment, a corresponding weight is assigned to each signaling, and the offset calibration residence point of each residence segment is calculated. Finally, the offset calibration residence point of each residence segment and the second user signaling data are judged by a recursive algorithm, and the corresponding residence points of multiple residence segments of each user are output. In this way, the accuracy of identifying the user's residence status can be improved.

Description

User resident point identification method and device, electronic equipment and storage medium
Technical Field
The application belongs to the field of information technology support, and particularly relates to a method and a device for identifying a user residence point, electronic equipment and a storage medium.
Background
In recent years, the mobile phone signaling data has the characteristics of large sample size, short sampling period, long observation period, large coverage, high information value and the like, and is more and more unique in the aspects of researching user behavior rules, traffic investigation and the like. The traffic travel characteristics are researched based on the mobile phone signaling data, so that the defects of insufficient sample size, high labor cost and the like in traditional urban traffic investigation, traffic development strategy formulation, traffic scheme evaluation and the like can be overcome. The stay point identification is used as an important part for converting the user activity rule into the identifiable traffic semantic, and has important significance for analyzing urban traffic travel and grasping the user behavior rule by using the mobile phone signaling as space-time big data.
Existing techniques for identifying points of residence for users have some common disadvantages. First, algorithms based on spatiotemporal rules rely on fixed temporal and spatial thresholds that are difficult to adapt to dynamic changes in user behavior, and rule settings may not meet the needs of all scenarios. Secondly, the clustering-based algorithm is very sensitive to parameter selection when processing data, and trace clusters without practical significance are easy to generate, and especially in the case of sparse data or large noise, the accuracy of stay point identification can be affected. In general, these techniques have certain deficiencies in accuracy and require further improvement and optimization.
Disclosure of Invention
The embodiment of the application provides a method, a device, electronic equipment and a storage medium for identifying a user resident point, which can improve the accuracy and efficiency of network abnormal behavior detection by combining a graph data model and a graph computing technology.
In a first aspect, an embodiment of the present application provides a method for identifying a residence point of a user, where the method may include:
Acquiring first user signaling data, wherein the first user signaling data comprises continuous signaling data of at least one day of at least one user;
adding target signaling to each user in the first user signaling data to obtain second user signaling data, wherein the target signaling is signaling with maximum time stamp for start time and end time and invalid value for longitude and latitude;
Performing recursive calculation on the second user signaling data through a recursive algorithm, and identifying and obtaining a plurality of resident segments corresponding to each user according to a distance threshold value and a time threshold value;
According to the signaling duration of each signaling in each resident segment, corresponding weight is distributed to each signaling, and offset calibration resident points of each resident segment are obtained through calculation;
and judging the offset calibration dwell point of each dwell segment and second user signaling data through a recursion algorithm, and outputting corresponding dwell points of a plurality of dwell segments of each user.
In one embodiment, the adding the target signaling to each user in the first user signaling data to obtain second user signaling data includes:
Sorting the first user signaling data according to user sorting and signaling starting time to obtain sorted first user signaling data;
Cleaning and removing the sequenced first user signaling data based on a preset field in the signaling data to obtain third user signaling data;
Adding a target signaling to each user in the first user signaling data to obtain second user signaling data, including:
and adding target signaling to each user in the third user signaling data to obtain second user signaling data.
In one embodiment, the performing, by a recursive algorithm, recursive calculation on the second user signaling data, and identifying, according to a distance threshold and a time threshold, a plurality of residence segments corresponding to each user includes:
according to the second user signaling data, calculating to obtain the spherical distance of the base station corresponding to each two adjacent signaling corresponding to each user in the second user signaling data;
Comparing the spherical distances of base stations corresponding to every two adjacent signaling corresponding to each user in the second user signaling data according to the distance threshold by taking the signaling starting time of the user as a sequence, and dividing the spherical distances to obtain a plurality of possible residence sections meeting the distance threshold;
And judging the plurality of possible resident segments according to the time threshold value to obtain a plurality of resident segments corresponding to the users.
In one embodiment, the calculating, according to the second user signaling data, the spherical distance of the base station corresponding to each two adjacent signaling corresponding to each user in the second user signaling data includes:
Calculating to obtain the spherical distance of the base station corresponding to each two adjacent signaling corresponding to each user in the second user signaling data through a formula 1 and a formula 2;
Wherein lati is the latitude of the base station position associated with the ith signaling of the user, distance is the distance between the ith signaling record and/or the jth signaling record base station position of the user, the unit of distance is km, and 6371000 is the earth radius.
In one embodiment, the calculating the offset calibration dwell point of each dwell segment according to the related signaling duration of each signaling in each dwell segment assigns a corresponding weight to each signaling includes:
Calculating the signaling duration of each signaling in a target resident segment according to signaling data of the target resident segment and the target user, wherein the target resident segment is any one of a plurality of resident segments, and the target user is a user corresponding to the target resident segment;
Calculating the time weight of each signaling according to the signaling duration of each signaling in the target residence section and the signaling data of the target user;
And calculating an offset calibration residence point of the target user in the target residence section according to the signaling data of the target user, the signaling duration of each signaling in the target residence section and the time weight.
In one embodiment, the determining, by a recursive algorithm, the offset calibration dwell point and the second user signaling data of each dwell segment, and outputting a user dwell point result includes:
Sorting and sorting the offset calibration residence points of the residence segments and the second user signaling data according to the signaling time and the user to obtain a user signaling data set;
Inputting the user signaling data set into the recursive algorithm, and identifying corresponding residence points of a plurality of residence segments of each user according to the distance threshold and the time threshold.
In a second aspect, an embodiment of the present application provides a device for identifying a residence point of a user, where the device may include:
An acquisition module for acquiring first user signaling data comprising at least one day of consecutive signaling data of at least one user;
the adding module is used for adding target signaling to each user in the first user signaling data to obtain second user signaling data, wherein the target signaling is signaling with the starting time and the ending time being maximum time stamps and the longitude and latitude being invalid values;
The first calculation module is used for carrying out recursive calculation on the second user signaling data through a recursive algorithm, and identifying and obtaining a plurality of resident segments corresponding to each user according to a distance threshold value and a time threshold value;
The second calculation module is used for distributing corresponding weight to each signaling according to the signaling duration of each signaling in each resident segment, and calculating to obtain an offset calibration resident point of each resident segment;
And the judging module is used for judging the offset calibration resident points of the resident segments and the second user signaling data through a recursion algorithm and outputting corresponding resident points of a plurality of resident segments of each user.
In a third aspect, an embodiment of the present application provides an electronic device, including:
A processor;
A memory for storing processor-executable instructions;
wherein the processor is configured to execute instructions to implement a method of identifying a user residence point as shown in any one of the embodiments of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer storage medium having a computer program stored thereon, which when executed by a processor implements a method of identifying a user residence point as shown in any one of the embodiments of the first aspect.
In a fifth aspect, embodiments of the present application also provide a computer program product comprising a computer program stored in a readable storage medium, the at least one processor of the device reading and executing the computer program from the storage medium, causing the device to perform the method of identifying a user residence point as shown in any one of the embodiments of the first aspect.
The embodiment of the application provides a method, a device, electronic equipment and a storage medium for identifying a user resident point, which have the following beneficial effects compared with the prior art:
According to the method, the device, the electronic equipment and the storage medium for identifying the user residence point, target signaling is added to each user in the acquired first user signaling data to obtain second user signaling data, the target signaling is signaling with maximum time stamp and invalid longitude and latitude, the second user signaling data is subjected to recursive calculation through a recursive algorithm, and a plurality of residence sections corresponding to each user are obtained according to distance threshold and time threshold identification. And distributing corresponding weights to each signaling according to the signaling duration of each signaling in each resident segment, and calculating to obtain the offset calibration resident point of each resident segment. And finally, judging the offset calibration dwell point of each dwell segment and the second user signaling data through a recursion algorithm, and outputting corresponding dwell points of a plurality of dwell segments of each user.
Therefore, different weights of longitude and latitude of different signaling points are distributed by using the signaling duration of the residence section, the longer the signaling time is, the closer the actual residence point of the user is to the target base station, the inaccuracy of the traditional algorithm based on the geometric space average position as the residence point is solved, and the accuracy of identifying the residence state of the user can be improved.
Drawings
In order to more clearly illustrate the technical solution of the embodiments of the present application, the drawings that are needed to be used in the embodiments of the present application will be briefly described, and it is possible for a person skilled in the art to obtain other drawings according to these drawings without inventive effort.
Fig. 1 is a schematic flow chart of a method for identifying a residence point of a user according to an embodiment of the present application;
FIG. 2 is a flowchart of another method for identifying a user residence point according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a user dwell point offset calibration according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a device for identifying a residence point of a user according to an embodiment of the present application;
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Features and exemplary embodiments of various aspects of the present application will be described in detail below, and in order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail below with reference to the accompanying drawings and the detailed embodiments. It should be understood that the particular embodiments described herein are meant to be illustrative of the application only and not limiting. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the application by showing examples of the application.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising" does not exclude the presence of additional identical elements in a process, method, article, or apparatus that comprises an element.
Based on the background section, there are some common disadvantages to the prior art solutions. First, algorithms based on spatiotemporal rules rely on fixed temporal and spatial thresholds that are difficult to adapt to dynamic changes in user behavior, and rule settings may not meet the needs of all scenarios. Secondly, the clustering-based algorithm is very sensitive to parameter selection when processing data, and trace clusters without practical significance are easy to generate, and especially in the case of sparse data or large noise, the accuracy of stay point identification can be affected.
In order to solve the problems in the prior art, the embodiment of the application provides a method, a device, an electronic device and a storage medium for identifying a residence point of a user, which are used for adding a target signaling to each user in acquired first user signaling data to obtain second user signaling data, wherein the target signaling is signaling with maximum time stamp for start time and end time and invalid value for longitude and latitude, the second user signaling data is subjected to recursive calculation through a recursive algorithm, and a plurality of residence sections corresponding to each user are identified according to a distance threshold and a time threshold. And distributing corresponding weights to each signaling according to the signaling duration of each signaling in each resident segment, and calculating to obtain the offset calibration resident point of each resident segment. And finally, judging the offset calibration dwell point of each dwell segment and the second user signaling data through a recursion algorithm, and outputting corresponding dwell points of a plurality of dwell segments of each user.
Therefore, different weights of longitude and latitude of different signaling points are distributed by using the signaling duration of the residence section, the longer the signaling time is, the closer the actual residence point of the user is to the target base station, the inaccuracy of the traditional algorithm based on the geometric space average position as the residence point is solved, and the accuracy of identifying the residence state of the user can be improved.
The embodiment of the application provides a method and a device for identifying a user resident point, electronic equipment and a storage medium. The following first describes a method for identifying a user residence point provided by an embodiment of the present application. As shown in fig. 1, the method for identifying a residence point of a user provided by the embodiment of the application includes the following steps:
s101, acquiring first user signaling data, wherein the first user signaling data comprises continuous signaling data of at least one user for at least one day;
S102, adding target signaling to each user in the first user signaling data to obtain second user signaling data, wherein the target signaling is signaling with maximum time stamp for start time and end time and invalid value for longitude and latitude;
S103, carrying out recursive calculation on the second user signaling data through a recursive algorithm, and identifying a plurality of resident segments corresponding to each user according to a distance threshold value and a time threshold value;
S104, distributing corresponding weights to each signaling according to the signaling duration of each signaling in each resident segment, and calculating to obtain offset calibration resident points of each resident segment;
s105, judging the offset calibration dwell point of each dwell segment and the second user signaling data through a recursion algorithm, and outputting corresponding dwell points of a plurality of dwell segments of each user.
The embodiment of the application provides a method, a device, electronic equipment and a storage medium for identifying a user residence point, wherein target signaling is added to each user in acquired first user signaling data to obtain second user signaling data, the target signaling is signaling with maximum time stamp and invalid longitude and latitude of start time and end time, the second user signaling data is subjected to recursive calculation through a recursive algorithm, and a plurality of residence sections corresponding to each user are identified according to a distance threshold value and a time threshold value. And distributing corresponding weights to each signaling according to the signaling duration of each signaling in each resident segment, and calculating to obtain the offset calibration resident point of each resident segment. And finally, judging the offset calibration dwell point of each dwell segment and the second user signaling data through a recursion algorithm, and outputting corresponding dwell points of a plurality of dwell segments of each user.
In S101, user signaling data is acquired and packet ordered. Acquiring continuous signaling data of a user on a single day or multiple days, wherein detailed information of the signaling data is shown in a table 1;
TABLE 1 details of Signaling data
In one example, the adding the target signaling to each user in the first user signaling data to obtain the second user signaling data includes:
Sorting the first user signaling data according to user sorting and signaling starting time to obtain sorted first user signaling data;
Cleaning and removing the sequenced first user signaling data based on a preset field in the signaling data to obtain third user signaling data;
Adding a target signaling to each user in the first user signaling data to obtain second user signaling data, including:
and adding target signaling to each user in the third user signaling data to obtain second user signaling data.
In the above embodiment, after each user is sequenced according to the user, the signaling start time is sequenced, so as to obtain sequenced first user signaling data. And then cleaning the data of the sequenced first user signaling data. The method can include screening samples of invalid values in the fields of UID, longitude and latitude in the signaling record, and manually eliminating the samples as abnormal values in the first user signaling data to obtain third user signaling data.
In S102, a signaling that the starting time and the ending time of a line of signaling are both the maximum time stamp (for example, the maximum time that the system can represent) and the longitude and latitude are both invalid values is added to each user in the first user signaling data, so as to solve the problem that the recursive algorithm cannot identify the last signaling continuity or intermittent residence segment in the original signaling of the user. It should be noted that, the invalid value may be-1, which indicates that this is an invalid geographic location, and the algorithm will not treat it as an actual geographic location, which is not limited herein. This ensures that the target signaling is always the last of all signaling. The target signaling has the main effect that when the algorithm processes the last valid signaling of the user, the algorithm can continue to execute due to the existence of the virtual signaling without worrying about the fact that the ending condition of the resident segment cannot be judged because of the absence of the next signaling. Through this virtual signaling, the algorithm can force the trigger of the determination of the last valid signaling, ensuring that any possible resident segments are not missed due to lack of subsequent signaling.
In S103, the acquired second user signaling data is first sequenced according to the signaling time, and then the signaling data of the user is traversed by using a recursive algorithm according to the distance threshold and the time threshold, and the distance between the signaling and the stay time are calculated one by one, so as to determine whether the distance threshold and the time threshold are satisfied. The distance threshold and the time threshold may be set according to actual needs, and are not limited herein. For example, assume that the distance threshold is set to 0.5 km. This means that if the distance between two signalling records is less than 0.5 km, the system will consider that they belong to the same resident segment. Likewise, assume that the time threshold is set to 30 minutes. This means that if the user stays at a location for more than 30 minutes, the location is identified as a stay point.
The logic of the recursive algorithm may include, after grouping according to the user, sequentially sorting according to the signaling start time of the user, sequentially retrieving the signaling after the backward until the first signaling exceeding the distance threshold (0.5 km) is retrieved, where the last signaling of the signaling is within the distance threshold (0.5 km), by this way, the goal of dividing the most likely residence section is achieved when the distance threshold is controlled, and when the distance threshold is met, determining whether the residence time length meets the time threshold (30 min), that is, whether the ending time of the signaling-the starting time of the first signaling is greater than or equal to 30min. It should be noted that, due to possible discontinuities in signaling, even if the next signaling duration exceeds the time threshold, its last signaling may not meet the time threshold at the time threshold determination, and the impact of this feature on the tail-end signaling is more pronounced. For example, according to the activity rule of the user, the last signaling of the user is generally in the range of 23-0 points at night, and when the last signaling is a residence section, the possible residence time of part of the users does not meet the time threshold of 30min, so that it is necessary to supplement the last signaling for judging whether the detected possible ending signaling of the residence section meets the residence time of 30min.
In one example, the performing, by a recursive algorithm, the recursive calculation on the second user signaling data, and identifying, according to a distance threshold and a time threshold, a plurality of residence segments corresponding to each user includes:
according to the second user signaling data, calculating to obtain the spherical distance of the base station corresponding to each two adjacent signaling corresponding to each user in the second user signaling data;
Comparing the spherical distances of base stations corresponding to every two adjacent signaling corresponding to each user in the second user signaling data according to the distance threshold by taking the signaling starting time of the user as a sequence, and dividing the spherical distances to obtain a plurality of possible residence sections meeting the distance threshold;
And judging the plurality of possible resident segments according to the time threshold value to obtain a plurality of resident segments corresponding to the users.
In the foregoing embodiment, the step of calculating, according to the second user signaling data, a spherical distance between each two adjacent base stations corresponding to two signaling corresponding to each user in the second user signaling data may include:
Calculating to obtain the spherical distance of the base station corresponding to each two adjacent signaling corresponding to each user in the second user signaling data through a formula 1 and a formula 2;
Wherein lati is the latitude of the base station position associated with the ith signaling of the user, distance is the distance between the ith signaling record and/or the jth signaling record base station position of the user, the unit of distance is km, and 6371000 is the earth radius.
The algorithm logic of the recursive algorithm defining the resident segments is then as follows:
Input parameters are signaling data bsi={countyIdi,latitudei,longitudei,procedureStartTimei,procedureEndTimegeohashi}, of the user, a distance threshold DISTTHREH (0.5 km), a time threshold TIMETHREH (30 min);
output parameters set sp= { S } formed by dwell points, wherein the detailed information of Si={UID,countyIdi,orderedStarti,orderedEndi,latitude_loci,longitude_loci,arriveTimei,leaveTimei}, dwell data is shown in table 2 below:
TABLE 2 details of resident data
And recursion logic, namely after grouping the users, sequentially sequencing the signaling after the signaling starting time of the users, sequentially retrieving the signaling after the signaling is retrieved backwards until the signaling which exceeds a distance threshold (0.5 km) is retrieved, wherein the last signaling of the signaling is within the distance threshold (0.5 km), the goal of dividing the most possible residence section is realized when the distance threshold is controlled in the mode, and when the distance threshold is met, whether the residence time length meets a time threshold (30 min) is judged, namely, whether the ending time of the signaling, namely, the starting time of the first signaling is more than or equal to 30min.
Supplementary logic it is noted that due to possible discontinuities in signalling, in the time threshold determination, even if the duration of the next signalling exceeds the time threshold, its last signalling may not meet the time threshold, and the effect of this feature on the tail signalling is more pronounced. For example, according to the activity rule of the user, the last signaling of the user is generally in the range of 23-0 points at night, and when the last signaling is a residence section, the possible residence time of part of the users does not meet the time threshold of 30min, so that it is necessary to supplement the last signaling for judging whether the detected possible ending signaling of the residence section meets the residence time of 30min.
In S104, according to the signaling data of each user, the signaling duration of each signaling, the time weight of each signaling, and the predicted residence point of the residence section of the user between the signaling (i.e. the offset calibration residence point) are calculated, so as to obtain the latitude and longitude of the residence section. Namely, different weights are distributed to the longitude and latitude of each signaling through the signaling duration of different signaling from the first signaling to the tail signaling in the resident segment, the resident point of the resident segment is calculated, and the offset calibration of the resident point is realized.
In one example, the allocating a corresponding weight to each signaling according to the signaling duration of each signaling in each residence segment, and calculating to obtain an offset calibration residence point of each residence segment includes:
Calculating the signaling duration of each signaling in a target resident segment according to signaling data of the target resident segment and the target user, wherein the target resident segment is any one of a plurality of resident segments, and the target user is a user corresponding to the target resident segment;
Calculating the time weight of each signaling according to the signaling duration of each signaling in the target residence section and the signaling data of the target user;
And calculating an offset calibration residence point of the target user in the target residence section according to the signaling data of the target user, the signaling duration of each signaling in the target residence section and the time weight.
In the above embodiment, in the above recursive step, the possible signaling start index i and end index j of the resident segment have been obtained by the distance threshold (0.5 km) and the time threshold (30 min), and within such resident segment interval, the signaling data bS of each user is usedi={countyIdi,latitudei,longitudei,procedureStartTimei
ProcedureEndTimei, the following three indices are calculated:
1. the signaling duration of each signaling is calculated as follows:
durationk=procedureStartTimek-procedureEndTimek
Adding it to the signaling data bSi of the user, then
bSi={countyIdi,latitudei,longitudei,procedureStartTimei,
procedureEndTimei,durationi}
2. The time weight of each signaling is calculated as follows:
Wherein bSk represents the signaling duration of the kth signaling;
3. Predicted dwell point for a user to dwell a segment between signaling i to j-1:
According to bSi={countyIdi,latitudei,longitudei,procedureStartTimei,procedureEndTimei,durationi}, where bSk[1],bSk [2] represents the latitude and longitude ;Si={UID,countyIdi,orderedStarti,orderedEndi,latitude_loci,longitude_loci,arriveTimei,leaveTimei},Si{[4],[5]} of the piece of signaling, respectively, represents the latitude and longitude of the resident segment Si, respectively. Namely, through signaling time length Wk of different signaling from the first signaling to the tail signaling in the resident segment, different weights are allocated to longitude and latitude { bSk[1],bSk [2] } of each signaling, resident points Si { [4], [5] } of the resident segment are calculated, and offset calibration of the resident points is realized. The dwell point can be shifted to the dwell position of the actual dwell place by a time weighting method, so that the shift calibration of the dwell point in the dwell area is realized.
In S105, signaling data for each user is obtained from the system, including daily base station communication records. Each record contains information such as base station ID, latitude and longitude, signaling start and end time, etc. The signaling records of each user are ordered according to the signaling start time, so that the data are ensured to be arranged in time sequence, and the subsequent processing is convenient. The ordered signaling data is passed to a recursive algorithm. The algorithm identifies the resident segments based on a preset distance threshold and time threshold. The algorithm checks the signaling record piece by piece, and determines if distance and time conditions are met to determine the resident segment. After processing by a recursion algorithm, the identification result of each user is output, and the identification result comprises a plurality of resident segment information of the user. Such information includes data such as time ranges and dwell points for each dwell segment.
In one embodiment, the determining, by a recursive algorithm, the offset calibration dwell point of each dwell segment and the second user signaling data, and outputting the result of the user dwell point, includes:
Sorting and sorting the offset calibration residence points of the residence segments and the second user signaling data according to the signaling time and the user to obtain a user signaling data set;
Inputting the user signaling data set into the recursive algorithm, and identifying corresponding residence points of a plurality of residence segments of each user according to the distance threshold and the time threshold.
In the above embodiments, the signaling data of the user is entered. Each user communicates with the base station daily, so each user communicates with the base station set dailyThe user is indexed by u epsilon UID, T epsilon T is the distribution date of user signaling data, N{u,t} represents the switching times of the user u and the base station communication in a target time interval of the date T, bsi (i is more than or equal to 1 and less than or equal to N) represents the base station information of each switching, and the information bsi={countyIdi,latitudei,longitudei,procedureStartTimei,procedureEndTimegeohashi}. transmits all signaling data (namely user signaling data set) sequenced by each user into a recursive algorithm for identifying a residence section to obtain an identified residence result.
In order to better illustrate the method provided by the embodiments of the present application, the following description is based on a specific embodiment. Referring to a flow chart of a method for identifying a user residence point shown in fig. 2, the scheme includes:
step one, user signaling data are acquired and are grouped and sequenced. And acquiring continuous signaling data of the user on a single day or multiple days, and sequencing signaling start time of each user according to user sequencing, wherein detailed information of the signaling data is shown in the table 2.
And step two, data cleaning. Screening samples with invalid values in the fields of UID, longitude and latitude in the signaling record, and manually eliminating the samples as abnormal values in the user signaling data.
And step three, data processing. And adding a signaling with a maximum timestamp and a maximum latitude and longitude of a line of signaling for each user, so as to solve the problem that the recursive algorithm cannot identify the last signaling continuity or intermittent residence section in the original signaling of the user.
And step four, defining a function for calculating the spherical distance between two points, wherein the calculation formula is as follows:
(Unit: km)
Wherein lati is the latitude of the base station position associated with the ith signaling of the user, distance is the distance between the ith signaling record and/or the jth signaling record base station position of the user, and 6371000 is the earth radius.
Defining a recursive algorithm for identifying the resident segments, wherein the algorithm logic is as follows:
Input parameters are signaling data bsi={countyIdi,latitudei,longitudei,procedureStartTimei,procedureEndTimegeohashi}, of the user, a distance threshold DISTTHREH (0.5 km), a time threshold TIMETHREH (30 min);
the output parameters are set sp= { S } formed by dwell points, wherein the detailed information of Si={UID,countyIdi,orderedStarti,orderedEndi,latitude_loci,longitude_loci,arriveTimei,leaveTimei}, dwell data is shown in table 2 above.
And recursion logic, namely after grouping the users, sequentially sequencing the signaling after the signaling starting time of the users, sequentially retrieving the signaling after the signaling is retrieved backwards until the signaling which exceeds a distance threshold (0.5 km) is retrieved, wherein the last signaling of the signaling is within the distance threshold (0.5 km), the goal of dividing the most possible residence section is realized when the distance threshold is controlled in the mode, and when the distance threshold is met, whether the residence time length meets a time threshold (30 min) is judged, namely, whether the ending time of the signaling, namely, the starting time of the first signaling is more than or equal to 30min.
Supplementary logic it is noted that due to possible discontinuities in signalling, in the time threshold determination, even if the duration of the next signalling exceeds the time threshold, its last signalling may not meet the time threshold, and the effect of this feature on the tail signalling is more pronounced. For example, according to the activity rule of the user, the last signaling of the user is generally in the range of 23-0 points at night, and when the last signaling is a residence section, the possible residence time of part of the users does not meet the time threshold of 30min, so that it is necessary to supplement the last signaling for judging whether the detected possible ending signaling of the residence section meets the residence time of 30 min. The pseudo code of the recursive algorithm is as follows:
Offset calibration logic in the above recursive step, the possible signalling start index i and end index j of the resident segment have been obtained by means of a distance threshold (0.5 km) and a time threshold (30 min), within such resident segment interval, based on the signalling data bS of each useri={countyIdi,latitudei,longitudei,procedureStartTimei
ProcedureEndTimei, the following three indices are calculated:
1. the signaling duration of each signaling is calculated as follows:
durationk=procedureStartTimek-procedureEndTimek
Adding it to the signaling data bSi of the user, then
bSi={countyIdi,latitudei,longitudei,procedureStartTimei,
procedureEndTimei,durationi}
2. The time weight of each signaling is calculated as follows:
Wherein bSk represents the signaling duration of the kth signaling;
3. Predicted dwell point for a user to dwell a segment between signaling i to j-1:
According to bSi={countyIdi,latitudei,longitudei,procedureStartTimei,procedureEndTimei,durationi}, where bSk[1],bSk [2] represents the latitude and longitude ;Si={UID,countyIdi,orderedStarti,orderedEndi,latitude_loci,longitude_loci,arriveTimei,leaveTimei},Si{[4],[5]} of the piece of signaling, respectively, represents the latitude and longitude of the resident segment Si, respectively. Namely, through signaling time length Wk of different signaling from the first signaling to the tail signaling in the resident segment, different weights are allocated to longitude and latitude { bSk[1],bSk [2] } of each signaling, resident points Si { [4], [5] } of the resident segment are calculated, and offset calibration of the resident points is realized. The dwell point can be shifted to the dwell position of the actual dwell place by a time weighting method, so that the shift calibration of the dwell point in the dwell area is realized.
As can be seen from fig. 3, the dwell point can be shifted to the dwell position of the actual residence by a time weighting method, so that the shift calibration of the dwell point in the dwell area is realized.
And step six, signaling data of the user is transmitted in. Each user communicates with the base station daily, so each user communicates with the base station set dailyThe user is indexed by u epsilon UID, T epsilon T is the distribution date of user signaling data, N{u,t} represents the switching times of the user u and the base station communication in a target time interval of the date T, bsi (i is more than or equal to 1 and less than or equal to N) represents the base station information of each switching, and the information bsi={countyIdi,latitudei,longitudei,procedureStartTimei,procedureEndTimegeohashi}. transmits all the signaling data sequenced by each user into a recursion algorithm for identifying the resident section to obtain the identified resident result.
Based on the method for identifying a user resident point provided in the foregoing embodiment, correspondingly, as shown in fig. 4, an embodiment of the present application provides a device 400 for identifying a user resident point, where the device may include:
An acquisition module 401, configured to acquire first user signaling data, where the first user signaling data includes continuous signaling data of at least one day of at least one user;
An adding module 402, configured to add a target signaling to each user in the first user signaling data to obtain second user signaling data, where the target signaling is a signaling with a start time and an end time being maximum time stamps and a longitude and latitude being invalid values;
A first calculation module 403, configured to perform a recursive calculation on the second user signaling data through a recursive algorithm, and identify a plurality of residence segments corresponding to each user according to a distance threshold and a time threshold;
a second calculation module 404, configured to allocate a corresponding weight to each signaling according to a signaling duration of each signaling in each residence segment, and calculate an offset calibration residence point of each residence segment;
and a judging module 405, configured to judge, by using a recursive algorithm, the offset calibration dwell point of each dwell segment and the second user signaling data, and output corresponding dwell points of a plurality of dwell segments of each user.
In one embodiment, the adding module may include:
The ordering module is used for ordering the first user signaling data according to user ordering and signaling starting time to obtain ordered first user signaling data;
the cleaning module is used for cleaning and removing the sequenced first user signaling data based on a preset field in the signaling data to obtain third user signaling data;
the adding module may be further specifically configured to:
and adding target signaling to each user in the third user signaling data to obtain second user signaling data.
In one embodiment, the first computing module may include:
The first calculation unit is used for calculating and obtaining the spherical distance of the base station corresponding to each two adjacent signaling corresponding to each user in the second user signaling data according to the second user signaling data;
the first comparison unit is used for comparing the spherical distances of the base stations corresponding to every two adjacent signaling corresponding to each user in the second user signaling data according to the distance threshold by taking the signaling starting time of the user as a sequence, and dividing the spherical distances to obtain a plurality of possible residence sections meeting the distance threshold;
And the first judging unit is used for judging the plurality of possible resident segments according to the time threshold value to obtain a plurality of resident segments corresponding to the users.
In one embodiment, the first computing unit may be specifically configured to:
Calculating to obtain the spherical distance of the base station corresponding to each two adjacent signaling corresponding to each user in the second user signaling data through a formula 1 and a formula 2;
Wherein lati is the latitude of the base station position associated with the ith signaling of the user, distance is the distance between the ith signaling record and/or the jth signaling record base station position of the user, the unit of distance is km, and 6371000 is the earth radius.
In one embodiment, the second computing module may be specifically configured to:
Calculating the signaling duration of each signaling in a target resident segment according to signaling data of the target resident segment and the target user, wherein the target resident segment is any one of a plurality of resident segments, and the target user is a user corresponding to the target resident segment;
Calculating the time weight of each signaling according to the signaling duration of each signaling in the target residence section and the signaling data of the target user;
And calculating an offset calibration residence point of the target user in the target residence section according to the signaling data of the target user, the signaling duration of each signaling in the target residence section and the time weight.
In one embodiment, the judging module may be specifically configured to:
Sorting and sorting the offset calibration residence points of the residence segments and the second user signaling data according to the signaling time and the user to obtain a user signaling data set;
Inputting the user signaling data set into the recursive algorithm, and identifying corresponding residence points of a plurality of residence segments of each user according to the distance threshold and the time threshold.
Based on the method and the device for identifying the residence point of the user provided in the foregoing embodiments, the embodiment of the present application further provides an electronic device 500, as shown in fig. 5:
comprises a processor 501, a memory 502, and a computer program stored in the memory 502 and capable of running on the processor 501, which when executed by the processor 501, realizes the processes of the above-mentioned embodiments of the method for identifying the user residence point, and can achieve the same technical effects.
In particular, the processor 501 may include a Central Processing Unit (CPU), or an Application SPECIFIC INTEGRATED Circuit (ASIC), or may be configured as one or more integrated circuits that implement embodiments of the present application.
Memory 502 may include mass storage for data or instructions. By way of example, and not limitation, memory 502 may comprise a hard disk drive (HDD, hard Disk Drive), floppy disk drive, flash memory, optical disk, magneto-optical disk, magnetic tape, or universal serial bus (USB, universal Serial Bus) drive, or a combination of two or more of the foregoing. Memory 502 may include removable or non-removable (or fixed) media, where appropriate. Memory 502 may be internal or external to the integrated gateway disaster recovery device, where appropriate. In a particular embodiment, the memory 502 is a non-volatile solid state memory.
In particular embodiments, the memory may include Read Only Memory (ROM), random Access Memory (RAM), magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical, or other physical/tangible memory storage devices. Thus, in general, the memory includes one or more tangible (non-transitory) computer-readable storage media (e.g., memory devices) encoded with software comprising computer-executable instructions and when the software is executed (e.g., by one or more processors) it is operable to perform the operations described with reference to a method in accordance with an aspect of the application.
The processor 501 implements any one of the above-described embodiments of the method for identifying a user resident point by reading and executing computer program instructions stored in the memory 502.
In one example, the electronic device may also include a communication interface 503 and a bus 510. As an example, as shown in fig. 5, a processor 501, a memory 502, and a communication interface 503 are connected and communicate with each other via a bus 510.
The communication interface 503 is mainly used to implement communication between each module, apparatus, unit and/or device in the embodiments of the present application.
Bus 510 includes hardware, software, or both that couple the components of the online data flow billing device to each other. By way of example, and not limitation, the buses may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a micro channel architecture (MCa) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus, or a combination of two or more of the above. Bus 510 may include one or more buses, where appropriate. Although embodiments of the application have been described and illustrated with respect to a particular bus, the application contemplates any suitable bus or interconnect.
The embodiment of the application also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the processes of the above-mentioned embodiment of the method for identifying a user residence point, and can achieve the same technical effects, so that repetition is avoided, and no further description is given here. Among them, a computer-readable storage medium such as a Read-Only Memory (ROM), a random-access Memory (RAM, random Access Memory), a magnetic disk or an optical disk, and the like.
It should be understood that the application is not limited to the particular arrangements and instrumentality described above and shown in the drawings. For the sake of brevity, a detailed description of known methods is omitted here. In the above embodiments, several specific steps are described and shown as examples. The method processes of the present application are not limited to the specific steps described and shown, but various changes, modifications and additions, or the order between steps may be made by those skilled in the art after appreciating the spirit of the present application.
The functional blocks shown in the above block diagrams may be implemented in hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, a plug-in, a function card, or the like. When implemented in software, the elements of the application are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine readable medium or transmitted over transmission media or communication links by a data signal carried in a carrier wave. A "machine-readable medium" may include any medium that can store or transfer information. Examples of machine-readable media include electronic circuitry, semiconductor memory devices, ROM, flash memory, erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, radio Frequency (RF) links, and the like. The code segments may be downloaded via computer networks such as the internet, intranets, etc.
It should also be noted that the exemplary embodiments mentioned in this disclosure describe some methods or systems based on a series of steps or devices. The present application is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, or may be performed in a different order from the order in the embodiments, or several steps may be performed simultaneously.
Aspects of the present application are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus and computer program products according to embodiments of the application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to being, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware which performs the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In the foregoing, only the specific embodiments of the present application are described, and it will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the systems, modules and units described above may refer to the corresponding processes in the foregoing method embodiments, which are not repeated herein. It should be understood that the scope of the present application is not limited thereto, and any equivalent modifications or substitutions can be easily made by those skilled in the art within the technical scope of the present application, and they should be included in the scope of the present application.

Claims (10)

CN202411679064.9A2024-11-212024-11-21 User residence point identification method, device, electronic device and storage mediumPendingCN119521138A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202411679064.9ACN119521138A (en)2024-11-212024-11-21 User residence point identification method, device, electronic device and storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202411679064.9ACN119521138A (en)2024-11-212024-11-21 User residence point identification method, device, electronic device and storage medium

Publications (1)

Publication NumberPublication Date
CN119521138Atrue CN119521138A (en)2025-02-25

Family

ID=94660972

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202411679064.9APendingCN119521138A (en)2024-11-212024-11-21 User residence point identification method, device, electronic device and storage medium

Country Status (1)

CountryLink
CN (1)CN119521138A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20140136414A1 (en)*2006-03-172014-05-15Raj AbhyankerAutonomous neighborhood vehicle commerce network and community
CN105142106A (en)*2015-07-292015-12-09西南交通大学Traveler home-work location identification and trip chain depicting method based on mobile phone signaling data
CN105682025A (en)*2016-01-052016-06-15重庆邮电大学User residing location identification method based on mobile signaling data
CN107133318A (en)*2017-05-032017-09-05北京市交通信息中心A kind of population recognition methods based on mobile phone signaling data
CN111770452A (en)*2020-05-272020-10-13中山大学 A mobile phone signaling stop point identification method based on personal travel trajectory characteristics
CN114707616A (en)*2022-04-292022-07-05阿里云计算有限公司Method, device and equipment for identifying incidental relationship between tracks
CN115412852A (en)*2021-05-272022-11-29中移动信息技术有限公司Method and system for determining motion trail of mobile terminal
CN118427457A (en)*2023-09-062024-08-02西安科技大学 A hotspot identification method and system for trajectory data

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20140136414A1 (en)*2006-03-172014-05-15Raj AbhyankerAutonomous neighborhood vehicle commerce network and community
CN105142106A (en)*2015-07-292015-12-09西南交通大学Traveler home-work location identification and trip chain depicting method based on mobile phone signaling data
CN105682025A (en)*2016-01-052016-06-15重庆邮电大学User residing location identification method based on mobile signaling data
CN107133318A (en)*2017-05-032017-09-05北京市交通信息中心A kind of population recognition methods based on mobile phone signaling data
CN111770452A (en)*2020-05-272020-10-13中山大学 A mobile phone signaling stop point identification method based on personal travel trajectory characteristics
CN115412852A (en)*2021-05-272022-11-29中移动信息技术有限公司Method and system for determining motion trail of mobile terminal
CN114707616A (en)*2022-04-292022-07-05阿里云计算有限公司Method, device and equipment for identifying incidental relationship between tracks
CN118427457A (en)*2023-09-062024-08-02西安科技大学 A hotspot identification method and system for trajectory data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
魏亚果 等: "《定时器及其元件》", 30 September 1985, 国防工业出版社, pages: 44 - 45*

Similar Documents

PublicationPublication DateTitle
CN108574934B (en)Pseudo base station positioning method and device
CN109951306B (en)Alarm processing method, device, equipment and medium
CN114564370B (en)Method, device and equipment for determining alarm threshold value and computer storage medium
CN110958599B (en)One-machine multi-card user distinguishing method based on track similarity
CN115174355A (en)Generation method of fault root cause positioning model, and fault root cause positioning method and device
CN109995886B (en)Domain name identification method, device, equipment and medium
CN116009034B (en)Satellite signal capturing method, baseband signal processing unit, receiver and medium
CN111414528B (en)Method and device for determining equipment identification, storage medium and electronic equipment
CN116109145A (en)Risk assessment method, risk assessment device, risk assessment terminal and risk assessment storage medium for vehicle driving route
CN109618281B (en) A kind of identification method and device of high-speed rail community
CN119521138A (en) User residence point identification method, device, electronic device and storage medium
CN110933601A (en)Target area determination method, device, equipment and medium
CN109982392B (en) Neighbor cell configuration method, device, device and medium of base station cell
CN112288050B (en)Abnormal behavior identification method and device, terminal equipment and storage medium
CN111090642B (en)Method for cleaning signaling data of mobile phone
CN111355630B (en)Block chain performance quantitative analysis method, system, equipment and storage medium
CN116434534B (en) Road congestion index determination method, device, equipment, medium and product
CN111431869A (en)Method and device for acquiring vulnerability information heat
CN115209342B (en)Subway driver identification method, system and readable storage medium
CN112566013B (en)Target equipment positioning method, device, equipment and computer storage medium
CN111385731B (en) Train user positioning method, device, equipment and medium
CN113808755B (en)Method for training prediction model of infected population, prediction method, device and equipment
CN116567672B (en)High-speed rail user identification method, device and storage medium
CN114579547B (en) Data processing method, device, equipment, storage medium and computer program product
CN110798271A (en) A Neural Network-Based Pseudo-Path Elimination Method in Wireless Channel Measurement

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp