Movatterモバイル変換


[0]ホーム

URL:


CN103051637A - User identification method and device - Google Patents

User identification method and device
Download PDF

Info

Publication number
CN103051637A
CN103051637ACN2012105932268ACN201210593226ACN103051637ACN 103051637 ACN103051637 ACN 103051637ACN 2012105932268 ACN2012105932268 ACN 2012105932268ACN 201210593226 ACN201210593226 ACN 201210593226ACN 103051637 ACN103051637 ACN 103051637A
Authority
CN
China
Prior art keywords
user
cookie
websites
information
access
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012105932268A
Other languages
Chinese (zh)
Inventor
罗峰
黄苏支
李娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING IZP TECHNOLOGIES Co Ltd
Original Assignee
BEIJING IZP TECHNOLOGIES Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING IZP TECHNOLOGIES Co LtdfiledCriticalBEIJING IZP TECHNOLOGIES Co Ltd
Priority to CN2012105932268ApriorityCriticalpatent/CN103051637A/en
Publication of CN103051637ApublicationCriticalpatent/CN103051637A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Landscapes

Abstract

The invention provides a user identification method and device, wherein the user identification method comprises the steps of: selecting messages from network access log messages, wherein the messages are same with a user identification within a set time period and are in one-to-one correspondence to website COOKIEs of a set website; obtaining tetrad information from the obtained messages, wherein the tetrad information comprises domain names of user access websites indicated by the user identification, the user identification, COOKIE fields of the user access websites, and values of the COOKITE fields; performing statistics on the tetrad information to obtain access information of each user access website; filtering the access information of each user access website to obtain the COOKIE filed of the identification user of each user access website; and establishing the correspondence of the obtained COOKIE field and the user identification, and identifying the user according to the correspondence. According to the invention, the effect of accurately identifying the user is achieved.

Description

User identification method and device
Technical field
The present invention relates to networking technology area, particularly relate to a kind of user identification method and device.
Background technology
At present along with Internet technology use more and more extensive, the routine work that people are a lot of and amusement are all carried out at network.Under a lot of network application scenes, the user is when accesses network, and server need to be identified the user.The user identification method that comparatively extensively adopts comprises by the IP Address Recognition, by the identification of ADSL account and the COOKIE identification by the website etc.
Wherein, during by IP Address Recognition user, because IP resource-constrained, and the Internet user is more and more, present broadband user generally uses dynamic IP, also takies valuable IP resource to avoid the user not surf the Net, like this, just cause same IP address to be used by a plurality of different users, can't accurately identify the user.
During by ADSL account identification user, the browser version that usually uses with UA(user) combine, but the granularity of this mode by ADSL+UA is too thick, a plurality of users of ADSL correspondence can occur, is difficult to equally accurately determine the user.
When the COOKIE by the website identifies the user, the behavior of COOKIE Technical Follow-Up user under this website adopted in the website, yet each website can only be followed the tracks of the user in this website or be embedded the access behavior of the third party website of this website COOKIE, can't follow the tracks of the user behavior of whole the Internet, thereby also can't accurately identify the user.
As seen, no matter above-mentioned which kind of situation, all can't reach accurate identification user, and only have server accurately identify customer end and user, just can carry out follow-up high precision manipulation, throw in advertisement etc. such as high accuracy, to lower information interaction cost and information interaction amount, promote the user access of network is experienced.
Summary of the invention
The invention provides a kind of user identification method and device, can't accurately identify user's problem to solve existing scheme.
In order to address the above problem, the invention discloses a kind of user identification method, comprising: from access to netwoks daily record message, obtain in the setting-up time section user ID identical, and described user ID and the website COOKIE that sets website message one to one; From the described message that obtains, obtain quaternary group information, wherein, described quaternary group information comprises domain name, described user ID, the COOKIE field of described user's access websites and the value of described COOKIE field of user's access websites of described user ID indication; Described quaternary group information is added up, obtained the visit information of each described user's access websites; Visit information to described each user's access websites filters, and obtains the COOKIE field that described each user's access websites identifies described user; The described COOKIE field that foundation is obtained and the corresponding relation of described user ID are identified described user according to described corresponding relation.
Preferably, described user ID comprises user account and browser version number; The visit information of described user's access websites comprises: the domain name of described user's access websites, the page browsing amount of domain name, the page browsing amount accounting of domain name, the COOKIE field of described user's access websites, the number of page views that user ID is identical, the number of page views that user ID is different, the number of page views ratio that described user ID is different, independent visitor's number of times that user ID is identical, independent visitor's number of times that user ID is different, independent visitor's ratio that described user ID is different.
Preferably, before the described step that the visit information of described each user's access websites is filtered, also comprise: sort according to the visit information of the different independent visitor's ratio of the different number of page views ratio of described user ID and/or described user ID to described each user's access websites.
Preferably, described visit information to described each user's access websites filters, obtaining the step that described each user's access websites identifies described user's COOKIE field comprises: the page browsing amount or the mutual information that use domain name, perhaps information gain, visit information to described each user's access websites filters, and obtains the COOKIE field that described each user's access websites identifies described alone family.
Preferably, described user identification method also comprises: according to the described message that obtains, obtain the website access information of two identical websites of COOKIE name, wherein, described website access information comprises: the COOKIE field of described two websites, the value of described COOKIE field, the domain name of described two websites, the number of page views that user ID is identical, the number of page views that user ID is different, the number of page views ratio that described user ID is different, independent visitor's number of times that user ID is identical, independent visitor's number of times that user ID is different, independent visitor's ratio that described user ID is different; Sort according to the visit information of the different independent visitor's ratio of the different number of page views ratio of described user ID and/or described user ID to described two websites; Described visit information after the ordering is filtered, determine whether described two websites use identical COOKIE field; If the association of then setting up described two websites is identified described user according to the corresponding relation of described association and described COOKIE field and described user ID.
Preferably, described user identification method also comprises: if comprise the value of a plurality of COOKIE fields for the described COOKIE field that identifies described user, then carry out association between the value of described a plurality of COOKIE fields; Corresponding relation according to described association and described COOKIE field and described user ID is identified described user.
In order to address the above problem, the invention also discloses a kind of customer identification device, comprise: the first acquisition module, be used for obtaining in the setting-up time section user ID from access to netwoks daily record message identical, and described user ID and the website COOKIE that sets website message one to one; The second acquisition module, be used for obtaining quaternary group information from the described message that obtains, wherein, described quaternary group information comprises domain name, described user ID, the COOKIE field of described user's access websites and the value of described COOKIE field of user's access websites of described user ID indication; The 3rd acquisition module is used for described quaternary group information is added up, and obtains the visit information of each described user's access websites; The 4th acquisition module is used for the visit information of described each user's access websites is filtered, and obtains the COOKIE field that described each user's access websites identifies described user; Identification module be used for to be set up the described COOKIE field obtained and the corresponding relation of described user ID, identifies described user according to described corresponding relation.
Preferably, described user ID comprises user account and browser version number; The visit information of described user's access websites comprises: the domain name of described user's access websites, the page browsing amount of domain name, the page browsing amount accounting of domain name, the COOKIE field of described user's access websites, the number of page views that user ID is identical, the number of page views that user ID is different, the number of page views ratio that described user ID is different, independent visitor's number of times that user ID is identical, independent visitor's number of times that user ID is different, independent visitor's ratio that described user ID is different.
Preferably, described customer identification device also comprises: order module, be used for before described the 4th acquisition module filters the visit information of described each user's access websites, sort according to the visit information of the different independent visitor's ratio of the different number of page views ratio of described user ID and/or described user ID to described each user's access websites.
Preferably, described the 4th acquisition module is for the page browsing amount or the mutual information that use domain name, perhaps information gain, visit information to described each user's access websites filters, and obtains the COOKIE field that described each user's access websites identifies described alone family.
Compared with prior art, the present invention has the following advantages:
Among the present invention, the first COOKIE and the one-to-one relationship of user ID by determine setting the website, thereby the user who determines this user ID sign unique user whether, and then obtain the message of this unique user.
Wherein, set the website and be generally the larger website of visit capacity, its COOKIE is known and unique, can be according to these websites and the user ID corresponding user who determines this user ID sign alone family whether one by one whether.
Determining that the user that user ID indicates is in the situation at alone family, access to netwoks daily record message to institute's access websites of this user carries out the processing such as a series of extraction, statistics and filtration, thereby the website that obtains user access can be used for unique identification user's COOKIE field, and then set up the corresponding relation of this COOKIE field and user ID, in follow-up access, the website can be according to this corresponding relation identification user.
Because include a large amount of information in the access to netwoks daily record message, wherein in fact the information of some COOKIE field can be used as identity information and use, the solution of the present invention is according to these characteristics of COOKIE information, can from a large amount of information, parse as the COOKIE field of identity information to automation, then which COOKIE field can the unique identification user identity in each website of Analysis deterrmination, set up the corresponding relation of these COOKIE fields and user ID, accurately identify the user with this corresponding relation.
By the present invention, solve existing scheme and can't accurately identify user's problem, reached accurate identification user's effect, and then, follow-up high precision manipulation can be carried out according to this high-precision recognition result in the website, throw in advertisement etc. such as high accuracy, lowered information interaction cost and information interaction amount, promoted the user access of network is experienced.
Description of drawings
Fig. 1 is the flow chart of steps according to a kind of user identification method of the embodiment of the invention one;
Fig. 2 is the flow chart of steps according to a kind of user identification method of the embodiment of the invention two;
Fig. 3 is the flow chart of steps according to a kind of user identification method of the embodiment of the invention three;
Fig. 4 is the structured flowchart according to a kind of customer identification device of the embodiment of the invention four.
Embodiment
For above-mentioned purpose of the present invention, feature and advantage can be become apparent more, the present invention is further detailed explanation below in conjunction with the drawings and specific embodiments.
Embodiment one
With reference to Fig. 1, show the flow chart of steps according to a kind of user identification method of the embodiment of the invention one.
The user identification method of the present embodiment may further comprise the steps:
Step S102: from access to netwoks daily record message, obtain in the setting-up time section user ID identical, and user ID and the website COOKIE that sets website message one to one.
Wherein, the setting-up time section can suitably be arranged according to actual conditions by those skilled in the art, and such as one day or several hours or several days etc., the present invention is not restricted this.Set the website normally visit capacity larger, the user often uses, the COOKIE field of its energy identifying user identity can be by those websites of statistical analysis technique acquisition, such as Baidu, Google, Taobao etc., these websites, its COOKIE is known and unique, can determine whether alone family of user that this user ID indicates according to these websites and user ID be whether corresponding one by one, and then obtain corresponding alone family message.
Step S104: from the message that obtains, obtain quaternary group information.
Wherein, quaternary group information comprises domain name, user ID, the COOKIE field of user's access websites and the value of COOKIE field of user's access websites of user ID indication.
The quaternary group information that obtains can comprise the quaternary group information of above-mentioned setting website, also can be the quaternary group information that comprises those websites except above-mentioned setting website.
Step S106: quaternary group information is added up, obtained the visit information of each user's access websites.
The visit information of website is the information relevant with website visiting, such as PV(PageView, page browsing amount) information, UV(Unique Visitor, independent visitor) information etc.As, can add up the different numbers of the COOKIE value under the website and the different numbers of user ID such as userid according to the access message of each website, obtain the visit information of each user's access websites.
Step S108: the visit information to each user's access websites filters, and obtains the COOKIE field that each user's access websites identifies described user.
The purpose that visit information is filtered is to filter out the COOKIE field that those can not identifying user.
Step S110: the COOKIE field that foundation is obtained and the corresponding relation of user ID, identify described user according to corresponding relation.
For example, suppose to determine that by said process the COOKIE ID in the COOKIE field of a certain website can user of unique identification, the corresponding relation of then setting up is: UID-COOKIE ID, wherein UID represents user ID, is ADSL1+UA1 such as a user's user ID, and COOKIE ID is 4500, then when this user accesses this website, server gets access to the access message of website, and therefrom getting access to COOKIE ID is 4500 o'clock, can determine that then corresponding user is ADSL1+UA1.
By the present embodiment, the first COOKIE and the one-to-one relationship of user ID by determine setting the website, thereby the user who determines this user ID sign unique user whether, and then obtain the message of this unique user.Wherein, set the website and be generally the larger website of visit capacity, its COOKIE is known and unique, can be according to these websites and the user ID corresponding user who determines this user ID sign alone family whether one by one whether.Determining that the user that user ID indicates is in the situation at alone family, access to netwoks daily record message to institute's access websites of this user carries out the processing such as a series of extraction, statistics and filtration, thereby the website that obtains user access can be used for unique identification user's COOKIE field, and then set up the corresponding relation of this COOKIE field and user ID, in follow-up access, the website can be according to this corresponding relation identification user.Because include a large amount of information in the access to netwoks daily record message, wherein in fact the information of some COOKIE field can be used as identity information and use, the solution of the present invention is according to these characteristics of COOKIE information, can from a large amount of information, parse as the COOKIE field of identity information to automation, then which COOKIE field can the unique identification user identity in each website of Analysis deterrmination, set up the corresponding relation of these COOKIE fields and user ID, accurately identify the user with this corresponding relation.Pass through the present embodiment, solve existing scheme and can't accurately identify user's problem, reached accurate identification user's effect, and then, follow-up high precision manipulation can be carried out according to this high-precision recognition result in the website, throw in advertisement etc. such as high accuracy, lowered information interaction cost and information interaction amount, promoted the user access of network is experienced.
Embodiment two
With reference to Fig. 2, show the flow chart of steps according to a kind of user identification method of the embodiment of the invention two.
The user identification method of the present embodiment may further comprise the steps:
Step S202: server obtains in the setting-up time section user ID from access to netwoks daily record message identical, and user ID and the website COOKIE that sets website message one to one.
In the present embodiment, user ID comprises user account and browser version number.User account include but not limited to surf the Net account such as ADSL account or ADSL account+UA, subscriber mailbox etc.
Take a user ID as example, suppose that this user ID is ADSL1+UA1, then server obtains all messages that user ID is ADSL1+UA1 from access to netwoks daily record message.Certainly, user account is not limited to the ADSL account, and other user account is applicable too.
Then, in the message of same user ID, can analyze the message of setting-up time section first, as, the message on the same day that this user ID is corresponding, because of the COOKIE that sets the website generally all representative, therefore user that can this website of identification access, judges whether one to one user ID and the website COOKIE that sets the website, can determine that this user ID has represented an alone family, then obtains the message at this alone family; Not one to one for user ID and website COOKIE, illustrate that then this user ID can not only represent a user, can abandon the message of the type.
Step S204: server obtains quaternary group information from the message that obtains.
Wherein, quaternary group information comprises domain name, user ID, the COOKIE field of user's access websites and the value of COOKIE field of the website that the user of user ID indication accesses.
Step S206: server is added up quaternary group information, obtains the visit information of each user's access websites.
In the present embodiment, the visit information of user's access websites comprises: the PV(page browsing amount of the domain name of user's access websites, domain name), the PV(page browsing amount of domain name) accounting, the COOKIE field of user's access websites, PV(page browsing that user ID is identical) number of times, PV(page browsing that user ID is different) number of times, PV(number of page views that user ID is different) ratio, UV(independence visitor that user ID is identical) number of times, UV(independence visitor that user ID is different) number of times, UV(independence visitor that user ID is different) ratio.
Step S208: server sorts according to the visit information of the different independent visitor's ratio of the different number of page views ratio of user ID and/or user ID to each user's access websites.
This step is preferred steps, by the ordering to website access information, and can be follow-up more effective and rapidly visit information is filtered.Obviously, it also is feasible not sorting and directly filter.
Step S210: server filters the visit information of each user's access websites, obtains the COOKIE field of each user's access websites identifying user.
Preferably, server can use page browsing amount or the mutual information of domain name, and perhaps information gain is filtered the visit information of each user's access websites, obtains the COOKIE field that each user's access websites identifies alone family.
Mutual information and information increment are two terms in the information theory, generally consider these metric relations when using text classification.Mutual information is the relation of the possibility between two event sets only, is whether be effective field by calculating corresponding relation between user ID (user ID) and the COOKIE value if weighing this COOKIE field in the present embodiment.Information gain is asymmetrical relation, is used for measuring the difference of two kinds of probability distribution, namely judges by ratio and the different threshold value of ratio setting corresponding to COOKIE of user ID.Also namely, can determine from a plurality of dimension calculating probabilities the cookie field of the unique identification user identity of each user's access websites.
Step S212: the COOKIE field that server foundation is obtained and the corresponding relation of user ID, according to corresponding relation identification user.
Preferably, if be used for the value that the COOKIE field of identifying user comprises a plurality of COOKIE fields, then between the value of a plurality of COOKIE fields, carry out association; Corresponding relation identification user according to related and COOKIE field and user ID.As, the mailbox of supposing existing user in unique identification user's the COOKIE field has again user's individual account, so, can set up this user's mailbox and the incidence relation of individual account, it also is corresponding relation, then the user can obtain user ID by mailbox, also can obtain user ID by individual account, can certainly obtain user ID by the combination of mailbox and individual account.
Need to prove, there is incidence relation in two or more websites in some cases, such as Taobao and day cat, might use identical COOKIE to indicate the user between these websites, then the scheme of the present embodiment can also comprise: according to the message that obtains, obtain the website access information of two identical websites of COOKIE name, wherein, website access information comprises: the COOKIE field of two websites, the value of COOKIE field, the domain name of two websites, the number of page views that user ID is identical, the number of page views that user ID is different, the number of page views ratio that user ID is different, independent visitor's number of times that user ID is identical, independent visitor's number of times that user ID is different, independent visitor's ratio that user ID is different; Sort according to the visit information of the different independent visitor's ratio of the different number of page views ratio of user ID and/or user ID to two websites; Visit information after the ordering is filtered, determine whether two websites use identical COOKIE field; If the association of then setting up two websites is according to the corresponding relation identification user of related and COOKIE field and user ID.For example, suppose that Taobao has identical COOKIE name with a day cat, by obtaining the website access information of these two websites, this website access information is sorted, after filtration waits and processes, determine that these two websites have used identical COOKIE field, as used identical COOKIE ID, then set up the corresponding relation of Taobao and day cat, then no matter the user accesses Taobao and still accesses a day cat, server can be according to this COOKIE field in the access message, corresponding relation according to identical user ID and website COOKIE is determined user ID, and then the user of definite access websites.
Pass through the present embodiment, realizing accurately identifying on user's the basis, also incidence relation is set up in the website of using identical COOKIE field identification user, when the COOKIE field has a plurality of value, a plurality of values are set up incidence relation, organization of unity and the management of related information have further been realized, improved user's recognition efficiency, and the information of having saved takies resource.
Embodiment three
With reference to Fig. 3, show the flow chart of steps according to a kind of user identification method of the embodiment of the invention three.
The user identification method of the present embodiment may further comprise the steps:
Step S302: obtain the COOKIE field that the website can the unique identification user identity.
Comprise in the COOKIE information a large amount of, such as YYID=D4A741CDC23704C21D8E99150E94F9C4; SUID=96F7B43C26420A0A4EA94973000407AE etc. do not have the COOKIE field information of mark, and these fields can be used as identity information and use.By this step, can automation these fields be parsed from COOKIE, judge then which COOKIE field can the unique identity of identifying user in each website.
This step specifically comprises:
Step S3022: obtain one day the original ptu daily record daily record of message (produce), by upper offline information, each bar log(daily record of mark) ADSL ID.
Step S3024: in the middle of above-mentioned daily record data, choose the data at all alone families.
Wherein, alone family refers to: ADSL ID+UA is identical, and in the middle of intraday Visitor Logs, the COOKIE of several main stream website such as baidu cookie, taobao cookie is corresponding one by one with ADSL ID+UA.
Step S3026: in the middle of alone user data, extract quaternary group information { host, userid, cookie field, the value that the cookie field is concrete }.
Wherein, host represents the domain name of the website that the user accesses; Userid represents user ID, is ADSL ID+UA in the present embodiment; The cookie field represents the COOKIE field of the website that the user accesses; The value of the COOKIE field of the website of the value representation user access that the cookie field is concrete.
Step S3028: according to quaternary group information, statistics obtains following data { host, host pv, host pv accounting, cookie field, identical pv the number of user id, different pv the numbers of user id, different pv the several ratios of user id, identical uv the number of user id, different uv the numbers of user id, the different uv ratios of user id }.
Wherein, host represents the domain name of the website that the user accesses as mentioned above, pv representation page pageview, and uv represents independent visitor, user id represents that user ID also is userid.
Step S30210: according to the resulting data of previous step, to with each website, according to the different number of times ratios of user id, isolated user not same ratio sort, then with host pv threshold filtering, perhaps mutual information or information gain are filtered, and find which COOKIE field is used for identifying unique user under each host.
Wherein, the different number of times ratios of user id are expressed as with the form of molecule/denominator: molecule is a cookie field under a website, the number of the different user id of the unique cookie value that user id is corresponding; Denominator is this cookie field under this website, and the number of the user id that all are different comprises the situation of the corresponding a plurality of cookie values of user id.
Isolated user not same ratio is expressed as with the form of molecule/denominator: molecule is a cookie field under a website, the number of the cookie value of the unique user id that cookie value is corresponding; Denominator is: this cookie field under this website, the cookie value that all are different.
In the middle of the data of the COOKIE field of identifying user, can analyze the information such as a large amount of mailboxes, account.
By said process, can find to automation each website can be used for the COOKIE field of identifying user.
Step S304: the COOKIE field that is used for identifying unique user of obtaining the inter-network station.
At many different web sites, such as " BAIDUID=" relevant information has been used in a lot of websites, at present except BAIDUID, also has the rule of some other websites such as taobao, cnzz, can use the method for statistics, excavate the similar inter-network station identifications of BAIDUID user's COOKIE field.
This step specifically comprises:
Step S3042: from alone user data, extract following data { cookie field, the cookie field value, host1, host2, identical pv the number of user id, different pv the numbers of user id, different pv the several ratios of user id, identical uv the number of user id, different uv the numbers of user id, the different uv ratios of user id }.
Preferably, host can use top-level domain such as host1 and host2, and user id also is userid.
Step S3044: according to the different number of times ratios of user id, isolated user not same ratio sort, then use host pv threshold filtering, also can mutual information or information gain filter, whether in twos count between the website public COOKIE field.
Step S3046: if whether in twos public COOKIE field between the website then according to the data of previous step, is carried out host and merged, the COOKIE field that may appear at a plurality of websites finds.Obtain data table items { cookie field, cookie field value, [host1, host2 ... ].
By said process, can find to automation which website to plant algorithm in the common same COOKIE kind of using, can lean on these COOKIE fields to carry out association between these websites.
Step S306: set up the COOKIE field of identifying user and the association between the user ID.
Comprise:
In the middle of same COOKIE field, the identifier (such as mailbox and the account information that has simultaneously the user) that may have simultaneously two or more identifying users, can set up the relation between these identifiers, and then set up the corresponding relation between these identifiers and the user ID;
Put up a bridge according to inter-network station COOKIE, set up the mapping relations between more identifiers; As, the cna cookie field of Taobao is consistent with the cna field of tmail and Alibaba, then can be by other cookie fields under related these two websites of this identical cna field;
The account information that obtains according to step S302 (comprise subscriber mailbox and log in some general information such as account information), setting up different I D(is user ID userid), the incidence relation of different COOKIE;
By refer tree information, set up the corresponding relation between different COOKIE, the account.As, set up the refer tree according to the redirect relation of user's access websites, jump to Sina such as the user by Baidu's search, and then jump to other websites from Sina, and then set the corresponding relation of setting up between different COOKIE, the account according to this refer.
Step S308: according to the COOKIE field of setting up and the association between the user ID, the identification user.
Pass through the present embodiment, can from a large amount of information, parse as the COOKIE field of identity information to automation, then which COOKIE field can the unique identification user identity in each website of Analysis deterrmination, set up the corresponding relation of these COOKIE fields and user ID, accurately identify the user with this corresponding relation, solve existing scheme and can't accurately identify user's problem, reached accurate identification user's effect.
Embodiment four
With reference to Fig. 4, show the structured flowchart according to a kind of customer identification device of the embodiment of the invention four.
The customer identification device of the present embodiment comprises: the first acquisition module 402, be used for obtaining in the setting-up time section user ID from access to netwoks daily record message identical, and the website COOKIE of user ID and setting website message one to one; The second acquisition module 404 is used for obtaining quaternary group information from the message that obtains, and wherein, quaternary group information comprises domain name, user ID, the COOKIE field of user's access websites and the value of COOKIE field of user's access websites of user ID indication; The 3rd acquisition module 406 is used for quaternary group information is added up, and obtains the visit information of each user's access websites; The 4th acquisition module 408 is used for the visit information of each user's access websites is filtered, and obtains each user's access websites unique identification user's COOKIE field; Identification module 410 be used for to be set up the COOKIE field and the corresponding relation of user ID obtained, identifies the user according to corresponding relation.
Preferably, user ID comprises user account and browser version number; The visit information of user's access websites comprises: the page browsing amount accounting of the page browsing amount of the domain name of user's access websites, domain name, domain name, the COOKIE field of user's access websites, the different different identical different different independent visitor's ratio of independent visitor's number of times, user ID of independent visitor's number of times, user ID of number of page views ratio, user ID of number of page views, user ID of number of page views, user ID that user ID is identical.
Preferably, the customer identification device of the present embodiment also comprises: order module 412, be used for before the 4th acquisition module 408 filters the visit information of each user's access websites, sort according to the visit information of the different independent visitor's ratio of the different number of page views ratio of user ID and/or user ID to each user's access websites.
Preferably, the 4th acquisition module 408 be used for to use page browsing amount or the mutual information of domain name, and perhaps information gain is filtered the visit information of each user's access websites, obtains the COOKIE field that each user's access websites identifies described alone family.
Preferably, the customer identification device of the present embodiment also comprises: the first relating module 414, be used for according to the message that obtains, obtain the website access information of two identical websites of COOKIE name, wherein, website access information comprises: the COOKIE field of two websites, the value of COOKIE field, the domain name of two websites, the number of page views that user ID is identical, the number of page views that user ID is different, the number of page views ratio that user ID is different, independent visitor's number of times that user ID is identical, independent visitor's number of times that user ID is different, independent visitor's ratio that user ID is different; Sort according to the visit information of the different independent visitor's ratio of the different number of page views ratio of user ID and/or user ID to two websites; Visit information after the ordering is filtered, determine whether two websites use identical COOKIE field; If the association of then setting up two websites is according to the corresponding relation identification user of related and COOKIE field and user ID.
Preferably, the customer identification device of the present embodiment also comprises: the second relating module 416, if be used for the value that the COOKIE field of identifying user comprises a plurality of COOKIE fields, then carry out association between the value of a plurality of COOKIE fields; Corresponding relation identification user according to related and COOKIE field and user ID.
The customer identification device of the present embodiment is used for realizing the corresponding user identification method of aforementioned a plurality of embodiment of the method, and the beneficial effect with corresponding embodiment of the method, does not repeat them here.
Each embodiment in this specification all adopts the mode of going forward one by one to describe, and what each embodiment stressed is and the difference of other embodiment that identical similar part is mutually referring to getting final product between each embodiment.For device embodiment because itself and embodiment of the method basic simlarity, so describe fairly simple, relevant part gets final product referring to the part explanation of embodiment of the method.
Above a kind of user identification method provided by the present invention and device are described in detail, used specific case herein principle of the present invention and execution mode are set forth, the explanation of above embodiment just is used for helping to understand method of the present invention and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.

Claims (10)

CN2012105932268A2012-12-312012-12-31User identification method and devicePendingCN103051637A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN2012105932268ACN103051637A (en)2012-12-312012-12-31User identification method and device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN2012105932268ACN103051637A (en)2012-12-312012-12-31User identification method and device

Publications (1)

Publication NumberPublication Date
CN103051637Atrue CN103051637A (en)2013-04-17

Family

ID=48064136

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN2012105932268APendingCN103051637A (en)2012-12-312012-12-31User identification method and device

Country Status (1)

CountryLink
CN (1)CN103051637A (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103533530A (en)*2013-09-262014-01-22林毅Cross-device user corresponding and user tracking methods and systems
CN103944916A (en)*2014-04-282014-07-23东华大学 A Key Cookies Identification Method for Web Session Aggregation
CN103944995A (en)*2014-04-282014-07-23东华大学 A method of identifying individual user accounts in a broadband network
CN103995907A (en)*2014-06-132014-08-20北京奇艺世纪科技有限公司Determining method of access users
CN104199849A (en)*2014-08-082014-12-10亿赞普(北京)科技有限公司Advertisement injecting method and device
CN104484357A (en)*2014-12-012015-04-01北京国双科技有限公司Data processing method and device and access frequency information processing method and device
CN104717079A (en)*2013-12-122015-06-17华为技术有限公司Network flow data processing method and device
CN105224593A (en)*2015-08-252016-01-06中国人民解放军信息工程大学Frequent co-occurrence account method for digging in a kind of of short duration online affairs
CN105447148A (en)*2015-11-262016-03-30上海晶赞科技发展有限公司Cookie identifier association method and apparatus
CN106302797A (en)*2016-08-312017-01-04北京锐安科技有限公司A kind of cookie accesses De-weight method and device
CN106656934A (en)*2015-11-032017-05-10中国移动通信集团公司User identity mapping method and user identity mapping device based on operator gateway log
CN107592214A (en)*2017-08-282018-01-16杭州安恒信息技术有限公司A kind of method for identifying Internet application system login username
CN107767070A (en)*2017-11-062018-03-06泰康保险集团股份有限公司method and device for information popularization
CN108282475A (en)*2018-01-182018-07-13世纪龙信息网络有限责任公司User identity information read method and system, computer storage media and equipment
CN108462615A (en)*2018-02-052018-08-28百川通联(北京)网络技术有限公司A kind of network user's group technology and device
CN109388686A (en)*2017-08-102019-02-26北京国双科技有限公司A kind of user identifier method and device
CN109583472A (en)*2018-10-302019-04-05中国科学院计算技术研究所A kind of web log user identification method and system
CN111581235A (en)*2020-03-252020-08-25贝壳技术有限公司Method and system for identifying common incidence relation

Citations (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20020007317A1 (en)*1998-03-302002-01-17Patrick Joseph CallaghanMethod, system and program products for sharing state information across domains
US20080235243A1 (en)*2007-03-212008-09-25Nhn CorporationSystem and method for expanding target inventory according to browser-login mapping
CN101523379A (en)*2006-08-182009-09-02阿卡麦科技公司Method of data collection in a distributed network
CN101651671A (en)*2008-08-142010-02-17鸿富锦精密工业(深圳)有限公司Inter-system subscriber identity authentication system and method
CN101847160A (en)*2010-05-192010-09-29深圳市五巨科技有限公司Method and device for pushing personalized pages to mobile terminal
CN101860987A (en)*2010-05-072010-10-13中兴通讯股份有限公司Mobile terminal and method for acquiring network information by same
CN101945234A (en)*2010-09-132011-01-12深圳市华曦达科技股份有限公司Method and television terminal for formulating personalized menu based on use frequency of user
CN101968802A (en)*2010-09-302011-02-09百度在线网络技术(北京)有限公司Method and equipment for recommending content of Internet based on user browse behavior
CN102333092A (en)*2011-09-302012-01-25北京亿赞普网络技术有限公司Network user identification method and application server

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20020007317A1 (en)*1998-03-302002-01-17Patrick Joseph CallaghanMethod, system and program products for sharing state information across domains
CN101523379A (en)*2006-08-182009-09-02阿卡麦科技公司Method of data collection in a distributed network
US20080235243A1 (en)*2007-03-212008-09-25Nhn CorporationSystem and method for expanding target inventory according to browser-login mapping
CN101651671A (en)*2008-08-142010-02-17鸿富锦精密工业(深圳)有限公司Inter-system subscriber identity authentication system and method
CN101860987A (en)*2010-05-072010-10-13中兴通讯股份有限公司Mobile terminal and method for acquiring network information by same
CN101847160A (en)*2010-05-192010-09-29深圳市五巨科技有限公司Method and device for pushing personalized pages to mobile terminal
CN101945234A (en)*2010-09-132011-01-12深圳市华曦达科技股份有限公司Method and television terminal for formulating personalized menu based on use frequency of user
CN101968802A (en)*2010-09-302011-02-09百度在线网络技术(北京)有限公司Method and equipment for recommending content of Internet based on user browse behavior
CN102333092A (en)*2011-09-302012-01-25北京亿赞普网络技术有限公司Network user identification method and application server

Cited By (28)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103533530A (en)*2013-09-262014-01-22林毅Cross-device user corresponding and user tracking methods and systems
CN103533530B (en)*2013-09-262017-09-26余飞The user's correspondence and user tracking method, system of a kind of striding equipment
CN104717079A (en)*2013-12-122015-06-17华为技术有限公司Network flow data processing method and device
CN103944916A (en)*2014-04-282014-07-23东华大学 A Key Cookies Identification Method for Web Session Aggregation
CN103944995A (en)*2014-04-282014-07-23东华大学 A method of identifying individual user accounts in a broadband network
CN103944995B (en)*2014-04-282017-06-06东华大学 A method of identifying individual user accounts in a broadband network
CN103995907B (en)*2014-06-132017-04-12北京奇艺世纪科技有限公司Determining method of access users
CN103995907A (en)*2014-06-132014-08-20北京奇艺世纪科技有限公司Determining method of access users
CN104199849A (en)*2014-08-082014-12-10亿赞普(北京)科技有限公司Advertisement injecting method and device
CN104484357A (en)*2014-12-012015-04-01北京国双科技有限公司Data processing method and device and access frequency information processing method and device
CN105224593B (en)*2015-08-252019-08-16中国人民解放军信息工程大学Frequent co-occurrence account method for digging in the of short duration online affairs of one kind
CN105224593A (en)*2015-08-252016-01-06中国人民解放军信息工程大学Frequent co-occurrence account method for digging in a kind of of short duration online affairs
CN106656934A (en)*2015-11-032017-05-10中国移动通信集团公司User identity mapping method and user identity mapping device based on operator gateway log
CN106656934B (en)*2015-11-032020-02-14中国移动通信集团公司User identifier mapping method and device based on operator gateway log
CN105447148A (en)*2015-11-262016-03-30上海晶赞科技发展有限公司Cookie identifier association method and apparatus
CN105447148B (en)*2015-11-262018-12-21上海晶赞科技发展有限公司A kind of Cookie mark correlating method and device
CN106302797A (en)*2016-08-312017-01-04北京锐安科技有限公司A kind of cookie accesses De-weight method and device
CN109388686A (en)*2017-08-102019-02-26北京国双科技有限公司A kind of user identifier method and device
CN107592214A (en)*2017-08-282018-01-16杭州安恒信息技术有限公司A kind of method for identifying Internet application system login username
CN107592214B (en)*2017-08-282021-05-14杭州安恒信息技术股份有限公司Method for identifying login user name of internet application system
CN107767070A (en)*2017-11-062018-03-06泰康保险集团股份有限公司method and device for information popularization
CN107767070B (en)*2017-11-062021-06-11泰康保险集团股份有限公司Method and device for information popularization
CN108282475B (en)*2018-01-182020-09-08世纪龙信息网络有限责任公司User identification information reading method and system, computer storage medium and device
CN108282475A (en)*2018-01-182018-07-13世纪龙信息网络有限责任公司User identity information read method and system, computer storage media and equipment
CN108462615A (en)*2018-02-052018-08-28百川通联(北京)网络技术有限公司A kind of network user's group technology and device
CN109583472A (en)*2018-10-302019-04-05中国科学院计算技术研究所A kind of web log user identification method and system
CN111581235A (en)*2020-03-252020-08-25贝壳技术有限公司Method and system for identifying common incidence relation
CN111581235B (en)*2020-03-252021-08-03贝壳找房(北京)科技有限公司Method and system for identifying common incidence relation

Similar Documents

PublicationPublication DateTitle
CN103051637A (en)User identification method and device
Lee et al.Measuring geographical regularities of crowd behaviors for Twitter-based geo-social event detection
CN103218431B (en)A kind ofly can identify the system that info web gathers automatically
CN107800591B (en)Unified log data analysis method
CN103106259B (en)A kind of mobile webpage content recommendation method based on situation
CN103823888B (en)Node-closeness-based social network site friend recommendation method
CN102567494B (en)Website classification method and device
CN103237094B (en)A kind of method and device identifying user
CN105005600B (en)Preprocessing method of URL (Uniform Resource Locator) in access log
US20170053031A1 (en)Information forecast and acquisition method based on webpage link parameter analysis
CN103136358B (en)A kind of method of Automatic Extraction forum data
CN106354800A (en)Undesirable website detection method based on multi-dimensional feature
CN110648172B (en)Identity recognition method and system integrating multiple mobile devices
JP2014506355A (en) Collecting method and system for electronic bulletin board reply increase amount
CN102402594A (en)Rich media personalized recommendation method
CN103136331A (en)Micro blog network opinion leader identification method
CN107395650A (en)Even method and device is returned based on sandbox detection file identification wooden horse
CN104090961B (en)A kind of social networks junk user filter method based on machine learning
CN108710670A (en)A kind of log analysis method, device, electronic equipment and readable storage medium storing program for executing
Huang et al.On the understanding of interdependency of mobile app usage
CN103218411B (en)Website related information acquisition methods and device
CN104268289A (en)Link URL (Uniform Resource Locator) failure detection method and device
CN108650145A (en)Phone number characteristic automatic extraction method under a kind of home broadband WiFi
KR20120090131A (en)Method, system and computer readable recording medium for providing search results
CN103684856A (en)Video website infrastructure measurement and analysis method

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
AD01Patent right deemed abandoned

Effective date of abandoning:20161116

C20Patent right or utility model deemed to be abandoned or is abandoned

[8]ページ先頭

©2009-2025 Movatter.jp