Detailed Description
The present disclosure is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. In addition, for convenience of description, only a portion related to the related invention is shown in the drawings.
It should be noted that, without conflict, the embodiments of the present disclosure and features in the embodiments may be combined with each other. Unless the context clearly indicates otherwise, the elements may be one or more if the number of the elements is not specifically limited. Furthermore, the term "and/or" as used in this disclosure encompasses any and all possible combinations of the listed items.
In view of the increasing emphasis on protecting user privacy, it has become increasingly difficult to identify terminal devices using unique and non-modifiable identities that relate to the terminal devices. The present disclosure proposes the following technical ideas: the device source of the application access information is identified based on the device identification information which is included in the application access information and is not strong in uniqueness, can be tampered with by technology, and/or is authorized to be friendly, and is combined with the network identification information of the network where the terminal device which is included in the application access information and is associated with the application access information.
Based on the technical conception, the embodiment of the disclosure provides a device source identification method and device. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 is a flowchart illustrating a devicesource identification method 100 according to an embodiment of the present disclosure. As shown in fig. 1, the devicesource identification method 100 may include: step S102, acquiring a plurality of pieces of application access information, wherein each piece of application access information in the plurality of pieces of application access information comprises network identification information about a network where a terminal device associated with the piece of application access information is located and device identification information about the terminal device associated with the piece of application access information; step S104, dividing the pieces of application access information into one or more application information sets based on the pieces of network identification information included in the pieces of application access information, wherein the application access information in each application information set includes the same network identification information; and step S106, for each of the one or more application information sets, dividing the application information set into one or more application information subsets based on a plurality of pieces of device identification information included in the application access information in the application information set, and identifying the application access information in each application information subset as originating from the same terminal device, wherein the application access information in each application information subset includes the same device identification information.
In the device source identification method according to the embodiment of the present disclosure, the device source of the application access information is identified in combination with both the network identification information and the device identification information included in the application access information, which helps to avoid erroneous identification due to technical tampering or non-uniqueness of the device identifier when the device source of the application access information is identified with a single device identifier, resulting in higher reliability of the identification result.
In some embodiments, pieces of application access information for one or more terminal devices for one or more applications may be obtained from one or more application servers. Each piece of application access information may include one or more pieces of network identification information and one or more pieces of device identification information. Each piece of network identification information may include at least one of the following items of network information: internet protocol address (IP), base station number (CID), location Area Code (LAC), basic Service Set Identifier (BSSID), and wireless network address (WiFi-MAC). Each piece of device identification information may include at least one of the following items of device information: system start-up time (uptime), bluetooth address, network card physical address (MAC), advertisement identifier (IDFA/IDFV), and international mobile equipment identification number (IMEI).
Here, most of the network information items and the device information items are authorization-friendly (i.e., the user is allowed to acquire in most cases), and interruption of the device source identification flow due to lack of individual information in the application access information can be avoided.
In some embodiments, step S104 may include: for any two pieces of application access information in the pieces of application access information, judging whether the two pieces of application access information comprise at least one identical network information item and judging whether the information of the at least one identical network information item is identical, if so, judging that the two pieces of application access information comprise identical network identification information, and dividing the two pieces of application access information into the same application information set. For example, theapplication access information 1 includes an internet protocol address IP1, a base station number CID1, and a wireless network address WiFi-MAC1, theapplication access information 2 includes the internet protocol address IP1 and the wireless network address WiFi-MAC1, and since theapplication access information 1 and 2 each include two network information items of the internet protocol address and the wireless network address and information of the two network information items is the same, it can be determined that the two pieces of application access information include the same network identification information, and the two pieces of application access information can be divided into the same application information set.
Alternatively, in some embodiments, step S104 may include: for each piece of application access information in the pieces of application access information, calculating a weighted sum of network information items included in the piece of application access information; the weighted and equal application access information is divided into the same application information set, wherein the same network information item has the same weight value in different application access information, and different network information items have different weight values in the same application access information. Here, it is not necessary to include the individual network information items one by one of the plurality of pieces of application access information, and application access information originating from the same network section can be recognized relatively quickly.
Alternatively, in some embodiments, step S104 may include: dividing the pieces of application access information into one or more application information groups based on at least one network information item of internet protocol address, base station number and location area code in the pieces of network identification information included in the pieces of application access information; each of the one or more application information groups is further partitioned based on at least one of a basic service set identifier and a wireless network address in a plurality of pieces of network identification information included in the plurality of pieces of application access information to partition the one or more application information groups into one or more application information sets. Here, the application program access information from the same larger network section is first identified based on at least one network information item in the internet protocol address, the base station number and the location area code, and then the application program access information from the same smaller network section in the application program access information from the same larger network section is identified based on at least one network information item in the basic service set identifier and the wireless network address, so that the accuracy of identifying the application program access information from the same terminal device is potentially improved.
In some embodiments, step S106 includes: for any two pieces of application access information in each application information set, judging whether the two pieces of application access information comprise at least one identical equipment information item and judging whether the information of the at least one identical equipment information item is identical, if so, judging that the two pieces of application access information comprise identical equipment identification information, and dividing the two pieces of application access information into the same application information subset. For example, theapplication access information 1 includes a system start time uptime1, abluetooth address 1, and a network card physical address MAC1, theapplication access information 2 includes thebluetooth address 1 and the network card physical address MAC1, and since theapplication access information 1 and 2 each include two device information items of the bluetooth address and the network card physical address and the information of the two device information items is the same, it can be determined that the two pieces of application access information include the same device identification information, and the two pieces of application access information can be divided into the same subset of application information.
Alternatively, in some embodiments, step S106 may include: for each set of application information, calculating a weighted sum of device information items included in each piece of application access information in the set of application information; the weighted and equal application access information is divided into the same subset of application information, wherein the same device information item has the same weight value in different application access information, and different device information items have different weight values in the same application access information. Here, it is not necessary to include the respective device information items one by one of the plural pieces of application access information, and application access information originating from the same terminal device can be recognized relatively quickly.
Alternatively, in some embodiments, step S106 may include: for each application information set, dividing the application information set into one or more application information clusters based on a system start time in a plurality of pieces of device identification information included in application access information in the application information set; one or more application information clusters are reorganized into one or more application information subsets based on at least one device information item of Bluetooth address, network card physical address, advertisement identifier, and International Mobile Equipment identity in a plurality of pieces of device identification information included in the application access information in the application information set. The application program access information from the terminal equipment with the same system starting time is firstly identified based on the system starting time, then the application program access information from the same terminal equipment is identified based on at least one equipment information item in the Bluetooth address, the network card physical address, the advertisement identifier and the international mobile equipment identification code, multiple association relations among the equipment information items included in the application program access information are fully considered, and the equipment identification is carried out based on a plurality of equipment information items, so that interruption of the equipment source identification process caused by lack of individual equipment information items is reduced.
In the device source identification method according to the embodiment of the present disclosure, the network identification information and the device identification information included in the application access information are used in series to identify the application access information originating from the same terminal device, and the reliability of the identification result is higher. In addition, by identifying application access information associated with the same terminal device, information about the APP usage of the terminal device can be analyzed to obtain a user representation (e.g., usage habit and hobby information) of the terminal device user, thereby facilitating targeted recommendation of the APP and/or specific functions in the APP to the terminal device user. Further, after identifying the terminal device from which the application access information comes, the large-scale information generated by the terminal device can be analyzed by utilizing the large-data technology, so that a more accurate user portrait of the terminal device user is obtained.
Fig. 2 is a diagram illustrating an example process of identifying a device source of application access information using the device source identification method illustrated in fig. 1. A device source identification method according to an embodiment of the present disclosure is further described below with reference to examples. In the example shown in fig. 2:
In step S102, seven pieces of application access information (the seven pieces of application access information are identified as CUID1, CUID2, CUID3, CUID4, APP a, APP B, APP C, respectively) are acquired.
In step S104, the seven pieces of application access information are divided into a plurality of application information groups according to the internet protocol addresses included in the application access information CUID1, CUID2, CUID3, CUID4, APP a, APP B, APP C. Specifically, since the application access information CUID1, CUID2, APP a, APP B, and APP C include the same internet protocol address IP1, the application access information CUID3 and CUID4 include the same internet protocol address IP2, the application access information CUID1, CUID2, APP a, APP B, and APP C are divided into the sameapplication information group 1, and the application access information CUID3 and CUID4 are divided into the same application information group 2 (not shown). Then, for theapplication information group 1, the application information group is divided into a plurality of application information sets according to the basic service set identifier included in the application access information in the application information group. Specifically, since the application access information CUID1, APP a, APP B, APP C includes the same basic service set identifier BSSID1 and the application access information CUID2 includes the basic service set identifier BSSID2, the application access information CUID1, APP a, APP B, APP C is divided into the application information set 1 and the application access information CUID2 is divided into the application information set 2 (not shown).
In step S106, for the application information set 1, the application information set is first divided into a plurality of application information clusters according to the system start time included in the application access information in the application information set. Specifically, since the application access information CUID1 and APP C include the same system start time uptime1, the application access information APP a includes the system start time uptime2, and the application access information APP B includes the system start time uptime3, the application access information CUID1 and APP C are divided into theapplication information cluster 1, the application access information APP a is divided into theapplication information cluster 2, and the application access information APP B is divided into theapplication information cluster 3. And then, reorganizing the application information clusters according to the network card physical address and/or the advertisement identifier included in the application access information in the application information clusters 1-3 to finally obtain anapplication information subset 1, wherein the application information subset comprises application access information CUID1, APP A, APP B and APP C. That is, the application access information CUID1, APP a, APPB, and APP C are recognized as coming from the same terminal device.
It should be added that theapplication information group 2 only includes the application access information CUID3 and CUID4, and it may be determined whether they are from the same terminal device directly based on the device identification information included in the application access information CUID3 andCUID 4. In addition, since the application information set 2 includes only the application access information CUID2, it is not necessary to perform the subsequent identification processing based on the device identification information included in the applicationaccess information CUID 2. Here, for clarity and brevity, theapplication information group 2 and the application information set 2 are not shown in fig. 2.
It should be noted that, in the practical application process, the network identification information acquired in different scenarios is not limited to include only the above-mentioned network information items, but may include other network information items that are easy to obtain user authorization and relatively fixed; the device identification information obtained under the different operating systems may also include other device information items.
Fig. 3 is a block diagram illustrating a devicesource identification apparatus 200 according to an embodiment of the present disclosure. As shown in fig. 3, the devicesource identifying apparatus 200 includes an information acquiring unit 202, anetwork identifying unit 204, and a device identifying unit 206. The information acquisition unit 202 is configured to acquire a plurality of pieces of application access information, each piece of application access information including network identification information about a network where a terminal device associated with the piece of application access information is located and device identification information about the terminal device associated with the piece of application access information. Thenetwork identification unit 204 is configured to divide the pieces of application access information into one or more application information sets based on the pieces of network identification information included in the pieces of application access information, the application access information in each application information set including the same network identification information. The device identification unit 206 is configured to divide the application information set into one or more application information subsets based on pieces of device identification information included in the application access information in the application information set, and identify the application access information in each application information subset as originating from the same terminal device, wherein the application access information in each application information subset includes the same device identification information, for each application information set.
In some optional implementations of this embodiment, each piece of network identification information in the plurality of pieces of network identification information may include at least one of the following plurality of network information items: an internet protocol address, a base station number, a location area code, a basic service set identifier, and a wireless network address.
In some optional implementations of this embodiment, each piece of device identification information in the plurality of pieces of device identification information includes at least one of the following items of device information: system start-up time, bluetooth address, network card physical address, advertisement identifier, and international mobile equipment identification code.
In some optional implementations of the present embodiment, thenetwork identification unit 204 may be further configured to: for any two pieces of application access information in the pieces of application access information, judging whether the two pieces of application access information comprise at least one identical network information item and judging whether the information of the at least one identical network information item is identical, if so, judging that the two pieces of application access information comprise identical network identification information, and dividing the two pieces of application access information into the same application information set.
In some optional implementations of the present embodiment, thenetwork identification unit 204 may be further configured to: for each piece of application access information in the pieces of application access information, calculating a weighted sum of network information items included in the piece of application access information; the weighted and equal application access information is divided into the same application information set, wherein the same network information item has the same weight value in different application access information, and different network information items have different weight values in the same application access information.
In some optional implementations of the present embodiment, thenetwork identification unit 204 may be further configured to: dividing the pieces of application access information into one or more application information groups based on at least one network information item of internet protocol address, base station number and location area code in the pieces of network identification information; each of the one or more application information groups is further partitioned based on at least one of the basic service set identifier and the wireless network address in the plurality of pieces of network identification information to partition the one or more application information groups into one or more application information sets.
In some optional implementations of the present embodiment, the device identification unit 206 may be further configured to: for any two pieces of application access information in each application information set, judging whether the two pieces of application access information comprise at least one identical equipment information item and judging whether the information of the at least one identical equipment information item is identical, if so, judging that the two pieces of application access information comprise identical equipment identification information, and dividing the two pieces of application access information into the same application information subset.
In some optional implementations of the present embodiment, the device identification unit 206 may be further configured to: for each set of application information, calculating a weighted sum of device information items included in each piece of application access information in the set of application information; the weighted and equal application access information is divided into the same subset of application information, wherein the same device information item has the same weight value in different application access information, and different device information items have different weight values in the same application access information.
In some optional implementations of the present embodiment, the device identification unit 206 may be further configured to: for each application information set, dividing the application information set into one or more application information clusters based on a system start time in a plurality of pieces of device identification information included in application access information in the application information set; one or more application information clusters are reorganized into one or more application information subsets based on at least one device information item of Bluetooth address, network card physical address, advertisement identifier, and International Mobile Equipment identity in a plurality of pieces of device identification information included in the application access information in the application information set.
In this embodiment, the technical effects brought by the devicesource identification apparatus 200 and the corresponding modules thereof may refer to the related descriptions in the corresponding embodiment of fig. 1, and are not repeated herein.
In some embodiments, the devicesource identification method 100 and the devicesource identification apparatus 200 may be implemented as part of a cloud platform using cloud technology. For example, the devicesource identifying apparatus 200 may be provided on a cloud platform that provides service information and/or commodity information and/or application programs for a user, acquire application program access information for one or more application programs for a large number of terminal devices using the devicesource identifying method 100, and identify application program access information originating from the same terminal device according to network identification information and device identification information included in these application program access information. Based on the application access information from the same terminal device, a user portrait of the terminal device user can be obtained, so that at least one of APP, specific function in APP, commodity information and service information can be recommended to the terminal device user in a targeted manner.
Fig. 4 is a block diagram illustrating anexemplary computer system 300 that can be used in connection with the exemplary embodiments. Acomputer system 300 suitable for use in implementing embodiments of the present disclosure is described below in connection with fig. 4. It should be appreciated that thecomputer system 300 illustrated in FIG. 4 is only one example and should not be taken as limiting the functionality and scope of use of the embodiments of the present disclosure.
As shown in fig. 4, thecomputer system 300 may include a processing device (e.g., a central processing unit, a graphics processor, etc.) 301 that may perform various suitable actions and processes in accordance with programs stored in a Read Only Memory (ROM) 302 or loaded from astorage device 308 into a Random Access Memory (RAM) 303. In theRAM 303, various programs and data required for the operation of thecomputer system 300 are also stored. Theprocessing device 301, theROM 302, and theRAM 303 are connected to each other via abus 304. An input/output (I/O)interface 305 is also connected tobus 304.
In general, the following devices may be connected to the I/O interface 305:input devices 306 including, for example, a touch screen, touchpad, camera, accelerometer, gyroscope, etc.; anoutput device 307 including, for example, a liquid crystal display (LCD, liquid Crystal Display), a speaker, a vibrator, and the like;storage 308 including, for example, flash memory (Flash Card) or the like; and communication means 309. The communication means 309 may allow thecomputer system 300 to communicate with other devices wirelessly or by wire to exchange data. While FIG. 4 illustrates acomputer system 300 having various devices, it should be understood that not all illustrated devices are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 4 may represent one device or a plurality of devices as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure provide a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing themethod 100 shown in fig. 1. In such an embodiment, the computer program may be downloaded and installed from a network via acommunication device 309, or installed from astorage device 308, or installed from aROM 302. The above-described functions defined in the system of the embodiments of the present disclosure are achieved when the computer program is executed by the processing means 301.
It should be noted that, the computer readable medium according to the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In an embodiment of the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. Whereas in embodiments of the present disclosure, the computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (Radio Frequency), and the like, or any suitable combination thereof.
The computer readable medium may be embodied in thecomputer system 300; or may exist alone without being assembled into thecomputer system 300. The computer readable medium carries one or more programs which, when executed by the computer system, cause the computer system to: acquiring a plurality of pieces of application access information, each piece of application access information including network identification information about a network where a terminal device associated with the piece of application access information is located and device identification information about the terminal device associated with the piece of application access information; dividing the pieces of application access information into one or more application information sets based on pieces of network identification information included in the pieces of application access information, the application access information in each application information set including the same network identification information; and for each of the one or more application information sets, dividing the application information set into one or more application information subsets based on a plurality of pieces of device identification information included in the application access information in the application information set, and identifying the application access information in each application information subset as originating from the same terminal device, wherein the application access information in each application information subset includes the same device identification information.
Computer program code for carrying out operations of embodiments of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments described in the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The described units may also be provided in a processor, for example, described as: a processor includes an information acquisition unit, a network identification unit, and a device identification unit. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above technical features, but encompasses other technical features formed by any combination of the above technical features or their equivalents without departing from the spirit of the invention. Such as the above-described features, are mutually substituted with (but not limited to) the features having similar functions disclosed in the embodiments of the present disclosure.