Detailed Description
Referring to fig. 1, an exemplary network architecture of the present application is suitable for an e-commerce platform scenario, and includes a terminal device 80, a security server 81, and a store server 82.
The terminal device 80 may be a smart phone, personal computer, notebook, tablet, etc., which may be used to trigger a network request to use various application services in the store server 82, such as a registered store. The terminal device 80 monitors the registration behavior of the user when the user performs a store registration event, and sends a corresponding store sample generated by the registration behavior of the user to the store server 82 for storage, where the store sample includes store registration attribute information corresponding to the store registration event, such as registration time, registration mode (mobile phone/mailbox), mobile phone number, mailbox, IP, device fingerprint, etc., and store registration behavior feature data corresponding to the store registration event, and the store registration behavior feature data may be classified into an offline feature and a real-time feature.
The generation and storage of offline features including statistics for each IP, handset number, device fingerprint or user agent registration application number, registration pass rate, and number of registration violations identified as being registered for a particular period of time includes analysis of registration behavior of the same IP, handset number, device fingerprint or User Agent (UA) over a particular time frame. These offline features are processed through the data processing flow of t+1, meaning that the data will be processed and summarized the next day after the event occurs. In particular, these offline features are stored in four tables that are primary keys for IP, handset number, device fingerprint, and user agent. The tables respectively correspond to the primary key IP, the mobile phone number, the device fingerprint and the user agent and are used for storing the registration application quantity, the registration passing rate and the illegal registration identifying quantity related to each primary key.
The real-time feature is dynamically generated, mainly focuses on the feature of store registration behavior itself, and is closely related to the specific situation when the registration behavior occurs, and comprises the total duration of the registration process, the time required by a user to complete verification code filling, the time required by filling a mailbox or a mobile phone number, the number of times of mailbox or mobile phone number verification, whether the IP address of the registered mobile phone number is consistent with the registered IP address, whether the filled geographic position information is consistent with the geographic position information to which the IP address belongs during registration, and the registration application number, the registration passing rate of the same IP, the mobile phone number, the device fingerprint or the user agent and the number identified as illegal registration in a short time window before the present registration event occurs. These real-time features are typically computed on the fly as the registration activity occurs.
In addition, the store sample also includes a store unique identifier, such as a store ID. And the rule-breaking registration tag data is used for representing whether the store corresponding to the store sample belongs to the rule-breaking registration tag data corresponding to the rule-breaking registration store, the rule-breaking registration tag data can be marked manually or obtained through prediction of the trained neural network model, and the rule-breaking registration tag data corresponding to each store is packaged in the store sample and stored in the database of the store server 82. These store samples generated when the user performs store registration on the terminal device 80 are packaged into data packets and transmitted to the store server 82 through a security protocol.
The security server 81 may be used as a main execution subject of the store wind control system of the present application, and the store wind control system adopts a neural network algorithm to effectively identify the offending registration behavior with the aggregate feature by learning the association relationship information between the registered stores. Specifically, before the implementation of the embodiment of the present application, the store wind control system accesses the database in the store server 82 to obtain the store unique identifier, store registration attribute information, store registration behavior feature data and rule whether the store registration feature data belongs to the rule-breaking registration tag data corresponding to the rule-breaking registration store in the historical registration store sample set, then uses the store samples to construct nodes in the graph object, and establishes connection between edges of the nodes based on the store registration attribute information and the preset edge connection rule to form a complete graph object, wherein the graph object constructed based on the historical registration store sample set can be stored in the database of the security server 81 and used by the security server 81 to train the graph neural network model. In the embodiment of the present application, in response to a store rule-breaking registration recognition event, the security server 81 receives a store sample set to be predicted, extracts a graph object constructed based on a historical registration store sample set from a database, constructs a store sample in the store sample set to be predicted as a newly added node in the graph object to update the graph object, then inputs the updated graph object into a graph neural network model obtained by training the store sample in advance using the historical registration store sample set, and finally can determine whether each store in the store sample set to be predicted belongs to a rule-breaking registration store.
According to the judging result of whether each store in the store sample set to be predicted belongs to the illegal registration store, marking the illegal registration tag data of the corresponding store sample as 0 (not belonging to the illegal registration store) and 1 (not belonging to the illegal registration store) tags, associating the marked illegal registration tag data with each store sample in the store sample set to be predicted, updating the store sample in the database of the security server 81, and carrying out iterative training and optimization to adapt to the new change of the illegal registration behavior.
The store violation registration identification method can be realized by programming as a computer program product, and is implanted into the store wind control system, when the store wind control system predicts an online store as a violation registration store, the store can be subjected to wind control right limiting treatment, and the store can be subjected to further manual checking and determining by a manual management user and then subjected to wind control right limiting treatment.
Referring to fig. 2, according to the method for identifying store violation registration provided by the present application, in one embodiment, the method includes the following steps:
Step S5100, acquiring a to-be-predicted registered store sample set, wherein the to-be-predicted registered store sample set comprises store samples of a plurality of stores;
In the daily operation of an e-commerce platform, a large number of new stores will typically be registered every day. When a user intends to set up a new store on the platform, the user enters the e-commerce platform to perform store registration, a store registration request is triggered to be submitted to the store server, the store server can trigger a corresponding store registration event, store unique identifiers, store registration attribute information and store registration behavior characteristic data of the corresponding store generated by the registration act are used as store samples of the store, and store samples of the stores corresponding to the registration act are sent to the store server to be stored. In response to the store violation registration identification event, the store wind control system sends a request for acquiring a store sample set to be predicted to a store server, wherein the store sample set to be predicted comprises store samples of a plurality of stores, and the number of the store samples in the store sample set to be predicted can be set to be a fixed number as required by a person skilled in the art or can be set to be the number of store samples generated by a registered store in a specific time period.
It is further understood that the store sample includes store registration attribute information corresponding to when the store is registered, such as registration time, registration mode (handset/mailbox), handset number, mailbox, IP, device fingerprint, and the like. In the conventional identified illegal registration behaviors, it is known that the illegal registration behaviors are often aggregated, and in order to register a large number of stores at one time, usually, an illegal registrant uses the same or similar store registration attribute information to perform quick registration, that is, store registration attribute information generated by the illegal stores during registration is often associated. For example, if an IP address registers a large number of stores in a short time, or the device fingerprint matches a known illicit registration device fingerprint, or the illicit registrants would tend to register using similar mailboxes, cell phone numbers, then these registration actions should be considered high risk. By utilizing the characteristic, store registration attribute information is deeply analyzed to identify potential association modes, and rich context information is provided for subsequent training of the graph neural network model so as to train a more accurate graph neural network model, thereby effectively predicting and striking illegal registration behaviors.
The store sample also includes store registration behavioral characteristic data, including offline and real-time characteristics, as part of the training sample graph object of the training graph neural network model of the present application. The offline feature includes the number of registration applications, registration pass number, registration pass rate, and number of registration violations identified for each IP, handset number, device fingerprint, or user agent for a particular period of time. By mining and analyzing offline features, hidden links between different store registration attributes can be discovered, which often reveal patterns of behavior of offending registrants. For example, when a mobile phone number is associated with a plurality of illicit registration stores, there may be a case where the mobile phone number is used for batch registration, which is a common means for a rule-breaking registrant, and can be identified by the number of registration applications, the number of registration passes, the registration passing rate, and the number of registration violations identified as the number of registration violations, whether the store in which the mobile phone number is registered is a high risk rule-breaking registration. Furthermore, if a certain IP address is associated with a large number of registration actions in a short time, i.e., a certain IP address has a large number of registration applications in a certain period of time, this is also an alert signal, because it is unlikely that a normal user registers a plurality of stores from the same IP address in a short time.
The real-time features include total duration in the registration process, time required by the user to complete the verification code, time required to fill out the mailbox or the mobile phone number, number of times the mailbox or the mobile phone number is verified, whether the IP address of the registered mobile phone number is consistent with the registered IP address, whether the geographic location information filled in during registration is consistent with the geographic location to which the IP address belongs, and the number of registration applications, the number of registration passes, the registration pass rate and the number of identified illegal registrations of the same IP, mobile phone number, device fingerprint or user agent in a short time window before the registration event occurs. The total duration in the registration process, the time required by the user to complete the verification code, and the time required to fill out the mailbox or the mobile phone number are often regarded as illegal registration behaviors. Because illicit registrants tend to seek speed, registration procedures are typically done very quickly, while normal users tend to be more cautious and require more time to think to fill in the information. Similarly, the registration behavior that the IP address of the registered mobile phone number is inconsistent with the registered IP address or that the geographic position information filled in during registration is inconsistent with the geographic position to which the IP address belongs is identified, and is often regarded as illegal registration behavior. Normally, the IP address of the registered mobile phone number of the user should be matched with the registered IP address of the user, and if no match occurs, the IP address is a sign of illegal registration behavior. When the graph neural network model is trained, the illegal registration tag data of each store sample is used for training, the characteristics of the real-time characteristics are learned, and the risk probability that each store in the store sample set to be predicted belongs to the illegal registration event is further predicted.
In addition, the registration application number, registration passing rate and the number of illegal registrations identified as the same IP, mobile phone number, device fingerprint or user agent are counted in a short time window before the occurrence of the present registration event. In one embodiment, the time slices remain within 3 hours and the time slices within 3 hours are selected for analysis because offending registrants typically tend to concentrate on registration activities in a short period of time, which can be quickly captured and analyzed. A shorter time window helps focus on the most recent registration actions, and such data is more likely to show a direct link between registration actions, since actions that register using the same registration attributes in a short time are likely to be correlated. In addition, the shorter time window helps to reduce the consumption of computing resources, ensuring stability and efficiency in processing large amounts of data. Finally, by limiting the time range, it is possible to avoid falsely regarding registration behaviors that are far apart in time and that are not actually associated, as being related, thereby improving the accuracy of identifying the illicit registration behaviors. The store sample also includes a store unique identification, such as a store ID. In summary, these store samples can provide rich contextual information for the neural network model, enabling it to more fully understand registration behavior and more accurately predict illicit registration behavior.
Step S5200, corresponding to a graph object constructed based on store samples in a history registered store sample set, constructing the store samples in the to-be-predicted registered store sample set as new nodes in the graph object, so as to update the graph object;
First, an initial graph object is constructed based on store samples in a historic registered store sample set. The constructed graph object consists of a plurality of nodes and edges, wherein each node represents one store sample in the historical registration store sample set, and the edges are connected according to the association relation between the store samples. The association relationship can be determined through IP addresses, device fingerprints, mobile phone numbers, mailboxes and the like, and the specific definition can be determined according to experience and data exploration. For example, if the IP addresses of two store samples are the same, or the device fingerprint, cell phone number, mailbox, etc. are similar, an edge is added between the two nodes, indicating that there is some association between them.
Store samples in the store sample set to be predicted are added to the graph object constructed based on the historical store sample set one by one to serve as newly added nodes. Each newly added node also contains store registration attribute information, store registration behavior feature data, and a store unique identifier (e.g., a store ID) of the store sample. In order to ensure the integrity and consistency of the graph objects, the edge connection relationship between the newly added node and the history node needs to be defined and constructed according to the same rule. For example, if the IP address of the newly added node is the same as the IP address of a certain history node, or the device fingerprint, the phone number, the mailbox and other attributes are similar to those of the history node, an edge is added between the two nodes.
After the new node is built, the whole graph object needs to be updated. This includes updating store registration attribute information of the nodes, and store registration behavior feature data, connection relationships of edges, and topology of the graph. The updated graph object not only contains the information of the historical registration store sample, but also contains the latest registration store sample to be predicted, thereby realizing the dynamic update of the graph object. The updating mechanism enables the graph object to reflect the current registration behavior and association mode in real time, and provides a more comprehensive and accurate data basis for subsequent illegal registration identification.
After the graph object update is completed, further analysis and prediction may be performed using this updated graph object. For example, the newly added nodes are predicted by the graph neural network model, so that potential illegal registration behaviors are timely identified and processed. The graph neural network model can effectively capture complex relations among nodes through a message transmission mechanism, and conduct risk probability prediction of illegal registration events on newly added nodes according to the captured relations among the nodes. By the method, the illegal registration behavior can be monitored and early-warned in real time, and the safety and the user experience of the electronic commerce platform are improved.
Step S5300, inputting the updated graph object into a graph neural network model which is trained by store samples of the historical registration store sample set in advance, predicting the risk probability that each store belongs to an illegal registration event in the registration store sample set to be predicted, and judging whether the corresponding store belongs to the illegal registration store according to the risk probability.
First, the updated graph object is input into a graph neural network model trained in advance using store samples of a history registration store sample set. The neural network model is obtained by training a historical registration store sample set, and specific training process refers to the following specific embodiment, which is not repeated here. After the updated graph object is input into the graph neural network model, the graph neural network model predicts the risk probability of the illegal registration event for each newly added node (i.e. the registration store sample to be predicted) in the graph object. Specifically, the graph neural network model aggregates the characteristic information of each node with the characteristic information of its neighboring nodes through a message passing mechanism, thereby updating the node representation. This process is repeated multiple times to ensure that the graph neural network model is able to adequately capture complex relationships between nodes.
After the message transmission and the feature aggregation are completed, the graph neural network model can conduct classified prediction on each newly added node, and the risk probability that the node belongs to the illegal registration event is output. In one embodiment, the resulting risk probability is a value between 0 and 1, indicating the likelihood that the store to which the node corresponds belongs to a illicit registration event. The higher the risk probability, the greater the likelihood that the store belongs to a illicit registration event. According to the predicted risk probability, whether the corresponding store belongs to the illegal registration store can be judged. In general, a risk probability threshold may be set, and when the risk probability of a certain store exceeds the threshold, the store is considered to belong to the illicit registration store. This threshold can be adjusted according to business needs and actual conditions to balance recognition accuracy and recall.
After determining the illicit registration store, corresponding disposal measures may be taken, such as limiting certain functions of the store, requiring further authentication, or directly blocking the store. By adopting the measures, the illegal registration shops can be effectively prevented from carrying out malicious bill swiping, weeding, transaction fraud and other actions, so that the health of the electronic commerce platform and the transaction satisfaction of buyers are maintained. By the method, the graph neural network model not only can monitor and early warn illegal registration behaviors in real time, but also can be updated dynamically according to the latest registration behaviors, and accuracy and instantaneity of the model are improved. The electronic commerce platform can more effectively cope with illegal registration behaviors, and normal operation of the platform and good experience of users are guaranteed.
In another embodiment, after determining whether each store in the store sample set to be predicted is a rule-breaking registration store, marking rule-breaking registration tag data of a corresponding store sample as a 0 (not belonging to the rule-breaking registration store) tag and a1 (not belonging to the rule-breaking registration store) tag, and storing store samples of the corresponding store in a store server, wherein the marked store samples can be used for forming a historical registration store sample set to train the graph neural network model of the application in an iterative manner.
As can be appreciated from the exemplary embodiments of the present application, the technical solution of the present application has various advantages, including but not limited to the following aspects:
The application realizes updating of the graph object by constructing the store samples in the store sample set to be predicted as newly added nodes in the graph object based on the historical store sample set. The graph object can comprehensively reflect complex relevance and potential cluster characteristics among stores, and because illegal registration behaviors often form clusters in a network, namely, the illegal registration stores can be more accurately identified through construction of the graph object. And the graph neural network model can accurately capture the inter-store correlation mode by utilizing information transfer among store sample nodes in the graph object. Specifically, the updated graph object is input into a graph neural network model which is obtained by training store samples of the historical registration store sample set in advance, and the complex relation among the store samples in the graph object is learned by utilizing the capability of the graph neural network model for processing graph structure data, so that the potential mode of illegal registration behavior is effectively identified, and the accuracy and the reliability of a prediction result are improved.
The graph neural network model in the application shows better generalization capability when dealing with complex and changeable illegal registration behaviors, and can adapt to continuous evolution of the illegal registration behaviors. That is, the graph neural network model not only can handle the current illegal registration behavior, but also can adapt to the new illegal registration behavior which may occur in the future.
In general, the correlation information between stores is effectively captured by utilizing the graph neural network model, so that the accuracy of identifying illegal registration stores is improved.
On the basis of any embodiment of the method, before acquiring the registered store sample set to be predicted, the method comprises the following steps:
Step S6100, obtaining the historical registration store sample set, wherein the historical registration store sample set comprises a plurality of store samples, and each store sample comprises a store unique identifier of a corresponding store, store registration attribute information, store registration behavior feature data and illegal registration tag data corresponding to whether the characterization belongs to an illegal registration store or not;
In this step, the store wind control system transmits a request to acquire a history registration store sample set to the store server to acquire the history registration store sample set. The history registration store sample set is stored in the store server, and a person skilled in the art can request a plurality of store samples as the history registration store sample set in the store server by setting a time range or a number range. Each store sample comprises a store unique identifier of a corresponding store, store registration attribute information, store registration behavior feature data and rule-breaking registration tag data which characterizes whether the store belongs to a rule-breaking registration store or not. Specifically, each store sample has a unique identification, typically a store ID, that is used to distinguish between different store samples, ensuring that each store sample is unique in the dataset. Each store sample also includes corresponding store registration attribute information describing the context of store registration. In addition, each store sample further includes store registration behavior feature data, and corresponding rule-breaking registration tag data characterizing whether the store sample belongs to a rule-breaking registration store, the rule-breaking registration tag data being used to mark whether the store sample belongs to the rule-breaking registration store, e.g., a tag data of 0 indicates that the store sample does not belong to the rule-breaking registration store, and a tag data of 1 indicates that the store sample belongs to the rule-breaking registration store. By acquiring a detailed historical registration store sample set, a rich data basis can be provided for training of the graph neural network model, so that the model can be helped to better understand store registration behaviors and accurately predict illegal registration behaviors.
Step S6200, constructing a graph object for training the graph neural network model according to the historical registration store sample set, wherein nodes of the graph object are constructed by the store sample;
A graph object for training a graph neural network model is constructed according to a historical registration store sample set, wherein the graph object consists of a plurality of nodes and edges, and each node represents one historical registration store sample. Specifically, each of the store samples in the history registration store sample set is a node of the graph object, and each node includes a store unique identifier of the store sample, store registration attribute information, store registration behavior feature data, and rule-breaking registration tag data indicating whether or not the store sample belongs to a rule-breaking registration store. In constructing a graph object, edge connection relationships between nodes need to be defined. The edge connection relationship is defined according to the association between store samples.
Specifically, the definition of the edges can be constructed according to the following modes that device fingerprints of terminal devices used in store registration are used as store registration attribute information in corresponding store samples, whether the device fingerprints corresponding to any two nodes in the graph object are identical or not is detected, if so, edge connection is established between the two nodes, a mobile phone number registered by the store is used as store registration attribute information in the corresponding store samples, at least one numerical value in the mobile phone number is deleted, whether the rest parts of the mobile phone numbers corresponding to any two nodes in the graph object are identical or not is compared, if so, edge connection is established between the corresponding nodes, the similarity of mailbox information corresponding to any two nodes in the graph object is calculated by taking the registration time and the registration time as store registration attribute information in the corresponding store samples, if the similarity exceeds a preset first similarity threshold, and when the corresponding registration time interval between the two nodes does not exceed the preset first time threshold, edge connection is established between the corresponding nodes, and if the IP (Internet protocol) when the difference between the corresponding nodes in the graph object is not equal to the preset first time threshold and the corresponding to the second time interval between the two nodes exceeds the preset threshold, and if the corresponding mailbox information between the two nodes is not equal, the corresponding to the corresponding time between the two nodes is established.
Through the definition of the association relation for the edge connection, a graph object containing rich association information can be constructed. The graph object not only comprises a store unique identifier of a historical registration store sample, store registration attribute information, store registration behavior feature data and rule-breaking registration tag data which characterizes whether the store belongs to a rule-breaking registration store or not, but also captures association modes among the store samples through an edge connection relationship. The graph object can provide a comprehensive data basis for training the graph neural network model, help the model to better understand store registration behaviors and accurately predict illegal registration behaviors. By the method, the constructed graph object can better reflect complex relations among store samples, and more accurate and comprehensive data support is provided for training of the graph neural network model.
And step S6300, inputting the constructed graph object into a graph neural network model, performing supervision training by using the rule-breaking registration tag data of each store sample, and training the graph neural network model to a convergence state.
In the training process, the built graph object is input into a graph neural network model of a store wind control system, and the rule-breaking registration tag data of each store sample is used as a supervision signal to guide the training of the graph neural network model. Specifically, the graph neural network model aggregates the characteristic information of each node with the characteristic information of its neighbor nodes through a message passing mechanism, thereby updating the representation of the node. This process is repeated a number of times to ensure that the graph neural network model is able to adequately capture the complex relationships between nodes. In each iteration, the graph neural network model performs classified prediction according to the current node representation, and calculates the loss between the prediction result and the real label. The loss function typically employs cross entropy loss for measuring the difference between the graph neural network model prediction and the true label.
After the loss is calculated, the graph neural network model is back-propagated, and parameters of the graph neural network model are updated to minimize the loss function. This process is repeated a number of times until the loss of the graph neural network model is no longer significantly reduced, i.e., the graph neural network model reaches a converged state. The convergence state indicates that the neural network model has fully learned the characteristics and the association pattern of the historic registration store samples, and can accurately predict the illegal registration behavior.
In the training process, some optimization strategies such as learning rate adjustment, regularization, batch normalization and the like can be adopted to improve the training effect and generalization capability of the graph neural network model. Through the optimization strategies, the characteristics and the association modes of the historical registration store samples can be learned more stably and efficiently by the graph neural network model, so that the prediction accuracy of the graph neural network model is improved.
In this way, the built graph object is input to the graph neural network model, and the graph neural network model can be trained to a convergence state by performing supervised training using the rule-breaking registration tag data of each store sample. The trained graph neural network model can effectively capture complex relations among store samples and accurately predict illegal registration behaviors, so that the safety and the health degree of an e-commerce platform are improved.
On the basis of any embodiment of the method, constructing a graph object for training the graph neural network model according to the historical registration store sample set comprises the following steps:
step S6210, constructing corresponding nodes in the graph object by using store unique identifiers of store samples in the history registration store sample set;
The corresponding node in the graph object is constructed by using the store unique identifier of the store sample in the history registration store sample set, specifically, the store unique identifier of each store sample, typically the store ID, is first extracted from the history registration store sample set. These stores are then uniquely identified as nodes in the graph object. Each node represents a store sample, and store unique identifiers of the store samples are stored in the nodes and are used for distinguishing different store samples, so that each store sample is ensured to be unique in the graph object. In this way, an initial node set of graph objects is constructed, which lays a foundation for subsequent edge connections and feature associations.
Step S6220, judging whether the store registration attribute information of any two nodes of the graph object meets an edge connection rule or not based on the store registration attribute information in the store sample and a preset edge connection rule, and if so, establishing edge connection between the two nodes;
First, store registration attribute information of each store sample is extracted from a historical registration store sample set, and the store registration attribute information generally includes, but is not limited to, registration time, registration mode (mobile phone/mailbox), mobile phone number, mailbox, IP, device fingerprint, and the like. These store registration attribute information are important bases for judging the correlation between store samples. Second, the preset edge connection rules are derived based on analysis and empirical summary of historical violation registration behavior. For example, the rules may include establishing an edge connection between two store samples if the IP addresses of the two nodes are the same, or the device fingerprint, cell phone number, mailbox, etc. are similar. In addition, registration time intervals, geographic location information, etc. may also be considered to more fully capture correlations between store samples. Any two nodes in the graph object are compared, and whether store registration attribute information of the nodes meets a preset edge connection rule is judged. If the edge connection rule is satisfied, an edge is established between the two nodes indicating that there is some association between them. In case the edge connection rule is satisfied, an edge connection is established between the two nodes. The establishment of the edges is not only based on a single attribute, but also can comprehensively consider a plurality of attributes so as to more accurately capture the association mode among store samples. For example, if the IP addresses of two nodes are the same and the registration time interval is short, an edge connection is established between the two nodes, indicating a strong association between them. Through the steps, the edge connection relation of the graph object is constructed. The side connection relations capture the association modes among store samples and provide rich context information for subsequent training of the graphic neural network model.
The step ensures that the side connection relation of the graph object can accurately capture the relevance among store samples, and provides a solid foundation for subsequent illegal registration identification. The edge connection rule can effectively capture the relevance between store samples by defining the connection conditions between nodes. Through these rules, the graph objects can capture complex associations between store samples, providing rich context information for subsequent analysis and prediction. The edge connection rules determine the topology of the graph object, i.e. the connection between nodes. A good topological structure can better reflect the real association relation between store samples, so that the training effect of the graph neural network model is improved. For example, the rule-breaking registration behavior generally has a local aggregation, i.e., a large number of registrations are performed in a short time using the same IP address, device fingerprint, and other store registration attribute information. Through the edge connection rule, a graph structure with local aggregation can be constructed, so that the graph neural network model can better capture the aggregation. The edge connection rules directly influence the training effect of the graph neural network model. By defining reasonable edge connection rules, the graph object with rich associated information can be constructed, so that the training effect of the graph neural network model is improved. Specifically, the graph neural network model aggregates the behavior characteristics of each node with the behavior characteristics of its neighboring nodes through a messaging mechanism. The edge connection rules determine which nodes are neighbor nodes, thereby affecting the effect of feature aggregation. The reasonable edge connection rule can ensure the effectiveness of feature aggregation and improve the prediction accuracy of the graph neural network model.
Edge connection rules also support dynamic updating and adaptation of graph objects. As new store samples are added, the graph objects need to be updated continually to reflect the latest registration behavior and association patterns. The edge connection rule can ensure reasonable and effective updating process of the graph object, thereby maintaining the real-time performance and accuracy of the graph neural network model. Through the operations, the edge connection rules play a vital role in the whole scheme, not only define the connection relation between nodes in the graph object, but also directly influence the training effect of the graph neural network model and the final accuracy of the illegal registration identification. By capturing the association mode among store samples, a reasonable graph object topological structure is constructed, the training effect of a graph neural network model is improved, the accuracy of illegal registration identification is enhanced, and the edge connection rule provides a solid foundation for successful implementation of the whole scheme.
Step S6230, importing the store registration behavior feature data and the rule-breaking registration tag data corresponding to each store sample into the graph object, and associating with the corresponding node in the graph object to form a complete graph object.
Store registration behavior feature data for each store sample is extracted from the historical registration store sample set. The feature data comprises offline features and real-time features, and can comprehensively reflect registered behavior features of store samples and provide rich context information for subsequent analysis and prediction. And extracting the illegal registration tag data of each store sample. These tag data are used to mark whether a store sample belongs to an illicit registration store, for example, a tag data of 0 indicates that the store sample does not belong to an illicit registration store, and a tag data of 1 indicates that the store sample belongs to an illicit registration store. The illegal registration tag data is an important supervisory signal for training the graphic neural network model, and can guide the model to learn the characteristics and association modes of store samples, so that the accuracy of illegal registration identification is improved.
Store registration behavior characteristic data and violation registration tag data of each store sample are imported into the graph object and are associated with corresponding nodes in the graph object. Specifically, store registration behavior feature data of each store sample is associated with a node as a feature vector of the node. Meanwhile, the illegal registration tag data of each store sample is used as a tag of the node and is associated with the node. In this way, each node in the graph object not only contains the store unique identifier and the store registration attribute information of the store sample, but also contains store registration behavior feature data and violation registration tag data, so that a complete graph object is formed.
Through the steps, a graph object containing rich information is constructed. The graph object not only comprises a store unique identifier of a historical registration store sample, store registration attribute information, store registration behavior feature data and rule-breaking registration tag data which characterizes whether the store belongs to a rule-breaking registration store or not, but also captures association modes among the store samples through an edge connection relationship. The complete graph object can provide a comprehensive data basis for training of the graph neural network model, help the model to better understand store registration behaviors and accurately predict illegal registration behaviors. In this way, store registration behavior characteristic data and violation registration tag data corresponding to each store sample are imported into the graph object, and are associated with corresponding nodes in the graph object to form a complete graph object, so that a solid foundation is provided for subsequent violation registration identification.
On the basis of any embodiment of the method, based on the store registration attribute information in the store sample and a preset edge connection rule, judging whether the store registration attribute information of any two nodes of the graph object meets the edge connection rule, and if so, establishing edge connection between the two nodes, wherein the method comprises any one or more steps as follows:
step S6221, using the device fingerprint of the terminal device used in store registration as the store registration attribute information in the corresponding store sample, detecting whether the device fingerprints corresponding to any two nodes in the graph object are identical, and if so, establishing edge connection between the two nodes;
In this step, the device fingerprint of the terminal device for each store sample is extracted from the history registration store sample set. The device fingerprint of the terminal device is a unique identifier generated by collecting and encoding various characteristics of the device, so that the device uniqueness can be accurately identified. These device fingerprints typically include hardware configuration of the device, operating system, browser type and version, plug-in information, etc. From this information, a unique device fingerprint can be generated for distinguishing between different devices. The device fingerprint of the terminal device of each store sample is imported into the drawing object as a part of store registration attribute information. Each node represents a store sample, and store unique identification, store registration attribute information, store registration behavior feature data and violation registration tag data of the store sample are stored in the node. The device fingerprint of the terminal device is associated with the node as part of the store registration attribute information.
Any two nodes in the graph object are compared to detect whether the device fingerprints of their terminal devices are the same. Specifically, by comparing the device fingerprints of the terminal devices of the two nodes, it is determined whether they are completely identical. If the device fingerprints of the terminal devices of two nodes are identical, it is considered that there is some association between the two nodes, i.e. they may be registered by the same device. In this case, an edge is established between the two nodes, indicating that there is some association between them.
By the method, the edge connection relation of the graph object is ensured to accurately capture the association mode between store samples, and a solid foundation is provided for subsequent illegal registration identification. The identity of the device fingerprint of the terminal device is typically characteristic of the illicit registration activity, as illicit registrants often use the same device for bulk registration. By capturing the association mode, the graph object can better reflect complex relations among store samples, and more accurate and comprehensive data support is provided for training of the graph neural network model.
Step S6222, calculating the similarity between the contact information corresponding to any two nodes in the graph object by taking the contact information and the registration time at the time of store registration as the store registration attribute information in the corresponding store sample, and establishing edge connection between the two nodes if the similarity exceeds a preset first similarity threshold and the corresponding registration time interval between the two nodes does not exceed the preset first time threshold;
Contact information and registration time for each store sample is extracted from a historical registered store sample set. The contact information generally comprises a mobile phone number, a mailbox and the like, and is an important basis for judging the relevance between store samples. The registration time is a registration time stamp of the store samples, and is used for judging registration time intervals between the store samples. The contact information and registration time of each store sample are imported into the graph object as part of store registration attribute information. Each node represents a store sample, and store unique identification, store registration attribute information, store registration behavior feature data and violation registration tag data of the store sample are stored in the node. The contact information and registration time are associated with the node as part of store registration attribute information.
Any two nodes in the graph object are compared, and the similarity between the contact information of the nodes is detected. Specifically, by calculating the similarity between the contact information of two nodes, it is determined whether they are similar. The similarity calculation may employ various methods, such as string similarity calculation, cosine similarity calculation, and the like. If the similarity between the contact information of two nodes exceeds a preset first similarity threshold, then it is considered that there is some association between the two nodes, i.e., they may be registered by the same offending registrant or partner. The registration time intervals of the two nodes are further compared. Specifically, the difference between registration times of two nodes is calculated, and it is determined whether they are within a preset first time threshold. If the registration time interval of two nodes does not exceed the preset first time threshold, then a strong association is considered to exist between the two nodes, i.e., they may be registered by the same offending registrant or partner in a short period of time. Wherein the first similarity threshold and the first time threshold may be set by one skilled in the art based on experience or data exploration.
When the two conditions are satisfied, that is, the similarity between the contact information exceeds a preset first similarity threshold value, and the registration time interval does not exceed the preset first time threshold value, an edge is established between the two nodes, which indicates that a certain association exists between the two nodes. By the method, the edge connection relation of the graph object is ensured to accurately capture the association mode between store samples, and a solid foundation is provided for subsequent illegal registration identification. The similarity and time interval of contact information and registration time typically reflect the nature of the offending registration activity, as offending registrants often use similar contact information for batch registration and complete registration in a short period of time. By capturing the association mode, the graph object can better reflect complex relations among store samples, and more accurate and comprehensive data support is provided for training of the graph neural network model.
Step S6223, comparing whether the corresponding IP addresses between any two nodes in the graph object are the same or not by taking the IP address, the contact information and the registration time at the time of store registration as the store registration attribute information in the corresponding store sample, and if the IP addresses are the same and the similarity of the corresponding contact information between the two nodes exceeds a preset second similarity threshold value and the registration time interval between the two nodes does not exceed the preset second time threshold value, establishing edge connection between the two nodes.
The IP address, contact information, and registration time for each store sample are extracted from the historical registered store sample set. The IP address is a network address used when the store sample is registered, and the IP address, the contact information, and the registration time of each store sample are used as part of store registration attribute information and are imported into the map object. Any two nodes in the graph object are compared to detect whether their IP addresses are the same. Specifically, by comparing the IP addresses of two nodes, it is determined whether they are completely identical. If the IP addresses of two nodes are the same, then it is considered that there is some association between the two nodes, i.e. they may be registered by the same device or by the same network environment. The similarity between the contact information of the two nodes is further compared. Specifically, by calculating the similarity between the contact information of two nodes, it is determined whether they are similar. If the similarity between the contact information of two nodes exceeds a preset second similarity threshold, then it is considered that there is some association between the two nodes, i.e., they may be registered by the same offending registrant or partner. Finally, the registration time intervals of the two nodes are further compared. If the registration time interval of two nodes does not exceed the preset second time threshold, then a strong association is considered to exist between the two nodes, i.e., they may be registered by the same offending registrant or partner in a short period of time.
When the above three conditions are satisfied, that is, the IP addresses are the same, the similarity between the contact information exceeds a preset second similarity threshold, and the registration time interval does not exceed the preset second time threshold, an edge is established between the two nodes, which indicates that there is a certain association between them. The similarity and time interval of IP addresses, contact information, and registration times typically reflect the nature of the offending registration activity, as offending registrants tend to register in batches using the same IP addresses and similar contact information and complete the registration in a short period of time. By capturing the association mode, the graph object can better reflect complex relations among store samples, and more accurate and comprehensive data support is provided for training of the graph neural network model.
On the basis of any embodiment of the method of the present application, corresponding to a graph object constructed based on store samples in a history registered store sample set, constructing the store samples in the registered store sample set to be predicted as newly added nodes in the graph object to update the graph object, including:
Step S5210, obtaining shop samples in the shop sample set to be predicted and a graph object formed based on the historic registered shop sample set, wherein each shop sample comprises a shop unique identifier, shop registration attribute information and shop registration behavior characteristic data of a corresponding shop;
In the process of acquiring the store samples in the store sample set to be predicted and the graph object constituted based on the history-registered store sample set, it is first required to make clear that the store samples in the store sample set to be predicted are store samples newly registered by the electronic commerce platform in daily operation. These newly registered store samples require detection of illicit registration activities to ensure the security and health of the platform.
First, the store wind control system sends a request to the store server to obtain a registered store sample set to be predicted. The store server responds to the request and transmits the newly registered store sample data to the store wind control system. Store samples are the basis for performing the identification of offending registrations because they contain key information for store registration behavior. At the same time, it is also important to acquire a graph object composed based on a historic registered store sample set. A historical registered store sample set is a collection of all registered store samples over a period of time that have been manually labeled or otherwise marked as to whether they belong to an illicit registered store.
The purpose of acquiring store samples in a store sample set to be predicted and graph objects formed based on a historical registered store sample set is to correlate a newly registered store sample with the historical registered store sample, thereby constructing a graph object containing the latest registration behavior. The updated graph object not only contains the information of the historical registration store sample, but also contains the latest registration store sample to be predicted, thereby realizing the dynamic update of the graph object. The updating mechanism enables the graph object to reflect the current registration behavior and association mode in real time, and provides a more comprehensive and accurate data basis for subsequent illegal registration identification. By the method, the integrity and consistency of the graph objects can be ensured, so that the graph neural network model can better capture complex relations among store samples, and the risk probability prediction of the illegal registration event is carried out on newly added nodes according to the captured relations among the nodes. The method is beneficial to realizing real-time monitoring and early warning of illegal registration behaviors and improving the safety and user experience of an e-commerce platform.
Step S5220, using a store unique identifier in the store sample as a newly added node of the graph object, judging whether the two nodes of the graph object meet a connection rule according to store registration attribute information corresponding to each node in the graph object and the preset edge connection rule, and if so, establishing edge connection between the corresponding nodes to integrate the newly added node into the graph object;
Dynamic updating of the graph object is achieved by integrating newly registered store samples as newly added nodes into the graph object. The dynamic updating mechanism enables the graph object to reflect the current registration behavior and association mode in real time, and provides a more comprehensive and accurate data basis for subsequent illegal registration identification. With the continuous addition of new store samples, the graph objects can be continuously updated, and the timeliness and accuracy of the graph objects are kept, so that the timeliness and effectiveness of illegal registration identification are improved. The preset edge connection rules are obtained based on analysis and experience summary of historical violation registration behaviors, and can effectively capture complex association patterns among store samples.
After integrating the newly added nodes into the graph object, further analysis and prediction can be performed by using the updated graph object. For example, the newly added nodes are predicted by the graph neural network model, so that potential illegal registration behaviors are timely identified and processed. The real-time monitoring and early warning mechanism can effectively prevent illegal registration shops from conducting malicious bill brushing, weeding, transaction fraud and other actions, so that the health degree of an electronic commerce platform and the transaction satisfaction degree of buyers are maintained. By the method, the illegal registration behavior can be monitored and early-warned in real time, and the safety and the user experience of the electronic commerce platform are improved. In one embodiment, after determining whether each store in the set of store samples to be predicted is a rule-breaking registration store, rule-breaking registration tag data of a corresponding store sample may be marked as a 0 (not belonging to the rule-breaking registration store) and a1 (belonging to the rule-breaking registration store) tag, and store the store samples associated with the corresponding store in the store server. The marked store samples can be used to construct a historical registration store sample set for iteratively training the neural network model of the present application. By the method, the performance of the graph neural network model can be optimized and improved continuously, so that the graph neural network model can adapt to continuously changing illegal registration behavior modes better.
And step S5230, importing the store registration behavior characteristic data of the store sample set to be predicted and registered into the graph object, associating with corresponding nodes in the graph object, and updating the graph object.
Store registration behavior feature data of each store sample is extracted from a store sample set to be predicted, and is imported into a graph object to be associated with corresponding nodes in the graph object. In this way, each node in the graph object not only contains the store unique identifier and store registration attribute information of the store sample, but also contains store registration behavior feature data, so that a complete graph object is formed. The importing and associating of the store registration behavior feature data enables the graph object to comprehensively reflect the registration behavior feature of the store sample, and provides a more comprehensive and accurate data basis for subsequent illegal registration identification.
On the basis of any embodiment of the method, after judging whether the corresponding store belongs to the illegal registration store according to the risk probability, the method comprises the following steps:
Step S5310, when the store is confirmed to be an illegal registration store, calling a large language model to generate an illegal registration report corresponding to the store according to store registration attribute information and store registration behavior characteristic data of the store, and sending the illegal registration report to a user corresponding to the store;
After confirming that a store is a illicitly registered store, the store wind control system invokes a large language model. The large language model is a natural language processing model based on deep learning, and can understand and generate natural language texts. By invoking the large language model, the store wind control system can automatically generate detailed illegitimate registration reports.
Specifically, the large language model generates an illegal registration report from store registration attribute information and store registration behavior feature data of the store. The illegal registration report generally includes a unique identification of the store, store registration attribute information, store registration behavior feature data, risk assessment results, corresponding treatment advice, and the like. For example, the illegal registration report indicates that the IP address of the store has registered a large number of stores in a short time, or that the device fingerprint matches a known illegal registration device fingerprint, or that the IP address of the registered cell phone number does not coincide with the registered IP address, or the like. The report may also provide corresponding disposal advice, such as limiting certain functions of the store, requiring further authentication, or directly blocking the store. Further, the store wind control system transmits the generated illegal registration report to the user corresponding to the store. The sending mode can be through email, short message, platform message and the like. By sending the illegal registration report, the user can be timely informed that the store is identified as the illegal registration store, and corresponding treatment suggestions are provided, so that the user is helped to take measures timely, and further malicious actions are prevented.
The step ensures the timely identification and processing of the illegal registration store, and improves the safety and user experience of the e-commerce platform. By generating detailed illegal registration reports, clear guidance can be provided for users to help the users understand and cope with illegal registration behaviors, so that normal operation of the platform and good experience of the users are maintained.
Step S5320, iterating the illegal registration report generation event until the generated number of the illegal registration reports exceeds a preset report number threshold, and calling a large language model to generate corresponding illegal registration regulations according to the generated illegal registration reports, wherein the illegal registration regulations comprise a plurality of illegal registration details;
The store wind control system records illegal registration reports generated each time and counts the number of reports generated. When the generated illegal registration report number exceeds a preset report number threshold, the store wind control system triggers a further processing flow. The preset report quantity threshold value can be adjusted according to service requirements and actual conditions so as to ensure that measures can be taken in time when the illegal registration behaviors are concentrated. And then the store wind control system can call the large language model, and corresponding illegal registration regulations are generated according to the generated illegal registration report. The rule of illegal registration is a summary and generalization of the act of illegal registration, and generally includes a plurality of details of illegal registration. Each illegal registration detail corresponds to a specific illegal registration behavior mode, for example, the IP address registers a large number of shops, device fingerprints and known illegal registration device fingerprints in a short time, the IP address of the registered mobile phone number is inconsistent with the registered IP address, and the like.
It is further appreciated that the large language model may analyze the generated illegal registration report, extract key information and behavior patterns therein, and generate corresponding illegal registration details. For example, if a large number of stores are registered in a short time by referring to a certain IP address in a plurality of illegal registration reports, the large language model extracts this behavior pattern as an illegal registration list. In this way, the offending registration behavior can be systematically summarized and generalized, and more comprehensive and accurate guidance is provided for subsequent offending registration identification.
Finally, the generated illegal registration bar can be stored in a store wind control system for subsequent illegal registration identification and treatment. The illegal registration regulations can be used as training data of the graphic neural network model, help the model to better understand illegal registration behaviors, and improve the prediction accuracy of the model. In addition, the illegal registration regulations can be used for generating illegal registration prevention guidelines of the platform to help users to know and cope with illegal registration behaviors, so that the safety and user experience of the platform are improved.
Step S5330, after determining that the illegal registration details exceed a preset detail number threshold, calling a large language model to judge whether each store in the sample set of the stores to be predicted is an illegal registration store according to the illegal registration details and the sample set of the stores to be predicted after the sample set of the stores to be predicted is obtained.
The store wind control system first checks whether the number of illegally registered details exceeds a preset number of details threshold. If the comparison result is exceeded, the store wind control system can comprehensively analyze by acquiring the latest store sample set to be predicted and calling a large language model and taking illegal registration details and the store sample set to be predicted as inputs. The large language model analyzes each store sample in the store sample set to be predicted one by one according to the illegal registration behavior patterns summarized in the illegal registration details. Specifically, the large language model extracts store registration attribute information and store registration behavior feature data for each store sample and matches the behavior patterns in the illegitimate registration details. If the registration behavior characteristic of a certain store sample is highly matched with a certain item in the illegal registration details, the large language model can judge that the store sample is an illegal registration store and generate a corresponding risk assessment result. By means of the method, the large language model can rapidly and accurately judge whether each store in the sample set of the registration stores to be predicted is an illegal registration store by utilizing the behavior modes summarized in the illegal registration details, so that the efficiency and accuracy of illegal registration identification are improved. Finally, the store wind control system can take corresponding disposal measures according to the judgment result of the large language model, such as limiting certain functions of the store, requiring further identity verification, or directly sealing the store, so as to maintain the safety and user experience of the e-commerce platform.
On the basis of any embodiment of the method, after judging whether each store in the sample set of stores to be predicted is a illegal store, the method comprises the following steps:
Step S5331, obtaining an illegal registration store sample set corresponding to the illegal registration store judged by the large language model in the to-be-predicted registration store sample set;
After judging whether each store in the sample set of stores to be predicted is an illegal store, the sample set of stores to be predicted needs to be acquired first, and the sample set of stores to be predicted is judged to be the illegal store corresponding to the illegal store by the large language model. Specifically, the store wind control system identifies which stores in the sample set of registered stores to be predicted are determined to be illegal registered stores according to the determination result of the large language model. And then, collecting all store samples judged to be illegal registered stores together to form an illegal registered store sample set. The rule-breaking registered store sample set includes store samples of stores judged as rule-breaking registered stores by the large language model.
Step S5332, obtaining a graph object formed based on a historical registration store sample set, and updating the graph object by combining the illegal registration store sample set;
The illegal registration store sample set is a set composed of store samples of the illegal registration stores determined by the large language model in the previous step. The store wind control system firstly acquires a graph object formed by a historical registration store sample set, and adds store samples in the illegal registration store sample set to the graph object formed by the historical registration store sample set one by one to serve as a newly added node update graph object. The updated graph object includes not only information of historic registered store samples, but also latest illicit registered store samples, which are stores determined to be illicitly registered by the large language model. The wind control system of the subsequent stores utilizes the trained graphic neural network model to predict the risk probability of each store in the illegal registration store samples, so as to judge the accuracy of predicting the illegal registration of the large language model, and the step is referred to the subsequent detailed description, and is omitted herein.
Step S5333, inputting the updated graph object into a trained graph neural network model, judging the number of the illegal registration stores in the illegal registration store sample set, dividing the number by the total number of stores in the illegal registration store sample set, and obtaining the accuracy of judging the illegal registration stores by the large language model;
And inputting the updated graph object in the previous step into a graph neural network model which is trained by store samples of a historical registration store sample set in advance. After the updated graph object is input into the graph neural network model, the graph neural network model predicts the risk probability of the rule-breaking registration event for each newly added node (i.e. rule-breaking registration store sample) in the graph object. And then counting the number of the illegal registration stores in the illegal registration store sample set, which is judged as the number of the illegal registration stores by the graphic neural network model, and dividing the number by the total number of the stores in the illegal registration store sample set to obtain the accuracy of judging the illegal registration stores by the large language model. This accuracy reflects the behavior of the large language model in the identification of illicit registration stores. In this way, the store wind control system can evaluate the accuracy of the large language model in the recognition of the illegal registration store, thereby providing basis for subsequent model optimization and decision.
And step S5334, when the accuracy reaches a preset accuracy threshold, starting the large language model to identify illegal shops.
The store wind control system is used for judging whether the performance of the large language model in the illegal registration store identification reaches the expected standard or not by setting a preset accuracy threshold. The accuracy threshold can be adjusted according to business requirements and actual conditions so as to ensure that the performance of the large language model in the recognition of the illegal registration store can meet the safety and user experience requirements of the platform. When the large language model judges that the accuracy of the illegal registration store reaches or exceeds a preset accuracy threshold, the store wind control system can select to start the large language model to identify the illegal store. After the large language model is started, the store wind control system can identify and process potential illegal registration stores in real time according to the judging result of the large language model, so that the safety and user experience of the electronic commerce platform are improved. In this way, the store wind control system can ensure that the performance of the large language model in the illegal registration store identification reaches the expected standard, thereby providing guarantee for the normal operation of the platform and the good experience of users.
In one embodiment, the trained neural network model and the large language model can be used for prediction at the same time, and then whether the rule is illegal is judged by setting a rule-breaking registration judgment condition, for example, a comprehensive risk probability threshold value can be set, and when the comprehensive risk probability of the prediction results of the neural network model and the large language model exceeds the threshold value, the store sample is judged to belong to the rule-breaking registration store. When at least one of the neural network model and the large language model determines that a certain store sample is an illegal registered store, the store sample may be determined to belong to the illegal registered store.
Referring to fig. 3, a store violation registration identifying apparatus provided according to an aspect of the present application includes a to-be-predicted sample set obtaining module 5100, a graph object updating module 5200, and a risk probability predicting module 5300, where the to-be-predicted sample set obtaining module 5100 is configured to obtain a to-be-predicted registration store sample set, the to-be-predicted registration store sample set includes store samples of a plurality of stores, the store samples describe basic information and behavior characteristics of a corresponding store, the graph object updating module 5200 is configured to correspond to a graph object constructed based on store samples in a history registration store sample set, construct the store samples in the to-be-predicted registration store sample set as a new node in the graph object to update the graph object, and the risk probability predicting module 5300 is configured to input the updated graph object into a graph neural network model trained by store samples of the history registration store sample set, and predict whether each store in the to-be-predicted registration store sample set belongs to a corresponding store with a probability of a violation registration event according to a risk of a risk determination.
On the basis of any embodiment of the device, the to-be-predicted sample set obtaining module 5100 comprises a history sample set obtaining sub-module, a graph object constructing sub-module and a graph neural network model training sub-module, wherein the history sample set obtaining sub-module is used for obtaining a history registration store sample set, the history registration store sample set comprises a plurality of store samples, each store sample comprises a store unique identifier of a corresponding store, store registration attribute information, store registration behavior characteristic data and rule-breaking registration label data which characterize whether the store belongs to a rule-breaking registration store or not, the graph object constructing sub-module is used for constructing a graph object used for training the graph neural network model according to the history registration store sample set, nodes of the graph object are constructed by the store samples, and the graph neural network model training sub-module is used for inputting the constructed graph object into the graph neural network model and performing supervision training by using the rule-breaking registration label data of each store sample to train the graph neural network model to a convergence state.
On the basis of any embodiment of the device, the graph object construction submodule comprises a node construction submodule, an edge connection submodule and a graph object feature importing submodule, wherein the node construction submodule is used for constructing corresponding nodes in the graph object by using store unique identifiers of store samples in the historical registration store sample set, the edge connection submodule is used for judging whether store registration attribute information of any two nodes of the graph object meets an edge connection rule or not based on store registration attribute information of the store samples and a preset edge connection rule, if yes, edge connection is established between the two nodes, and the graph object feature importing submodule is used for importing store registration behavior feature data corresponding to each store sample and violation registration tag data into the graph object and correlating the store registration feature data with corresponding nodes in the graph object to form the complete graph object.
On the basis of any embodiment of the device, the side connection submodule comprises any one or more of a device fingerprint judging submodule, a contact information judging submodule and an IP address judging submodule, wherein the device fingerprint judging submodule is set to take device fingerprints of terminal devices used in store registration as store registration attribute information in corresponding store samples, detects whether the device fingerprints corresponding to any two nodes in the graph object are identical or not, if so, establishes side connection between the two nodes, the contact information judging submodule is set to take contact information and registration time in store registration as store registration attribute information in corresponding store samples, calculates similarity between the contact information corresponding to any two nodes in the graph object, and if the similarity exceeds a preset first similarity threshold value and the corresponding registration time interval between the two nodes does not exceed a preset first time threshold value, establishes side connection between the two nodes, and sets to take IP address, contact information and registration time in store registration as the corresponding store registration attribute information in corresponding store samples, and establishes whether the similarity between the two nodes in the graph object is identical or not and if the similarity exceeds the preset first similarity threshold value and the corresponding registration time interval between the two nodes exceeds the preset second similarity threshold value.
On the basis of any embodiment of the device, the graph object updating module 5200 comprises a data acquisition sub-module, a characteristic data importing sub-module and a characteristic data updating sub-module, wherein the data acquisition sub-module is used for acquiring a graph object formed by a store sample in a store sample set to be predicted and a store sample set based on a historical registration store, the store sample in the store sample set to be predicted comprises a store unique identifier of a corresponding store, store registration attribute information and store registration behavior characteristic data, the new node constructing sub-module is used for taking the store unique identifier in the store sample as a new node of the graph object, judging whether the connection rule is met between every two nodes of the graph object according to store registration attribute information corresponding to each node in the graph object and the preset edge connection rule, if so, establishing edge connection between the corresponding nodes to integrate the new node into the graph object, and the characteristic data importing sub-module is used for importing the store registration behavior characteristic data of the store sample set to be predicted into the graph object and updating the graph object in association with the corresponding nodes in the graph object.
On the basis of any embodiment of the device, the risk probability prediction module 5300 comprises an illegal registration report generation sub-module, an illegal registration rule generation sub-module and a large language model, wherein the illegal registration report generation sub-module is used for generating an illegal registration report corresponding to a store according to store registration attribute information and store registration behavior characteristic data of the store when the store is confirmed to be the illegal registration store, the illegal registration report generation sub-module is used for transmitting the illegal registration report to a user corresponding to the store, the illegal registration rule generation sub-module is used for iterating the illegal registration report generation event until the number of generated illegal registration reports exceeds a preset report number threshold, the large language model is called to generate a corresponding illegal registration rule according to the generated illegal registration report, the illegal registration rule comprises a plurality of illegal registration details, the illegal registration store judgment sub-module is used for determining whether the illegal registration details exceed a preset list number threshold, and the large language model is called according to whether the illegal registration details and store sample sets to be predicted are illegal registration samples in the store sample sets to be predicted.
The device comprises a rule-breaking registration store judging submodule, a graph object updating submodule, a judgment accuracy calculating submodule and a large language model enabling submodule, wherein the rule-breaking registration store judging submodule is used for obtaining rule-breaking registration store sample sets corresponding to rule-breaking registration stores judged by a large language model in a to-be-predicted registration store sample set, the graph object updating submodule is used for obtaining graph objects formed by historical registration store sample sets and combining the rule-breaking registration store sample sets to update the graph objects, the judgment accuracy calculating submodule is used for inputting updated graph objects into a trained graph neural network model, judging that the rule-breaking registration store sample sets are the number of rule-breaking registration stores, dividing the number of rule-breaking registration store sample sets by the total number of stores in the rule-breaking registration store sample sets to obtain the rule-breaking registration store judging accuracy of the large language model, and the large language model enabling submodule is used for enabling the large language model to conduct rule-breaking store identification when the accuracy reaches a preset accuracy threshold.
In order to solve the technical problems, the embodiment of the application also provides computer equipment. As shown in fig. 4, the internal structure of the computer device is schematically shown. The computer device includes a processor, a computer readable storage medium, a memory, and a network interface connected by a system bus. The computer readable storage medium of the computer device stores an operating system, a database and computer readable instructions, the database can store a control information sequence, and the computer readable instructions, when executed by a processor, can enable the processor to realize a store violation registration identification method. The processor of the computer device is used to provide computing and control capabilities, supporting the operation of the entire computer device. The memory of the computer device may have stored therein computer readable instructions that, when executed by the processor, cause the processor to perform the store violation registration identification method of the present application. The network interface of the computer device is for communicating with a terminal connection. It will be appreciated by persons skilled in the art that the architecture shown in fig. 4 is merely a block diagram of some of the architecture relevant to the present inventive arrangements and is not limiting as to the computer device to which the present inventive arrangements are applicable, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
The processor in this embodiment is configured to execute specific functions of each module and its sub-module in fig. 3, and the memory stores program codes and various data required for executing the above modules or sub-modules. The network interface is used for data transmission between the user terminal or the server. The memory in this embodiment stores program codes and data necessary for executing all modules/sub-modules in the store violation registration identifying device of the present application, and the server can call the program codes and data of the server to execute the functions of all sub-modules.
The present application also provides a storage medium storing computer readable instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of the store violation registration identification method of any of the embodiments of the present application.
Those skilled in the art will appreciate that all or part of the processes implementing the methods of the above embodiments of the present application may be implemented by a computer program for instructing relevant hardware, where the computer program may be stored on a computer readable storage medium, where the program, when executed, may include processes implementing the embodiments of the methods described above. The storage medium may be a computer readable storage medium such as a magnetic disk, an optical disk, a Read-On-y Memory (ROM), or a random access Memory (Random Access Memory, RAM).
Those of skill in the art will appreciate that the various operations, methods, steps in the flow, acts, schemes, and alternatives discussed in the present application may be alternated, altered, combined, or eliminated. Further, other steps, means, or steps in a process having various operations, methods, or procedures discussed herein may be alternated, altered, rearranged, disassembled, combined, or eliminated. Further, various operations, methods, steps, means, or arrangements of procedures found in the prior art with the open source of the present application may be alternated, altered, rearranged, split, combined, or eliminated.
The foregoing is only a partial embodiment of the present application, and it should be noted that it will be apparent to those skilled in the art that modifications and adaptations can be made without departing from the principles of the present application, and such modifications and adaptations are intended to be comprehended within the scope of the present application.