Disclosure of Invention
The application provides an intelligent collaborative management method and platform based on artificial intelligent driving, which at least solve the technical problem of low post management efficiency caused by the fact that related technologies cannot efficiently and accurately distribute posts in an exchange platform to objects suitable for processing the posts.
According to one aspect of the application, an intelligent collaborative management method based on artificial intelligence driving is provided, which comprises the steps of obtaining a target post to be processed in an exchange platform, determining a multidimensional feature vector of the target post, wherein the multidimensional feature vector comprises a theme classification feature, an emotion feature and a keyword coding feature, constructing a graphic neural network, nodes in the graphic neural network comprise department nodes and employee nodes, edges in the graphic neural network comprise department collaborative edges, department-employee membership edges and employee collaborative edges, department node attributes at least comprise the post number to be processed, post average processing time length and professional field coding, employee node attributes at least comprise a current state, a skill feature vector and a historical satisfaction score, analyzing the multidimensional feature vector and the graphic neural network by utilizing a first depth Q network to obtain a target node for processing the target post in the graphic neural network, wherein the state in the first depth Q network comprises the current multidimensional feature vector, the topology structure and the node attributes of the graphic neural network, and the department node attributes in the first depth Q network comprise selecting the target node, the department-employee membership edges and the employee collaborative edges, forwarding the post to the target processing area, and forwarding the post to the target processing result to the target area, and forwarding the target post processing result to the target post processing target node.
Optionally, the reward function of the first deep Q network is determined by determining a first sub-reward function according to a matching score, wherein the matching score is used for representing the semantic matching degree between the post content and the node professional field, the matching score is obtained by calculating the similarity between a post keyword and a node preset label through a natural language processing technology, determining a second sub-reward function according to a time delay, wherein the time delay is used for representing the time difference from the creation of the post to the successful allocation of the time stamp, determining a third sub-reward function according to a load index, wherein the load index is used for representing the ratio of the current task queue length of the node to the maximum processing capacity of the node, determining a fourth sub-reward function according to a satisfaction score, wherein the satisfaction score is used for representing the feedback rating provided by a user after the post processing is completed, and determining the reward function according to the first sub-reward function, the second sub-reward function, the third sub-reward function and the fourth sub-reward function.
Optionally, determining the rewarding function according to the first sub rewarding function, the second sub rewarding function, the third sub rewarding function and the fourth sub rewarding function comprises the steps of obtaining initial weight coefficients of each sub rewarding function, continuously detecting post distribution success rate indexes, average processing delay indexes and user satisfaction average indexes, increasing first initial weight coefficients corresponding to the first sub rewarding function and fourth initial weight coefficients corresponding to the fourth sub rewarding function when detecting post distribution success rate indexes to be reduced, increasing second initial weight coefficients corresponding to the second sub rewarding function and third initial weight coefficients corresponding to the third sub rewarding function when detecting average processing delay indexes to be increased, increasing fourth initial weight coefficients when detecting user satisfaction average indexes to be reduced, executing normalization operation after each time of weight coefficient updating, enabling the sum of the updated weight coefficients to be 1, and obtaining first weight coefficients, second weight coefficients, third weight coefficients and fourth weight coefficients, and determining the rewarding function according to the first sub rewarding function, the first weight coefficients, the second weight coefficients, the third sub rewarding function and the fourth sub rewarding function.
Optionally, determining a multidimensional feature vector of the target post, wherein the multidimensional feature vector comprises extracting a semantic embedded vector of the target post by using a pre-training language model, calculating different probability distributions corresponding to different topic classifications by using a topic attention layer, normalizing the different probability distributions to obtain a first feature component, calculating a vocabulary strength score of the target post based on a financial emotion dictionary, wherein the vocabulary strength score comprises a plurality of different preset words for representing positive emotion, neutral emotion and negative emotion, the vocabulary strength score is used for quantifying the strength of an emotion tendency expressed in the target post, determining the emotion probability of the target post by using a deep learning model, wherein the emotion probability is used for representing the probability that the target post belongs to positive emotion, neutral emotion and negative emotion, dynamically weighting and fusing the vocabulary strength score and the emotion probability according to a financial term density to obtain a second feature component, wherein the financial term density is the ratio of the number of financial terms appearing in the target post to the total word number of the target post, removing general stop words and high-frequency financial virtual words in the target post, and retaining word frequency-inverse document frequency exceeds a preset threshold, and jointly filtering the second feature component by using a core word, and jointly filtering the second feature component by using a core feature, and jointly filtering the second feature component by using the core feature component, and performing dimension reduction processing on the splicing result through the orthogonal constraint linear transformation layer to obtain a multidimensional feature vector.
Optionally, after the target post to be processed in the communication platform is obtained, the method further comprises the steps of encoding the target post by utilizing a pre-training language model to obtain a semantic feature vector, determining a click rate standardization value and a poster job level weight of the target post, splicing the semantic feature vector, the click rate standardization value and the poster job level weight, determining a splicing result as a state vector, analyzing the state vector by utilizing a second depth Q network, determining whether the target post is an essence post or not, wherein an action space of the second depth Q network comprises a marked essence post and a non-marked essence post, a reward function of the second depth Q network comprises a compliance score, an expert review pass rate and a false mark penalty, the compliance score is a quantification result of risk scanning of post contents through a preset gold fusion rule base, the compliance score is a difference value of 1 and a target ratio, the target ratio is a ratio of the number of occurrence times of risk keywords to the number of total keywords, the expert review pass rate is a mark and a ratio of the number of the marks to the total number of the marks, the marks is punishment cost of the expert review pass rate and the marks is a punishment cost of the marks, the punishment cost comprises a first and a preset essence cost per-post is a preset essence cost, the first post is a preset cost per-unit, and the first post is a preset cost per-unit and the cost of the first post is a preset per-unit of the cost and the attention is a preset per-to be a per-cost per-unit.
Optionally, if the target post is determined to be an essence post by using the second deep Q network, after the processing result of the target node for the target post is obtained, the processing result is sent to the target object for rechecking, the rechecking result is obtained, and the rechecking result is forwarded to an information receiving area or a target communication address associated with the target post.
Optionally, the processing result at least comprises target content, a processing conclusion type, a user privacy level identifier and a result sensitivity label, wherein the processing result is forwarded to an information receiving area or a target communication address information receiving area related to the target post, and the processing result comprises the steps of forwarding the target content to the information receiving area when the processing conclusion type is a first type used for representing a public reply request, forwarding the target content to the target communication address when the processing conclusion type is a second type used for representing personal transaction, forwarding the target content to the information receiving area when the user privacy level identifier is a first identifier used for representing public authority, forwarding the target content to the target communication address when the user privacy level identifier is a second identifier used for representing private authority, and forwarding the target content to the target communication address when the result sensitivity label is a first label used for representing private data.
According to the application, the intelligent collaborative management platform based on artificial intelligence driving is further provided, and comprises an acquisition unit, an analysis unit and a construction unit, wherein the acquisition unit is used for acquiring target posts to be processed in the communication platform and determining multi-dimensional feature vectors of the target posts, the multi-dimensional feature vectors comprise topic classification features, emotion features and keyword coding features, the construction unit is used for constructing a graph neural network, nodes in the graph neural network comprise department nodes and employee nodes, edges in the graph neural network comprise department collaborative edges, department-employee membership edges and employee collaborative edges, department node attributes at least comprise post numbers to be processed, post average processing time length and professional field coding, employee node attributes at least comprise current states, skill feature vectors and historical satisfaction scores, the analysis unit is used for analyzing the multi-dimensional feature vectors and the graph neural network by means of a first depth Q network to obtain target nodes for processing the target posts, the states in the first depth Q network comprise the multi-dimensional feature vectors of the current posts, the topology structures and the node attributes of the graph neural network, the first depth Q network comprise department collaborative edges, the department-employee membership edges and employee collaborative edges, the department node attributes comprise the first depth Q network, the first depth Q network comprises the department nodes and the target nodes are selected to be used for forwarding the post results to the target posts to the target nodes, and the target nodes are selected to be forwarded to the target nodes, and the target nodes are forwarded to the target nodes.
According to still another aspect of the present application, there is also provided a non-volatile storage medium, the storage medium including a stored program, wherein the program, when running, controls a device in which the storage medium is located to execute the above intelligent collaborative management method based on artificial intelligence driving.
According to still another aspect of the present application, there is also provided an electronic device including a memory and a processor for running a program stored in the memory, wherein the program runs to perform the above intelligent collaborative management method based on artificial intelligence driving.
According to still another aspect of the present application, there is also provided a computer program, wherein the computer program, when executed by a processor, implements the above intelligent collaborative management method based on artificial intelligence driving.
According to yet another aspect of the present application, there is also provided a computer program product comprising a non-volatile computer readable storage medium, wherein the non-volatile computer readable storage medium stores a computer program which, when executed by a processor, implements the above intelligent collaborative management method based on artificial intelligence driving.
The method comprises the steps of acquiring a target post to be processed in an exchange platform, determining a multidimensional feature vector of the target post, wherein the multidimensional feature vector comprises a theme classification feature, an emotion feature and a keyword coding feature, constructing a graph neural network, nodes in the graph neural network comprise department nodes and staff nodes, edges in the graph neural network comprise department-to-department cooperation edges, department-staff membership edges and staff-to-staff cooperation edges, department node attributes at least comprise post numbers to be processed, post average processing time length and professional field coding, staff node attributes at least comprise a current state, a skill feature vector and a historical satisfaction score, analyzing the multidimensional feature vector and the graph neural network by utilizing a first depth Q network to obtain target nodes for processing the target post in the graph neural network, wherein the states in the first depth Q network comprise the multidimensional feature vector of the current post, the topological structure and the node attributes of the graph neural network, the actions in the first depth Q network comprise the selection of the target nodes, the selection of the staff nodes and the execution of multiple decision-making along the graph edges, the target post is forwarded to the target nodes, the target post is acquired, the target post is forwarded to the target nodes, the target post is accurately distributed to the target posts, the target posts are well-related to the target posts, the post processing target posts are well-by the communication platform, the high-efficient communication is achieved, the high-efficient and the communication is achieved, the high-efficient and the communication is achieved, and the communication efficiency is achieved, the technical problem of low post management efficiency is caused.
Detailed Description
In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to an embodiment of the present application, there is provided a method embodiment of an intelligent collaborative management method based on artificial intelligence driving, it should be noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different from that herein.
FIG. 1 is a flow chart of an intelligent collaborative management method based on artificial intelligence driving according to an embodiment of the application, as shown in FIG. 1, the method comprises the following steps:
Step S102, obtaining a target post to be processed in the communication platform, and determining a multidimensional feature vector of the target post, wherein the multidimensional feature vector comprises a theme classification feature, an emotion feature and a keyword coding feature.
The communication platform comprises, but is not limited to, an employee forum, an instant messaging platform and a document collaboration platform.
It will be appreciated that in order to more accurately understand and process the mass content of posts in an exchange platform, particularly to identify information that contains potential task value or that requires further action, each post that requires significant attention, i.e., the target post, needs to be converted into a structured mathematical representation. This mathematical representation is known as a "multi-dimensional feature vector," which is essentially a list or array of numbers, where each location (dimension) represents the quantized feature value of the post in a particular aspect.
Specifically, the multi-dimensional feature vector includes three core components. Topic classification features, which refer to the business domain or content category of the main discussion of the post, such as "credit product," "internal flow optimization," "IT system problem," or "customer service feedback," are automatically judged by natural language processing techniques (e.g., topic models or text classifiers), and the topic category to which the post belongs is represented by a specific numerical code. Emotional characteristics, reflecting the emotional tendency or attitude strength expressed by the author of the post, may be a numerical value calculated by an emotion analysis model, such as a score ranging from-1 (extreme negative) to +1 (extreme positive), or a finer granularity classification such as positive emotion, neutral emotion, and codes corresponding to negative emotion. Keyword encoding features, which involve identifying and quantifying core terms or phrases (e.g., "slow approval", "systematic katon", "suggestion add function X") that appear in a post, may be used to generate a sequence of values by word embedding techniques that capture the key concepts mentioned in the post and their importance.
And calculating and combining the characteristic values of the three dimensions to finally form a multidimensional characteristic vector. This vector becomes a "digital fingerprint" of the post.
Step S104, constructing a graph neural network, wherein nodes in the graph neural network comprise department nodes and employee nodes, edges in the graph neural network comprise department cooperation edges, department-employee membership edges and employee cooperation edges, department node attributes at least comprise post numbers to be processed, post average processing time length and professional field codes, and employee node attributes at least comprise current states, skill feature vectors and historical satisfaction scores.
The graph neural network is composed of nodes (or vertices) representing entities and edges representing connection relationships between the nodes. The nodes and the edges can have respective feature vectors for describing attribute information, and the whole graph can also have global information. By iteratively passing and aggregating the characteristic information of nodes and their neighbors, learning the representation of nodes and graphs, the graph neural network can capture the dependency relationships and topological structure features between nodes in the graph. A skill feature vector is a form of vector used to represent the skill-facet characteristics. Each dimension of skill (e.g., type of skill, proficiency, application scenario, etc.) is quantified as a series of values, and a skill is fully described by a vector of these values.
In the above step S104, the nodes in the graph neural network represent basic constituent units of the enterprise, and are mainly divided into two types, namely, department nodes and employee nodes. Department nodes, such as credit, risk control, IT support, represent functional units in the enterprise, and employee nodes represent specific personnel members. The connection relation between the nodes is embodied by edges, wherein the inter-department cooperation edges describe formal or informal cooperation relations (such as cooperation of credit parts and risk control parts on approval processes) generated between different departments due to projects or processes, department-employee membership edges clearly identify subordinate relations between employees and departments to which the employees belong, and the inter-employee cooperation edges reflect actual work networks formed between employee individuals based on historical project cooperation, knowledge sharing or social interaction (such as frequent cooperation of employees of two different departments solves the problem of clients).
Further, in order for the graph neural network to learn and perform automatic task allocation efficiently, each node is given attribute information describing its state and capabilities. The attributes of the department nodes at least comprise the number of posts to be processed, wherein the number of posts to be processed quantifies the backlog of task requests to be processed currently received by the department from a forum, the backlog is a key index for measuring the load of the department, the average processing time of the posts, and the professional field codes are used for describing the core responsibility of the department and the service types (such as credit approval, system operation and maintenance and customer complaints) of the department through vectorized representation, so that the tasks are conveniently matched to the most suitable department. The attributes of the staff nodes at least comprise the current state (such as idle, busy and vacation), skill feature vectors which are multidimensional vectors, encode various skills mastered by the staff and the proficiency (such as data analysis-high-level, client communication-medium-level and Python programming-proficiency) of the staff, are core bases of skills required by accurate matching tasks, and historical satisfaction scores are calculated based on the quality of the completion of the past tasks, timeliness or colleagues evaluation and are used for measuring the reliability and the work performance level of the staff.
And S106, analyzing the multidimensional feature vector and the graph neural network by using a first depth Q network to obtain a target node for processing the target post in the graph neural network, wherein the state in the first depth Q network comprises the multidimensional feature vector of the current post, the topological structure and the node attribute of the graph neural network, and the actions in the first depth Q network comprise selecting a target department node, selecting a target employee node and executing multi-hop decision along the graph edge.
Step S106 is used to combine the multidimensional feature vector representing the core information of the post to be processed in step S102 and the neural network of the graph describing the internal structure, the capacity and the relationship of the enterprise in step S104, and automatically and intelligently decide which department or employee should be responsible for processing the task requirement contained in the post through reinforcement learning technology (specifically, deep Q network, DQN).
In step S106, the state of the first deep Q network includes a multidimensional feature vector of the current target post, which provides core information of task requirements, such as subject of discussion (e.g., "credit approval process optimization"), expressed emotional tendency (e.g., "negative, indicating that there is a serious problem with the process"), and key appeal vocabulary (e.g., "simplified", "accelerated", "automated"). The state also comprises the current topological structure of the whole graph neural network and the real-time attribute of all nodes in the graph neural network, wherein the current topological structure of the graph neural network reflects the cooperative relationship among departments, the membership of the departments and staff and the possible cooperative links among staff, and the real-time attribute of all nodes such as the current post count (load) to be processed of each department, the average processing time (efficiency), the professional field (capability matching degree), the current state (whether idle) of the staff, the skill vector (whether the skill required by the processing task is provided) and the historical satisfaction score (reliability and performance). It will be appreciated that the above states fully describe what the task is and what the departments and employees can do at present.
The actions mainly include three categories, selecting a target department node (directly assigning tasks to a department process), selecting a target employee node (directly assigning tasks to a particular employee process), and performing multi-hop decisions along the graph edges. Multi-hop decisions are particularly important for modeling the process of posts (i.e., tasks in reinforcement learning) flowing in a neural network. For example, the decision "first step of selecting department A- > second step of moving to department B- > third step of selecting employee C within department B along the collaboration edge of department A and department B". Or "first step: select employee X- > second step: transfer to employee Y along employee X's collaborative edge with employee Y. This action design enables flexible handling of situations where cross-department collaboration is required or specific specialists are sought, greatly enhancing the rationality and adaptability of the decision.
Notably, multi-hop decisions allow agents to make continuous, path-like movements on graph structures made up of department and employee nodes and the various edges connecting them (department-employee membership edges, employee-employee cooperation edges), ultimately reaching one or more target nodes. Specifically, each "hop" represents an action that the DQN selects in the current state, which is not the final allocation, but rather instructs the system to move along a particular "edge" in the graph to the next adjacent "node". For example, a first jump is performed starting from a starting point (e.g., a department node A initially associated with the subject of the post) and moving along an inter-department collaboration edge to another department node B. To simulate tasks being handed over from department a to department B for processing because of the need to collaborate across departments. And the second jump action is to determine to move from the department node B to an employee node C affiliated to the department along a department-employee affiliated side after reaching the department node B. Indicating that the task is assigned to a specific employee C within the department B. A third-hop action, DQN, may also be at employee node C, again deciding to move along one inter-employee collaboration edge to another employee node D (even though D may belong to another department). This simulates employee C thinking that it cannot do alone, or that employee D is a more appropriate expert, issuing a task transfer or collaboration request to D.
This "jump" process described above may continue until the DQN considers the most appropriate final processing node (target department or target employee) to be found, or a preset maximum number of hops limit is reached. The choice of each step "jump" (which edge to walk, to which neighboring node) depends on the DQN's analysis and evaluation of the current combined state (post feature vector + real-time structure/attributes of the whole graph) with the goal of maximizing long-term jackpot.
Step S108, forwarding the target posts to the target nodes, obtaining processing results of the target nodes aiming at the target posts, and forwarding the processing results to an information receiving area or a target communication address associated with the target posts.
According to the method, a target post to be processed in an exchange platform is obtained, a multidimensional feature vector of the target post is determined, wherein the multidimensional feature vector comprises a theme classification feature, an emotion feature and a keyword coding feature, a graph neural network is constructed, nodes in the graph neural network comprise department nodes and staff nodes, edges in the graph neural network comprise department-to-department cooperation edges, department-staff membership edges and staff-to-staff cooperation edges, department node attributes at least comprise post numbers to be processed, post average processing time length and professional field coding, staff node attributes at least comprise a current state, a skill feature vector and a historical satisfaction score, the multidimensional feature vector and the graph neural network are analyzed by a first depth Q network to obtain target nodes for processing the target post in the graph neural network, the states in the first depth Q network comprise the multidimensional feature vector of the current post, the topological structure and the node attributes of the graph neural network, actions in the first depth Q network comprise the selection of the target nodes, the selection of the staff nodes and the execution of multiple decision-making along the graph edges, the target posts are forwarded to the target nodes, the target posts are obtained, the target posts are forwarded to the target nodes, the target posts are processed, and the target posts are received, the target posts are processed, and the target posts are accurately and the target posts are processed, and the target posts are processed in the communication effect is achieved according to the target post processing results in the communication area, and the communication efficiency is improved.
The steps shown in fig. 1 are exemplarily illustrated and explained below.
According to some alternative embodiments of the present application, the reward function of the first deep Q network in step S106 is determined by determining a first sub-reward function according to a matching score, wherein the matching score is used for representing a semantic matching degree between the post content and the node professional field, the matching score is obtained by calculating a similarity between the post keyword and the node preset label through a natural language processing technology, determining a second sub-reward function according to a time delay, wherein the time delay is used for representing a time difference from the post creation time stamp to the successful allocation time stamp, determining a third sub-reward function according to a load index, wherein the load index is used for representing a ratio of a current task queue length of the node to a maximum processing capacity of the node, determining a fourth sub-reward function according to a satisfaction score, wherein the satisfaction score is used for representing a feedback rating provided by a user after the post processing is completed, and determining the reward function according to the first sub-reward function, the second sub-reward function, the third sub-reward function, and the fourth sub-reward function.
In this embodiment, the first sub-bonus function is based on a matching score. The matching score is used for accurately measuring the semantic matching degree between task demand content (represented by keyword codes, topic classifications and the like in the multidimensional feature vectors) contained in the target posts and professional capabilities of potential processing nodes (departments or staff). Specifically, natural language processing techniques (e.g., word vector similarity calculation, topic model matching analysis) may be utilized to calculate a similarity score between the extracted keywords or core semantics in the posts and a specialized domain label preset by the node (e.g., a "specialized domain code" of a department or a core capability label in a "skill feature vector" of an employee). The higher the matching score, the more specialized the node has to handle the task, and the greater the potential for task completion quality and efficiency.
The time delay refers to the time difference from the time stamp created by the forum post until the task demand represented by the post is successfully assigned to the time stamp of the target node. This index directly reflects the efficiency of the task allocation link. Too long delay can lead to a problem response lag, affecting business operations and employee experience. The load index is used to evaluate how busy a target node (department or employee) is at the time of task allocation and resource availability. The load index is defined as the ratio of the current pending task queue length of the node (e.g., the "pending post count" of the department or the number of tasks currently undertaken by the employee) to the maximum processing capacity of the node. The higher the ratio, the more overloaded the node, at which point reassigning new tasks may lead to processing delays, quality degradation, and even system crashes. Satisfaction score is the final quality feedback after task closure, typically derived from feedback ratings (e.g., five star score or satisfaction/dissatisfaction tab) provided by the post initiator or related user after task processing is complete. The satisfaction score directly reflects whether the result of task execution really solves the problem and meets the requirement.
The determining of the reward function according to the first sub-reward function, the second sub-reward function, the third sub-reward function and the fourth sub-reward function can be achieved by obtaining initial weight coefficients of each sub-reward function, continuously detecting post distribution success rate indexes, average processing delay indexes and user satisfaction average indexes, increasing first initial weight coefficients corresponding to the first sub-reward function and fourth initial weight coefficients corresponding to the fourth sub-reward function when detecting post distribution success rate indexes to decrease, increasing second initial weight coefficients corresponding to the second sub-reward function and third initial weight coefficients corresponding to the third sub-reward function when detecting average processing delay indexes to increase, increasing fourth initial weight coefficients when detecting user satisfaction average indexes to decrease, executing normalization operation after each time of weight coefficient updating, enabling the sum of the updated weight coefficients to be 1, obtaining first weight coefficients, second weight coefficients, third weight coefficients and fourth weight coefficients, and determining the reward function according to the first sub-reward function, the first weight coefficients, the second weight coefficients, the third weight coefficients, the fourth sub-reward function and the fourth sub-reward function.
In this embodiment, first, an initial weighting coefficient is set for each sub-bonus function (e.g., based on business experience or preliminary experimental setting: matching coefficientsDelay coefficientLoad factorCoefficient of satisfaction. The business operation indexes of several cores are continuously monitored, namely a post allocation success rate index (the proportion of successfully converting posts into tasks and allocating out), an average processing delay index (the total average time consumed from post creation to task completion), and a user satisfaction average index (the average feedback of users on task processing results).
Then, according to the change trend of the monitoring indexes, the corresponding weight coefficients are dynamically adjusted:
1. when the post allocation success rate index is detected to be reduced, the task allocation link is problematic, and the task requirement is not matched with the capability of a processor (the matching degree is low), or the processing result is not satisfactory to the user (the subsequent task conversion will is reduced). Therefore, the weight coefficient corresponding to the first sub-bonus function (matching score) needs to be increasedAnd increasing the weighting factor corresponding to the fourth sub-bonus function (satisfaction score). The purpose of this is to guide the DQN to pay more attention to selecting nodes with high professional ability matching and finally to obtain a distribution path with high satisfaction when deciding, thereby improving the probability of successful reception and efficient completion of tasks.
2. When an increase in the average processing delay index is detected, it is indicated that the overall period from the presentation to the completion of the task becomes long, and the efficiency becomes low. The delay may result from the allocation process itself being too long (a large time delay) or the task being allocated to a node that is already too busy (a high load), resulting in the task backlog in the queue. Therefore, it is necessary to increase the weight coefficient corresponding to the second sub-bonus function (time delay)And increasing the weight coefficient corresponding to the third sub-bonus function (load index). Thereby causing DQN to preferentially select nodes that are able to respond quickly to allocation requests (low time delay) and are currently lightly loaded (low load index) to shorten overall processing cycles.
3. When the user satisfaction mean value index is detected to be reduced, the final quality of the task processing result is reflected to be poor, and the user is dissatisfied. Although there may be a number of reasons for the reduced satisfaction, increasing the weight of the fourth sub-prize is the most straightforward countermeasure. Therefore, it is necessary to increase the weight coefficient corresponding to the fourth sub-bonus function (satisfaction score)。
Finally, after one or more weight coefficients are adjusted according to the rule, normalization operation is performed, namely, the sum of all four updated weight coefficients is ensured to be strictly equal to 1. The rewarding function of the first depth Q network is obtained by multiplying the four dynamically adjusted weight coefficients by corresponding sub-rewarding function values respectively and adding the multiplied weight coefficients.
The relative importance (weight coefficient) of the dimensions within this dynamically constructed bonus function will vary intelligently. Matching and satisfaction are emphasized more when allocation success rate is low, fast response and load balancing are emphasized more when processing delay is high, and final satisfaction is emphasized most when users are not full. The closed-loop feedback mechanism enables the DQN to continuously learn and optimize the allocation strategy thereof so as to cope with the changing business demands, and finally drives the whole intelligent task allocation system to evolve towards the optimal comprehensive performance.
According to other optional embodiments of the application, the multi-dimensional feature vector of the target post is determined by extracting a semantic embedded vector of the target post by using a pre-trained language model, calculating different probability distributions corresponding to different topic classifications by a topic attention layer, normalizing the different probability distributions to obtain a first feature component, calculating a vocabulary strength score of the target post based on a financial emotion dictionary, wherein the financial emotion dictionary comprises a plurality of different preset words for representing positive emotion, neutral emotion and negative emotion, the vocabulary strength score is used for quantifying the strength of an emotion tendency expressed in the target post explicitly, determining an emotion probability of the target post by using a deep learning model, wherein the emotion probability is used for representing the probability that the target post belongs to positive emotion, neutral emotion and negative emotion, dynamically weighting and fusing the vocabulary strength score and the emotion probability according to the density of the financial terms to obtain a second feature component, wherein the density of the financial terms is a ratio of the number of the financial terms appearing in the target post to the total word number of the target post, removing general stop words and high-frequency financial words in the target post, and preserving word-inverse document frequency, and preserving the co-occurrence of the common feature word in the target post, and jointly screening the feature components by using a core feature, and the core feature component is determined by using a deep learning model, and performing dimension reduction processing on the splicing result through the orthogonal constraint linear transformation layer to obtain a multidimensional feature vector.
In this embodiment, first, the original text of the target post is encoded by using a pre-training language model (such as BERT, roBERTa, etc.), so as to generate a semantic embedded vector containing overall semantic information. Next, to capture the core business category of post discussion, a topic attention layer is introduced. The layer receives the semantic embedded vector and calculates the probability distribution that the post belongs to a preset different topic classification (e.g. "credit product", "risk control", "operation and maintenance", "customer service", "internal flow", etc.). After normalization processing (ensuring that the sum of all probabilities is 1), the probability distribution forms a first characteristic component representing the main discussion direction of the post, and the theme tendency of the post content is clearly quantized.
Further, in emotion analysis, a dual-management strategy is adopted to improve accuracy in the context of the financial field. In one aspect, the lexical strength score of the post is calculated based on a specially constructed financial emotion dictionary (containing a large number of preset, labeled positive, neutral, or negative financial domain related words). The score directly quantifies the strength of emotion tendencies expressed explicitly in the text by counting and analyzing the emotion words that appear in the posts and their strength (possibly considering word frequency, location, modifier, etc.). On the other hand, a deep learning emotion analysis model is used to predict the probability distribution that the post as a whole belongs to positive, neutral or negative emotion. Taking a recurrent neural network as an example, by introducing a recurrent structure in the network, information can be transferred along the sequence, so that sequence information in the text can be captured. For a post, the model may process word by word (or character by character), with each word being transformed by the embedding layer into a vector of fixed dimensions that can represent the semantic information of the word. The recurrent neural network then affects the processing of the following words based on the information of the preceding words, so that the context and context relationship of the text can be understood. The model gradually extracts higher-level text features through a multi-layer neural network structure. In the final stage of the model, there is an output layer, which may be a fully connected neural network layer, whose output nodes correspond in number to emotion categories, such as positive, neutral, negative three categories, each node outputting a probability value. These probability values are processed through a softmax function such that their sum is 1, forming a probability distribution. The probability distribution can intuitively represent the probability that a post belongs to each emotion category, for example, a post may have a probability of 0.7 belonging to positive emotion, a probability of 0.2 belonging to neutral emotion and a probability of 0.1 belonging to negative emotion, so that the emotion tendency of the whole post can be judged according to the probabilities, and the uncertainty degree of emotion classification is reflected.
Notably, to combine the advantages of both methods and to accommodate the nature of the financial text, financial term density (i.e., the ratio of the number of financial terms appearing in the post to the total number of words of the post) is introduced as a dynamic weighting factor. The method is characterized in that when the density of the financial terms is higher than a threshold value, the text specificity is higher, the word strength score obtained by a financial emotion dictionary method is more depended, and when the density of the financial terms is lower than the threshold value, the emotion probability predicted by an emotion analysis model is more prone to deep learning. The two results are dynamically weighted and fused according to the term density, and finally a second characteristic component reflecting the emotion tendency and the strength of the posts is generated.
Further, for identifying core complaints and specialized focuses in posts, focus is on extracting key terminology. And preprocessing the post text, removing general stop words and high-frequency financial stop words (such as certain high-frequency but low-information financial connecting words), and retaining the professional terms that the word frequency-inverse document frequency value exceeds a preset threshold value. These screened terms constitute a candidate keyword set. To further identify the most core and representative words therein, a term co-occurrence graph is constructed. The nodes in the term co-occurrence graph are each selected professional term, and edges between the nodes are established according to the co-occurrence frequency or semantic association degree (which can be calculated through word vectors) of the terms in the post context window. A centrality metric (e.g., degree centrality, near centrality, or feature vector centrality) is calculated for each term node in the graph. The highly centralised term is considered to be at a core location in the post semantic network, more representative of the core issues of the post. These filtered core words are used to construct a third feature component, which may be a vector obtained by averaging or weighted averaging (weights may be determined by centrality) of its word vectors.
And finally, splicing the first characteristic component representing the theme tendency, the second characteristic component representing the emotion intensity and the third characteristic component containing the core professional appeal to form a temporary vector with higher dimension. In order to compress this high-dimensional temporal vector into a final practical, information-intensive and dimension-controllable multi-dimensional feature vector while reducing redundancy between features, a linear transformation layer with orthogonal constraints (e.g., a linear layer in which a weight matrix is constrained to be approximately orthogonal) is used for the dimension reduction process. The orthogonal constraint is helpful to keep the information of the original feature space as much as possible and reduce the correlation among features in the dimension reduction process, and finally a multi-dimensional feature vector with low dimension and high information retention is output, wherein the vector comprehensively codes the information of posts in three key dimensions of theme, emotion and core professional appeal.
After the target post to be processed in the communication platform is obtained, the method further comprises the steps of coding the target post by utilizing a pre-training language model to obtain a semantic feature vector, determining a click rate standardization value and a poster job level weight of the target post, splicing the semantic feature vector, the click rate standardization value and the poster job level weight, determining a splicing result as a state vector, analyzing the state vector by utilizing a second depth Q network, determining whether the target post is an essence post or not, wherein an action space of the second depth Q network comprises a marked essence post and a non-marked essence post, a reward function of the second depth Q network comprises a compliance score, a review pass rate and a false mark penalty, the compliance score is a quantification result of risk scanning of post contents by a preset gold fusion rule base, the compliance score is a difference value of 1 and the target ratio, the target ratio is a ratio of the number of occurrence times of risk keywords to the number of total keywords, the expert review pass rate is a ratio of the marked and the number of confirmed posts to the marked post, the total number of the marked posts is determined, the action space of the second depth Q network comprises a marked essence post and the non-marked essence post, the reward pass rate and the false mark penalty is set according to a preset essence cost per unit, and the preset cost of the first post is a preset essence cost and the attenuation cost per the first post is set.
In the embodiment, firstly, a pre-training language model is utilized to carry out deep semantic coding on target posts, semantic feature vectors containing context information are generated, and the vectors can capture the core value of the posts in aspects of business insight, knowledge depth and the like. Meanwhile, two key auxiliary indexes, namely a click rate standardization value (the historical click rate of posts is converted into a relative heat index of [0,1] interval, and the influence of different plate flow difference is eliminated) and a poster job level weight (a job level preset weight coefficient of a poster in an enterprise architecture is calculated according to the fact that a department general supervision=0.9 and a common employee=0.5, and authority difference is reflected). The three types of information are fused into a comprehensive state vector through splicing operation, so that complete input representing the value potential of the post is formed.
The state vector is then input into a second deep Q network for decision analysis. The action space of the network comprises binary choices of marked essence posts or unmarked essence posts. The reward function adopts a triple constraint mechanism to ensure decision quality, wherein a compliance score is calculated through scanning of a preset gold fusion rule base, a specific formula is 1- (the occurrence number of risk keywords/the total number of keywords), the score approaches 1 to represent complete compliance, negative rewards are triggered when the score is lower than a threshold value, expert review passing rate is used as a feedback supervision signal, the proportion of the number of essence posts confirmed by field experts to the total marking quantity is calculated as marks (for example, 80 posts in 100 posts are approved by the expert, 0.8 is rewarded), wrong-mark punishment fines quantify the cost of wrong decision, and comprise two key cost items, wherein the first cost is post exposure amount multiplied by preset unit attention cost (the unit cost is converted into monetary value according to the man-hour loss of staff reading a typical document) and used for punishing attention resources occupied by low-quality posts, and the second cost is a post knowledge value coefficient multiplied by preset attenuation factor (the knowledge value coefficient is calculated by index backtracking such as the reference number of a knowledge base and the attenuation factor increases exponentially along with wrong-mark discovery time) and is used for punishing the implicit loss missing high-value content.
And finally, realizing intelligent decision through dynamic balance rewarding factors, when the compliance score is low, enabling the second depth Q network to avoid risk content, when the expert rechecks that the passing rate is reduced, enabling the second depth Q network to tighten the marking standard, and ensuring balance between noise interference reduction and knowledge asset loss prevention through double-cost design of false mark punishment. The second deep Q network learns the optimal strategy in continuous training, namely, the essence mark is triggered only when the comprehensive state vector of the post predicts that the post can bring forward net benefit (expected reward > 0), so that the accurate capture of the knowledge value is realized.
Further, if the target post is determined to be an essence post by using the second depth Q network, after the processing result of the target node for the target post is obtained, the processing result is sent to the target object for rechecking, the rechecking result is obtained, and the rechecking result is forwarded to an information receiving area or a target communication address associated with the target post.
As some optional embodiments of the application, the processing result at least comprises target content, processing conclusion type, user privacy level identification and result sensitivity label.
Further, forwarding the processing result to the information receiving area or the target communication address information receiving area associated with the target post may be achieved by forwarding the target content to the information receiving area if the processing conclusion type is a first type for representing a public reply request, forwarding the target content to the target communication address if the processing conclusion type is a second type for representing personal transactions, forwarding the target content to the information receiving area if the user privacy level identification is a first identification for representing public rights, forwarding the target content to the target communication address if the user privacy level identification is a second identification for representing private rights, and forwarding the target content to the target communication address if the result sensitivity label is a first label for representing private data.
FIG. 2 is a block diagram of an intelligent collaborative management platform based on artificial intelligence drivers, according to an embodiment of the application, as shown in FIG. 2, the platform comprising:
the obtaining unit 22 is configured to obtain a target post to be processed in the communication platform, and determine a multidimensional feature vector of the target post, where the multidimensional feature vector includes a topic classification feature, an emotion feature, and a keyword coding feature.
The construction unit 24 is configured to construct a graph neural network, where nodes in the graph neural network include department nodes and employee nodes, edges in the graph neural network include inter-department collaboration edges, department-employee membership edges, and inter-employee collaboration edges, and the department node attributes include at least a post count to be processed, a post average processing duration, and a professional field code, and the employee node attributes include at least a current state, a skill feature vector, and a historical satisfaction score.
The analysis unit 26 is configured to analyze the multidimensional feature vector and the graph neural network by using a first depth Q network, so as to obtain a target node in the graph neural network for processing the target post, where a state in the first depth Q network includes the multidimensional feature vector of the current post, a topology structure of the graph neural network, and node attributes, and an action in the first depth Q network includes selecting a target department node, selecting a target employee node, and performing a multi-hop decision along a graph edge.
And the forwarding unit 28 is configured to forward the target post to the target node, obtain a processing result of the target node for the target post, and forward the processing result to an information receiving area or a target communication address associated with the target post.
Optionally, the reward function of the first deep Q network is determined by determining a first sub-reward function according to a matching score, wherein the matching score is used for representing the semantic matching degree between the post content and the node professional field, the matching score is obtained by calculating the similarity between a post keyword and a node preset label through a natural language processing technology, determining a second sub-reward function according to a time delay, wherein the time delay is used for representing the time difference from the creation of the post to the successful allocation of the time stamp, determining a third sub-reward function according to a load index, wherein the load index is used for representing the ratio of the current task queue length of the node to the maximum processing capacity of the node, determining a fourth sub-reward function according to a satisfaction score, wherein the satisfaction score is used for representing the feedback rating provided by a user after the post processing is completed, and determining the reward function according to the first sub-reward function, the second sub-reward function, the third sub-reward function and the fourth sub-reward function.
Optionally, determining a reward function according to the first sub-reward function, the second sub-reward function, the third sub-reward function and the fourth sub-reward function comprises the steps of obtaining initial weight coefficients of each sub-reward function, continuously detecting post distribution success rate indexes, average processing delay indexes and user satisfaction average indexes, increasing first initial weight coefficients corresponding to the first sub-reward function and fourth initial weight coefficients corresponding to the fourth sub-reward function when detecting post distribution success rate indexes to be reduced, increasing second initial weight coefficients corresponding to the second sub-reward function and third initial weight coefficients corresponding to the third sub-reward function when detecting average processing delay indexes to be increased, increasing fourth initial weight coefficients when detecting user satisfaction average indexes to be reduced, executing normalization operation after each time of weight coefficient updating, enabling the sum of the updated weight coefficients to be 1, obtaining first weight coefficients, second weight coefficients, third weight coefficients and fourth weight coefficients, and determining a reward function according to the first sub-reward function, the first weight coefficients, the second weight coefficients, the third sub-reward function and the fourth sub-reward function.
Optionally, determining a multidimensional feature vector of the target post, wherein the multidimensional feature vector comprises the steps of extracting a semantic embedded vector of the target post by using a pre-training language model, calculating different probability distributions corresponding to different topic classifications by using a topic attention layer, normalizing the different probability distributions to obtain a first feature component, calculating a vocabulary strength score of the target post based on a financial emotion dictionary, wherein the finance emotion dictionary comprises a plurality of different preset words for representing positive emotion, neutral emotion and negative emotion, the vocabulary strength score is used for quantifying the strength of an emotion tendency expressed in the target post, determining the emotion probability of the target post by using a deep learning model, wherein the emotion probability is used for representing the probability that the target post belongs to positive emotion, neutral emotion and negative emotion, dynamically weighting and fusing the vocabulary strength score and the emotion probability according to the density of the finance term to obtain a second feature component, wherein the density of the finance term is the ratio of the number of finance terms appearing in the target post to the total word number of the target post, removing general stop words and high-frequency virtual words in the target post, and retaining word frequency-inverse document frequency exceeds a preset threshold value, and jointly displaying the second feature component by using a core feature, and jointly displaying the second feature component in the graph is determined according to the co-occurrence feature of the first feature and second feature component, and the second feature component is obtained after the co-occurrence of the core feature is obtained in the co-related feature graph, and performing dimension reduction processing on the splicing result through the orthogonal constraint linear transformation layer to obtain a multidimensional feature vector.
Optionally, after the target post to be processed in the communication platform is obtained, the method further comprises the steps of encoding the target post by using a pre-training language model to obtain a semantic feature vector, determining a click rate standardization value and a poster job level weight of the target post, splicing the semantic feature vector, the click rate standardization value and the poster job level weight, determining a splicing result as a state vector, analyzing the state vector by using a second depth Q network, and determining whether the target post is an essence post, wherein an action space of the second depth Q network comprises a marked essence post and a non-marked essence post, a reward function of the second depth Q network comprises a compliance penalty, an expert review pass rate and a false mark, the compliance penalty is a quantification result of risk scanning of post contents by a preset gold fusion rule base, the compliance score is a difference value of 1 and a target ratio, the target ratio is a ratio of the occurrence times of risk keywords to the total keyword number, the expert review pass rate is a mark and a ratio of the number of posts confirmed by the expert, the total number of posts to the mark, the mark comprises a first cost penalty and a preset essence cost, the first cost and the first cost unit is a preset essence cost and the attenuation cost is a preset essence cost, and the first cost unit is a preset cost of the consumption of the post is determined according to the preset essence cost.
Optionally, if the target post is determined to be an essence post by using the second deep Q network, after the processing result of the target node for the target post is obtained, the processing result is sent to the target object for rechecking, the rechecking result is obtained, and the rechecking result is forwarded to an information receiving area or a target communication address associated with the target post.
Optionally, the processing result at least comprises target content, processing conclusion type, user privacy level identification and result sensitivity label. Forwarding the processing result to an information receiving area or a target communication address information receiving area associated with the target post, wherein the processing result comprises the steps of forwarding target content to the information receiving area when the processing conclusion type is a first type for representing a public reply request, forwarding the target content to the target communication address when the processing conclusion type is a second type for representing personal transaction, forwarding the target content to the information receiving area when the user privacy level identification is a first identification for representing public rights, forwarding the target content to the target communication address when the user privacy level identification is a second identification for representing private rights, and forwarding the target content to the target communication address when the result sensitivity label is a first label for representing private data.
It should be noted that each module in fig. 2 may be a program module (for example, a set of program instructions for implementing a specific function), or may be a hardware module, and for the latter, it may be expressed in a form, but is not limited to, that each module is expressed in a form of one processor, or the functions of each module are implemented by one processor.
It should be noted that, the preferred implementation manner of the embodiment shown in fig. 2 may refer to the related description of the embodiment shown in fig. 1, which is not repeated herein.
Fig. 3 shows a block diagram of a hardware architecture of a computer terminal for implementing an artificial intelligence driven based intelligent collaborative management method. As shown in fig. 3, the computer terminal 30 may include one or more processors 302 (shown in the figures as 302a, 302b, 302 n), which processor 302 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA, a memory 304 for storing data, and a transmission module 306 for communication functions. Among other things, a display, an input/output interface (I/O interface), a Universal Serial BUS (USB) port (which may be included as one of the ports of the BUS BUS), a network interface, a power supply, and/or a camera. It will be appreciated by those of ordinary skill in the art that the configuration shown in fig. 3 is merely illustrative and is not intended to limit the configuration of the electronic device described above. For example, the computer terminal 30 may also include more or fewer components than shown in FIG. 3, or have a different configuration than shown in FIG. 3.
It should be noted that the one or more processors 302 and/or other data processing circuits described above may be referred to generally herein as "data processing circuits. The data processing circuit may be embodied in whole or in part in software, hardware, firmware, or any other combination. Furthermore, the data processing circuitry may be a single stand-alone processing module or incorporated, in whole or in part, into any of the other elements in the computer terminal 30. As referred to in embodiments of the application, the data processing circuit acts as a processor control (e.g., selection of the path of the variable resistor termination connected to the interface).
The memory 304 may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the intelligent collaborative management method based on artificial intelligence driving in the embodiment of the present application, and the processor 302 executes various functional applications and data processing by running the software programs and modules stored in the memory 304, that is, implements the intelligent collaborative management method based on artificial intelligence driving. Memory 304 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 304 may further include memory remotely located relative to the processor 302, which may be connected to the computer terminal 30 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission module 306 is used to receive or transmit data via a network. The specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 30. In one example, the transmission module 306 includes a network adapter (Network Interface Controller, NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission module 306 may be a Radio Frequency (RF) module for communicating with the internet wirelessly.
The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the computer terminal 30.
It should be noted here that, in some alternative embodiments, the computer terminal shown in fig. 3 may include hardware elements (including circuits), software elements (including computer code stored on a computer readable medium), or a combination of both hardware and software elements. It should be noted that fig. 3 is only one example of a specific example, and is intended to illustrate the types of components that may be present in the computer terminal described above.
It should be noted that, the computer terminal shown in fig. 3 is configured to execute the intelligent collaborative management method based on the artificial intelligence driver shown in fig. 1, so that the explanation of the execution method of the command is also applicable to the electronic device, and is not repeated herein.
The embodiment of the application also provides a nonvolatile storage medium, which comprises a stored program, wherein the program controls equipment where the storage medium is located to execute the intelligent collaborative management method based on the artificial intelligent drive when running.
The method comprises the steps of acquiring a target post to be processed in an exchange platform, determining a multidimensional feature vector of the target post, wherein the multidimensional feature vector comprises a theme classification feature, an emotion feature and a keyword coding feature, constructing a graph neural network, nodes in the graph neural network comprise department nodes and staff nodes, edges in the graph neural network comprise department cooperation edges, department-staff membership edges and staff cooperation edges, department node attributes at least comprise post to be processed, post average processing time length and professional field coding, staff node attributes at least comprise a current state, a skill feature vector and a historical satisfaction score, analyzing the multidimensional feature vector and the graph neural network by utilizing a first depth Q network to obtain a target node for processing the target post in the graph neural network, wherein the state in the first depth Q network comprises the current multidimensional feature vector, a topological structure of the graph neural network and node attributes, the actions in the first depth Q network comprise selecting the target node, executing multi-jump decision along the graph edges, forwarding the target post to the target post, and receiving a target post to a target post or a target address of the target post, and forwarding the target post to a target post or a target address of the target post processing area.
The embodiment of the application also provides the electronic equipment, which comprises a memory and a processor, wherein the processor is used for running a program stored in the memory, and the intelligent collaborative management method based on the artificial intelligence driver is executed when the program runs.
The processor is used for operating a program for executing the following functions of acquiring a target post to be processed in an exchange platform and determining a multidimensional feature vector of the target post, wherein the multidimensional feature vector comprises a theme classification feature, an emotion feature and a keyword coding feature, a graph neural network is constructed, nodes in the graph neural network comprise department nodes and staff nodes, edges in the graph neural network comprise department-to-department cooperation edges, department-staff membership edges and staff-to-staff cooperation edges, department node attributes at least comprise post number to be processed, post average processing time length and professional field coding, staff node attributes at least comprise a current state, a skill feature vector and a historical satisfaction score, the multidimensional feature vector and the graph neural network are analyzed by utilizing a first depth Q network to obtain a target node for processing the target post in the graph neural network, the states in the first depth Q network comprise the current multidimensional feature vector, the topological structure and the node attributes of the graph neural network, actions in the first depth Q network comprise selecting the target node, selecting the target staff node, executing multi-jump decision along the graph edges, forwarding the target post to the target post, and forwarding the target post to the target post or the target post to the target post processing target address, and forwarding the target post processing result to the target post processing target node.
The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In the above embodiment of the present application, the collected information is information and data authorized by the user or sufficiently authorized by each party, and the processes of collection, storage, use, processing, transmission, provision, disclosure, application, etc. of the related data all comply with the related laws and regulations and standards, necessary protection measures are taken without violating the public welfare, and corresponding operation entries are provided for the user to select authorization or rejection.
In the several embodiments provided in the present application, it should be understood that the disclosed technology may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, for example, may be a logic function division, and may be implemented in another manner, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the related art or all or part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. The storage medium includes a U disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, etc. which can store the program code.
The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application, which are intended to be comprehended within the scope of the present application.