Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The embodiment of the invention provides a knowledge question and answer intention recognition method, which is shown in fig. 1 and comprises the following steps:
101. And acquiring statement information of the knowledge questions and answers to be identified.
In the embodiment of the invention, the sentence information of the knowledge question and answer is the sentence content of the question to be solved, which is input by the user from the front end application, and can be a Chinese sentence or an English sentence, for example, the embodiment of the invention is not particularly limited. For the sentence information, text contents such as subject words, predicate words, object words, and the like processed by natural language technology may be included so as to identify intent of the sentence information based on the specific text contents.
It should be noted that, because the embodiment of the present invention extracts intent for sentence information, the sentence information of knowledge question and answer to be identified is preferably input in real time by a user, for example, sentence information needing to obtain an answer input through an interactive platform, and may also be sentence information stored in a knowledge question and answer library, where such sentence information is sentence information of a user submitting a waiting answer.
102. And carrying out first-level intention query on the statement information according to the intention entity characteristics, and determining at least one intention logic content in a second-level intention matched with the queried first-level intention.
The intention entity features are used for representing features for carrying out first-level intention division on sentence information, the features are word contents which embody different feature meanings in sentences, the intention entity features are determined by comparing preset intention entities with sentence information, and the embodiment of the invention is not particularly limited. In the embodiment of the invention, each intention in the first-level intention corresponds to an intention entity feature, so when the first-level intention query is performed, the intention entity features are compared with each word in the sentence information one by one, if the words in the sentence information are matched with the intention entity features, the corresponding first-level intention is found, for example, the application intention entity feature a corresponding to the check intention as the first-level intention is found according to { application intention_entity feature a } as a comparison basis, and one by one comparison is found and judged with each word in the sentence information, and if the comparison is matched, the matched check intention is determined as the first-level intention found by the sentence information.
In addition, the matching relationship between the first hierarchical intent and the second hierarchical intent is pre-hierarchically divided based on the intent category, and each first hierarchical intent in the embodiment of the present invention corresponds to at least one second hierarchical intent, so, in order to accurately find the second hierarchical intent to which the sentence information is matched, at least one intent logic content corresponding to the second hierarchical intent included in the found first hierarchical intent can be determined. The intent logic content is used for characterizing specific word content which can be matched with different second-level intents based on the logic relationship, for example, after the first-level intents are found to be the intent, different disease names, different insurance names, different age information, different practice names and the like can be used as the intent logic content, so that whether the corresponding second-level intents can be matched or not can be judged based on the intent logic relationship in a combined mode, and therefore the second-level intents are determined.
103. And if the statement information is judged to be matched with the intention logic content according to the intention logic relation in the second-level intention, determining the second-level intention as the intention information of the statement information.
In the embodiment of the invention, in order to accurately find the second-level intention serving as the detailed intention, after the logic content of the intention is determined, whether each word in the sentence information matches the second-level intention is judged based on the logic intention relation combination. The intent logic relationship is used for representing the and/or logic relationship between the logic contents of the disagreement graph corresponding to the second-level intent, namely whether each word in the statement information is matched with the intent logic content combined according to the intent logic relationship, for example, the intent logic content comprises a disease name 1 and a disease name 2, if the intent logic content is "intent to apply for the disease 1 and the disease 2" according to the judgment statement of the logic relationship, the intent logic content disease name 1 and the disease name 2 are matched according to the combination of the intent logic relationship, and if the intent logic content disease name 1 and the intent logic content disease name 2 are applied as the intent information of the statement information.
For example, it is determined that the sentence information is "what the lung cancer application is" matches { application intention_application product }, and { application intention_application product_not }, i.e., the intention logic content and intention logic relationship of the insurance application in the corresponding matched second-level intention, and thus, the insurance application intention is the intention of the sentence information.
In an embodiment of the present invention, in order to implement classified search of intentions at each level to accurately match the intentions of the knowledge questions and answers, before the sentence information of the knowledge questions and answers to be identified is obtained, the method further includes: acquiring statement information of marked intention categories; hierarchical classification is carried out on the intention category according to the classification level matched with the intention category, wherein the classification level comprises a first hierarchical classification and a second hierarchical classification; and establishing an intention logic relationship between at least two intention logic contents corresponding to the second-level intention according to the intention matching expression of the statement information.
Before knowledge question and answer sentence information is identified, firstly, the sentence information with the identified intention is labeled in the intention category, which can be manually labeled or automatically labeled by using a machine learning model, and the embodiment of the invention is not particularly limited. In the embodiment of the invention, the intention category is classified into 2 categories according to the hierarchy, so that the first hierarchy classification and the second hierarchy classification are separated at the classification level according to the matching of the intention category. Specifically, the noted intent category may include a first range intent and a second range intent predetermined according to the business content, where the first range intent includes a second range intent, for example, the first range intent includes, but is not limited to, a warranty, a claim, product information, a chat, and the like, and the second range intent for the first range intent of the warranty includes, but is not limited to, a disease application, an age application, an occupation application, a comprehensive application, and other applications, and the like, so that the classification levels corresponding to the first range intent and the second range intent, i.e., the first hierarchy classification, the second hierarchy classification, are hierarchically divided according to the first range intent and the second range intent, and the first hierarchy intent and the second hierarchy intent are obtained after division. In addition, the sentence information labeled with the intention category is matched with at least one intention matching expression so as to establish an intention logic relationship between intention logic contents matched by the intention of each second level according to the intention matching expression. Wherein, in order for each second level intent, an intent logic relationship between a plurality of intent logics can be established to achieve a split of logical content in the intent matching expression.
In an embodiment of the present invention, for further explanation and limitation, before the establishing, according to the intent matching expression of the sentence information, an intent logic relationship between at least two intent logic contents corresponding to the second level intent, the method further includes: acquiring at least one intention matching expression matched with sentence information of the marked intention category; extracting at least two intention logic contents from the intention expression according to the sentence entity characteristics, and analyzing the relation number or relation number and/or logic relation position of the logic relation in the intention matching expression.
Specifically, the purpose of the embodiment of the invention is to split and establish a knowledge question and answer intention recognition system with a hierarchical structure based on sentence information with marked intention types, so that intention can be more accurately found in intention recognition, and the efficiency of updating recognition intention is improved. Wherein the sentence information labeled with the intent category is preset with at least one intent matching expression, which is a judging expression for judging whether the sentence information accords with the intent content, for example, the disease application intent includes an intent matching expression 1: disease name, intent matching expression 2: insurance name and insurance class, intent match expression 3: the application name and the underwriting name are used for resolving the corresponding and/or logical relationship of each intention logic content from the intention matching expression to establish an intention logic relationship. When at least one intention matching expression is obtained, at least two intention logic contents are extracted based on sentence entity characteristics, wherein the sentence entity characteristics are common characteristic contents in each intention matching expression, for example, the intention matching expression comprises insurance, insurance underwriting and insurance buying, the sentence entity characteristics are insurance, and the contents extracted according to the sentence entity characteristics are the insurance intention and are expressed as intention logic contents { insurance intention_entity }. In addition, since the intent in a sentence is identified based on a plurality of intent matching expressions in a combined manner, and is determined according to a logical relationship with or during the combined identification, the and/or logical relationship corresponding to each intent logic content can be resolved from the intent matching expressions, and the logical relationship position of the and relationship number or relationship number and/or the and/or logical relationship is included, so that the intent logic relationship between at least two intent logic contents is established based on the logical relationship position of the and relationship number or relationship number and/or the and/or logical relationship.
It should be noted that, after the hierarchical division, the intention logic content in the matching expression needs to be extracted, that is, the intention logic content may also represent the templated logic for matching the intention so as to be multiplexed as a template, and specific content in the intention logic content is used as content in the template, that is, each extracted intention logic content may be a template corresponding to the second hierarchical intention, that is, one template corresponds to one intention logic content, and the relationship between the templates is the relationship representing the intention logic.
TABLE 1
As shown in Table 1, the intent in each second intent tier may match a plurality of intent logic content, one template for each intent logic content, so that the intent logic content may be updated directly to adapt the specific content of the intent logic content in each second tier of intent, i.e., the template content for each template. The specific content of the intention logic content is the specific content of whether the intention logic content needs to be judged in sequence in each second-level intention, for example, the specific content of { disease name_wildcard } is # # disease name, { application intention_entity } is (application |underwriting|buying insurance|..+ -.), the specific content of { application intention_entity_non } is (why|how.), { risk type_entity } is (why insurance product|risk|belongs to|. The templatization decouples rule logic from specific content, so that reusability among different rules is remarkably improved, and cost of rule maintenance and updating is greatly reduced.
Further, for supplementing and describing, the establishing the intent logic relationship between the at least two intent logic contents corresponding to the second-level intent according to the intent matching expression of the sentence information includes: and establishing an intention logic relationship between at least two intention logic contents corresponding to the second-level intention according to the number of the relationships, the number of the or relationships and the logic relationship position.
Specifically, the intent logic relationship between at least two intent logic contents corresponding to the second-level intent is established by combining the extracted intent logic contents and the number of relationships or the number of relationships and the logical relationship positions determined in the intent matching expression, that is, the number of the intent logic relationships of each second-level intent can be multiple, and each intent logic relationship comprises different intent logic contents, the number of the matched relationships or the number of the relationships and the logical relationship positions. A step of
For example, if the number of relationships is 1, or the number of relationships is 0, and the logical relationship position is 3 intention logical relationships, the 3 intention logical contents are { intention_entity }, { disease name_wildcard }, and { insurance name_wildcard }, respectively, and the established intention logical relationship is { intention_entity } & { disease name_wildcard } & { and { insurance name_wildcard }, respectively, the second-level intention is disease application.
In one embodiment of the present invention, for further definition and explanation, the extracting at least two intent logic contents from the intent expression according to the sentence entity features includes: analyzing sentence entity characteristics in the sentence information based on a dependency analysis algorithm, and sequentially screening at least two intention logic contents matched with sentence meanings of the second-level intention from the at least one intention matching expression based on the sentence entity characteristics.
In the embodiment of the invention, in order to accurately extract the intention logic from the intention expression based on the statement entity characteristics, thereby constructing the recognition of each level of intention with the template effect, realizing the overwriting of the recognition judging rule of the recognition disagreement graph, specifically, analyzing the statement entity characteristics in the statement information marked with the intention category based on the dependency analysis algorithm, and screening the intention logic content matched with the statement meaning of the second level intention from each intention matching expression in turn based on the analyzed statement entity characteristics. As shown in fig. 2, the specific steps of analyzing the sentence entity features by using the sequential analysis method include: locating core relation words in statement information as a central node; then, selecting all nodes within the distance range of each unit or the distances range of two units of the central node, wherein the centering relation-to-form relation in the figure 2 is a unit distance, and mining verb phrase and entity word/noun according to semantic relation and node part of speech; finally, counting the unique verb phrase and entity word/noun of each category through a weighting tf-idf algorithm or a similarity bm25 algorithm, and taking the verb phrase and entity word/noun as the intention logic content of the second-level intention classification.
Specifically, as shown in fig. 2, the nodes are used for obtaining the words and parts of speech in the sentence content of all the marked intention categories, the unit distances are the relations among the nodes, and the words of the core relations are circled in red. In the result of dependency analysis, each statement will extract a 'core relation', which is a verb, representing the core intention of the statement. Different verb phrases and entity words/nouns can be mined by taking the core relation words as the center, wherein the verb phrases represent actions, states and the like mentioned in sentences, and the semantic relations with the core relation words are mostly 'parallel relations', 'in-shape structure', and the like; the entity mentioned in the entity word/noun characterization sentence has the semantic relation with the core relation vocabulary of mainly 'main-name relation', 'moving-guest relation', 'centering relation', and the like, so that dependency analysis is completed, and the intention logic content is obtained.
In an embodiment of the present invention, in order to satisfy recognition of intent of different sentences, maintenance difficulty is reduced, so as to improve accuracy of intent recognition, the method further includes: if the updating request of the intention logic relationship is detected, acquiring intention logic content and/or logic relationship to be updated; and updating the intention logic content, the number of the relationships, the number of the or relationships and the logic relationship position according to the intention logic content to be updated and the and/or logic relationship.
Specifically, the user can select to update the intention logic content and the intention logic relationship which need to be maintained and updated, so as to be used as a judging method for identifying the intention of other sentences, therefore, when the update request of the intention logic relationship is detected, the intention logic content and/or the logic relationship to be updated are obtained, and the original intention logic content, the number of relationships or the number of relationships and the logic relationship position are updated according to the intention logic content and/or the logic relationship, so that a new logic relationship, the intention logic content and the like which are matched with the intention of the second level are generated, and the maintenance efficiency of the intention identification is realized.
In an embodiment of the present invention, in order to accurately obtain the features of the intent entity, the accuracy of establishing the intent logic relationship in the second-level intent is improved, so as to improve the efficiency of intent recognition, and after the sentence information of the knowledge question and answer to be recognized is obtained, the method further includes: comparing the entity characteristics with at least one split word of the sentence information based on preset intention; and determining the split words which are the same as the preset intention entity characteristics in comparison as the intention entity characteristics.
Specifically, the intention entity characteristics serving as the acquisition intention statement information, namely preset intention entity characteristics, are preset according to the service information and the historical intention entity characteristics, then the preset intention entity characteristics are compared with each split word in the statement information based on the preset intention entity characteristics, and in the comparison process, the split words which are compared with the preset intention entity to be the same are directly determined to be the intention entity characteristics. The splitting words are obtained by splitting the sentence information based on a word library of a natural language processing algorithm, specifically, a Python word library in a natural language processing algorithm NLP is selected, massive words determined according to word parts of nouns, adjectives, verbs and the like are stored, and splitting of the words is completed by comparing each word in the sentence information with the words in the Python word library, so that the embodiment of the invention is not particularly limited.
In one embodiment of the present invention, to meet the diversified requirements for intent recognition, the method further includes: if the statement information is judged to be not matched with the intention logic content according to the intention logic relation in the second-level intention, splitting the statement information according to a natural language processing technology, and combining entity words serving as subjects, predicates and objects obtained through splitting to determine the intention information of the statement information.
Specifically, if the statement information is judged to be not matched with the intention logic content according to the intention logic relationship, the accurate intention cannot be recognized, so that the statement information is split according to a natural language processing technology in order to timely feed back to a user, and then the entity words of the split subject, predicate and object are combined to determine the intention of the statement information and feed back to the user. The natural language processing technology includes but is not limited to methods such as dependency analysis, when entity words serving as a main body, predicates and objects are split and combined to form a group of words, the words are directly fed back to a user, so that the user can judge the user based on the feedback intention information serving as the group of words.
Compared with the prior art, the embodiment of the invention obtains the sentence information of the knowledge question and answer to be identified; performing first-level intention query on the statement information according to the intention entity characteristics, and determining at least one intention logic content in a second-level intention matched with the queried first-level intention; if the statement information is judged to be matched with the intention logic content according to the intention logic relationship in the second-level intention, the second-level intention is determined to be the intention information of the statement information, the intention logic relationship is used for representing the and/or logic relationship between the logic contents of the different graphs corresponding to the second-level intention, the rule judgment of intention recognition is templated based on intention searching of different levels, so that the recognition of question-answer statements can be multiplexed based on modification of the intention logic content and the intention logic relationship, and the maintenance convenience is greatly improved, and the intention recognition is more intelligent and accurate.
Further, as an implementation of the method shown in fig. 1, an embodiment of the present invention provides a knowledge question and answer intention recognition device, as shown in fig. 3, including:
An acquisition module 21, configured to acquire sentence information of knowledge questions and answers to be identified;
A first determining module 22, configured to query the sentence information for a first level of intent according to the intent entity feature, and determine at least one intent logic content in a second level of intent matched with the queried first level of intent, where the first level of intent and the second level of intent are matched based on a hierarchical division of an intent category;
A second determining module 23, configured to determine the second-level intent as the intent information of the sentence information if the sentence information is judged to match the intent logic content according to the intent logic relationship in the second-level intent, where the intent logic relationship is used to characterize the and/or logic relationship between the logic contents of the disagreeable graph corresponding to the second-level intent.
Further, the apparatus further comprises: the dividing module and the establishing module are used for dividing the data,
The acquisition module is also used for acquiring statement information marked with intention categories;
The classification module is used for carrying out hierarchical classification on the intention category according to the classification level matched with the intention category, wherein the classification level comprises a first hierarchical classification and a second hierarchical classification;
the establishing module is used for establishing an intention logic relationship between at least two intention logic contents corresponding to the second-level intention according to an intention matching expression of the statement information, wherein the intention matching expression is an expression used for judging whether the statement information accords with the intention contents.
Further, the apparatus further comprises: the analysis module is used for analyzing the data of the data,
The acquisition module is also used for acquiring at least one intention matching expression matched with the sentence information of the marked intention category;
The analysis module is used for extracting at least two intention logic contents from the intention expression according to sentence entity characteristics and analyzing the relation number or relation number and/or logic relation position of the logic relation in the intention matching expression;
The establishing module is specifically configured to establish an intent logic relationship between at least two intent logic contents corresponding to the second-level intent according to the number of relationships, the number of or relationships, and the logical relationship position.
Further, the parsing module is specifically configured to parse statement entity features in the statement information based on a dependency analysis algorithm, and sequentially screen at least two intention logic contents matched with the statement meaning of the second-level intention from the at least one intention matching expression based on the statement entity features.
Further, the apparatus further comprises: the module is updated with the information of the update,
The acquisition module is further configured to acquire the intention logic content and/or the logic relationship to be updated if the update request of the intention logic relationship is detected;
the updating module is used for updating the intention logic content, the number of the relationships, the number of the or relationships and the logic relationship position according to the intention logic content to be updated and the and/or logic relationship.
Further, the apparatus further comprises:
The comparison module is used for comparing the entity characteristics of the preset intention with at least one split word of the sentence information, wherein the split word is obtained by splitting the sentence information based on a word library of a natural language processing algorithm;
And the third determining module is used for determining split words which are the same as the preset intention entity characteristics in comparison to the preset intention entity characteristics as the intention entity characteristics.
Further, the apparatus further comprises:
And a fourth determining module, configured to split the sentence information according to a natural language processing technique if it is determined that the sentence information does not match the intended logic content according to the intended logic relationship in the second-level intent, and combine the entity words obtained by splitting as subjects, predicates, and objects to determine the intent information of the sentence information.
Compared with the prior art, the embodiment of the invention obtains the sentence information of the knowledge question and answer to be identified; performing first-level intention query on the statement information according to the intention entity characteristics, and determining at least one intention logic content in a second-level intention matched with the queried first-level intention; if the statement information is judged to be matched with the intention logic content according to the intention logic relationship in the second-level intention, the second-level intention is determined to be the intention information of the statement information, the intention logic relationship is used for representing the and/or logic relationship between the logic contents of the different graphs corresponding to the second-level intention, the rule judgment of intention recognition is templated based on intention searching of different levels, so that the recognition of question-answer statements can be multiplexed based on modification of the intention logic content and the intention logic relationship, and the maintenance convenience is greatly improved, and the intention recognition is more intelligent and accurate.
According to one embodiment of the present invention, there is provided a computer storage medium storing at least one executable instruction that can perform the knowledge question and answer intention recognition method in any of the above method embodiments.
Fig. 4 is a schematic structural diagram of a computer device according to an embodiment of the present invention, and the specific embodiment of the present invention is not limited to the specific implementation of the computer device.
As shown in fig. 4, the computer device may include: a processor (processor) 302, a communication interface (Communications Interface) 304, a memory (memory) 306, and a communication bus 308.
Wherein: processor 302, communication interface 304, and memory 306 perform communication with each other via communication bus 308.
A communication interface 304 for communicating with network elements of other devices, such as clients or other servers.
The processor 302 is configured to execute the program 310, and may specifically perform relevant steps in the above-described embodiment of the knowledge question and answer intention recognition method.
In particular, program 310 may include program code including computer-operating instructions.
The processor 302 may be a central processing unit CPU, or an Application-specific integrated Circuit ASIC (Application SPECIFIC INTEGRATED Circuit), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included in the computer device may be the same type of processor, such as one or more CPUs; but may also be different types of processors such as one or more CPUs and one or more ASICs.
Memory 306 for storing programs 310. Memory 306 may comprise high-speed RAM memory or may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
Program 310 may be specifically operable to cause processor 302 to:
acquiring statement information of knowledge questions and answers to be identified;
performing first-level intention query on the statement information according to the intention entity characteristics, and determining at least one intention logic content in a second-level intention matched with the queried first-level intention, wherein the first-level intention and the second-level intention are matched based on the hierarchical division of the intention category;
And if the statement information is judged to be matched with the intention logic content according to the intention logic relationship in the second-level intention, determining the second-level intention as the intention information of the statement information, wherein the intention logic relationship is used for representing the and/or logic relationship between the logic contents of the disagreement graph corresponding to the second-level intention.
It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a memory device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps within them may be fabricated into a single integrated circuit module for implementation. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.