Movatterモバイル変換


[0]ホーム

URL:


CN112650878A - Retrieval method, system, device and medium - Google Patents

Retrieval method, system, device and medium
Download PDF

Info

Publication number
CN112650878A
CN112650878ACN201910966829.XACN201910966829ACN112650878ACN 112650878 ACN112650878 ACN 112650878ACN 201910966829 ACN201910966829 ACN 201910966829ACN 112650878 ACN112650878 ACN 112650878A
Authority
CN
China
Prior art keywords
retrieval
search
instruction
category
mode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910966829.XA
Other languages
Chinese (zh)
Inventor
刘涛
苏少炜
陈孝良
常乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing SoundAI Technology Co Ltd
Original Assignee
Beijing SoundAI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing SoundAI Technology Co LtdfiledCriticalBeijing SoundAI Technology Co Ltd
Priority to CN201910966829.XApriorityCriticalpatent/CN112650878A/en
Publication of CN112650878ApublicationCriticalpatent/CN112650878A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

A retrieval method, system, device and medium, the method comprising: acquiring a retrieval instruction; analyzing the retrieval instruction to obtain a retrieval category corresponding to the retrieval instruction; calling the retrieval mode of the retrieval category; and searching according to the searching mode. By setting the retrieval mode corresponding to each retrieval type, the retrieval mode of the corresponding retrieval type is called for the retrieval instruction to carry out retrieval, the retrieval accuracy is improved, and the user experience is improved.

Description

Retrieval method, system, device and medium
Technical Field
The present disclosure relates to the field of data retrieval, and in particular, to a retrieval method, system, device, and medium.
Background
In the field of intelligent voice retrieval, such as the use of a voice assistant, an intelligent sound box and the like, in order to ensure the accuracy of the voice assistant and the intelligent sound box to embody the intelligence thereof, after a user sends a retrieval instruction, for some instructions with fuzzy intentions, such as 'get one to send a red packet', chatting is recognized in the related art, and then a general retrieval model is called to retrieve contents related to the instruction, so that a response similar to 'i cannot hear what you say' may be generated, and in practice, 'send a red packet' is a small title. When a general retrieval model is used for retrieving some instructions with fuzzy intentions, the retrieval accuracy is greatly reduced, and the retrieval rate is reduced when the general retrieval model is used for retrieving in all resources.
Disclosure of Invention
The present disclosure is directed to a retrieval method, a retrieval system, a retrieval device, and a retrieval medium, which are used to solve the problem in the prior art that the retrieval accuracy and the retrieval rate are affected by retrieving a user's fuzzy instruction according to a general retrieval template.
A first aspect of an embodiment of the present disclosure provides a retrieval method, including: acquiring a retrieval instruction; analyzing the retrieval instruction to obtain a retrieval category corresponding to the retrieval instruction; calling a retrieval mode of the retrieval category; and searching according to the searching mode.
Optionally, the method further comprises: and presetting a corresponding relation between the retrieval types and retrieval modes, wherein each retrieval type corresponds to at least one retrieval mode, and each retrieval mode corresponds to one retrieval type.
Optionally, the search mode includes at least one search element, and each search element corresponds to a matching mode, and the method further includes: analyzing the retrieval instruction to obtain one or more retrieval fields, wherein each retrieval field corresponds to one retrieval element; the searching according to the searching mode comprises the following steps: and searching the one or more than one search fields according to the matching mode of the search elements corresponding to the one or more than one search fields.
Optionally, the method further comprises: establishing a resource list corresponding to each retrieval category, wherein the resource list comprises the at least one retrieval element and at least one retrieval field corresponding to each retrieval element; and carrying out fragmentation storage on the resource list.
Optionally, the analyzing the search instruction to obtain one or more search fields includes: extracting one or more keywords in the retrieval instruction; querying a similar meaning word of the keyword; and indexing the resource list corresponding to the retrieval category according to the keywords and the corresponding similar words to obtain a retrieval field corresponding to each keyword.
Optionally, when the retrieval manner of the retrieval category is more than one, the retrieving according to the retrieval manner includes: searching according to the more than one searching modes in sequence; the method further comprises the following steps: and executing the target file when the retrieval is carried out according to one retrieval mode and the matching score between the target file and the retrieval instruction is larger than a threshold value.
Optionally, the analyzing the retrieval instruction to obtain a retrieval category corresponding to the retrieval instruction includes: and analyzing the retrieval instruction according to a preset neural network model to obtain a retrieval category corresponding to the retrieval instruction.
A second aspect of an embodiment of the present disclosure provides a search system, including: the acquisition module is used for acquiring a retrieval instruction; the analysis module is used for analyzing the retrieval instruction to obtain a retrieval category corresponding to the retrieval instruction; the calling module is used for calling the retrieval mode of the retrieval category; and the retrieval module is used for retrieving according to the retrieval mode.
Optionally, the system further comprises: and the setting module is used for presetting the corresponding relation between the retrieval types and the retrieval modes, wherein each retrieval type corresponds to at least one retrieval mode, and each retrieval mode corresponds to one retrieval type.
Optionally, the retrieval method includes at least one retrieval element, and each retrieval element corresponds to a matching method, and the system further includes: the analysis module is used for analyzing the retrieval instruction to obtain one or more retrieval fields, and each retrieval field corresponds to one retrieval element; the retrieval module is also used for retrieving the one or more retrieval fields according to the matching mode of the retrieval elements corresponding to the one or more retrieval fields.
Optionally, the system further comprises: the establishing module is used for establishing a resource list corresponding to each retrieval category, wherein the resource list comprises the at least one retrieval element and at least one retrieval field corresponding to each retrieval element; and the storage module is used for carrying out fragmentation storage on the resource list.
Optionally, the parsing module includes: the extraction submodule is used for extracting one or more keywords in the retrieval instruction; the query sub-module is used for querying the similar meaning words of the keywords; and the indexing sub-module is used for indexing the resource list corresponding to the retrieval category according to the keywords and the corresponding similar words to obtain the retrieval field corresponding to each keyword.
A third aspect of the embodiments of the present disclosure provides an electronic device, including: a processor; a memory storing a computer executable program which, when executed by the processor, causes the processor to perform the above-described retrieval method.
A fourth aspect of the embodiments of the present disclosure provides a computer-readable storage medium on which a computer program is stored, the program, when executed by a processor, implementing the above-described retrieval method.
As can be seen from the foregoing embodiments of the present disclosure, the retrieval method, system, device, and medium provided by the present disclosure obtain a retrieval instruction, analyze the retrieval instruction, obtain a retrieval type corresponding to the retrieval instruction, call a retrieval method of the retrieval type, and perform retrieval according to the retrieval method. By setting a corresponding retrieval mode for each retrieval category in a self-defined manner and calling the retrieval mode of the corresponding retrieval category for the retrieval instruction to perform retrieval, the retrieval accuracy and expandability are improved, the user experience is improved, a corresponding resource list is set for each retrieval category, a near word file is maintained, the performance of instruction query is improved, and the retrieval accuracy is further improved.
Drawings
For a more complete understanding of the present disclosure and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
fig. 1 schematically shows a flow diagram of a retrieval method according to an embodiment of the present disclosure;
fig. 2 schematically shows a schematic diagram of a resource list stored in a retrieval method according to an embodiment of the present disclosure;
FIG. 3 schematically shows a structural schematic of a retrieval system according to an embodiment of the present disclosure;
fig. 4 shows a block diagram of a hardware configuration of an electronic device.
Detailed Description
In order to make the objects, features and advantages of the present disclosure more apparent and understandable, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
Referring to fig. 1, fig. 1 schematically shows a flow chart of a retrieval method according to an embodiment of the present disclosure. The method mainly includes operations S101 to S104.
S101, a retrieval instruction is obtained.
The retrieval instruction may be a voice instruction issued by the user to search for content. Specific contents of the search instruction are, for example, "to send a red packet", "play a sunny day of a peridium", "give a gift from lai dupu", and the like.
S102, analyzing the search command to obtain the search category corresponding to the search command.
Specifically, the retrieval instruction is analyzed according to a preset neural network model, and the retrieval category corresponding to the retrieval instruction is obtained. The search category is, for example, "music category", "literature category", "science category", "gourmet category", "news category", "history category", "movie & tv series category", or the like. Other retrieval categories can be obtained by analogy according to the description of the embodiment by a person skilled in the art, and then each retrieval category can be configured in advance.
Taking the search instruction as "sunny day for playing the perils" as an example, the instruction "sunny day for playing the perils" may be analyzed through a preset neural network model, and the corresponding search category is determined to be "music category". It is understood that the search instruction may also correspond to more than one search category, for example, the search instruction "lisu", and the corresponding search category may be "literature category", and may also be "music category", in which case, for example, the search category more interested by the user may be determined according to the usage data of the user, for example, the play record, and the search instruction may be corresponding to the search category more interested by the user.
The neural network is an algorithmic mathematical model for distributed parallel information processing formed by widely interconnecting a large number of simple neurons, and has wide application prospects in the fields of system identification, pattern recognition, intelligent control and the like.
In the embodiment of the disclosure, a large number of retrieval instructions of known retrieval categories can be used for training the neural network model, and parameters of the neural network model are adjusted in the training process until the analysis accuracy of the neural network model meets the requirement, and the obtained neural network model is the preset neural network model. For example, when the analysis accuracy of the adjusted neural network model on the retrieval instruction reaches 99.99%, the adjusted neural network model is used as a preset neural network model.
S103, calling the search mode of the search category.
In the embodiment of the disclosure, each retrieval mode comprises at least one retrieval element, and each retrieval element corresponds to one matching mode.
The retrieval elements are types corresponding to retrieval contents required when the retrieval mode to which the retrieval elements belong is used for retrieval, and each retrieval element corresponds to a large number of specific retrieval contents. For example, the search contents "zhou jen", "li yuchun", "TFBOYS", etc. are all singer names, and the corresponding search element may be defined as "singer name".
Taking the search category as "music category" as an example, the search elements of the search method include "song title", "singer name", "genre", and the like, for example. Those skilled in the art can obtain other search elements of the "music class" search category and obtain search elements of other search categories according to the description of the embodiment.
Further, a corresponding matching mode can be set for each retrieval element according to the degree of correlation of each retrieval element in the retrieval category. The matching method is, for example, "must", "should", "no _ must", etc., where the matching method corresponding to "must" for the search element means that the search element must be completely matched with the search element; the matching mode of the search element corresponding to the 'should' indicates that the matching degree of the search element reaches a certain value, for example, 80%; the matching mode of the search element corresponding to "no _ most" indicates whether the search element is matched with the search element. Other matching manners can be obtained by those skilled in the art according to the description of the embodiment.
The search method of the "music class" search category will be exemplified by taking the search category as "music class", and the search elements as "song title", "singer name", and "genre". For example, the search element "song name" is matched in a "should" manner, the minimum matching degree is 80%, and the corresponding score (i.e., weight) is 400; the matching mode of the search element 'singer name' is 'no _ must', and the corresponding score is 100; the matching mode of the type of the retrieval element is 'must', the corresponding score is 200, the three retrieval elements and the matching modes are combined to form the retrieval mode of the music type retrieval category, which means that 400 scores are obtained when the matching degree of the resource and the song name reaches 80%, 200 scores are obtained when the resource and the song name are completely matched, 100 scores are obtained when the resource and the song name are completely matched, and the obtained scores are added to obtain the final score. It is understood that the above examples are merely illustrative of the retrieval method in the embodiments of the present disclosure, and those skilled in the art may obtain other customized retrieval methods according to the description of the embodiments.
In the embodiment of the present disclosure, a corresponding relationship between each retrieval type and a retrieval method is preset, where each retrieval type corresponds to at least one retrieval method, and each retrieval method corresponds to one retrieval type. It is understood that one search category may correspond to a plurality of search methods, and different search elements are emphasized among the plurality of search methods. The search elements included in each search method may be different, and the matching methods corresponding to the same search element may also be different, so that the search is performed with emphasis from different sides.
In addition, in the embodiment of the disclosure, when the retrieval mode is called, whether the retrieval mode is in a json format is checked, and when the retrieval mode is not in the json format, the retrieval mode is rejected to be called, and a corresponding feedback result is output, so that a maintenance worker corrects and uploads the retrieval mode according to the feedback result.
And S104, searching according to the searching mode.
In the disclosed embodiment, operation S104 may include operations S104A through S104B.
S104A, analyzing the searching command to obtain one or more searching fields, each corresponding to one searching element.
The search field is specific contents obtained from the search command, and examples thereof include "Zhou Jieron", "sunny day", "Libai", "rock" and the like. The search field "zhou jen" corresponds to, for example, the search element "singer name", and the search field "sunny day" corresponds to the search element "singer name", and the like. The contents of other search fields can be obtained by those skilled in the art according to the description of the present embodiment.
In the embodiment of the present disclosure, a resource list corresponding to each retrieval category is pre-established, where the resource list includes the at least one retrieval element and includes at least one retrieval field corresponding to each retrieval element, and the resource list is stored in a partitioned manner.
Fig. 2 schematically shows a schematic diagram of a resource list stored in a retrieval method according to an embodiment of the present disclosure. Referring to fig. 2, each search category is stored with its resource list, taking "music category" search category as an example, its resource list includes search elements "song title", "singer name", "genre", "lyrics", etc., each search element includes a large number of search fields, for example, search elements "singer name" includes search fields "jieren", "TFBOYS", "Janis join", etc. And adding a search field to the resource list in real time. The resource list is established according to the retrieval category, so that later maintenance is facilitated, huge indexes do not need to be established when the retrieval instructions are indexed, and the retrieval efficiency is improved.
In the embodiment of the present disclosure, the resource list shard may be stored in an Elastic Search (ES for short) server. ES is a Lucene-based search server, providing a distributed multi-user capability full-text search engine. The resource list can be stored and scheduled in a fragmentation mode through the distributed ES servers. Since the ES cluster is a distributed search engine, the resource lists distributed at different nodes of the ES cluster are fragments, and the size and number of the fragments affect the search performance overhead. In the embodiment of the present disclosure, the fragmentation principle is, for example: ensuring that the number of fragments of each node is lower than 20-25 fragments configured in each GB heap memory; or considering that the data in the resource list grows too fast, the fragmentation capacity is limited firstly, and then the fragmentation number is limited, for example, the resource list is estimated to reach 100GB, and for an ES cluster with a maximum heap memory of 32GB, the maximum capacity of the fragmentation can be limited to 30GB, so that it is reasonable to set 4-5 fragmentation in the resource list of 100 GB.
In the embodiment of the disclosure, a word list of similar meaning words is also pre-established, and a large number of fields with the same entity meaning are stored in the word list of similar meaning words. Synonyms refer to words having the same meaning of an entity, i.e., words pointing to the same entity, such as "Libai", "Litaibai", "the poetry", "the lotus-house" and the like.
Further, operation S104A includes: extracting one or more keywords in the retrieval instruction, inquiring the similar meaning word of each keyword, and indexing the resource list corresponding to the retrieval category according to each keyword and the corresponding similar meaning word to obtain the retrieval field corresponding to each keyword.
Specifically, first, one or more keywords in the search instruction are extracted, for example, using an index pattern of jieba participles. Keywords are for example fields in the search instruction that have the meaning of an entity. Taking the search instruction as "want to listen to the national song of the people's republic of china" as an example, the extracted keywords are, for example, { "china", "the people's republic of china", "chinese", "people", "republic", "national song" }.
Secondly, according to the extracted keywords, a word list of the near-meaning words is inquired to obtain the near-meaning words corresponding to each keyword, such as the near-meaning words corresponding to the 'people' republic of China 'China, the near-meaning words corresponding to the' Chinese 'China', the near-meaning words corresponding to the 'national songs' are 'heroic military soups', and the like.
Then, for example, by using the search mode of jieba segmentation, the resource list corresponding to the search category is queried according to each obtained keyword and the corresponding synonym of the keyword, so as to obtain the search field "heroic military progress song", where the search element corresponding to the search field is "song name". It is understood that, since the search elements having an influence on the search result are stored in the resource list, not every field in the search instruction corresponds to a search element.
In the embodiment of the present disclosure, the above operation S104A is not performed on all the retrieval instructions, that is, there is also a retrieval instruction that does not split a field. For example, for an instruction with a definite resource type, such as 'playing a movie tomb stealing note', the retrieval is directly carried out without splitting, so that the storage space can be saved, and the retrieval performance can be improved.
S104B, searching the one or more search fields according to the matching method of the search elements corresponding to the one or more search fields.
Taking the search fields obtained in operation S104A as "heroic military music, and" star group "as examples, the attribute corresponding to the search element" song name "in the search mode is replaced by" heroic military music, "the attribute corresponding to the search element" singer name "is replaced by" star group, "and the attributes corresponding to the other search elements are replaced by" star group, "where" star group "represents any match, and the" heroic military music, "and" star group "are compositely searched according to the corresponding matching modes, that is, compositely searched in the resources corresponding to the search category. Further, the score corresponding to each resource can be obtained, the higher the score is, the higher the comprehensive matching degree of the resource and the retrieval field is, and a plurality of resources with the highest scores can be output in sequence.
In this embodiment of the disclosure, when the retrieval category corresponding to the retrieval instruction includes more than one retrieval manner, operation S104B specifically includes: searching is performed sequentially according to the one or more searching methods, for example, a searching order of the one or more searching methods is preset, and searching is performed sequentially by the one or more searching methods in sequence; and executing the target file when the retrieval is carried out according to one retrieval mode and the matching score between the obtained target file and the retrieval instruction is larger than a threshold value. The threshold is used to ensure the accuracy of the target file, e.g., 500 or the like.
In addition, the retrieval method in the embodiment can adopt log4j2 log, has the advantages of high speed, asynchronous writing, good compression and the like, and when the asynchronous writing is adopted, the performance is increased linearly, so that the smoothness of a log system is ensured under frequent retrieval and high concurrent access.
In the embodiment of the disclosure, a retrieval instruction is acquired, the retrieval instruction is analyzed to obtain a retrieval type corresponding to the retrieval instruction, a retrieval mode of the retrieval type is called, and retrieval is performed according to the retrieval mode. By setting a corresponding retrieval mode for each retrieval category in a self-defined manner and calling the retrieval mode of the corresponding retrieval category for the retrieval instruction to perform retrieval, the retrieval accuracy and expandability are improved, the user experience is improved, a corresponding resource list is set for each retrieval category, a near word file is maintained, the performance of instruction query is improved, and the retrieval accuracy is further improved.
Referring to fig. 3, fig. 3 schematically shows a structural diagram of a retrieval system according to an embodiment of the present disclosure. The system mainly comprises anacquisition module 301, ananalysis module 302, acalling module 303 and aretrieval module 304.
The obtainingmodule 301 is configured to obtain a retrieval instruction.
The retrieval instruction may be a voice instruction issued by the user to search for content. Specific contents of the search instruction are, for example, "to send a red packet", "play a sunny day of a peridium", "give a gift from lai dupu", and the like.
Theanalysis module 302 is configured to analyze the search instruction to obtain a search category corresponding to the search instruction.
Specifically, the retrieval instruction is analyzed according to a preset neural network model, and the retrieval category corresponding to the retrieval instruction is obtained. The search category is, for example, "music category", "literature category", "science category", "gourmet category", "news category", "history category", "movie & tv series category", or the like. Other retrieval categories can be obtained by analogy according to the description of the embodiment by a person skilled in the art, and then each retrieval category can be configured in advance.
Taking the search instruction as "sunny day for playing the perils" as an example, the instruction "sunny day for playing the perils" may be analyzed through a preset neural network model, and the corresponding search category is determined to be "music category". It is understood that the search instruction may also correspond to more than one search category, for example, the search instruction "lisu", and the corresponding search category may be "literature category", and may also be "music category", in which case, for example, the search category more interested by the user may be determined according to the usage data of the user, for example, the play record, and the search instruction may be corresponding to the search category more interested by the user.
In the embodiment of the disclosure, a large number of retrieval instructions of known retrieval categories can be used for training the neural network model, and parameters of the neural network model are adjusted in the training process until the analysis accuracy of the neural network model meets the requirement, and the obtained neural network model is the preset neural network model. For example, when the analysis accuracy of the adjusted neural network model on the retrieval instruction reaches 99.99%, the adjusted neural network model is used as a preset neural network model.
The callingmodule 303 is configured to call a search mode of the search category.
In the embodiment of the disclosure, each retrieval mode comprises at least one retrieval element, and each retrieval element corresponds to one matching mode.
The retrieval elements are types corresponding to retrieval contents required when the retrieval mode to which the retrieval elements belong is used for retrieval, and each retrieval element corresponds to a large number of specific retrieval contents. For example, the search contents "zhou jen", "li yuchun", "TFBOYS", etc. are all singer names, and the corresponding search element may be defined as "singer name".
Taking the search category as "music category" as an example, the search elements of the search method include "song title", "singer name", "genre", and the like, for example. Those skilled in the art can obtain other search elements of the "music class" search category and obtain search elements of other search categories according to the description of the embodiment.
Further, a corresponding matching mode can be set for each retrieval element according to the degree of correlation of each retrieval element in the retrieval category. The matching method is, for example, "must", "should", "no _ must", etc., where the matching method corresponding to "must" for the search element means that the search element must be completely matched with the search element; the matching mode of the search element corresponding to the 'should' indicates that the matching degree of the search element reaches a certain value, for example, 80%; the matching mode of the search element corresponding to "no _ most" indicates whether the search element is matched with the search element. Other matching manners can be obtained by those skilled in the art according to the description of the embodiment.
The search method of the "music class" search category will be exemplified by taking the search category as "music class", and the search elements as "song title", "singer name", and "genre". For example, the search element "song name" is matched in a "should" manner, the minimum matching degree is 80%, and the corresponding score (i.e., weight) is 400; the matching mode of the search element 'singer name' is 'no _ must', and the corresponding score is 100; the matching mode of the type of the retrieval element is 'must', the corresponding score is 200, the three retrieval elements and the matching modes are combined to form the retrieval mode of the music type retrieval category, which means that 400 scores are obtained when the matching degree of the resource and the song name reaches 80%, 200 scores are obtained when the resource and the song name are completely matched, 100 scores are obtained when the resource and the song name are completely matched, and the obtained scores are added to obtain the final score. It is understood that the above examples are merely illustrative of the retrieval method in the embodiments of the present disclosure, and those skilled in the art may obtain other customized retrieval methods according to the description of the embodiments.
In the embodiment of the present disclosure, the retrieval system further includes a setting module, configured to preset a corresponding relationship between each retrieval category and a retrieval manner, where each retrieval category corresponds to at least one retrieval manner, and each retrieval manner corresponds to one retrieval category. It is understood that one search category may correspond to a plurality of search methods, and different search elements are emphasized among the plurality of search methods. The search elements included in each search method may be different, and the matching methods corresponding to the same search element may also be different, so that the search is performed with emphasis from different sides.
And aretrieval module 304, configured to perform retrieval according to the retrieval method.
In the embodiment of the disclosure, the retrieval system further comprises an analysis module for analyzing the retrieval instruction to obtain one or more retrieval fields, and each retrieval field corresponds to one retrieval element.
The search field is specific contents obtained from the search command, and examples thereof include "Zhou Jieron", "sunny day", "Libai", "rock" and the like. The search field "zhou jen" corresponds to, for example, the search element "singer name", and the search field "sunny day" corresponds to the search element "singer name", and the like. The contents of other search fields can be obtained by those skilled in the art according to the description of the present embodiment.
In the embodiment of the present disclosure, the retrieval system further includes an establishing module and a storage module, the establishing module is configured to establish in advance a resource list corresponding to each retrieval category, the resource list includes the at least one retrieval element and includes at least one retrieval field corresponding to each retrieval element, and the storage module is configured to perform fragmentation storage on the resource list. The storage module stores a resource storage list as shown in fig. 2, for example.
In the embodiment of the present disclosure, the storage module may store the resource list shard in an Elastic Search (ES for short) server. ES is a Lucene-based search server, providing a distributed multi-user capability full-text search engine. The resource list can be stored and scheduled in a fragmentation mode through the distributed ES servers. Since the ES cluster is a distributed search engine, the resource lists distributed at different nodes of the ES cluster are fragments, and the size and number of the fragments affect the search performance overhead.
In the embodiment of the present disclosure, the establishing module is further configured to establish a word list of similar words in advance, where the word list of similar words stores a large number of fields with the same entity meaning. Synonyms refer to words having the same meaning of an entity, i.e., words pointing to the same entity, such as "Libai", "Litaibai", "the poetry", "the lotus-house" and the like.
Further, the parsing module comprises an extraction sub-module, a query sub-module and an indexing sub-module.
The extraction submodule is used for extracting one or more keywords in the retrieval instruction. Specifically, the extraction sub-module extracts one or more keywords in the search instruction by using, for example, an index pattern of the jieba participle. Keywords are for example fields in the search instruction that have the meaning of an entity. Taking the search instruction as "want to listen to the national song of the people's republic of china" as an example, the extracted keywords are, for example, { "china", "the people's republic of china", "chinese", "people", "republic", "national song" }.
The query submodule is used for querying the similar meaning words of each keyword. Specifically, the query submodule queries the word list of the near-sense words according to the extracted keywords to obtain the near-sense words corresponding to each keyword, such as the near-sense word "china" corresponding to the "people's republic of china", the near-sense word "chinese" corresponding to the "chinese", the near-sense word "hero song" corresponding to the "heroic military song", and the like.
And the indexing submodule is used for indexing the resource list corresponding to the retrieval category according to each keyword and the corresponding similar meaning word to obtain a retrieval field corresponding to each keyword. Specifically, the indexing sub-module queries a resource list corresponding to the retrieval category of each obtained keyword and a near-meaning word corresponding to the keyword according to the obtained keyword and the near-meaning word corresponding to the keyword by using a search mode of jieba word segmentation, so as to obtain a retrieval field of the heroic military song, wherein a retrieval element corresponding to the retrieval field is the song name. It is understood that, since the search elements having an influence on the search result are stored in the resource list, not every field in the search instruction corresponds to a search element.
In the embodiment of the present disclosure, the parsing module is not executed for all the search instructions, that is, there is also a search instruction that does not split a field. For example, for an instruction with a definite resource type, such as 'playing a movie tomb stealing note', the retrieval is directly carried out without splitting, so that the storage space can be saved, and the retrieval performance can be improved.
In the embodiment of the present disclosure, the retrievingmodule 304 is further configured to retrieve the one or more retrieval fields according to the matching manner of the retrieval elements corresponding to the one or more retrieval fields.
Taking the retrieval fields obtained in the analysis module as 'heroic army song' and 'group star' as examples, the attribute corresponding to the retrieval element 'song name' in the retrieval mode is replaced by 'heroic army song', the attribute corresponding to the retrieval element 'singer name' is replaced by 'group star', the attributes corresponding to other retrieval elements are replaced by 'star', wherein, 'star' represents any matching, and the 'heroic army song' and the 'group star' are subjected to composite retrieval according to the corresponding matching modes, namely composite retrieval is carried out in the resources corresponding to the retrieval types. Further, the score corresponding to each resource can be obtained, the higher the score is, the higher the comprehensive matching degree of the resource and the retrieval field is, and a plurality of resources with the highest scores can be output in sequence.
In the embodiment of the present disclosure, when the retrieval category corresponding to the retrieval instruction includes more than one retrieval method, theretrieval module 304 is further configured to perform retrieval sequentially according to the more than one retrieval method, for example, a retrieval sequence of the more than one retrieval method is preset, and the more than one retrieval method is sequentially used for retrieval sequentially according to the sequence; and executing the target file when the retrieval is carried out according to one retrieval mode and the matching score between the obtained target file and the retrieval instruction is larger than a threshold value. The threshold is used to ensure the accuracy of the target file, e.g., 500 or the like.
In the embodiment of the present disclosure, the obtainingmodule 301 obtains a retrieval instruction, the analyzingmodule 302 analyzes the retrieval instruction to obtain a retrieval type corresponding to the retrieval instruction, the callingmodule 303 calls a retrieval method of the retrieval type, and the retrievingmodule 304 performs retrieval according to the retrieval method. By setting a corresponding retrieval mode for each retrieval category in a self-defined manner and calling the retrieval mode of the corresponding retrieval category for the retrieval instruction to perform retrieval, the retrieval accuracy and expandability are improved, the user experience is improved, a corresponding resource list is set for each retrieval category, a near word file is maintained, the performance of instruction query is improved, and the retrieval accuracy is further improved.
Referring to fig. 4, fig. 4 shows a hardware configuration diagram of an electronic device.
The electronic device described in this embodiment includes:
amemory 41, aprocessor 42 and a computer program stored on thememory 41 and executable on the processor, the processor implementing the retrieval method described in the embodiment of fig. 1 in the foregoing when executing the program.
Further, the electronic device further includes:
at least oneinput device 43; at least oneoutput device 44.
Thememory 41,processor 42input device 43 andoutput device 44 are connected by abus 45.
Theinput device 43 may be a camera, a touch panel, a physical button, or a mouse. Theoutput device 44 may specifically be a display screen.
TheMemory 41 may be a high-speed Random Access Memory (RAM) Memory or a non-volatile Memory (non-volatile Memory), such as a magnetic disk Memory. Thememory 41 is used for storing a set of executable program code, and theprocessor 42 is coupled to thememory 41.
Further, an embodiment of the present disclosure also provides a computer-readable storage medium, where the computer-readable storage medium may be provided in the terminal in the foregoing embodiments, and the computer-readable storage medium may be the memory in the foregoing embodiment shown in fig. 4. The computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the retrieval method described in the foregoing embodiment shown in fig. 1. Further, the computer-readable storage medium may be various media that can store program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and an actual implementation may have another division, for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication link may be through some interfaces, and the indirect coupling or communication link of the modules may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present disclosure may be integrated into one processing module, or each module may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
It is noted that while for simplicity of explanation, the foregoing method embodiments have been described as a series of acts or combination of acts, it will be appreciated by those skilled in the art that the present disclosure is not limited by the order of acts, as some steps may, in accordance with the present disclosure, occur in other orders and concurrently. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no acts or modules are necessarily required of the disclosure.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In summary, the present disclosure should not be construed as limiting the present disclosure, since the concepts of the embodiments of the present disclosure can be changed in the specific implementation manners and the application ranges by those skilled in the art.

Claims (14)

CN201910966829.XA2019-10-112019-10-11Retrieval method, system, device and mediumPendingCN112650878A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201910966829.XACN112650878A (en)2019-10-112019-10-11Retrieval method, system, device and medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201910966829.XACN112650878A (en)2019-10-112019-10-11Retrieval method, system, device and medium

Publications (1)

Publication NumberPublication Date
CN112650878Atrue CN112650878A (en)2021-04-13

Family

ID=75343656

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201910966829.XAPendingCN112650878A (en)2019-10-112019-10-11Retrieval method, system, device and medium

Country Status (1)

CountryLink
CN (1)CN112650878A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN114817333A (en)*2022-05-162022-07-29北京沃东天骏信息技术有限公司 Retrieval method and apparatus

Citations (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1852124A (en)*2006-05-182006-10-25复旦大学Client-end resource search under broadcast-storage network environment and automatic downloading method
CN102368252A (en)*2010-09-302012-03-07微软公司Applying search inquiry in content set
CN102647520A (en)*2012-03-312012-08-22武汉诚迈科技有限公司Method for quickly searching and downloading songs on mobile terminal
CN103116653A (en)*2013-03-052013-05-22清华大学Service resource searching method and system based on attribute matching
CN104584033A (en)*2012-06-082015-04-29新加坡国立大学 Interactive clothing search in online stores
CN105183778A (en)*2015-08-112015-12-23百度在线网络技术(北京)有限公司Service providing method and apparatus
CN105760399A (en)*2014-12-192016-07-13华为软件技术有限公司Data retrieval method and device
CN106156187A (en)*2015-04-212016-11-23深圳市腾讯计算机系统有限公司Content search method and searching system
JP2017207853A (en)*2016-05-172017-11-24日本電信電話株式会社Table cell retrieval device, method and program
CN109299386A (en)*2018-11-122019-02-01北京航天智造科技发展有限公司A kind of furniture industry designer resource intelligent search matching method and system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1852124A (en)*2006-05-182006-10-25复旦大学Client-end resource search under broadcast-storage network environment and automatic downloading method
CN102368252A (en)*2010-09-302012-03-07微软公司Applying search inquiry in content set
CN102647520A (en)*2012-03-312012-08-22武汉诚迈科技有限公司Method for quickly searching and downloading songs on mobile terminal
CN104584033A (en)*2012-06-082015-04-29新加坡国立大学 Interactive clothing search in online stores
CN103116653A (en)*2013-03-052013-05-22清华大学Service resource searching method and system based on attribute matching
CN105760399A (en)*2014-12-192016-07-13华为软件技术有限公司Data retrieval method and device
CN106156187A (en)*2015-04-212016-11-23深圳市腾讯计算机系统有限公司Content search method and searching system
CN105183778A (en)*2015-08-112015-12-23百度在线网络技术(北京)有限公司Service providing method and apparatus
JP2017207853A (en)*2016-05-172017-11-24日本電信電話株式会社Table cell retrieval device, method and program
CN109299386A (en)*2018-11-122019-02-01北京航天智造科技发展有限公司A kind of furniture industry designer resource intelligent search matching method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN114817333A (en)*2022-05-162022-07-29北京沃东天骏信息技术有限公司 Retrieval method and apparatus

Similar Documents

PublicationPublication DateTitle
CN113190687B (en)Knowledge graph determining method and device, computer equipment and storage medium
CN102193917B (en)Method and device for processing and querying data
CN112115232B (en) Data error correction method, device and server
CN112035599B (en)Query method and device based on vertical search, computer equipment and storage medium
CN103092943B (en)A kind of method of advertisement scheduling and advertisement scheduling server
CN112911331B (en)Music identification method, device, equipment and storage medium for short video
WO2007085187A1 (en)Method of data retrieval, method of generating index files and search engine
CN113886535B (en)Knowledge graph-based question and answer method and device, storage medium and electronic equipment
JP7395377B2 (en) Content search methods, devices, equipment, and storage media
CN112364051B (en)Data query method and device
CN117539990A (en)Problem processing method and device, electronic equipment and storage medium
CN114328799A (en) Data processing method, apparatus, and computer-readable storage medium
CN101339560B (en)Method and device for searching series data, and search engine system
CN110263021B (en)Theme library generation method based on personalized label system
CN110019714A (en)More intent query method, apparatus, equipment and storage medium based on historical results
WO2025092584A1 (en)Method and apparatus for generating interaction component of client ui, terminal, and medium
CN116246629A (en) Man-machine dialogue method, device and electronic equipment
CN117725077A (en)Identification search method, apparatus, computer device, storage medium, and program product
CN118444788A (en)Interaction method and device, wearable device and storage medium
CN115422399B (en) Video search method, device, equipment and storage medium
US20170124090A1 (en)Method of discovering and exploring feature knowledge
CN103870476A (en)Retrieval method and device
CN112650878A (en)Retrieval method, system, device and medium
WO2013097065A1 (en)Index data processing method and device
CN114491232B (en)Information query method and device, electronic equipment and storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp