Movatterモバイル変換


[0]ホーム

URL:


CN114625845A - Information retrieval method, intelligent terminal and computer readable storage medium - Google Patents

Information retrieval method, intelligent terminal and computer readable storage medium
Download PDF

Info

Publication number
CN114625845A
CN114625845ACN202011444290.0ACN202011444290ACN114625845ACN 114625845 ACN114625845 ACN 114625845ACN 202011444290 ACN202011444290 ACN 202011444290ACN 114625845 ACN114625845 ACN 114625845A
Authority
CN
China
Prior art keywords
sentence
target
word
replacement
social
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011444290.0A
Other languages
Chinese (zh)
Other versions
CN114625845B (en
Inventor
王妍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen TCL New Technology Co Ltd
Original Assignee
Shenzhen TCL New Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen TCL New Technology Co LtdfiledCriticalShenzhen TCL New Technology Co Ltd
Priority to CN202011444290.0ApriorityCriticalpatent/CN114625845B/en
Publication of CN114625845ApublicationCriticalpatent/CN114625845A/en
Application grantedgrantedCritical
Publication of CN114625845BpublicationCriticalpatent/CN114625845B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention discloses an information retrieval method, an intelligent terminal and a computer readable storage medium, wherein the method comprises the following steps: acquiring an instruction sentence sent by a user; determining a replacement word corresponding to the instruction statement according to a target word in the instruction statement; and determining and outputting target information corresponding to the instruction sentence according to the alternative words. The invention can improve the accuracy of the information retrieval result.

Description

Translated fromChinese
一种信息检索方法、智能终端及计算机可读存储介质Information retrieval method, intelligent terminal and computer-readable storage medium

技术领域technical field

本发明涉及计算机技术领域,尤其涉及一种信息检索方法、智能终端及计算机可读存储介质。The present invention relates to the field of computer technology, and in particular, to an information retrieval method, an intelligent terminal and a computer-readable storage medium.

背景技术Background technique

随着自然语言处理技术的发展,用户可通过语音或文本下达各种类型的指令,例如控制开机关机,打开某个特定的软件等等。但是这种指令中的词一般较少,且较为固定,例如用户想要开机或关机,所采用的指令一般为“开机”或“关机”,打开A软件,指令一般为“打开A软件”。With the development of natural language processing technology, users can issue various types of instructions through voice or text, such as controlling power on and off, opening a specific software, and so on. However, the words in this kind of command are generally less and more fixed. For example, if the user wants to turn on or off the device, the command used is generally "boot" or "shutdown", and to open software A, the command is generally "open software A".

但是想要通过语音或文本进行信息检索时,常常会遇到用户下发的指令,终端只能根据用户指令中的目标词去查找相关的信息,因此,得到的信息较为粗陋,很难契合用户的需求。例如用户想要听某个乐队的歌曲,下发的指令为“查找某个乐队的歌曲”,终端可能正常识别,但是这个乐队可能存在简称、音译名、意译名、爱称,不同时期还可能出现不同的称呼,用户在下发指令时,用的可能是这个乐队的简称,此时终端就可能无法正确识别。所以当用户下发的指令为与检索相关的指令,也就是指令语句,时,终端常常无法精确识别,并提供有效的检索结果。However, when you want to retrieve information through voice or text, you often encounter instructions issued by the user. The terminal can only search for relevant information according to the target word in the user's instruction. Therefore, the obtained information is relatively crude and difficult to match the user. demand. For example, if a user wants to listen to a certain band's song, the command issued is "find a certain band's song", and the terminal may recognize it normally, but this band may have abbreviations, transliterated names, free translations, and nicknames, which may also exist in different periods. When different names appear, the user may use the abbreviation of the band when issuing an instruction, and the terminal may not be able to identify it correctly. Therefore, when the instruction issued by the user is an instruction related to retrieval, that is, an instruction sentence, the terminal is often unable to accurately identify and provide an effective retrieval result.

发明内容SUMMARY OF THE INVENTION

本发明的主要目的在于提供一种信息检索方法、智能终端及计算机可读存储介质,旨在解决现有技术中无法根据用户的指令语句,提供有效的检索结果的问题。The main purpose of the present invention is to provide an information retrieval method, an intelligent terminal and a computer-readable storage medium, aiming to solve the problem in the prior art that effective retrieval results cannot be provided according to the user's instruction statement.

为实现上述目的,本发明提供一种信息检索方法,所述信息检索方法包括如下步骤:In order to achieve the above object, the present invention provides an information retrieval method, which includes the following steps:

获取用户发送的指令语句;Get the command statement sent by the user;

根据所述指令语句中的目标词,确定与所述指令语句对应的替换词;According to the target word in the instruction statement, determine the replacement word corresponding to the instruction statement;

根据所述替换词,确定所述指令语句对应的目标信息并输出。According to the replacement word, target information corresponding to the instruction sentence is determined and output.

可选地,所述的信息检索方法,其中,所述根据所述指令语句中的目标词,确定与所述指令语句对应的替换词,具体包括:Optionally, the information retrieval method, wherein the determining the replacement word corresponding to the instruction statement according to the target word in the instruction statement specifically includes:

根据预设的目标词库,确定所述指令语句对应的目标词;Determine the target word corresponding to the instruction sentence according to the preset target vocabulary;

根据预设的替换词库,确定每一个所述目标词对应的替换词。A replacement word corresponding to each of the target words is determined according to a preset replacement word database.

可选地,所述的信息检索方法,其中,所述根据预设的目标词库,确定所述指令语句对应的目标词,具体包括:Optionally, the information retrieval method, wherein the determining the target word corresponding to the instruction sentence according to a preset target thesaurus specifically includes:

对所述指令语句进行分词,生成多个文本字符串;performing word segmentation on the instruction statement to generate multiple text strings;

根据所述文本字符串与预设的目标词库中的各个关键词的目标词相似度值,确定所述文本字符串中的目标词。The target word in the text string is determined according to the similarity value of the target word between the text string and each keyword in the preset target vocabulary.

可选地,所述的信息检索方法,其中,所述替换词包括所述目标词库中的关键词对应的上位词、下位词、同义词和近义词;所述根据预设的替换词库,确定每一个所述目标词对应的替换词之前,还包括:Optionally, in the information retrieval method, the replacement words include hypernyms, hyponyms, synonyms and synonyms corresponding to keywords in the target vocabulary; the replacement vocabulary is determined according to a preset substitution vocabulary. Before the replacement word corresponding to each target word, it also includes:

针对每一个所述关键词,根据预设的既定知识库,将该关键词对应的上位词、下位词、同义词和近义词作为对应的替换词。For each of the keywords, according to a preset knowledge base, the hypernyms, hyponyms, synonyms and synonyms corresponding to the keywords are used as corresponding replacement words.

可选地,所述的信息检索方法,其中,所述替换词包括所述目标词库中的关键词对应的别名;所述根据预设的替换词库,确定每一个所述目标词对应的替换词之前,还包括:Optionally, in the information retrieval method, the replacement words include aliases corresponding to keywords in the target vocabulary; the replacement vocabulary is determined according to a preset replacement vocabulary. Before the replacement word, also include:

获取采集社交语句,并对所述社交语句进行聚类,生成多个社交语句群;acquiring and collecting social sentences, and clustering the social sentences to generate multiple social sentence groups;

根据预设的参考语句规则,确定各个所述社交语句群中的参考语句;According to preset reference sentence rules, determine the reference sentences in each of the social sentence groups;

针对每一个所述社交语句群,根据该社交语句群中的参考语句,确定该社交语句群中各个社交语句中与该参考语句中的参考字符串对应的别名,其中,所述参考字符串为与所述关键词库中的关键词对应的字符串;For each of the social sentence groups, according to the reference sentences in the social sentence group, determine the alias corresponding to the reference character string in the reference sentence in each social sentence in the social sentence group, wherein the reference character string is Strings corresponding to keywords in the keyword library;

根据所述参考字符串与所述关键词之间的对应关系,确定所述关键词对应的别称并作为对应的替换词。According to the correspondence between the reference character string and the keyword, the alias corresponding to the keyword is determined and used as the corresponding replacement word.

可选地,所述的信息检索方法,其中,所述根据预设的参考语句规则,确定各个所述社交语句群中的参考语句,具体包括:Optionally, the information retrieval method, wherein the determining the reference sentences in each of the social sentence groups according to preset reference sentence rules specifically includes:

针对每一个社交语句群,计算该社交语句群中各个社交语句之间的语句相似度值;For each social sentence group, calculate the sentence similarity value between each social sentence in the social sentence group;

根据所述语句相似度值,确定该社交语句群中的参考语句。According to the sentence similarity value, the reference sentence in the social sentence group is determined.

可选地,所述的信息检索方法,其中,所述根据所述替换词,确定所述指令语句对应的目标信息并输出,具体包括:Optionally, the information retrieval method, wherein the determining and outputting the target information corresponding to the instruction statement according to the replacement word specifically includes:

将各个所述目标词对应的替换词进行组合,生成多个替换词集;Combining the replacement words corresponding to each of the target words to generate multiple replacement word sets;

将所述目标词作为目标词集,并根据所述替换词集和所述目标词集,确定所述指令语句对应的目标信息并输出。The target word is used as a target word set, and according to the replacement word set and the target word set, target information corresponding to the instruction sentence is determined and output.

可选地,所述的信息检索方法,其中,所述根据所述替换词集和所述目标词集,确定所述指令语句对应的目标信息并输出,具体包括:Optionally, the information retrieval method, wherein the determining and outputting the target information corresponding to the instruction sentence according to the replacement word set and the target word set specifically includes:

针对每一个所述替换词集,根据该替换词集中的替换词,确定对应的数据信息;以及For each of the replacement word sets, determine corresponding data information according to the replacement words in the replacement word set; and

根据所述目标词集中的目标词,确定对应的数据信息;According to the target word in the target word set, determine the corresponding data information;

根据预设的排序规则,对所述数据信息进行排序,生成所述指令语句对应的目标信息并输出。The data information is sorted according to a preset sorting rule, and target information corresponding to the instruction statement is generated and output.

此外,为实现上述目的,本发明还提供一种智能终端,其中,所述智能终端包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的信息检索程序,所述信息检索程序被所述处理器执行时实现如上所述的信息检索方法的步骤。In addition, in order to achieve the above object, the present invention also provides an intelligent terminal, wherein the intelligent terminal includes: a memory, a processor, and an information retrieval program stored in the memory and running on the processor, so When the information retrieval program is executed by the processor, the steps of the information retrieval method as described above are realized.

此外,为实现上述目的,本发明还提供一种计算机可读存储介质,其中,所述计算机可读存储介质存储有信息检索程序,所述信息检索程序被处理器执行时实现如上所述的信息检索方法的步骤。In addition, in order to achieve the above object, the present invention also provides a computer-readable storage medium, wherein the computer-readable storage medium stores an information retrieval program, and when the information retrieval program is executed by a processor, the above-mentioned information is realized Steps of the retrieval method.

本发明在获取指令语句后,并非常规地直接根据指令语句中起到理解含义作用的目标词进行信息检索,而是先根据该指令语句中的目标词,确定可能存在的替换词,然后根据目标词和替换词,再去查找对应的目标信息。由于替换词替换指令语句并不会对理解指令语句产生歧义,但是可扩大检索的对象,从而能够提供更为准确的检索结果。After acquiring the instruction sentence, the present invention does not routinely perform information retrieval directly according to the target word in the instruction sentence that plays the role of understanding the meaning, but firstly determines the possible replacement words according to the target word in the instruction sentence, and then according to the target word in the instruction sentence. words and replacement words, and then look up the corresponding target information. Since the replacement of the instruction sentence by the replacement word will not produce ambiguity for the understanding of the instruction sentence, it can expand the object of retrieval, thereby providing more accurate retrieval results.

附图说明Description of drawings

图1是本发明信息检索方法提供的较佳实施例的流程图;Fig. 1 is the flow chart of the preferred embodiment provided by the information retrieval method of the present invention;

图2是本发明信息检索方法提供的较佳实施例中步骤S200的流程图;2 is a flowchart of step S200 in a preferred embodiment provided by the information retrieval method of the present invention;

图3是本发明信息检索方法提供的较佳实施例中步骤S210的流程图;3 is a flowchart of step S210 in a preferred embodiment provided by the information retrieval method of the present invention;

图4为本发明智能终端的较佳实施例的运行环境示意图。FIG. 4 is a schematic diagram of an operating environment of a preferred embodiment of an intelligent terminal of the present invention.

具体实施方式Detailed ways

为使本发明的目的、技术方案及优点更加清楚、明确,以下参照附图并举实施例对本发明进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer and clearer, the present invention will be further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

本发明较佳实施例所述的信息检索方法,如图1所示,所述信息检索方法包括以下步骤:The information retrieval method described in the preferred embodiment of the present invention, as shown in FIG. 1, the information retrieval method includes the following steps:

步骤S100,获取用户发送的指令语句。Step S100, acquiring an instruction sentence sent by the user.

具体地,本实施例中,执行信息检索方法的主体为安装于智能终端的助手软件。用户在使用智能终端的时候,可通过智能终端的实体键盘、虚拟键盘或麦克风下达指令。例如“查找A乐队的最新专辑”。当该指令语句为语音形式时,智能终端通过麦克风在获取用户的指令语句后,为方便处理,还需采用语音识别,将其转换为文本形式的指令语句并存储至本地。助手软件再在本地获取该指令语句。其中,助手软件可处于实时运行状态或休眠状态,若处于休眠状态,则当智能终端获取指令语句后,可将助手软件唤醒以进行工作。Specifically, in this embodiment, the main body for executing the information retrieval method is assistant software installed in the smart terminal. When using the smart terminal, the user can issue commands through the physical keyboard, virtual keyboard or microphone of the smart terminal. For example "find the latest album of band A". When the command sentence is in the form of voice, after acquiring the command sentence of the user through the microphone, the intelligent terminal needs to use voice recognition to convert it into a command sentence in the form of text and store it locally in order to facilitate processing. The assistant software then obtains the instruction statement locally. Wherein, the assistant software can be in a real-time running state or a dormant state. If it is in a dormant state, after the intelligent terminal obtains the command statement, the assistant software can be woken up to work.

步骤S200,根据所述指令语句中的目标词,确定与所述指令语句对应的替换词。Step S200, according to the target word in the instruction statement, determine the replacement word corresponding to the instruction statement.

具体地,获取指令语句后,助手软件先确定该指令语句对应的目标词。目标词是指在对指令语句理解起到关键作用的词语。预先设定一个替换词库,在该替换词库中存储每一个词语可能对应的替换词,替换词是指将该词替换指令语句中的对应的词,不影响整个指令语句的含义的词。由于替换词库中替换词和词语是对应的,因此,根据这种对应关系以及指令语句中的目标词,可确定于该指令语句对应的替换词。Specifically, after acquiring the instruction statement, the assistant software first determines the target word corresponding to the instruction statement. The target word refers to the word that plays a key role in the understanding of the instruction sentence. A replacement thesaurus is preset, and the replacement words that each word may correspond to are stored in the replacement thesaurus. A replacement word refers to a word that replaces the word with the corresponding word in the instruction sentence and does not affect the meaning of the entire instruction sentence. Since the replacement words and words in the replacement thesaurus are corresponding, according to this correspondence and the target word in the command sentence, the replacement word corresponding to the command sentence can be determined.

进一步地,参阅图2,步骤S200包括:Further, referring to FIG. 2, step S200 includes:

步骤S210,根据预设的目标词库,确定所述指令语句中的目标词。Step S210: Determine the target word in the instruction sentence according to a preset target word library.

具体地,预先设定一个目标词库,在目标词库中保存有多个关键词。在本实施例的第一种实现方式中,确定目标词的过程为:根据这个目标词库的关键词,按照该指令语句中的语句顺序,依次进行遍历,当遇到指令语句中的某个关键词与目标词库中的关键词完全匹配时,则确定该关键词为指令语句中的目标词。Specifically, a target lexicon is preset, and a plurality of keywords are stored in the target lexicon. In the first implementation of this embodiment, the process of determining the target word is: according to the keywords of the target vocabulary, according to the order of the statements in the instruction statement, traverse in turn, when encountering a certain word in the instruction statement When the keyword completely matches the keyword in the target thesaurus, it is determined that the keyword is the target word in the instruction sentence.

进一步地,由于中文词与词之间的划分并非根据语句顺序来确定的,例如“办公室内有电池”,按照语句顺序,应当分为“办公”、“室内”、“有电”和“池”,但是这完全不符合这句话的真实含义,因此采用第一种实现方式存在较大欠缺,参阅图3,本实施例为提高目标词检索的准确率,采用确定目标词的方式为:Further, since the division between Chinese words is not determined according to the sequence of sentences, for example, "there is a battery in the office", according to the sequence of sentences, it should be divided into "office", "indoor", "with electricity" and "pool". ", but this does not conform to the true meaning of this sentence at all, so there is a big deficiency in the first implementation method. Referring to Figure 3, in this embodiment, in order to improve the accuracy of target word retrieval, the method used to determine the target word is:

步骤S211,对所述指令语句进行分词,生成多个文本字符串。Step S211, performing word segmentation on the instruction sentence to generate multiple text strings.

具体地,对该指令语句先进行分词,生成多个文本字符串。分词是指将文本拆分为最小的语音表达单位的形式。本实施例优选采用基于统计的分词方法进行,例如基于条件随机场的分词方法、基于隐马尔科夫模型的分词方法以及基于深度学习的分词方法。通过上述分词防范,将指令语句拆分为多个文本字符串,例如将“查找A乐队的最新专辑”拆分为“查找”、“A乐队”、“的”、“最新”和“专辑”。Specifically, the instruction sentence is firstly segmented to generate multiple text strings. Word segmentation refers to the splitting of text into the smallest unit of phonetic expression. This embodiment is preferably performed by a statistical-based word segmentation method, such as a conditional random field-based word segmentation method, a hidden Markov model-based word segmentation method, and a deep learning-based word segmentation method. Through the above word segmentation prevention, the instruction sentence is split into multiple text strings, for example, "find the latest album of band A" is split into "find", "band A", "of", "latest" and "album" .

步骤S212,根据所述文本字符串与预设的目标词库中的各个关键词的目标词相似度值,确定所述文本字符串中的目标词。Step S212: Determine the target word in the text string according to the similarity value of the target word between the text string and each keyword in the preset target vocabulary.

具体地,然后将得到的文本字符串与预设的目标词库中的各个关键词进行目标词相似度值计算。计算目标词相似度值的方式有很多,例如常见的基于词向量的关键词相似度计算,先通过word2vec等方式将文本字符串和目标词库中的关键词转换为向量形式,由于向量具有数值和方向,因此可计算两者之间差距。描述差距的方法一般词移距离和余弦值,词移距离越大,说明两者距离越远,因此相似度值越小;词移距离越小,说明两者距离越近,因此相似度值越大。余弦值是用两个向量的夹角的余弦值来描述这两个向量的相似度。余弦值越接近1,表明两个向量的夹角越仅仅0度,向量越相似,因此相似度值越小。Specifically, target word similarity value calculation is performed between the obtained text string and each keyword in the preset target vocabulary. There are many ways to calculate the similarity value of the target word. For example, the common keyword similarity calculation based on word vector, first convert the text string and the keywords in the target vocabulary into vector form by means of word2vec, because the vector has a numerical value. and direction, so the difference between the two can be calculated. The method of describing the gap is generally word shift distance and cosine value. The larger the word shift distance, the farther the distance between the two is, and the smaller the similarity value is; the smaller the word shift distance, the closer the distance between the two, so the higher the similarity value. big. The cosine value is the cosine value of the angle between the two vectors to describe the similarity of the two vectors. The closer the cosine value is to 1, the closer the angle between the two vectors is to 0 degrees, the more similar the vectors are, and the smaller the similarity value is.

在本实施例中,由于指令语句中并不一定每一个词都是目标词,例如例句中的“的”,预先设定一个目标词相似度值阈值,然后针对每一个文本字符串,计算该文本字符串和目标词库中的各个关键词中的目标词相似度值,然后选择超过目标词相似度阈值,且数值最大的目标词相似度值所对应的文本字符串作为目标词。In this embodiment, because not every word in the instruction sentence is necessarily a target word, such as "de" in an example sentence, a target word similarity value threshold is preset, and then for each text string, the The text string and the target word similarity value in each keyword in the target vocabulary, and then select the text string corresponding to the target word similarity value that exceeds the target word similarity threshold and has the largest value as the target word.

步骤S220,根据预设的替换词库,确定每一个所述目标词对应的替换词。Step S220: Determine the replacement word corresponding to each of the target words according to a preset replacement word library.

具体地,预先设定一个替换词库,该替换词库包含有上述目标词库中的每一个关键词所对应的替换词,替换词是指在可在语句中替换该关键词且不影响语句含义的关键词。例如上述“查找”的替换词可以是“寻找”、“检索”等词语,将替换词与目标词库中的各个关键词进行对应。由于所述目标词是通过目标词库确定的,且目标词在目标词库中存在对应的关键词,因此当确定了目标词后,根据目标词,在替换词库中寻找其对应的替换词。Specifically, a replacement thesaurus is preset, and the replacement thesaurus contains a replacement word corresponding to each keyword in the above target thesaurus. The replacement word means that the keyword can be replaced in the sentence without affecting the sentence. Meaningful keywords. For example, the replacement words for the above-mentioned "search" may be words such as "find" and "search", and the replacement words are corresponding to each keyword in the target thesaurus. Since the target word is determined through the target thesaurus, and the target word has corresponding keywords in the target thesaurus, after the target word is determined, according to the target word, the corresponding replacement word is searched in the replacement thesaurus .

进一步地,所述替换词包括所述目标词库中的关键词对应的上位词、下位词、同义词和近义词;步骤S220之前,还包括:针对每一个所述关键词,根据预设的既定知识库,将该关键词对应的上位词、下位词、同义词和近义词作为对应的替换词。Further, the replacement words include hypernyms, hyponyms, synonyms and synonyms corresponding to the keywords in the target vocabulary; before step S220, it also includes: for each of the keywords, according to preset established knowledge database, and use the hypernym, hyponym, synonym, and synonym corresponding to the keyword as the corresponding replacement word.

具体地,上下位关系是语言学概念。概括性较强的词称为特定性较强的词的上位词,特定性较强的词叫做概括性较强的词的下位词(hyponym)。近义词是指含义较为相近的关键词,而同义词是指含义完全相同的两个词。例如“视频”的上位词包括“音频”,音频的下位词包括“视频”,“查找”的近义词和同义词包括“寻找”、“寻觅”等。预先设定一个既定知识库,既定知识库是指已定的知识库,在本实施例中,既定知识库包括目前已经知道的上位词、下位词、同义词和近义词之间的关系,形成关于该关键词的上下位关键词和同近义词的网络。然后根据既定知识库中词之间的关系,可确定目标词库中的关键词对应的上位词、下位词、同义词和近义词,并将这些词作为目标词库中的关键词所对应的替换词,并存储在替换词库中。Specifically, the hyponymous relationship is a linguistic concept. Words with strong generalization are called hypernyms of words with strong specificity, and words with strong specificity are called hyponyms of words with strong generalization. Synonyms refer to keywords with similar meanings, while synonyms refer to two words with exactly the same meaning. For example, the hypernym of "video" includes "audio", the hyponym of audio includes "video", and the synonyms and synonyms of "find" include "search", "search" and so on. A predetermined knowledge base is preset, and the predetermined knowledge base refers to a predetermined knowledge base. In this embodiment, the predetermined knowledge base includes the currently known relationships between hypernyms, hyponyms, synonyms, and synonyms, and forms an A network of hyponyms and synonyms of keywords. Then, according to the relationship between words in the established knowledge base, the hypernyms, hyponyms, synonyms and synonyms corresponding to the keywords in the target lexicon can be determined, and these words can be used as the replacement words corresponding to the keywords in the target lexicon , and stored in the replacement thesaurus.

进一步地,所述替换词包括所述目标词对应的别名;步骤S220之前,还包括:Further, the replacement word includes the alias corresponding to the target word; before step S220, it also includes:

步骤A10,获取采集社交语句,并对所述社交语句进行聚类,生成多个社交语句群。Step A10, acquiring and collecting social sentences, and clustering the social sentences to generate a plurality of social sentence groups.

具体地,由于上位词、下位词、同义词和近义词都是基于既定的词之间的关系得到的,然而随着网络的发展,很多词的含义在飞速发展,而很多词的含义的改变都是基于社交网络产生的。例如“A乐队”的简称为“B”,音译名为“C”,但是“B”和“C”都相当于“A乐队”。因此,先采集大量的社交语句。Specifically, since hypernyms, hyponyms, synonyms and synonyms are all obtained based on the established relationship between words, however, with the development of the Internet, the meanings of many words are developing rapidly, and the changes in the meanings of many words are based on social networks. For example, the abbreviation of "A Band" is "B", and the transliteration is "C", but both "B" and "C" are equivalent to "A Band". Therefore, first collect a large number of social sentences.

采集到社交语句后,对社交语句进行聚类。本实施例中优选的聚类方式为先基于有监督聚类,根据社交语句对应的领域进行分类,得到多个基于社交领域的社交语句群,然后在基于社交领域的社交语句群中再对这些社交语句进行有监督或无监督聚类。区分社交领域的目的是因为在不同的领域,同一个词的含义都可能不同。例如“纲目”在文学领域是指概要或细则,而在生物领域,则是基于林耐生物分类方法对生物分类的两个等级。因此,先对社交语句所对应的领域进行划分。对社交领域划分的实现可基于社交语句的采集平台实现,例如医学相关、天文学相关、娱乐相关等等。在得到社交语句后,再采用有监督或无监督的聚类方式,对同一个社交领域内的社交语句进行分群,得到多个社交语句群。After the social sentences are collected, the social sentences are clustered. The preferred clustering method in this embodiment is to first classify based on supervised clustering, and classify according to the field corresponding to the social sentence to obtain a plurality of social sentence groups based on the social field, and then classify these social sentence groups based on the social field. Social sentences are clustered either supervised or unsupervised. The purpose of distinguishing social domains is because the same word may have different meanings in different domains. For example, "compendium" in the field of literature refers to outlines or details, while in the field of biology, it refers to the two levels of taxonomy based on the taxonomy of forest resistant organisms. Therefore, the fields corresponding to social sentences are firstly divided. The realization of the division of social fields can be realized based on the collection platform of social sentences, such as medical related, astronomical related, entertainment related and so on. After the social sentences are obtained, a supervised or unsupervised clustering method is used to group the social sentences in the same social field to obtain multiple social sentence groups.

步骤A20,根据预设的参考语句规则,确定各个所述社交语句群中的参考语句。Step A20: Determine reference sentences in each of the social sentence groups according to preset reference sentence rules.

具体地,参考语句是指该社交语句群中包含有与关键词库中的关键词对应的语句。由于社交语句群中的社交语句彼此存在关联,因此若某一社交语句中包含有关键词,则可与该社交语句,也就是参考语句,类似的社交语句中很可能出现于该参考语句对应的别名。因此,可根据参考语句,确定该社交语句群中其他的社交语句中可能出现该关键词对应的别名。Specifically, the reference sentence means that the social sentence group includes sentences corresponding to the keywords in the keyword database. Since the social sentences in the social sentence group are related to each other, if a certain social sentence contains a keyword, the social sentence, that is, the reference sentence, is likely to appear in the corresponding social sentence corresponding to the reference sentence. alias. Therefore, according to the reference sentence, it can be determined that the alias corresponding to the keyword may appear in other social sentences in the social sentence group.

进一步地,步骤A20包括:Further, step A20 includes:

步骤A21,针对每一个社交语句群,计算该社交语句群中各个社交语句之间的语句相似度值。Step A21: For each social sentence group, calculate the sentence similarity value between each social sentence in the social sentence group.

具体地,以每一个社交语句群为单位,计算同一个社交语句群中各个社交语句之间的语句相似度值。计算语句相似度的方式与上述计算关键词相似度的方式类似,区别在于词语得到的向量一般较小,计算较快,而语句较长,因此,向量一般是以矩阵的形式存在,计算较为复杂。计算方法包括杰卡德系数计算、编辑距离计算等,在此就不一一赘述。Specifically, taking each social sentence group as a unit, the sentence similarity value between each social sentence in the same social sentence group is calculated. The method of calculating the similarity of sentences is similar to the above method of calculating the similarity of keywords. The difference is that the vectors obtained by words are generally smaller and the calculation is faster, while the sentences are longer. Therefore, the vectors generally exist in the form of matrices, and the calculation is more complicated. . The calculation methods include Jaccard coefficient calculation, edit distance calculation, etc., which will not be repeated here.

步骤A22,根据所述语句相似度值,确定该社交语句群中的参考语句。Step A22: Determine a reference sentence in the social sentence group according to the sentence similarity value.

具体地,比较各个语句相似度值的大小,然后选择这一个社交语句群中的各个社交语句相似度值都较高的社交语句作为参考语句。Specifically, the magnitudes of the similarity values of each sentence are compared, and then a social sentence with a higher similarity value of each social sentence in the one social sentence group is selected as a reference sentence.

值得注意的是,根据分群的细致程度不同,社交语句群可包括多个社交语句小群。以医学领域为例,基于医学的社交语句群中还可分为中医和西医,西医可分为临床医学和基础医学,基础医学又可分为医学生物化学、人体免疫学等等。最低等级的社交语句群被上一层社交语句群所囊括在内。而确定社交语句群中的参考语句一般是在最低等级的社交语句群中选择确定。It is worth noting that, according to the degree of detail of the grouping, the social sentence group may include multiple social sentence subgroups. Taking the medical field as an example, medical-based social sentence groups can also be divided into traditional Chinese medicine and western medicine, western medicine can be divided into clinical medicine and basic medicine, and basic medicine can be divided into medical biochemistry, human immunology and so on. The lowest-level social sentence group is included by the upper-level social sentence group. While determining the reference sentence in the social sentence group is generally selected and determined in the lowest level social sentence group.

步骤A30,针对每一个所述社交语句群,根据该社交语句群中的参考语句,确定该社交语句群中各个社交语句中与该参考语句中的参考字符串对应的别名,其中,所述参考字符串为与所述关键词对应的字符串。Step A30: For each of the social sentence groups, according to the reference sentences in the social sentence group, determine the aliases corresponding to the reference strings in the reference sentences in each social sentence in the social sentence group, wherein the reference The character string is a character string corresponding to the keyword.

具体地,确定每一个社交语句群中的参考语句后,先采用上述确定指令语句中的目标词的方式,根据所述关键词库中的关键词,确定该参考语句中与关键词相等的字符串,并作为参考字符串。然后针对每一个社交语句,将社交语句进行分词,得到多个字符串,然后计算各个字符串与参考字符串之间的别名相似度值,然后将所述别名相似度值超过预设的别名相似度阈值的字符串作为该参考字符对应的别名。Specifically, after determining the reference sentence in each social sentence group, the above-mentioned method of determining the target word in the instruction sentence is used to determine the character equal to the keyword in the reference sentence according to the keyword in the keyword database. string, and as a reference string. Then, for each social sentence, the social sentence is segmented to obtain multiple strings, and then the alias similarity value between each string and the reference string is calculated, and then the alias similarity value exceeds the preset alias similarity value The string of degree threshold is used as the alias corresponding to the reference character.

步骤A40,根据所述参考字符串与所述关键词之间的对应关系,确定所述关键词对应的别称并作为对应的替换词。Step A40, according to the correspondence between the reference character string and the keyword, determine the alias corresponding to the keyword and use it as a corresponding replacement word.

具体地,由于参考字符串与关键词是对应关系,因此,确定参考字符串对应的别名后,根据其与关键词之间的对应关系,可确定关键词对应的别名,并将该别名作为关键词对应的替换词。Specifically, since the reference string and the keyword are in a corresponding relationship, after the alias corresponding to the reference string is determined, according to the corresponding relationship between the reference string and the keyword, the alias corresponding to the keyword can be determined, and the alias is used as the key The replacement word for the word.

步骤S300,根据所述替换词和所述目标词,确定所述指令语句对应的目标信息并输出。Step S300: Determine and output target information corresponding to the instruction sentence according to the replacement word and the target word.

具体地,确定关键词对应的替换词后,例如上述的“A乐队”对应的替换词有“B”和“C”。然后根据替换词和目标词,进行数据检索,从而确定该指令语句所对应的目标信息并输出。Specifically, after the replacement words corresponding to the keywords are determined, for example, the replacement words corresponding to the above-mentioned "A band" are "B" and "C". Then, according to the replacement word and the target word, data retrieval is performed to determine and output the target information corresponding to the instruction sentence.

进一步地,步骤S300包括:Further, step S300 includes:

步骤S310,将每一个所述目标词对应的替换词进行组合,生成多个替换词集。Step S310, combining the replacement words corresponding to each of the target words to generate multiple replacement word sets.

具体地,以目标词“查找”、“A乐队”、“最新”和“专辑”为例,“查找”对应的替换词有“寻找”、“检索”等,“A乐队”的替换词有“B”和“C”等,而“最新”对应的替换词为“最近”,“专辑”对应的替换词为“大碟”、“音乐集合”。在“查找”对应的替换词中随机选择一个,例如“寻找”,然后在“A乐队”对应的替换词中随机选择一个,例如“B”,然后选择“最近”和“大碟”作为候选词,然后将“寻找”、“B”、“最近”和“大碟”进行组合,生成替换词集。由于选择用于组合的替换词不同,替换词集中的替换词也不同,因此生成多个替换词集。Specifically, taking the target words "find", "A band", "latest" and "album" as examples, the replacement words corresponding to "find" include "find", "retrieve", etc., and the replacement words for "A band" are "B" and "C", etc., while the replacement words corresponding to "latest" are "recent", and the replacement words corresponding to "album" are "album" and "music collection". Randomly choose one of the replacement words corresponding to "find", such as "find", then randomly choose one of the replacement words corresponding to "A band", such as "B", then choose "recent" and "album" as candidates words, and then combine "find", "B", "recent" and "album" to generate a replacement word set. Since the replacement words selected for combination are different, the replacement words in the replacement word set are also different, so multiple replacement word sets are generated.

步骤S320,将所述目标词作为目标词集,并根据所述替换词集和所述目标词集,确定所述指令语句对应的目标信息并输出。Step S320, taking the target word as a target word set, and determining and outputting target information corresponding to the instruction sentence according to the replacement word set and the target word set.

具体地,得到多个替换词集后,将目标词作为一组目标词集,根据目标词集和替换词集,在预先连接的数据库中获取与该目标词集和替换词集对应的数据信息,作为该指令语句对应的目标信息。Specifically, after obtaining multiple replacement word sets, take the target word as a set of target word sets, and obtain data information corresponding to the target word set and replacement word set in a pre-connected database according to the target word set and replacement word set , as the target information corresponding to the instruction statement.

进一步地,由于根据替换词集和目标词集所得到的数据信息很多,因此为了能够步骤S320包括:Further, since there is a lot of data information obtained according to the replacement word set and the target word set, step S320 includes:

步骤S321,针对每一个所述替换词集,根据该替换词集中的替换词,确定对应的数据信息;以及根据所述目标词集中的目标词,确定对应的数据信息。Step S321, for each of the replacement word sets, determine corresponding data information according to the replacement words in the replacement word set; and determine corresponding data information according to the target words in the target word set.

具体地,以目标词集中的目标词作为检索词,在预先连接的数据库中进行检索,从而确定于该目标词集对应的数据信息。同时,在针对每一个替换词集,将该替换词集中的替换词作为检索词,在预先连接的数据库中进行检索,从而确定于该替换词集对应的数据信息。最终得到多条相关的数据信息。Specifically, the target word in the target word set is used as a search word, and a pre-connected database is searched, so as to determine the data information corresponding to the target word set. At the same time, for each replacement word set, the replacement words in the replacement word set are used as search words, and search is performed in the pre-connected database, so as to determine the data information corresponding to the replacement word set. Finally, multiple pieces of related data information are obtained.

步骤S322,根据预设的排序规则,对所述数据信息进行排序,生成所述指令语句对应的目标信息。Step S322: Sort the data information according to a preset sorting rule to generate target information corresponding to the instruction statement.

具体地,所述排序规则为对数据信息进行排序的规则,可采用计算数据信息与指令语句的相似度值,并根据相似度值的大小进行排序;还可采用根据预先给予不同的关键词或替换词不同的权重值,例如某个别名为使用频率最高,使用范围最广的,则将该别名的权重加大。在获取数据信息后,根据数据信息所对应的替换词或关键词的权重值,计算数据信息所对应的权重值,再根据权重值对数据信息进行排序,生成与指令语句对应的目标信息。Specifically, the sorting rule is a rule for sorting data information, which can be used to calculate the similarity value between the data information and the instruction sentence, and sort according to the size of the similarity value; Replace words with different weight values. For example, if an alias is used the most frequently and is used most widely, the weight of the alias will be increased. After acquiring the data information, calculate the weight value corresponding to the data information according to the weight value of the replacement word or keyword corresponding to the data information, and then sort the data information according to the weight value to generate target information corresponding to the instruction sentence.

进一步地,在得到目标信息后,根据该目标信息,执行对应的指令。例如,所得到的目标信息为上述“A乐队的最新专辑中的歌曲列表”,根据歌曲列表,播放这一专辑。因此,本方案可与执行指令相结合,根据目标信息,依次执行对应的操作。Further, after the target information is obtained, a corresponding instruction is executed according to the target information. For example, the obtained target information is the above-mentioned "list of songs in the latest album of band A", and this album is played according to the list of songs. Therefore, this solution can be combined with the execution instruction, and according to the target information, corresponding operations are performed in sequence.

进一步地,如图4所示,基于上述信息检索方法,本发明还相应提供了一种智能终端,所述智能终端包括处理器10、存储器20及显示器30。图4仅示出了智能终端的部分组件,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。Further, as shown in FIG. 4 , based on the above information retrieval method, the present invention also provides an intelligent terminal correspondingly, and the intelligent terminal includes aprocessor 10 , amemory 20 and adisplay 30 . FIG. 4 only shows some components of the smart terminal, but it should be understood that it is not required to implement all the shown components, and more or less components may be implemented instead.

所述存储器20在一些实施例中可以是所述智能终端的内部存储单元,例如智能终端的硬盘或内存。所述存储器20在另一些实施例中也可以是所述智能终端的外部存储设备,例如所述智能终端上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器20还可以既包括所述智能终端的内部存储单元也包括外部存储设备。所述存储器20用于存储安装于所述智能终端的应用软件及各类数据,例如所述安装智能终端的程序代码等。所述存储器20还可以用于暂时地存储已经输出或者将要输出的数据。在一实施例中,存储器20上存储有信息检索程序40,该信息检索程序40可被处理器10所执行,从而实现本申请中信息检索方法。In some embodiments, thememory 20 may be an internal storage unit of the smart terminal, such as a hard disk or a memory of the smart terminal. In other embodiments, thememory 20 may also be an external storage device of the smart terminal, for example, a plug-in hard disk equipped on the smart terminal, a smart memory card (Smart Media Card, SMC), a secure digital (Secure) Digital, SD) card, flash memory card (Flash Card), etc. Further, thememory 20 may also include both an internal storage unit of the smart terminal and an external storage device. Thememory 20 is used to store application software and various types of data installed in the smart terminal, such as program codes for installing the smart terminal. Thememory 20 can also be used to temporarily store data that has been output or is to be output. In one embodiment, aninformation retrieval program 40 is stored in thememory 20, and theinformation retrieval program 40 can be executed by theprocessor 10, thereby realizing the information retrieval method in the present application.

所述处理器10在一些实施例中可以是一中央处理器(Central Processing Unit,CPU),微处理器或其他数据处理芯片,用于运行所述存储器20中存储的程序代码或处理数据,例如执行所述信息检索方法等。In some embodiments, theprocessor 10 may be a central processing unit (Central Processing Unit, CPU), a microprocessor or other data processing chips, which are used to execute program codes or process data stored in thememory 20, such as The information retrieval method and the like are executed.

所述显示器30在一些实施例中可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。所述显示器30用于显示在所述智能终端的信息以及用于显示可视化的用户界面。所述智能终端的部件10-30通过系统总线相互通信。In some embodiments, thedisplay 30 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, and the like. Thedisplay 30 is used for displaying information on the smart terminal and for displaying a visual user interface. The components 10-30 of the intelligent terminal communicate with each other through the system bus.

在一实施例中,当处理器10执行所述存储器20中信息检索程序40时实现以下步骤:In one embodiment, when theprocessor 10 executes theinformation retrieval program 40 in thememory 20, the following steps are implemented:

获取用户发送的指令语句;Get the command statement sent by the user;

根据所述指令语句中的目标词,确定与所述指令语句对应的替换词;According to the target word in the instruction statement, determine the replacement word corresponding to the instruction statement;

根据所述替换词,确定所述指令语句对应的目标信息并输出。According to the replacement word, target information corresponding to the instruction sentence is determined and output.

其中,所述根据所述指令语句中的目标词,确定与所述指令语句对应的替换词,具体包括:Wherein, determining the replacement word corresponding to the instruction statement according to the target word in the instruction statement specifically includes:

根据预设的目标词库,确定所述指令语句对应的目标词;Determine the target word corresponding to the instruction sentence according to the preset target vocabulary;

根据预设的替换词库,确定每一个所述目标词对应的替换词。A replacement word corresponding to each of the target words is determined according to a preset replacement word database.

其中,所述根据预设的目标词库,确定所述指令语句对应的目标词,具体包括:Wherein, determining the target word corresponding to the instruction sentence according to the preset target word library specifically includes:

对所述指令语句进行分词,生成多个文本字符串;performing word segmentation on the instruction statement to generate multiple text strings;

根据所述文本字符串与预设的目标词库中的各个关键词的目标词相似度值,确定所述文本字符串中的目标词。The target word in the text string is determined according to the similarity value of the target word between the text string and each keyword in the preset target vocabulary.

其中,所述替换词包括所述目标词库中的关键词对应的上位词、下位词、同义词和近义词;所述根据预设的替换词库,确定每一个所述目标词对应的替换词之前,还包括:Wherein, the replacement words include hypernyms, hyponyms, synonyms and synonyms corresponding to the keywords in the target word database; the replacement word corresponding to each target word is determined before the replacement word according to the preset replacement word database. ,Also includes:

针对每一个所述关键词,根据预设的既定知识库,将该关键词对应的上位词、下位词、同义词和近义词作为对应的替换词。For each of the keywords, according to a preset knowledge base, the hypernyms, hyponyms, synonyms and synonyms corresponding to the keywords are used as corresponding replacement words.

其中,所述替换词包括所述目标词库中的关键词对应的别名;所述根据预设的替换词库,确定每一个所述目标词对应的替换词之前,还包括:Wherein, the replacement words include aliases corresponding to the keywords in the target word library; before determining the replacement word corresponding to each target word according to the preset replacement word database, it also includes:

获取采集社交语句,并对所述社交语句进行聚类,生成多个社交语句群;acquiring and collecting social sentences, and clustering the social sentences to generate multiple social sentence groups;

根据预设的参考语句规则,确定各个所述社交语句群中的参考语句;According to preset reference sentence rules, determine the reference sentences in each of the social sentence groups;

针对每一个所述社交语句群,根据该社交语句群中的参考语句,确定该社交语句群中各个社交语句中与该参考语句中的参考字符串对应的别名,其中,所述参考字符串为与所述关键词库中的关键词对应的字符串;For each of the social sentence groups, according to the reference sentences in the social sentence group, determine the alias corresponding to the reference character string in the reference sentence in each social sentence in the social sentence group, wherein the reference character string is Strings corresponding to keywords in the keyword library;

根据所述参考字符串与所述关键词之间的对应关系,确定所述关键词对应的别称并作为对应的替换词。According to the correspondence between the reference character string and the keyword, the alias corresponding to the keyword is determined and used as the corresponding replacement word.

其中,所述根据预设的参考语句规则,确定各个所述社交语句群中的参考语句,具体包括:Wherein, determining the reference sentences in each of the social sentence groups according to the preset reference sentence rules specifically includes:

针对每一个社交语句群,计算该社交语句群中各个社交语句之间的语句相似度值;For each social sentence group, calculate the sentence similarity value between each social sentence in the social sentence group;

根据所述语句相似度值,确定该社交语句群中的参考语句。According to the sentence similarity value, the reference sentence in the social sentence group is determined.

其中,所述根据所述替换词,确定所述指令语句对应的目标信息并输出,具体包括:Wherein, determining and outputting target information corresponding to the instruction statement according to the replacement word specifically includes:

将各个所述目标词对应的替换词进行组合,生成多个替换词集;Combining the replacement words corresponding to each of the target words to generate multiple replacement word sets;

将所述目标词作为目标词集,并根据所述替换词集和所述目标词集,确定所述指令语句对应的目标信息并输出。The target word is used as a target word set, and according to the replacement word set and the target word set, target information corresponding to the instruction sentence is determined and output.

其中,所述根据所述替换词集和所述目标词集,确定所述指令语句对应的目标信息并输出,具体包括:Wherein, determining and outputting target information corresponding to the instruction statement according to the replacement word set and the target word set specifically includes:

针对每一个所述替换词集,根据该替换词集中的替换词,确定对应的数据信息;以及For each of the replacement word sets, determine corresponding data information according to the replacement words in the replacement word set; and

根据所述目标词集中的目标词,确定对应的数据信息;According to the target word in the target word set, determine the corresponding data information;

根据预设的排序规则,对所述数据信息进行排序,生成所述指令语句对应的目标信息并输出。The data information is sorted according to a preset sorting rule, and target information corresponding to the instruction statement is generated and output.

本发明还提供一种计算机可读存储介质,其中,所述计算机可读存储介质存储有信息检索程序,所述信息检索程序被处理器执行时实现如上所述的信息检索方法的步骤。The present invention also provides a computer-readable storage medium, wherein the computer-readable storage medium stores an information retrieval program, and when the information retrieval program is executed by a processor, implements the steps of the information retrieval method as described above.

当然,本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关硬件(如处理器,控制器等)来完成,所述的程序可存储于一计算机可读取的计算机可读存储介质中,所述程序在执行时可包括如上述各方法实施例的流程。其中所述的计算机可读存储介质可为存储器、磁碟、光盘等。Of course, those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware (such as processors, controllers, etc.) through a computer program, and the programs can be stored in a In a computer-readable computer-readable storage medium, the program, when executed, may include the processes of the foregoing method embodiments. The computer-readable storage medium may be a memory, a magnetic disk, an optical disk, or the like.

应当理解的是,本发明的应用不限于上述的举例,对本领域普通技术人员来说,可以根据上述说明加以改进或变换,所有这些改进和变换都应属于本发明所附权利要求的保护范围。It should be understood that the application of the present invention is not limited to the above examples. For those of ordinary skill in the art, improvements or transformations can be made according to the above descriptions, and all these improvements and transformations should belong to the protection scope of the appended claims of the present invention.

Claims (10)

CN202011444290.0A2020-12-112020-12-11 Information retrieval method, intelligent terminal and computer-readable storage mediumActiveCN114625845B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202011444290.0ACN114625845B (en)2020-12-112020-12-11 Information retrieval method, intelligent terminal and computer-readable storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202011444290.0ACN114625845B (en)2020-12-112020-12-11 Information retrieval method, intelligent terminal and computer-readable storage medium

Publications (2)

Publication NumberPublication Date
CN114625845Atrue CN114625845A (en)2022-06-14
CN114625845B CN114625845B (en)2025-07-11

Family

ID=81895140

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202011444290.0AActiveCN114625845B (en)2020-12-112020-12-11 Information retrieval method, intelligent terminal and computer-readable storage medium

Country Status (1)

CountryLink
CN (1)CN114625845B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN117093604A (en)*2023-10-202023-11-21中信证券股份有限公司Search information generation method, apparatus, electronic device, and computer-readable medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101872351A (en)*2009-04-272010-10-27阿里巴巴集团控股有限公司Method, device for identifying synonyms, and method and device for searching by using same
US20110035403A1 (en)*2005-12-052011-02-10Emil IsmalonGeneration of refinement terms for search queries
CN102737021A (en)*2011-03-312012-10-17北京百度网讯科技有限公司Search engine and realization method thereof
CN107704474A (en)*2016-08-082018-02-16华为技术有限公司Attribute alignment schemes and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20110035403A1 (en)*2005-12-052011-02-10Emil IsmalonGeneration of refinement terms for search queries
CN101872351A (en)*2009-04-272010-10-27阿里巴巴集团控股有限公司Method, device for identifying synonyms, and method and device for searching by using same
CN102737021A (en)*2011-03-312012-10-17北京百度网讯科技有限公司Search engine and realization method thereof
CN107704474A (en)*2016-08-082018-02-16华为技术有限公司Attribute alignment schemes and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SHENG GUO 等: "Mining linguistic cues for query expansion: applications to drug interaction search", 《CIKM \'09: PROCEEDINGS OF THE 18TH ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT》, 2 November 2009 (2009-11-02), pages 335, XP058596326, DOI: 10.1145/1645953.1645998*
喻靖民: "基于词向量的自然语言隐写分析方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》, no. 06, 15 June 2019 (2019-06-15), pages 138 - 47*

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN117093604A (en)*2023-10-202023-11-21中信证券股份有限公司Search information generation method, apparatus, electronic device, and computer-readable medium
CN117093604B (en)*2023-10-202024-02-02中信证券股份有限公司Search information generation method, apparatus, electronic device, and computer-readable medium

Also Published As

Publication numberPublication date
CN114625845B (en)2025-07-11

Similar Documents

PublicationPublication DateTitle
Hu et al.Improved lexically constrained decoding for translation and monolingual rewriting
AU2019200437B2 (en)A method to build an enterprise-specific knowledge graph
US10691685B2 (en)Converting natural language input to structured queries
JP5936698B2 (en) Word semantic relation extraction device
US20220129448A1 (en)Intelligent dialogue method and apparatus, and storage medium
CN110059160B (en)End-to-end context-based knowledge base question-answering method and device
US11016968B1 (en)Mutation architecture for contextual data aggregator
US20060242130A1 (en)Information retrieval using conjunctive search and link discovery
US9092512B2 (en)Corpus search improvements using term normalization
US9754022B2 (en)System and method for language sensitive contextual searching
CN108932218B (en)Instance extension method, device, equipment and medium
CN108763529A (en)A kind of intelligent search method, device and computer readable storage medium
CN113065355B (en)Professional encyclopedia named entity identification method, system and electronic equipment
CN117093729A (en)Retrieval method, system and retrieval terminal based on medical scientific research information
CN105956053A (en)Network information-based search method and apparatus
US20120317125A1 (en)Method and apparatus for identifier retrieval
CN113434767B (en)UGC text content mining method, system, equipment and storage medium
CN115718791A (en)Specific ordering of text elements and applications thereof
US12073299B2 (en)Systems and methods for using contrastive pre-training to generate text and code embeddings
CN118797005A (en) Intelligent question-answering method, device, electronic device, storage medium and product
CN118228734A (en) Medical terminology normalization method based on large language model for data enhancement
Yu et al.Role-explicit query identification and intent role annotation
CN114625845B (en) Information retrieval method, intelligent terminal and computer-readable storage medium
Celikyilmaz et al.An empirical investigation of word class-based features for natural language understanding
Shiang et al.Spoken question answering using tree-structured conditional random fields and two-layer random walk.

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp