Movatterモバイル変換


[0]ホーム

URL:


CN109460461A - Text matching technique and system based on text similarity model - Google Patents

Text matching technique and system based on text similarity model
Download PDF

Info

Publication number
CN109460461A
CN109460461ACN201811344782.5ACN201811344782ACN109460461ACN 109460461 ACN109460461 ACN 109460461ACN 201811344782 ACN201811344782 ACN 201811344782ACN 109460461 ACN109460461 ACN 109460461A
Authority
CN
China
Prior art keywords
text
similarity
default
string
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811344782.5A
Other languages
Chinese (zh)
Inventor
朱钦佩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AI Speech Ltd
Original Assignee
AI Speech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by AI Speech LtdfiledCriticalAI Speech Ltd
Priority to CN201811344782.5ApriorityCriticalpatent/CN109460461A/en
Publication of CN109460461ApublicationCriticalpatent/CN109460461A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明实施例提供一种基于文本相似度模型的文本匹配方法。该方法包括:接收文本信息,确定文本信息的特征向量,其中,特征向量至少包括:文本字符串、文本拼音、词向量;将特征向量输入至的文本相似度模型中;获取文本相似度模型输出的特征相似度;根据特征相似度确定至少一个达到预设特征阈值的预设语句以作为文本信息的匹配文本。本发明实施例还提供一种基于文本相似度模型的文本匹配系统和文本相似度模型的训练方法及系统。本发明实施例通过使用考虑多种维度特征向量的文本相似度模型确定出用户输入语句与文本相似度模型中各预设语句的特征相似度,进而确定出相对精准较高的匹配文本。

An embodiment of the present invention provides a text matching method based on a text similarity model. The method includes: receiving text information, and determining a feature vector of the text information, wherein the feature vector at least includes: text string, text pinyin, and word vector; inputting the feature vector into a text similarity model; obtaining the output of the text similarity model The feature similarity is determined according to the feature similarity, and at least one preset sentence that reaches the preset feature threshold is determined as the matching text of the text information. Embodiments of the present invention also provide a text matching system based on a text similarity model and a method and system for training the text similarity model. In the embodiment of the present invention, the feature similarity between the user input sentence and each preset sentence in the text similarity model is determined by using a text similarity model that considers feature vectors of multiple dimensions, thereby determining a relatively accurate matching text.

Description

Text matching technique and system based on text similarity model
Technical field
The present invention relates to natural language processing field more particularly to a kind of text matches sides based on text similarity modelMethod and system.
Background technique
Text similarity computing is the basic problem of natural language processing, requires text similarity algorithm in many fieldsAs support.In life, due to the description of user's colloquial style, the use of input method or hand mistake etc., the description of user is simultaneouslyWill not as document standard, but still imply the information that user wants in the text of user's description, accurate paving is graspedThese Weak Informations, it is necessary to use text similarity measurement algorithm.For example, user's input " putting up a bridge somewhere in the Changjiang river ", in factUser really wants to ask " Yangtze Bridge is somewhere ".How according to " putting up a bridge somewhere in the Changjiang river ", in default corpus" Yangtze Bridge " is searched out, is the important application scene of text similarity measurement algorithm.For another example, user, which says, " navigates to north doctor sixInstitute ", " north doctor six institutes " how to be said according to user search out " the 6th hospital, Peking University " in default corpus.In order to solveThese problems are generally indicated the height of text similarity using the number of word similar between calculating character string, or usedStatistical model carries out text similarity statistics according to multiple words that user carries out in primary dialogue, or artificially collects, to locateManage these problems.
In realizing process of the present invention, at least there are the following problems in the related technology for inventor's discovery:
It is although able to solve subproblem using the number of word similar between calculating character string, but for because of misspellingSimilar Text caused by accidentally is difficult effectively to identify, for example, " Chiba hand-pulled noodles " (qian ye la mian) and " drawing of taste thousand can be obtainedThe similarity ratio " dangerous hand-pulled noodles " (wei xian la mian) and " thousand hand-pulled noodles of taste " (wei in face " (wei qian la mian)Qian la mian) similarity it is higher.And (such as the various inputs of session sampling instrument are often relied on using statistical modelMethod, search engine), covering surface is small, and artificially collects higher cost.
Summary of the invention
In order at least solve only to consider in the prior art between character string that similarity is not caused by the number of similar wordAccurately or statistical method covering surface is small, artificially collects problem at high cost.
In a first aspect, the embodiment of the present invention provides a kind of training method of text similarity model, comprising:
It receives dictionary training set and the default sentence is determined to default sentence word segmentation processing each in the dictionary training setText-string;
According to the text-string of each default sentence, determine term vector corresponding with the text-string and with instituteState the corresponding text phonetic of text-string;
According to the corresponding text-string of each default sentence, text phonetic and term vector, determine described each defaultThe corresponding feature vector of sentence, training text similarity model.
Second aspect, the embodiment of the present invention provide a kind of text matching technique based on text similarity model, comprising:
Text information is received, determines the feature vector of the text information, wherein described eigenvector includes at least: textThis character string, text phonetic, term vector;
Described eigenvector is input in the text similarity model;
Obtain the characteristic similarity of the text similarity model output;
Determine that at least one reaches the default sentence of default characteristic threshold value using as the text according to the characteristic similarityThe matched text of this information.
The third aspect, the embodiment of the present invention provide a kind of training system of text similarity model, comprising:
Text-string determines program module, for receiving dictionary training set, to each default language in the dictionary training setSentence word segmentation processing, determines the text-string of the default sentence;
Term vector and text phonetic determine program module, for the text-string according to each default sentence, determining and instituteState the corresponding term vector of text-string and text phonetic corresponding with the text-string;
Text similarity model training program module, for according to the corresponding text-string of each default sentence, textThis phonetic and term vector determine the corresponding feature vector of each default sentence, training text similarity model.
Fourth aspect, the embodiment of the present invention provide a kind of text matches system based on text similarity model, comprising:
Feature vector determines program module, for receiving text information, determines the feature vector of the text information,In, described eigenvector includes at least: text-string, text phonetic, term vector;
Feature vector inputs program module, for described eigenvector to be input in the text similarity model;
Characteristic similarity obtains program module, for obtaining the characteristic similarity of the text similarity model output;
Text matches program module, for determining that at least one reaches default characteristic threshold value according to the characteristic similaritySentence is preset using the matched text as the text information.
5th aspect, provides a kind of electronic equipment comprising: at least one processor, and with described at least oneManage the memory of device communication connection, wherein the memory is stored with the instruction that can be executed by least one described processor, instituteIt states instruction to be executed by least one described processor, so that at least one described processor is able to carry out any embodiment of the present inventionText similarity model training method and the step of text matching technique based on text similarity model.
6th aspect, the embodiment of the present invention provide a kind of storage medium, are stored thereon with computer program, and feature existsIn realizing the training method of the text similarity model of any embodiment of the present invention when the program is executed by processor and be based onThe step of text matching technique of text similarity model.
The beneficial effect of the embodiment of the present invention is: can be seen that by the embodiment by determining the multiple of wordFeature vector is trained text similarity model, and model parameter is more abundant, and the feature being related to is more, determining text phaseIt is more accurate like spending.User's read statement is determined by using the text similarity model of a variety of dimensional characteristics vectors of consideration againWith the characteristic similarity of default sentence each in text similarity model, and then determine relatively precisely higher matched text.In advanceIf dictionary collects relatively easy, advantage of lower cost.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show belowThere is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is this hairBright some embodiments for those of ordinary skill in the art without creative efforts, can be with rootOther attached drawings are obtained according to these attached drawings.
Fig. 1 is a kind of flow chart of the training method for text similarity model that one embodiment of the invention provides;
Fig. 2 is a kind of process for text matching technique based on text similarity model that one embodiment of the invention providesFigure;
Fig. 3 is a kind of structural schematic diagram of the training system for text similarity model that one embodiment of the invention provides.
Fig. 4 is that a kind of structure for text matches system based on text similarity model that one embodiment of the invention provides is shownIt is intended to.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present inventionIn attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment isA part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the artEvery other embodiment obtained without creative efforts, shall fall within the protection scope of the present invention.
A kind of flow chart of the training method of the text similarity model provided as shown in Figure 1 for one embodiment of the invention,Include the following steps:
S11: receiving dictionary training set, to default sentence word segmentation processing each in the dictionary training set, determines described defaultThe text-string of sentence;
S12: according to the text-string of each default sentence, determine term vector corresponding with the text-string andText phonetic corresponding with the text-string;
S13: it according to the corresponding text-string of each default sentence, text phonetic and term vector, determines described eachThe default corresponding feature vector of sentence, training text similarity model.
In the present embodiment, it due to no longer only comparing the number of the directly similar word of text-string, but introducesNew parameter carries out multiple orientation and comprehensively considers, therefore used text similarity model is also required to further training.
For step S11, dictionary training set is received, wherein a large number of users is contained in dictionary training set in daily lifeIn some words that may use, for example, " the first affiliated hospital, Peking University ", " the second affiliated hospital, Peking University ", " northThird affiliated hospital, capital university ", " the 4th affiliated hospital, Peking University ", " KFC ", " McDonald ", " thousand hand-pulled noodles of taste ", " pepperWork mill ", " Friendship Bridge ", " Shahe bridge ", " Yongdinghe River bridge ", " Zhenyang bridge ", " Yangtze Bridge ", " Caobai River is bigBridge " ....After receiving dictionary training set, word segmentation processing is carried out to default sentence each in the dictionary training set, is determined described pre-If the text-string of sentence, for example, the Changjiang river the text-string s1=_ bridge of " Yangtze Bridge ".Wherein Words partition systemIn may separate an individual word, it is also possible to separate a word.
Word corresponding with the text-string is determined according to the text-string of each default sentence for step S12Vector and text phonetic, after step S11, the determining the Changjiang river text-string s1=_ bridge.It is true according to the text-stringFixed its text phonetic p1 and term vector w1 obtains p1=chang jiang by determination | da qiao, w1=(0.323,0.123,...)(0.564,0.348,...).Wherein, when the text-string includes Chinese character, mapping with it is described inThe corresponding text phonetic of Chinese character, when the text-string includes English character, the text phonetic of the English characterFor described English character itself.
For step S13, according to the corresponding text-string of each default sentence, text phonetic and term vector, reallyThe corresponding feature vector of fixed each default sentence, feature vector cover the text-string feature of default sentence, textPhonetic feature and term vector feature, and then pass through described eigenvector training text similarity model.
It can be seen that by the embodiment by determining that multiple feature vectors of word are trained text similarity mouldType, model parameter is more abundant, and the feature being related to is more, and determining text similarity is more accurate.
A kind of text matching technique based on text similarity model of one embodiment of the invention offer is providedFlow chart includes the following steps:
S21: text information is received, determines the feature vector of the text information, wherein described eigenvector is at least wrappedIt includes: text-string, text phonetic, term vector;
S22: described eigenvector is input in the text similarity model;
S23: the characteristic similarity of the text similarity model output is obtained;
S24: determine that at least one reaches the default sentence of default characteristic threshold value using as institute according to the characteristic similarityState the matched text of text information.
In the present embodiment, the text similarity model by the claim 1 training carries out specific practical application.
For step S21, text information is received, wherein the text information can be inputted according to user by voice, phaseThe equipment answered carries out speech recognition, and the text information obtained, can also according to user by the input method of corresponding equipment intoRow input.For example, user carries out text input by input method, due to the hand shaking or general idea or other situations of user,User has got " the Changjiang river bridging " by input method.And then determine the feature vector of " the Changjiang river bridging " of user's input, including textThis character string, text phonetic, term vector.Wherein, the Changjiang river text-string s2=_ bridging, text phonetic p2=chang jiang| da qiao, term vector w2=(0.1234,0.2133 ...) (0.823,0.234 ...).
For step S22, the feature vector determined in the step s 21 is input to the text similarity modelIn, it is compared according to the various features with the default sentence in text similarity model.
For step S23, after step s 22, the characteristic similarity of the text similarity model output is obtained, whereinCharacteristic similarity includes the characteristic similarity of each default sentence in the word and text similarity model of user's input.
At least one, which reaches default threshold, is determined according to the characteristic similarity determined in step S23 for step S24Matched text of the default sentence of value as the text information.
It can be seen that by the embodiment true by using the text similarity model of a variety of dimensional characteristics vectors of considerationMake the characteristic similarity of each default sentence in user's read statement and text similarity model, so determine relatively precisely compared withHigh matched text.Default dictionary collects relatively easy, advantage of lower cost.
As an implementation, in the present embodiment, the default characteristic threshold value includes pre-set text threshold value, described to obtainThe characteristic similarity for taking text similarity model output includes:
When described eigenvector include at least text-string when, according to the text-string of the text information with it is describedThe text-string of each default sentence determines the text of the text information and each default sentence in text similarity modelSimilarity;
The default sentence that the text similarity is more than pre-set text threshold value is determined as matched character string set;
According at least to text-string, the text phonetic, term vector in described eigenvector, determine the text information withThe characteristic similarity of default sentence in the matched character string set.
In the present embodiment, the default characteristic threshold value includes pre-set text threshold value, also, works as described eigenvector extremelyWhen less including text-string, according to the text-string and the text similarity model of the text information of user inputThe text-string of interior each default sentence determines the text similarity of the text information and each default sentence.Namely firstWith one of various features vector feature, similarity-rough set is carried out.Determine that a range is lesser more than pre-set text thresholdThe matched character string set of the default sentence of value.
After determining matched character string set, in the text envelope for being determined user's input together according to various features vectorThe characteristic similarity of breath and the default sentence in matched character string set.
It can be seen that by the embodiment by first using single feature, to the pre- of the text similarity modelIf sentence carries out preliminary screening.It filters out relatively small-scale matched character string set and passes through various features vector again and determineCorresponding matched text accelerates the efficiency of determining matched text.
As an implementation, in the present embodiment, the default characteristic threshold value includes default phonetic threshold value, described to obtainThe characteristic similarity for taking text similarity model output includes:
When described eigenvector includes at least text phonetic, according to the text phonetic of the text information and the textThe text phonetic of each default sentence determines the pinyin similarity of the text information and each default sentence in similarity model;
The pinyin similarity is determined to be more than to preset the default sentence of phonetic threshold value as matching phonetic set;
According at least to text-string, the text phonetic, term vector in described eigenvector, determine the text information withThe characteristic similarity of default sentence in the matching phonetic set.
In the present embodiment, the default characteristic threshold value includes default phonetic threshold value, also, works as described eigenvector extremelyWhen less including text phonetic, according to each in the text phonetic and the text similarity model of the text information of user inputThe text phonetic of default sentence determines the pinyin similarity of the text information and each default sentence.Similarly, and first it usesOne of various features vector feature carries out similarity-rough set.Determine that a range is lesser more than default phonetic threshold valueDefault sentence matching phonetic set.
After determining matching phonetic set, in the text information for being determined user's input together according to various features vectorWith the characteristic similarity of the default sentence matched in phonetic set.
It can be seen that by the embodiment by first using single feature, to the pre- of the text similarity modelIf sentence carries out preliminary screening.Relatively small-scale matching phonetic set is filtered out, then is driven out by various features vectorCorresponding matched text accelerates the efficiency of determining matched text.
As an implementation, in the present embodiment, the default characteristic threshold value includes default vector threshold, described to obtainThe characteristic similarity for taking text similarity model output includes:
It is similar to the text according to the term vector of the text information when described eigenvector includes at least term vectorThe term vector of each default sentence determines the vector similarity of the text information and each default sentence in degree model;
The vector similarity is determined to be more than to preset the default sentence of vector threshold as matching vector set;
According at least to text-string, the text phonetic, term vector in described eigenvector, determine the text information withThe characteristic similarity of default sentence in the matching vector set.
In the present embodiment, the default characteristic threshold value includes default vector threshold, also, works as described eigenvector extremelyWhen less including term vector, according to each default in the term vector and the text similarity model of the text information of user inputThe term vector of sentence determines the vector similarity of the text information and each default sentence.Similarly, and first with a variety of spiesOne of vector feature is levied, similarity-rough set is carried out.Determine that a range is lesser default more than default vector thresholdThe matching vector set of sentence.
After determining matching vector set, in the text information for being determined user's input together according to various features vectorWith the characteristic similarity of the default sentence in matching vector set.
It can be seen that by the embodiment by first using single feature, to the pre- of the text similarity modelIf sentence carries out preliminary screening.Relatively small-scale matching vector set is filtered out, then is driven out by various features vectorCorresponding matched text accelerates the efficiency of determining matched text.
As an implementation, in the present embodiment, described to determine that at least one reaches default according to characteristic similarityThe default sentence of characteristic threshold value includes: using the matched text as the text information
When according to the sequence of similarity from high to low, determining only one is more than the default sentence conduct for presetting characteristic threshold valueWhen the matched text of the text information, using one default sentence as the matched text of the text information;Or
It is more than the default sentence work for presetting characteristic threshold value when having at least two according to the sequence determination of similarity from high to lowFor the text information matched text when, described at least two default sentences are sent to user;
Receive the default sentence of user's selection;
Using the selected default sentence as the matched text of the text information.
In the present embodiment, can according to similarity from high to low determine the default language for reaching default characteristic threshold valueMatched text of the sentence as the text information.Wherein when only determining a matched text, for example, the text envelope of user's inputBreath is " the Changjiang river bridging ", and a matched text of the determination by similarity by height on earth is " Yangtze Bridge ", " the Changjiang river by described inThe matched text of " the Changjiang river bridging " that bridge " is inputted as user.
When determining at least two matched texts, for example, the text information of user's input is " BJ Univ Hospital ", by similarAt least two determining matched texts of degree are " Peking University First Hospital ", " the second hospital, Peking University ", " Peking University's thirdHospital " ... receives the default sentence of user's selection to user feedback, such as user selects " The Third Affiliated Hospital of Peking University ", by instituteState matched text of the default sentence selected as text information.
It can be seen that the matched text by determining specified quantity by the embodiment, provide more for userWith mode, matching range is expanded, while also improving the usage experience of user.
A kind of structural representation of the training system of text similarity model of one embodiment of the invention offer is providedFigure, which can be performed the training method of text similarity model described in above-mentioned any embodiment, and configure in the terminal.
A kind of training system of text similarity model provided in this embodiment includes: that text-string determines program module11, term vector and text phonetic determine program module 12 and text similarity model training program module 13.
Wherein, text-string determines program module 11 for receiving dictionary training set, to each in the dictionary training setDefault sentence word segmentation processing, determines the text-string of the default sentence;Term vector and text phonetic determine program module 12For the text-string according to each default sentence, determine term vector corresponding with the text-string and with the textThe corresponding text phonetic of this character string;Text similarity model training program module 13 is used for according to each default sentence pairText-string, text phonetic and the term vector answered determine the corresponding feature vector of each default sentence, training text phaseLike degree model.
A kind of text matches system based on text similarity model of one embodiment of the invention offer is providedThe text matching technique based on text similarity model described in above-mentioned any embodiment can be performed in structural schematic diagram, the system,And it configures in the terminal.
A kind of text matches system based on text similarity model provided in this embodiment includes: that feature vector determines journeySequence module 21, feature vector input program module 22, and characteristic similarity obtains program module 23 and text matches program module 24.
Wherein, feature vector determines program module 21 for receiving text information, determine the feature of the text information toAmount, wherein described eigenvector includes at least: text-string, text phonetic, term vector;Feature vector inputs program module22 for described eigenvector to be input in the text similarity model;Characteristic similarity obtains program module 23 and is used forObtain the characteristic similarity of the text similarity model output;Text matches program module 24 is used for similar according to the featureDegree determines that at least one reaches the default sentence of default characteristic threshold value using the matched text as the text information.
Further, the default characteristic threshold value includes pre-set text threshold value, and the characteristic similarity obtains program moduleFor:
When described eigenvector include at least text-string when, according to the text-string of the text information with it is describedThe text-string of each default sentence determines the text of the text information and each default sentence in text similarity modelSimilarity;
The default sentence that the text similarity is more than pre-set text threshold value is determined as matched character string set;
According at least to text-string, the text phonetic, term vector in described eigenvector, determine the text information withThe characteristic similarity of default sentence in the matched character string set.
Further, the default characteristic threshold value includes default phonetic threshold value, and the characteristic similarity obtains program moduleFor:
When described eigenvector includes at least text phonetic, according to the text phonetic of the text information and the textThe text phonetic of each default sentence determines the pinyin similarity of the text information and each default sentence in similarity model;
The pinyin similarity is determined to be more than to preset the default sentence of phonetic threshold value as matching phonetic set;
According at least to text-string, the text phonetic, term vector in described eigenvector, determine the text information withThe characteristic similarity of default sentence in the matching phonetic set.
Further, the default characteristic threshold value includes default vector threshold, and the characteristic similarity obtains program moduleFor:
It is similar to the text according to the term vector of the text information when described eigenvector includes at least term vectorThe term vector of each default sentence determines the vector similarity of the text information and each default sentence in degree model;
The vector similarity is determined to be more than to preset the default sentence of vector threshold as matching vector set;
According at least to text-string, the text phonetic, term vector in described eigenvector, determine the text information withThe characteristic similarity of default sentence in the matching vector set.
Further, the text matches program module is used for:
When according to the sequence of similarity from high to low, determining only one is more than the default sentence conduct for presetting characteristic threshold valueWhen the matched text of the text information, using one default sentence as the matched text of the text information;Or
It is more than the default sentence work for presetting characteristic threshold value when having at least two according to the sequence determination of similarity from high to lowFor the text information matched text when, described at least two default sentences are sent to user;
Receive the default sentence of user's selection;
Using the selected default sentence as the matched text of the text information.
The embodiment of the invention also provides a kind of nonvolatile computer storage media, computer storage medium is stored with meterThe text similarity model in above-mentioned any means embodiment can be performed in calculation machine executable instruction, the computer executable instructionsTraining method;
As an implementation, nonvolatile computer storage media of the invention is stored with the executable finger of computerIt enables, computer executable instructions setting are as follows:
It receives dictionary training set and the default sentence is determined to default sentence word segmentation processing each in the dictionary training setText-string;
According to the text-string of each default sentence, determine term vector corresponding with the text-string and with instituteState the corresponding text phonetic of text-string;
According to the corresponding text-string of each default sentence, text phonetic and term vector, determine described each defaultThe corresponding feature vector of sentence, training text similarity model.
The embodiment of the invention also provides a kind of nonvolatile computer storage media, computer storage medium is stored with meterCalculation machine executable instruction, the computer executable instructions can be performed in above-mentioned any means embodiment based on text similarity mouldThe text matching technique of type;
As an implementation, nonvolatile computer storage media of the invention is stored with the executable finger of computerIt enables, computer executable instructions setting are as follows:
Text information is received, determines the feature vector of the text information, wherein described eigenvector includes at least: textThis character string, text phonetic, term vector;
Described eigenvector is input in the text similarity model;
Obtain the characteristic similarity of the text similarity model output;
Determine that at least one reaches the default sentence of default characteristic threshold value using as the text according to the characteristic similarityThe matched text of this information.
As a kind of non-volatile computer readable storage medium storing program for executing, it can be used for storing non-volatile software program, non-volatileProperty computer executable program and module, such as the corresponding program instruction/mould of the method for the test software in the embodiment of the present inventionBlock.One or more program instruction is stored in non-volatile computer readable storage medium storing program for executing, when being executed by a processor, is heldThe training method of text similarity model in the above-mentioned any means embodiment of row and text based on text similarity modelMatching process.
Non-volatile computer readable storage medium storing program for executing may include storing program area and storage data area, wherein storage journeyIt sequence area can application program required for storage program area, at least one function;Storage data area can be stored according to test softwareDevice use created data etc..In addition, non-volatile computer readable storage medium storing program for executing may include that high speed is deposited at randomAccess to memory, can also include nonvolatile memory, a for example, at least disk memory, flush memory device or other are non-Volatile solid-state part.In some embodiments, it includes relative to place that non-volatile computer readable storage medium storing program for executing is optionalThe remotely located memory of device is managed, these remote memories can be by being connected to the network to the device of test software.Above-mentioned networkExample include but is not limited to internet, intranet, local area network, mobile radio communication and combinations thereof.
The embodiment of the present invention also provides a kind of electronic equipment comprising: at least one processor, and with described at least oneThe memory of a processor communication connection, wherein the memory is stored with the finger that can be executed by least one described processorEnable, described instruction executed by least one described processor so that at least one described processor be able to carry out it is of the invention anyThe step of training method of the text similarity model of embodiment and text matching technique based on text similarity model.
The client of the embodiment of the present application exists in a variety of forms, including but not limited to:
(1) mobile communication equipment: the characteristics of this kind of equipment is that have mobile communication function, and to provide speech, dataCommunication is main target.This Terminal Type includes: smart phone (such as iPhone), multimedia handset, functional mobile phone and lowHold mobile phone etc..
(2) super mobile personal computer equipment: this kind of equipment belongs to the scope of personal computer, there is calculating and processing functionCan, generally also have mobile Internet access characteristic.This Terminal Type includes: PDA, MID and UMPC equipment etc., such as iPad.
(3) portable entertainment device: this kind of equipment can show and play multimedia content.Such equipment include: audio,Video player (such as iPod), handheld device, e-book and intelligent toy and portable car-mounted navigation equipment.
(4) other electronic devices having data processing function.
Herein, relational terms such as first and second and the like be used merely to by an entity or operation with it is anotherOne entity or operation distinguish, and without necessarily requiring or implying between these entities or operation, there are any this realityRelationship or sequence.Moreover, the terms "include", "comprise", include not only those elements, but also including being not explicitly listedOther element, or further include for elements inherent to such a process, method, article, or device.Do not limiting moreIn the case where system, the element that is limited by sentence " including ... ", it is not excluded that including process, method, the article of the elementOr there is also other identical elements in equipment.
The apparatus embodiments described above are merely exemplary, wherein described, unit can as illustrated by the separation memberIt is physically separated with being or may not be, component shown as a unit may or may not be physics listMember, it can it is in one place, or may be distributed over multiple network units.It can be selected according to the actual needsIn some or all of the modules achieve the purpose of the solution of this embodiment.Those of ordinary skill in the art are not paying creativenessLabour in the case where, it can understand and implement.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment canIt realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, onStating technical solution, substantially the part that contributes to existing technology can be embodied in the form of software products in other words, shouldComputer software product may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including several fingersIt enables and using so that a computer equipment (can be personal computer, server or the network equipment etc.) executes each implementationMethod described in certain parts of example or embodiment.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;AlthoughPresent invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be usedTo modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features;And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution spirit andRange.

Claims (10)

CN201811344782.5A2018-11-132018-11-13Text matching technique and system based on text similarity modelPendingCN109460461A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201811344782.5ACN109460461A (en)2018-11-132018-11-13Text matching technique and system based on text similarity model

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201811344782.5ACN109460461A (en)2018-11-132018-11-13Text matching technique and system based on text similarity model

Publications (1)

Publication NumberPublication Date
CN109460461Atrue CN109460461A (en)2019-03-12

Family

ID=65610191

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201811344782.5APendingCN109460461A (en)2018-11-132018-11-13Text matching technique and system based on text similarity model

Country Status (1)

CountryLink
CN (1)CN109460461A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110245606A (en)*2019-06-132019-09-17广东小天才科技有限公司Text recognition method, device, equipment and storage medium
CN110390015A (en)*2019-07-232019-10-29中国工商银行股份有限公司A kind of data information processing method, apparatus and system
CN110413988A (en)*2019-06-172019-11-05平安科技(深圳)有限公司Method, apparatus, server and the storage medium of text information matching measurement
CN110516125A (en)*2019-08-282019-11-29拉扎斯网络科技(上海)有限公司Method, device and equipment for identifying abnormal character string and readable storage medium
CN110717158A (en)*2019-09-062020-01-21平安普惠企业管理有限公司Information verification method, device, equipment and computer readable storage medium
CN111009244A (en)*2019-12-062020-04-14贵州电网有限责任公司Voice recognition method and system
CN111159338A (en)*2019-12-232020-05-15北京达佳互联信息技术有限公司Malicious text detection method and device, electronic equipment and storage medium
CN111159339A (en)*2019-12-242020-05-15北京亚信数据有限公司Text matching processing method and device
CN111753551A (en)*2020-06-292020-10-09北京字节跳动网络技术有限公司Information generation method and device based on word vector generation model
CN112000767A (en)*2020-07-312020-11-27深思考人工智能科技(上海)有限公司Text-based information extraction method and electronic equipment
CN113932518A (en)*2021-06-022022-01-14海信(山东)冰箱有限公司Refrigerator and food material management method thereof
WO2022095370A1 (en)*2020-11-062022-05-12平安科技(深圳)有限公司Text matching method and apparatus, terminal device, and storage medium
CN117095410A (en)*2023-09-052023-11-21金卫医保信息管理(中国)有限公司Text matching method, system and medium based on deep learning model

Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103605694A (en)*2013-11-042014-02-26北京奇虎科技有限公司Device and method for detecting similar texts
CN104102626A (en)*2014-07-072014-10-15厦门推特信息科技有限公司Method for computing semantic similarities among short texts
CN104239512A (en)*2014-09-162014-12-24电子科技大学Text recommendation method
US8996515B2 (en)*2008-06-242015-03-31Microsoft CorporationConsistent phrase relevance measures
CN104699763A (en)*2015-02-112015-06-10中国科学院新疆理化技术研究所Text similarity measuring system based on multi-feature fusion
CN106095928A (en)*2016-06-122016-11-09国家计算机网络与信息安全管理中心A kind of event type recognition methods and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US8996515B2 (en)*2008-06-242015-03-31Microsoft CorporationConsistent phrase relevance measures
CN103605694A (en)*2013-11-042014-02-26北京奇虎科技有限公司Device and method for detecting similar texts
CN104102626A (en)*2014-07-072014-10-15厦门推特信息科技有限公司Method for computing semantic similarities among short texts
CN104239512A (en)*2014-09-162014-12-24电子科技大学Text recommendation method
CN104699763A (en)*2015-02-112015-06-10中国科学院新疆理化技术研究所Text similarity measuring system based on multi-feature fusion
CN106095928A (en)*2016-06-122016-11-09国家计算机网络与信息安全管理中心A kind of event type recognition methods and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
梁敬东 等: "基于word2vec和LSTM的句子相似度计算及其", 《南京农业大学学报》*

Cited By (21)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110245606A (en)*2019-06-132019-09-17广东小天才科技有限公司Text recognition method, device, equipment and storage medium
CN110245606B (en)*2019-06-132021-07-20广东小天才科技有限公司 A text recognition method, device, device and storage medium
CN110413988A (en)*2019-06-172019-11-05平安科技(深圳)有限公司Method, apparatus, server and the storage medium of text information matching measurement
CN110413988B (en)*2019-06-172023-01-31平安科技(深圳)有限公司Text information matching measurement method, device, server and storage medium
CN110390015A (en)*2019-07-232019-10-29中国工商银行股份有限公司A kind of data information processing method, apparatus and system
CN110516125A (en)*2019-08-282019-11-29拉扎斯网络科技(上海)有限公司Method, device and equipment for identifying abnormal character string and readable storage medium
CN110516125B (en)*2019-08-282020-05-08拉扎斯网络科技(上海)有限公司 Method, apparatus, device and readable storage medium for identifying abnormal character string
CN110717158A (en)*2019-09-062020-01-21平安普惠企业管理有限公司Information verification method, device, equipment and computer readable storage medium
CN110717158B (en)*2019-09-062024-03-01冉维印Information verification method, device, equipment and computer readable storage medium
CN111009244A (en)*2019-12-062020-04-14贵州电网有限责任公司Voice recognition method and system
CN111159338A (en)*2019-12-232020-05-15北京达佳互联信息技术有限公司Malicious text detection method and device, electronic equipment and storage medium
CN111159339A (en)*2019-12-242020-05-15北京亚信数据有限公司Text matching processing method and device
CN111753551B (en)*2020-06-292022-06-14北京字节跳动网络技术有限公司Information generation method and device based on word vector generation model
WO2022001888A1 (en)*2020-06-292022-01-06北京字节跳动网络技术有限公司Information generation method and device based on word vector generation model
CN111753551A (en)*2020-06-292020-10-09北京字节跳动网络技术有限公司Information generation method and device based on word vector generation model
CN112000767A (en)*2020-07-312020-11-27深思考人工智能科技(上海)有限公司Text-based information extraction method and electronic equipment
CN112000767B (en)*2020-07-312024-07-23深思考人工智能科技(上海)有限公司Text-based information extraction method and electronic equipment
WO2022095370A1 (en)*2020-11-062022-05-12平安科技(深圳)有限公司Text matching method and apparatus, terminal device, and storage medium
CN113932518A (en)*2021-06-022022-01-14海信(山东)冰箱有限公司Refrigerator and food material management method thereof
CN113932518B (en)*2021-06-022023-08-18海信冰箱有限公司 Refrigerator and food management method thereof
CN117095410A (en)*2023-09-052023-11-21金卫医保信息管理(中国)有限公司Text matching method, system and medium based on deep learning model

Similar Documents

PublicationPublication DateTitle
CN109460461A (en)Text matching technique and system based on text similarity model
US10043520B2 (en)Multilevel speech recognition for candidate application group using first and second speech commands
CN110544488B (en)Method and device for separating multi-person voice
CN109101620B (en)Similarity calculation method, clustering method, device, storage medium and electronic equipment
CN105976812B (en)A kind of audio recognition method and its equipment
CN112037792B (en)Voice recognition method and device, electronic equipment and storage medium
US10811013B1 (en)Intent-specific automatic speech recognition result generation
CN103077050B (en)A kind of show the method for application information, device and equipment
US20170164049A1 (en)Recommending method and device thereof
CN107526846B (en)Method, device, server and medium for generating and sorting channel sorting model
CN107507615A (en)Interface intelligent interaction control method, device, system and storage medium
CN111984749B (en)Interest point ordering method and device
CN110413888B (en)Book recommendation method and device
CN104361896B (en)Voice quality assessment equipment, method and system
WO2020215683A1 (en)Semantic recognition method and apparatus based on convolutional neural network, and non-volatile readable storage medium and computer device
CN104485115A (en)Pronunciation evaluation equipment, method and system
CN104866308A (en)Scenario image generation method and apparatus
JP7372402B2 (en) Speech synthesis method, device, electronic device and storage medium
CN110970030A (en) A kind of speech recognition conversion method and system
US20170171471A1 (en)Method and device for generating multimedia picture and an electronic device
CN107112007A (en)Speech recognition equipment and audio recognition method
CN109410935A (en)A kind of destination searching method and device based on speech recognition
CN111477212A (en)Content recognition, model training and data processing method, system and equipment
CN109273004A (en) Predictive speech recognition method and device based on big data
EP4550173A1 (en)Song list generation method and apparatus, and electronic device and storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
CB02Change of applicant information

Address after:215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant after:Sipic Technology Co.,Ltd.

Address before:215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant before:AI SPEECH Co.,Ltd.

CB02Change of applicant information
RJ01Rejection of invention patent application after publication

Application publication date:20190312

RJ01Rejection of invention patent application after publication

[8]ページ先頭

©2009-2025 Movatter.jp