Summary of the invention
The purpose of the present invention is in view of the drawbacks of the prior art, provide a kind of intelligent semantic match party based on phonetic conversionMethod may be implemented during semantic matches, be revised as correct word after carrying out phonetic conversion to the word of the mistake in sentenceLanguage, thus realize to the homonym error correction in sentence, so that deposited in the case of an error in original text, it still can be in error correctionAfter carry out semantic matches.
To achieve the above object, the present invention provides a kind of intelligent semantic matching process based on phonetic conversion, the sidesMethod includes:
Semantic processes system obtains first object text data;
Semantic matches are carried out to the first object text data, obtain the first semantic matches result data;
When the first semantic matches result data is empty, the extensive object in the first object text data is obtainedInformation and contextual data;
Phonetic conversion is carried out to the extensive object information in the first object text data, obtains the extensive object letterThe Pinyin information of breath;
According to the Pinyin information of the extensive object information in the contextual data parallel expression information bankWith corresponding replacement object information;
The first object text data is updated according to the replacement object information, obtains the second target text data;
Semantic matches are carried out to the second target text data, obtain the second semantic matches result data, and export.
Preferably, before the semantic processes system obtains first object text data, the method also includes:
The semantic processes system receives phrase data, carries out speech recognition to the phrase data, obtains first objectText data.
It is further preferred that the phrase data includes sentence voice data and sentence lteral data;The semantic processesSystem receives phrase data, carries out speech recognition to the phrase data, obtains first object text data specifically:
The speech convertor of the semantic processes system receives the phrase data, to the sentence language in the phrase dataSound data are identified, obtain the sentence lteral data of the sentence voice data, and by the sentence of the sentence voice dataLteral data is inserted into the end of the input rank of the semantic processes system;
The interrogator of the semantic processes system monitors the data insertion of the input rank, obtains from the input rankThe sentence lteral data for taking the input rank end obtains the first object text data.
Preferably, the extensive object information in the first object text data carries out phonetic conversion specifically:
Extensive object information in the first object text data is split as one or more phrases units, to eachPhrases unit carries out phonetic conversion.
It is further preferred that the phrases unit includes binary phrases unit and ternary phrases unit.
Preferably, described that semantic matches are carried out to the first object text data, obtain the first semantic matches number of resultsAccording to specifically:
The extensive processing of clause is carried out to the first object text data, extracts consolidating in the first object text dataAttribute information and extensive object information;
The first semantic matches result data is obtained according to the fixed language information and the extensive object information.
It is further preferred that described obtain first semanteme according to the fixed language information and the extensive object informationMatching result data specifically:
The contextual data of the first object text data is determined according to the fixed language information;
It brings the extensive object information into the contextual data, obtains the first semantic matches result data.
Preferably, the Pinyin information according to the extensive object information with the contextual data parallel expressionCorresponding replacement object information is matched in information bank specifically:
The Pinyin information with the extensive object information is searched from the contextual data parallel expression information bankIdentical one or more word information;
The word information that highest priority is determined from one or more of word information is the replacement object information.
It is further preferred that the lexical information database includes user data information database and default lexical information database;?It is described from and the contextual data parallel expression information bank in search it is identical as the Pinyin information of the extensive object informationOne or more word information before, the method also includes:
Processor in the semantic processes system obtains local user's letter according to the application interface in semantic processes systemData are ceased, the user data information database is generated according to local user's information data;
Also, default term data is obtained from the server, is generated according to the default term data described defaultLexical information database.
Intelligent semantic matching process provided in an embodiment of the present invention based on phonetic conversion, may be implemented in semantic matchesIn the process, it is revised as correct word after carrying out phonetic conversion to the word of the mistake in sentence, to realize to same in sentenceSound contrary opinion word error correction can still carry out semantic matches so that depositing in the case of an error in original text after error correction.
Specific embodiment
Below by drawings and examples, technical scheme of the present invention will be described in further detail.
The embodiment of the present invention provides firstly a kind of intelligent semantic matching process based on phonetic conversion, is implemented in semantic placeIn reason system, for the homonym error correction in sentence.Its method flow diagram is as shown in Figure 1, include the following steps:
Step 101, semantic processes system obtains first object text data;
Specifically, semantic processes system can be understood as one with input by sentence, the system for handling and exporting function.LanguageAdopted processing system includes speech convertor, input rank, interrogator and processor.When the starting of semantic processes system, system is defeatedThe monitor configured in page-out is activated, which can load the configuration file for voice service, domain (domain) classAnd the output statement of the corresponding user profile of domain, semantic processes system specific condition, while starting voice conversionDevice, input rank, interrogator and processor.
Phrase data includes sentence voice data or sentence lteral data, that is to say, that user can pass through voice or textMode to semantic processes system read statement data.When user's read statement voice data by way of voice, voiceConverter receives sentence voice data, identifies to sentence voice data, obtains the sentence lteral data of sentence voice data,And the sentence lteral data of sentence voice data is inserted into the end of the input rank of semantic processing system.When user passes through textMode read statement lteral data when, sentence lteral data of the speech convertor directly by user's input is inserted into semantic processing systemThe end of the input rank of system.
Interrogator can monitor always whether input rank has new message, that is, monitor whether have sentence lteral data intoEnqueue, and from input rank obtain input rank end sentence lteral data, to obtain first object text data.First object text data can be understood as the urtext for not carrying out phonetic conversion.
Step 102, semantic matches are carried out to first object text data, obtains the first semantic matches result data;
Specifically, processor carries out the extensive processing of clause to current first object text data first, first object is extractedFixation language information and extensive object information in text data, then according to the fixation sentence information in first object text dataIt determines the corresponding contextual data of first object text data, determines according to the extensive object information in first object text dataThe corresponding interest point data of one target text data, finally according to the corresponding contextual data of first object text data and point of interestData obtain the first semantic matches result data.
Further specifically, after interrogator gets current first object text data, interrogator is by current first meshMark text data is sent to processor.Processor carries out the extensive processing of clause to phrase data according to syntax rule tree, extracts languageFixation language information and extensive object information in sentence data.The extensive processing of clause is understood that pass through one according to syntax rule treeA sentence expands to the expression-form of a variety of sentences, and extracts the process of key element in sentence.In this process, it extractsKey element includes fixed language information and extensive object information in sentence.
In a specific example, user has input the sentence voice number of " I will phone Yao Ming " by voiceAccording to speech converter carries out voice recognition processing to the sentence voice data, has obtained the first of " I will phone Yao Ming "Target text data.Then the processor in semantic processes system first recognizes " making a phone call ", then recognizes further along" wanting to make a phone call " recognizes the clause of " I wants to make a phone call ... " in turn, and according to " make a phone call " in preset syntax rule tree-The clause of " wanting to make a phone call "-" I wants to make a phone call " extracts the fixation language information " I wants to make a phone call " in phrase data, and " IWant to make a phone call " after word " Yao Ming " be used as extensive object information.
Each fixed language information can be mapped to a contextual data.Here, each contextual data can be understood as oneA independent user behavior scene.While processor determines contextual data according to fixed language information, processor is in point of interestInterest point data corresponding with the extensive object information in first object text data is matched in library.Point of interest library is understood thatFor the database of user setting, in current semantics processing system matching relationship for storing point of interest and extensive object.Interest point data in point of interest library can be what user was set as needed.If be matched in point of interest library with it is extensiveWhen the corresponding interest point data of object information, processor can directly obtain the corresponding interest of current first object text dataPoint data.If matching is less than interest point data corresponding with currently extensive object information in point of interest library, handleDevice connects external data library according to external interface, and in external database matching interest point corresponding with extensive object informationAccording to using the interest point data in exogenous data library as the corresponding interest point data of current first object text data.
Finally, processor brings first object text data corresponding interest point data in the data set of contextual data into,Obtain the first semantic matches result.First semantic matches result is understood that the matching result to obtain according to urtext.LanguageAdopted matching result can be the matching result in semantic processes system itself, is also possible to semantic processes system and opens other applicationThe matching result shown after program.
Step 103, when the first semantic matches result data is empty, the extensive object in first object text data is believedBreath carries out phonetic conversion, obtains the Pinyin information of extensive object information;
Specifically, when the corresponding interest point data of first object text data is brought into the data set of contextual data by processorAfterwards, when interest point data can not match with contextual data, illustrate after carrying out semantic matches to urtext, match less than correspondingAs a result, semantic matches result is sky at this time, the first semantic matches result data is sky.
When the first semantic matches result data is not sky, illustrate according to the available semantic matches of urtext as a result,Then processor directly exports current first semantic matches result data.When the first semantic matches result data is empty, illustrate rootSemantic matches are unable to get according to urtext as a result, it is desirable to which current first object text data is further processed.ThenProcessor obtain first object text data in extensive object information and according to the fixation language in first object text dataThe contextual data that information determines carries out phonetic conversion to the extensive object information in first object text data, and it is extensive right to obtainThe Pinyin information of image information.
When carrying out phonetic conversion to extensive object information, processor is first by the extensive object in first object text dataInformation is split as one or more phrases units, then carries out phonetic conversion to each phrases unit.Phrases unit includes binaryPhrases unit and ternary phrases unit.It is based on binary word that is, carrying out the process of phonetic conversion to extensive object informationGroup and ternary phrase carry out the process of phonetic conversion to word.
In a specific example, the first object text data that processor is recognized is that " I, which wants to phone, wantsLife ", fixing language information is " I wants to make a phone call ", and extensive object information is " terribly ", and processor will " terribly " bring that " I thinks dozen intoThe first semantic matches result data that phone " is matched to afterwards is sky, then processor obtains fixed language information " I wants to make a phone call "Corresponding contextual data " making a phone call " and extensive object information " terribly " carries out phonetic conversion to extensive object information, obtainsPinyin information to extensive object information is " yao ming ".
Step 104, according to the Pinyin information of extensive object information in contextual data parallel expression information bankWith corresponding replacement object information;
Specifically, processor is first according to first object textual data after the Pinyin information for having obtained extensive object informationAccording to contextual data, it is determining with the contextual data parallel expression information bank.
Lexical information database includes user data information database and default lexical information database.In processor according to extensive objectThe Pinyin information of information is before replacement object information corresponding with matching in contextual data parallel expression information bank, processingDevice can obtain local user's information data according to the application interface in semantic processes system, be generated according to local user's information dataUser data information database.Meanwhile default term data is obtained from server, default word is generated according to default term dataLanguage information bank.That is the word information in lexical information database had both included local user's letter in user's local applicationData are ceased, also include some default term datas.
Each user data information database and default lexical information database are corresponding with different contextual datas.Such as it " usesThe corresponding contextual data of user data information database of family telephone directory " is " making a phone call ", and the default word of " medical jargons or word " is believedCeasing the corresponding contextual data in library is " health " and " medical treatment ".Also, each user data information database and default word informationWord information in library all has priority.Priority can be the appearance according to the word information counted in useWhat frequency determined.
Processor is searched and the Pinyin information phase of extensive object information from contextual data parallel expression information bankThen same one or more word information determine the highest word information of current priority from one or more word informationTo replace object information.Replacement object information can be understood as with the extensive object information in first object text data being unisonanceThe word of contrary opinion word.
In a specific example, the Pinyin information of extensive object information is " yao in first object text dataMing ", and the contextual data of first object text data is " making a phone call ", then processor is true according to the contextual data of " making a phone call "Determine the user data information database that lexical information database is " user-phone book ", and in user data information database " user's electricitySearch whether that there are the word information that phonetic is " yao ming " in words book ".When processor finds user data information data" Yao Ming " is then used as current replacement pair there are when word information " Yao Ming " that phonetic is " yao ming " by library " user-phone book "Image information.
Step 105, first object text data is updated according to replacement object information, obtains the second target text data;
Specifically, processor will be original extensive right in obtained replacement object information replacement first object text dataImage information obtains the second target text number according to fixed language information original in first object text data and replacement object informationAccording to.Relative to first object text data, the second target text data can be understood as obtained new after phonetic is convertedWriting text.
In a specific example, first object text data is " I wants to phone terribly ", and obtained by processorThe current replacement object information arrived is " Yao Ming ", then updates first object text data " I according to replacement object information " Yao Ming "Want to phone terribly ", obtain the second target text data " I wants to phone Yao Ming ".
Step 106, semantic matches are carried out to the second target text data, obtains the second semantic matches result data;
Specifically, processor is to the second target text having obtained after the second target text data that phonetic is convertedData carry out semantic matches, obtain and export the second semantic matches result data.Semantic is carried out to the second target text dataThe process matched can refer to above-mentioned steps 102, and details are not described herein.
When exporting the second semantic matches result data, processor in semantic processes system is by the second semantic matches resultData are encapsulated into the end of the output queue in semantic processes system.Interrogator monitors the data insertion of output queue, from outputThe second semantic matches result data at input rank end is obtained in queue, and is exported.
In some preferred embodiments, processor first can when carrying out semantic matches to first object text dataRecord interrogator obtains the first time of the first object text data at input rank end from input rank, and monitors systemTime, according at the first time and system time obtains the semantic processes time, the semantic processes time can be understood as processor according toThe time for the semantic processes system processing first object text data being calculated.When processor monitors that the semantic processes time is bigWhen preset time, processor output feedback sentence.What preset time can be that user is set as needed allows to wait semanticThe maximum time of matching result.When processor monitors that the semantic processes time is greater than preset time, declarative semantics processing systemThe time of processing first object text data has been more than the maximum time for allowing to wait semantic matches result, then processor exports exampleSuch as feedback sentence of " I needs more information ", to overtime to user feedback current semantics matching process.
Intelligent semantic matching process provided in an embodiment of the present invention based on phonetic conversion, may be implemented in semantic matchesIn the process, it is revised as correct word after carrying out phonetic conversion to the word of the mistake in sentence, to realize to same in sentenceSound contrary opinion word error correction can still carry out semantic matches so that depositing in the case of an error in original text after error correction.
Professional should further appreciate that, described in conjunction with the examples disclosed in the embodiments of the present disclosureUnit and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, hard in order to clearly demonstrateThe interchangeability of part and software generally describes each exemplary composition and step according to function in the above description.These functions are implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Professional technician can use different methods to achieve the described function each specific application, but this realizationIt should not be considered as beyond the scope of the present invention.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can be executed with hardware, user terminalSoftware module or the combination of the two implement.Software module can be placed in random access memory (RAM), memory, read-only storageDevice (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology neckIn any other form of storage medium well known in domain.
Above-described specific embodiment has carried out further the purpose of the present invention, technical scheme and beneficial effectsIt is described in detail, it should be understood that being not intended to limit the present invention the foregoing is merely a specific embodiment of the inventionProtection scope, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should all includeWithin protection scope of the present invention.