Movatterモバイル変換


[0]ホーム

URL:


CN109308896A - Method of speech processing and device, storage medium and processor - Google Patents

Method of speech processing and device, storage medium and processor
Download PDF

Info

Publication number
CN109308896A
CN109308896ACN201710633042.2ACN201710633042ACN109308896ACN 109308896 ACN109308896 ACN 109308896ACN 201710633042 ACN201710633042 ACN 201710633042ACN 109308896 ACN109308896 ACN 109308896A
Authority
CN
China
Prior art keywords
speech
vector
model
multiple moment
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710633042.2A
Other languages
Chinese (zh)
Other versions
CN109308896B (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Huitong Jinke Data Co ltd
Original Assignee
Kuang Chi Innovative Technology Ltd
Shenzhen Guangqi Hezhong Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kuang Chi Innovative Technology Ltd, Shenzhen Guangqi Hezhong Technology Co LtdfiledCriticalKuang Chi Innovative Technology Ltd
Priority to CN201710633042.2ApriorityCriticalpatent/CN109308896B/en
Priority to PCT/CN2018/079848prioritypatent/WO2019019667A1/en
Publication of CN109308896ApublicationCriticalpatent/CN109308896A/en
Application grantedgrantedCritical
Publication of CN109308896BpublicationCriticalpatent/CN109308896B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The invention discloses a kind of method of speech processing and devices, storage medium and processor.Wherein, this method comprises: obtaining the speech vector at multiple moment in preset time period;It is handled using speech vector of the default speech model to multiple moment, obtain multiple text informations corresponding with the speech vector at multiple moment, wherein, speech model is preset to handle the speech vector at multiple moment based on the parameter vector at pre-stored multiple moment;Export multiple text informations.The present invention solves the low technical problem for the treatment of effeciency of method of speech processing in the prior art.

Description

Method of speech processing and device, storage medium and processor
Technical field
The present invention relates to data processing field, in particular to a kind of method of speech processing and device, storage medium andProcessor.
Background technique
Natural language processing is an important directions in computer science and artificial intelligence field.It is studied can be realThe various theory and methods of efficient communication are carried out between current family and computer with natural language.Natural language processing is one and meltsLinguistics, computer science, mathematics are in the science of one.Therefore, the research in this field will be related to natural language, i.e. people dayThe language being often used, thus it have with philological research it is close contact, but have important difference.
Currently used natural language processing method has: condition random field CRF, Hidden Markov Model HMM, recurrent neuralNetwork model RNN and shot and long term memory models LSTM etc. still in order to improve processing accuracy, need to increase model depth, causeIt is high to handle complexity, treatment effeciency is low.
For the low problem of the treatment effeciency of method of speech processing in the prior art, effective solution is not yet proposed at presentScheme.
Summary of the invention
The embodiment of the invention provides a kind of method of speech processing and devices, storage medium and processor, at least to solveThe low technical problem of the treatment effeciency of method of speech processing in the prior art.
According to an aspect of an embodiment of the present invention, a kind of method of speech processing is provided, comprising: obtain preset time periodThe speech vector at interior multiple moment;Handled using speech vector of the default speech model to multiple moment, obtain with it is multipleThe corresponding multiple text informations of the speech vector at moment, wherein preset speech model based on pre-stored multiple momentParameter vector handles the speech vector at multiple moment;Export multiple text informations.
Further, default speech model includes: speech processes model and parameter matrix, and parameter matrix is for being stored in advanceThe parameter vector at multiple moment, speech processes model is for the parameter vector based on multiple moment to the speech vector at multiple momentIt is handled, obtains multiple text informations corresponding with the speech vector at multiple moment.
Further, it is handled, is obtained and multiple moment using speech vector of the default speech model to multiple momentThe corresponding multiple text informations of speech vector, comprising: the first of multiple moment are obtained from parameter matrix according to read operationParameter vector;Speech processes model is modified using first parameter vector at multiple moment, is obtained at revised voiceManage model;The speech vector at multiple moment is handled using revised speech processes model, obtains multiple text informations.
Further, the speech vector at multiple moment is handled using revised speech processes model, is obtainedWhile multiple text informations, the above method further include: utilize revised speech processes model, obtain the second of multiple momentParameter vector;According to write operation by the second parameter vector write parameters matrix at multiple moment.
Further, using revised speech processes model, second parameter vector at multiple moment is obtained, comprising: benefitIt is updated with first parameter vector of the revised speech processes model to multiple moment, obtains second parameter at multiple momentVector.
Further, the speech vector at multiple moment is handled using default speech model, obtain with it is multiple whenBefore the corresponding multiple text informations of the speech vector at quarter, the above method further include: establish initial preset model, initial presetModel includes: speech processes model and initial parameter matrix;Obtain training data, wherein training data includes: multiple trained languagesSound vector and the corresponding text information of each trained speech vector;Initial preset model is instructed according to training dataPractice, obtains default speech model.
Further, initial preset model is trained according to training data, obtaining default speech model includes: that will instructPractice data and input speech processes model, obtains parameter preset vector;By write operation by parameter preset vector write-in initial parameterMatrix obtains parameter matrix.
Further, speech processes model is LSTM model, and parameter matrix is dot-blur pattern.
Further, according to the processing capacity of default speech model, preset time period is determined.
According to another aspect of an embodiment of the present invention, a kind of voice processing apparatus is additionally provided, comprising: first obtains mouldBlock, for obtaining the speech vector at multiple moment in preset time period;Processing module, for utilizing default speech model to multipleThe speech vector at moment is handled, and multiple text informations corresponding with the speech vector at multiple moment are obtained, wherein defaultSpeech model is handled the speech vector at multiple moment based on the parameter vector at pre-stored multiple moment;Export mouldBlock, for exporting multiple text informations.
Further, default speech model includes: speech processes model and parameter matrix, and parameter matrix is for being stored in advanceThe parameter vector at multiple moment, speech processes model is for the parameter vector based on multiple moment to the speech vector at multiple momentIt is handled, obtains multiple text informations corresponding with the speech vector at multiple moment.
Further, processing module includes: acquisition submodule, when for obtaining multiple from parameter matrix according to read operationThe first parameter vector carved;Submodule is corrected, speech processes model is carried out for the first parameter vector using multiple momentAmendment, obtains revised speech processes model;First processing submodule, for utilizing revised speech processes model to moreThe speech vector at a moment is handled, and multiple text informations are obtained.
Further, processing module further include: second processing submodule, for utilizing revised speech processes model,Obtain second parameter vector at multiple moment;First sub-module stored, for according to write operation by second parameter at multiple momentVector write-in parameter matrix.
Further, the second processing submodule was also used to using revised speech processes model to multiple momentFirst parameter vector is updated, and obtains second parameter vector at multiple moment.
Further, above-mentioned apparatus further include: module is established, for establishing initial preset model, initial preset model packetIt includes: speech processes model and initial parameter matrix;Second obtains module, for obtaining training data, wherein training data packetIt includes: multiple trained speech vectors and the corresponding text information of each trained speech vector;Training module, for according to instructionPractice data to be trained initial preset model, obtains default speech model.
Further, training module includes: third processing submodule, for training data to be inputted speech processes model,Obtain parameter preset vector;Second sub-module stored, for by write operation by parameter preset vector write-in initial parameter matrix,Obtain default speech model.
Further, speech processes model is LSTM model, and parameter matrix is dot-blur pattern.
Further, above-mentioned apparatus further include: determining module, for determining according to the processing capacity for presetting speech modelPreset time period.
According to another aspect of an embodiment of the present invention, a kind of storage medium is additionally provided, storage medium includes the journey of storageSequence, wherein equipment where control storage medium executes the method for speech processing in above-described embodiment in program operation.
According to another aspect of an embodiment of the present invention, a kind of processor is additionally provided, processor is used to run program,In, program executes the method for speech processing in above-described embodiment when running.
In embodiments of the present invention, the speech vector for obtaining multiple moment in preset time period, utilizes default speech modelThe speech vector at multiple moment is handled, multiple text informations corresponding with the speech vector at multiple moment are obtained, it is defeatedMultiple text informations out, to realize natural language processing.It is easily noted that, since what is got is in preset time periodThe speech vector at multiple moment, and default parameter vector of the speech model based on pre-stored multiple moment is to multiple momentSpeech vector handled, so that the sequential character using natural-sounding is realized, in conjunction with the dot-blur pattern of neural Turing machineWith LSTM model, natural-sounding is handled, and then the treatment effeciency for solving method of speech processing in the prior art is lowThe technical issues of.Therefore, scheme provided by the above embodiment through the invention can achieve and improve treatment effeciency, raising processingAccuracy, the effect for reducing processing complexity, reducing the processing time.
Detailed description of the invention
The drawings described herein are used to provide a further understanding of the present invention, constitutes part of this application, this hairBright illustrative embodiments and their description are used to explain the present invention, and are not constituted improper limitations of the present invention.In the accompanying drawings:
Fig. 1 is a kind of flow chart of method of speech processing according to an embodiment of the present invention;
Fig. 2 is a kind of schematic diagram of optional default speech model according to an embodiment of the present invention;
Fig. 3 is a kind of schematic diagram of the replicated blocks of optional speech processes model according to an embodiment of the present invention;And
Fig. 4 is a kind of schematic diagram of voice processing apparatus according to an embodiment of the present invention.
Specific embodiment
In order to enable those skilled in the art to better understand the solution of the present invention, below in conjunction in the embodiment of the present inventionAttached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is onlyThe embodiment of a part of the invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill peopleThe model that the present invention protects all should belong in member's every other embodiment obtained without making creative workIt encloses.
It should be noted that description and claims of this specification and term " first " in above-mentioned attached drawing, "Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that using in this wayData be interchangeable under appropriate circumstances, so as to the embodiment of the present invention described herein can in addition to illustrating herein orSequence other than those of description is implemented.In addition, term " includes " and " having " and their any deformation, it is intended that coverCover it is non-exclusive include, for example, the process, method, system, product or equipment for containing a series of steps or units are not necessarily limited toStep or unit those of is clearly listed, but may include be not clearly listed or for these process, methods, productOr other step or units that equipment is intrinsic.
Embodiment 1
According to embodiments of the present invention, a kind of embodiment of method of speech processing is provided, it should be noted that in attached drawingThe step of process illustrates can execute in a computer system such as a set of computer executable instructions, although also,Logical order is shown in flow chart, but in some cases, it can be to be different from shown by sequence execution herein or retouchThe step of stating.
Fig. 1 is a kind of flow chart of method of speech processing according to an embodiment of the present invention, as shown in Figure 1, this method includesFollowing steps:
Step S102 obtains the speech vector at multiple moment in preset time period.
Optionally, in the above embodiment of the present invention, according to the processing capacity of default speech model, preset time is determinedSection.
Specifically, above-mentioned preset time period can be set according to the processing capacity that actual speech handles model, onThe multiple moment stated, which can be, is spaced equal multiple sampling instants, for example, preset time period is 100s, sampling interval 10s,Then in 100s, the available speech vector to 10 moment.
Step S104 is handled using speech vector of the default speech model to multiple moment, is obtained and multiple momentThe corresponding multiple text informations of speech vector, wherein default parameter of the speech model based on pre-stored multiple momentVector handles the speech vector at multiple moment.
Optionally, in the above embodiment of the present invention, default speech model includes: speech processes model and parameter matrix,Parameter matrix is used to be stored in advance the parameter vector at multiple moment, and speech processes model is used for the parameter vector based on multiple momentThe speech vector at multiple moment is handled, multiple text informations corresponding with the speech vector at multiple moment are obtained.
Optionally, in the above embodiment of the present invention, speech processes model is LSTM model, and parameter matrix is memory squareBattle array.
Specifically, above-mentioned default speech model can be neural Turing machine, as shown in Fig. 2, neural Turing machine includes twoA component part: controller (i.e. above-mentioned speech processes model) and dot-blur pattern (i.e. above-mentioned parameter matrix), dot-blur patternFor external storage matrix, it is stored with speech processes model and carries out parameter vector required for speech processes, controller can be rightParameter vector in dot-blur pattern is read and writen;Above-mentioned speech processes model can be LSTM model, be a kind of RNNIn special type, long-term Dependency Specification can be learnt, LSTM avoids long-term Dependence Problem by design deliberately, specificallyGround, LSTM is as other RNN, the form with a kind of chain type for repeating neural network module, still, with single neural networkLayer is different, and duplicate module possesses a different structure, as shown in figure 3, can forget that door, out gate are constituted by input gate,And it is interacted in a kind of very special mode, to solve the problems, such as that the gradient of RNN disappears and gradient is exploded.
Step S106 exports multiple text informations.
In a kind of optional scheme, it can be obtained more in preset time period according to the sequential character of natural-soundingThe natural-sounding data of a sampling instant obtain the speech vector at multiple moment in preset time period, obtain trained in advanceNeural Turing machine is identified using speech vector of the neural Turing machine to multiple moment, obtains corresponding text information, and defeatedThe text information identified out.
According to that above embodiment of the present invention, the speech vector for obtaining multiple moment in preset time period, utilizes default voiceModel handles the speech vector at multiple moment, obtains multiple text envelopes corresponding with the speech vector at multiple momentBreath, exports multiple text informations, to realize natural language processing.It is easily noted that, since what is got is when presettingBetween in section multiple moment speech vector, and default parameter vector of the speech model based on pre-stored multiple moment is to moreThe speech vector at a moment is handled, so that the sequential character using natural-sounding is realized, in conjunction with the note of neural Turing machineRecall matrix and LSTM model, natural-sounding is handled, and then solves the processing of method of speech processing in the prior artThe technical issues of low efficiency.Therefore, scheme provided by the above embodiment through the invention can achieve and improve treatment effeciency, mentionHigh disposal accuracy, the effect for reducing processing complexity, reducing the processing time.
Optionally, in the above embodiment of the present invention, step S104, using default speech model to the voice at multiple momentVector is handled, and multiple text informations corresponding with the speech vector at multiple moment are obtained, comprising:
Step S1040 obtains first parameter vector at multiple moment according to read operation from parameter matrix.
Specifically, as shown in Fig. 2, neural Turing machine may include: read head and writing head, carrying out read operation by read head can be withFrom the W parameter read in LSTM model in dot-blur pattern, memory can be written for new W parameter by carrying out write operation by writing headIn matrix.
Step S1042 is modified speech processes model using first parameter vector at multiple moment, after obtaining amendmentSpeech processes model.
Step S1044 handles the speech vector at multiple moment using revised speech processes model, obtains moreA text information.
In a kind of optional scheme, after the speech vector for getting multiple moment, for the nature at each momentSpeech processes process can read W parameter vector by read head from dot-blur pattern, and W parameter vector is inputted LSTM model, rightLSTM model is modified, and obtains revised LSTM model, can be using speech vector as input vector, after being input to amendmentLSTM model, to obtain the output vector of LSTM model, i.e. the text information of speech vector, in the language at all multiple momentAfter sound vector completion processing, the corresponding multiple text informations of speech vector at multiple moment are obtained.
Optionally, in the above embodiment of the present invention, in step S1044, using revised speech processes model to moreThe speech vector at a moment is handled, while obtaining multiple text informations, this method further include:
Step S1046 obtains second parameter vector at multiple moment using revised speech processes model.
Optionally, in the above embodiment of the present invention, step S1046 is obtained more using revised speech processes modelSecond parameter vector at a moment, comprising:
Step S10462 is updated using the first parameter vector of the revised speech processes model to multiple moment,Obtain second parameter vector at multiple moment.
Step S1048, according to write operation by the second parameter vector write parameters matrix at multiple moment.
In a kind of optional scheme, for the natural-sounding treatment process at each moment, in utilization LSTM model to languageSound vector in the process of processing, the not only text information of available speech vector, can also obtain new W parameter toDot-blur pattern is written by writing head, the W parameter vector as next moment in new W parameter vector by amount.
Optionally, in the above embodiment of the present invention, in step S104, using default speech model to the language at multiple momentSound vector is handled, before obtaining multiple text informations corresponding with the speech vector at multiple moment, this method further include:
Step S108, establishes initial preset model, and initial preset model includes: speech processes model and initial parameter squareBattle array.
Step S110 obtains training data, wherein training data includes: multiple trained speech vectors and each trainingThe corresponding text information of speech vector.
Step S112 is trained initial preset model according to training data, obtains default speech model.
In a kind of optional scheme, the LSTM mould in neural Turing machine can be pre-established according to actual treatment needsType, and the W parameter vector in dot-blur pattern is set to initial value, then according to training data to the neural Turing machine pre-establishedIt is trained, obtains the higher neural Turing machine of accuracy.
Optionally, in the above embodiment of the present invention, step S112 instructs initial preset model according to training dataPractice, obtaining default speech model includes:
Training data is inputted speech processes model, obtains parameter preset vector by step S1122.
Step S1124 obtains parameter matrix by write operation by parameter preset vector write-in initial parameter matrix.
In a kind of optional scheme, the higher neural Turing machine of accuracy in order to obtain can will be in training dataMultiple trained speech vectors as input input vectors, the corresponding text information of each trained speech vector as output vector,It is input in LSTM model, obtains the default W parameter vector of LSTM model, and default W parameter vector is written by writing head and is rememberedMatrix is recalled, to obtain the higher neural Turing machine of accuracy.
Embodiment 2
According to embodiments of the present invention, a kind of embodiment of voice processing apparatus is provided.
Fig. 4 is a kind of schematic diagram of voice processing apparatus according to an embodiment of the present invention, as shown in figure 4, the device includes:
First obtains module 41, for obtaining the speech vector at multiple moment in preset time period.
Optionally, in the above embodiment of the present invention, the device further include: determining module, for according to default voice mouldThe processing capacity of type, determines preset time period.
Specifically, above-mentioned preset time period can be set according to the processing capacity of model, above-mentioned multiple momentIt can be and be spaced equal multiple sampling instants, for example, preset time period is 100s, sampling interval 10s, then in 100s,The available speech vector to 10 moment.
Processing module 43 is obtained and more for being handled using default speech model the speech vector at multiple momentThe corresponding multiple text informations of the speech vector at a moment, wherein default speech model is based on pre-stored multiple momentParameter vector the speech vector at multiple moment is handled.
Optionally, in the above embodiment of the present invention, default speech model includes: speech processes model and parameter matrix,Parameter matrix is used to be stored in advance the parameter vector at multiple moment, and speech processes model is used for the parameter vector based on multiple momentThe speech vector at multiple moment is handled, multiple text informations corresponding with the speech vector at multiple moment are obtained.
Optionally, in the above embodiment of the present invention, speech processes model is LSTM model, and parameter matrix is memory squareBattle array.
Specifically, above-mentioned default speech model can be neural Turing machine, as shown in Fig. 2, neural Turing machine includes twoA component part: controller (i.e. above-mentioned speech processes model) and dot-blur pattern (i.e. above-mentioned parameter matrix), dot-blur patternFor external storage matrix, it is stored with speech processes model and carries out parameter vector required for speech processes, controller can be rightParameter vector in dot-blur pattern is read and writen;Above-mentioned speech processes model can be LSTM model, be a kind of RNNIn special type, long-term Dependency Specification can be learnt, LSTM avoids long-term Dependence Problem by design deliberately, specificallyGround, LSTM is as other RNN, the form with a kind of chain type for repeating neural network module, still, with single neural networkLayer is different, and duplicate module possesses a different structure, as shown in figure 3, can forget that door, out gate are constituted by input gate,And it is interacted in a kind of very special mode, to solve the problems, such as that the gradient of RNN disappears and gradient is exploded.
Output module 45, for exporting multiple text informations.
In a kind of optional scheme, it can be obtained more in preset time period according to the sequential character of natural-soundingThe natural-sounding data of a sampling instant obtain the speech vector at multiple moment in preset time period, obtain trained in advanceNeural Turing machine is identified using speech vector of the neural Turing machine to multiple moment, obtains corresponding text information, and defeatedThe text information identified out.
According to that above embodiment of the present invention, the speech vector for obtaining multiple moment in preset time period, utilizes default voiceModel handles the speech vector at multiple moment, obtains multiple text envelopes corresponding with the speech vector at multiple momentBreath, exports multiple text informations, to realize natural language processing.It is easily noted that, since what is got is when presettingBetween in section multiple moment speech vector, and default parameter vector of the speech model based on pre-stored multiple moment is to moreThe speech vector at a moment is handled, so that the sequential character using natural-sounding is realized, in conjunction with the note of neural Turing machineRecall matrix and LSTM model, natural-sounding is handled, and then solves the processing of method of speech processing in the prior artThe technical issues of low efficiency.Therefore, scheme provided by the above embodiment through the invention can achieve and improve treatment effeciency, mentionHigh disposal accuracy, the effect for reducing processing complexity, reducing the processing time.
Optionally, in the above embodiment of the present invention, processing module 43 includes:
Acquisition submodule, for obtaining first parameter vector at multiple moment from parameter matrix according to read operation.
Specifically, as shown in Fig. 2, neural Turing machine may include: read head and writing head, carrying out read operation by read head can be withFrom the W parameter read in LSTM model in dot-blur pattern, memory can be written for new W parameter by carrying out write operation by writing headIn matrix.
Amendment submodule is obtained for being modified using first parameter vector at multiple moment to speech processes modelRevised speech processes model.
First processing submodule, for using revised speech processes model to the speech vector at multiple moment atReason, obtains multiple text informations.
In a kind of optional scheme, after the speech vector for getting multiple moment, for the nature at each momentSpeech processes process can read W parameter vector by read head from dot-blur pattern, and W parameter vector is inputted LSTM model, rightLSTM model is modified, and obtains revised LSTM model, can be using speech vector as input vector, after being input to amendmentLSTM model, to obtain the output vector of LSTM model, i.e. the text information of speech vector, in the language at all multiple momentAfter sound vector completion processing, the corresponding multiple text informations of speech vector at multiple moment are obtained.
Optionally, in the above embodiment of the present invention, processing module 43 further include:
Second processing submodule, for utilizing revised speech processes model, obtain second parameter at multiple moment toAmount.
Optionally, in the above embodiment of the present invention, second processing submodule is also used to utilize revised speech processesModel is updated first parameter vector at multiple moment, obtains second parameter vector at multiple moment.
First sub-module stored, for according to write operation by the second parameter vector write parameters matrix at multiple moment.
In a kind of optional scheme, for the natural-sounding treatment process at each moment, in utilization LSTM model to languageSound vector in the process of processing, the not only text information of available speech vector, can also obtain new W parameter toDot-blur pattern is written by writing head, the W parameter vector as next moment in new W parameter vector by amount.
Optionally, in the above embodiment of the present invention, the device further include:
Module is established, for establishing initial preset model, initial preset model includes: speech processes model and initial parameterMatrix.
Second obtains module, for obtaining training data, wherein and training data includes: multiple trained speech vectors, andThe corresponding text information of each trained speech vector.
Training module obtains default speech model for being trained according to training data to initial preset model.
In a kind of optional scheme, the LSTM mould in neural Turing machine can be pre-established according to actual treatment needsType, and the W parameter vector in dot-blur pattern is set to initial value, then according to training data to the neural Turing machine pre-establishedIt is trained, obtains the higher neural Turing machine of accuracy.
Optionally, in the above embodiment of the present invention, training module includes:
Third handles submodule, for training data to be inputted speech processes model, obtains parameter preset vector.
Second sub-module stored, for, by parameter preset vector write-in initial parameter matrix, obtaining parameter by write operationMatrix.
In a kind of optional scheme, the higher neural Turing machine of accuracy in order to obtain can will be in training dataMultiple trained speech vectors as input input vectors, the corresponding text information of each trained speech vector as output vector,It is input in LSTM model, obtains the default W parameter vector of LSTM model, and default W parameter vector is written by writing head and is rememberedMatrix is recalled, to obtain the higher neural Turing machine of accuracy.
Embodiment 3
According to embodiments of the present invention, a kind of embodiment of storage medium is provided, storage medium includes the program of storage,In, in program operation, equipment where control storage medium executes the method for speech processing in above-described embodiment 1.
Embodiment 4
According to embodiments of the present invention, a kind of embodiment of processor is provided, processor is for running program, wherein journeyThe method of speech processing in above-described embodiment 1 is executed when sort run.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
In the above embodiment of the invention, it all emphasizes particularly on different fields to the description of each embodiment, does not have in some embodimentThe part of detailed description, reference can be made to the related descriptions of other embodiments.
In several embodiments provided herein, it should be understood that disclosed technology contents, it can be according to othersMode is realized.Wherein, the apparatus embodiments described above are merely exemplary, such as the division of the unit, Ke YiweiA kind of logical function partition, there may be another division manner in actual implementation, for example, multiple units or components can combine orPerson is desirably integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutualBetween coupling, direct-coupling or communication connection can be INDIRECT COUPLING or communication link according to some interfaces, unit or moduleIt connects, can be electrical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unitThe component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multipleOn unit.It can some or all of the units may be selected to achieve the purpose of the solution of this embodiment according to the actual needs.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unitIt is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated listMember both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent productWhen, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantiallyThe all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other wordsIt embodies, which is stored in a storage medium, including some instructions are used so that a computerEquipment (can for personal computer, server or network equipment etc.) execute each embodiment the method for the present invention whole orPart steps.And storage medium above-mentioned includes: that USB flash disk, read-only memory (ROM, Read-Only Memory), arbitrary access are depositedReservoir (RAM, Random Access Memory), mobile hard disk, magnetic or disk etc. be various to can store program codeMedium.
The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the artFor member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answeredIt is considered as protection scope of the present invention.

Claims (12)

CN201710633042.2A2017-07-282017-07-28Voice processing method and device, storage medium and processorActiveCN109308896B (en)

Priority Applications (2)

Application NumberPriority DateFiling DateTitle
CN201710633042.2ACN109308896B (en)2017-07-282017-07-28Voice processing method and device, storage medium and processor
PCT/CN2018/079848WO2019019667A1 (en)2017-07-282018-03-21Speech processing method and apparatus, storage medium and processor

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201710633042.2ACN109308896B (en)2017-07-282017-07-28Voice processing method and device, storage medium and processor

Publications (2)

Publication NumberPublication Date
CN109308896Atrue CN109308896A (en)2019-02-05
CN109308896B CN109308896B (en)2022-04-15

Family

ID=65040955

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201710633042.2AActiveCN109308896B (en)2017-07-282017-07-28Voice processing method and device, storage medium and processor

Country Status (2)

CountryLink
CN (1)CN109308896B (en)
WO (1)WO2019019667A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113836270A (en)*2021-09-282021-12-24深圳格隆汇信息科技有限公司Big data processing method and related product

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112489630A (en)*2019-09-122021-03-12武汉Tcl集团工业研究院有限公司 A kind of speech recognition method and device
CN113095559B (en)*2021-04-022024-04-09京东科技信息技术有限公司Method, device, equipment and storage medium for predicting hatching time

Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1231742A (en)*1996-07-191999-10-13微软公司 Intelligent user aids
CN1623183A (en)*2002-03-272005-06-01诺基亚有限公司Pattern recognition
CN101123090A (en)*2006-08-112008-02-13哈曼贝克自动系统股份有限公司Speech recognition by statistical language using square-rootdiscounting
WO2013054347A2 (en)*2011-07-202013-04-18Tata Consultancy Services LimitedA method and system for detecting boundary of coarticulated units from isolated speech
CN105070300A (en)*2015-08-122015-11-18东南大学Voice emotion characteristic selection method based on speaker standardization change
CN106157950A (en)*2016-09-292016-11-23合肥华凌股份有限公司Speech control system and awakening method, Rouser and household electrical appliances, coprocessor

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JP2010204391A (en)*2009-03-032010-09-16Nippon Telegr & Teleph Corp <Ntt>Voice signal modeling method, signal recognition device and method, parameter learning device and method, and feature value generating device, method, and program
US9378729B1 (en)*2013-03-122016-06-28Amazon Technologies, Inc.Maximum likelihood channel normalization
CN105989839B (en)*2015-06-032019-12-13乐融致新电子科技(天津)有限公司Speech recognition method and device
DE102015211101B4 (en)*2015-06-172025-02-06Volkswagen Aktiengesellschaft Speech recognition system and method for operating a speech recognition system with a mobile unit and an external server

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1231742A (en)*1996-07-191999-10-13微软公司 Intelligent user aids
CN1623183A (en)*2002-03-272005-06-01诺基亚有限公司Pattern recognition
CN101123090A (en)*2006-08-112008-02-13哈曼贝克自动系统股份有限公司Speech recognition by statistical language using square-rootdiscounting
WO2013054347A2 (en)*2011-07-202013-04-18Tata Consultancy Services LimitedA method and system for detecting boundary of coarticulated units from isolated speech
CN105070300A (en)*2015-08-122015-11-18东南大学Voice emotion characteristic selection method based on speaker standardization change
CN106157950A (en)*2016-09-292016-11-23合肥华凌股份有限公司Speech control system and awakening method, Rouser and household electrical appliances, coprocessor

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
D.YU ET AL.: "《Deep Convolutional Neural Networks with Layer-wise》", 《PRO.INTERSPEECH》*
XUMCAS: "《神经网络—CNN结构和语音识别应用》", 《CSDN》*
倪蔚民: "《基于正交的神经网络在语音识别上的应用研究》", 《电脑开发与应用》*
奚雪峰等: "《面向自然语音处理的深度学习研究》", 《自动化学报》*

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113836270A (en)*2021-09-282021-12-24深圳格隆汇信息科技有限公司Big data processing method and related product

Also Published As

Publication numberPublication date
WO2019019667A1 (en)2019-01-31
CN109308896B (en)2022-04-15

Similar Documents

PublicationPublication DateTitle
CN109685202A (en)Data processing method and device, storage medium and electronic device
WO2019232772A1 (en)Systems and methods for content identification
US20190266442A1 (en)Tunable generative adversarial networks
CN109523014B (en)News comment automatic generation method and system based on generative confrontation network model
Prager et al.The modified Kanerva model for automatic speech recognition
CN110287297A (en)Dialogue replies method, apparatus, computer equipment and computer readable storage medium
CN109344395A (en)A kind of data processing method, device, server and storage medium
CN109977428A (en)A kind of method and device that answer obtains
CN109299264A (en)File classification method, device, computer equipment and storage medium
CN109308896A (en)Method of speech processing and device, storage medium and processor
CN110069781B (en)Entity label identification method and related equipment
CN109101492A (en)Usage history conversation activity carries out the method and system of entity extraction in a kind of natural language processing
CN107657313B (en)System and method for transfer learning of natural language processing task based on field adaptation
CN108170676B (en)Method, system and the terminal of story creation
CN108959388A (en)information generating method and device
CN115881103A (en)Voice emotion recognition model training method, voice emotion recognition method and device
CN109857865A (en)A kind of file classification method and system
CN118036736A (en)Legal knowledge graph construction method, legal knowledge graph construction device, legal knowledge graph construction equipment and storage medium
CN108171148A (en)The method and system that a kind of lip reading study cloud platform is established
Shah et al.Problem solving chatbot for data structures
CN110570844A (en)Speech emotion recognition method and device and computer readable storage medium
CN110321430A (en)Domain name identification and domain name identification model generation method, device and storage medium
CN110489744B (en)Corpus processing method and device, electronic equipment and storage medium
CN110472230A (en)The recognition methods of Chinese text and device
CN111090740B (en)Knowledge graph generation method for dialogue system

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
TA01Transfer of patent application right
TA01Transfer of patent application right

Effective date of registration:20220310

Address after:215000 Floor 9, building 5, Asia Pacific Plaza, No. 18, Zhaofeng Road, Huaqiao Town, Kunshan City, Suzhou City, Jiangsu Province

Applicant after:Jiangsu Huitong Jinke Data Co.,Ltd.

Address before:518000 Guangdong, Shenzhen, Nanshan District, Nanhai Road, West Guangxi Temple Road North Sunshine Huayi Building 1 15D-02F

Applicant before:SHEN ZHEN KUANG-CHI HEZHONG TECHNOLOGY Ltd.

Applicant before:Shenzhen Guangqi Innovation Technology Co., Ltd

GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp