Movatterモバイル変換


[0]ホーム

URL:


CN110390956A - Emotion recognition network model, method and electronic equipment - Google Patents

Emotion recognition network model, method and electronic equipment
Download PDF

Info

Publication number
CN110390956A
CN110390956ACN201910751541.0ACN201910751541ACN110390956ACN 110390956 ACN110390956 ACN 110390956ACN 201910751541 ACN201910751541 ACN 201910751541ACN 110390956 ACN110390956 ACN 110390956A
Authority
CN
China
Prior art keywords
emotion
text
emotion recognition
speech
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910751541.0A
Other languages
Chinese (zh)
Inventor
聂镭
徐泓洋
聂颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dragon Horse Zhixin (zhuhai Hengqin) Technology Co Ltd
Original Assignee
Dragon Horse Zhixin (zhuhai Hengqin) Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dragon Horse Zhixin (zhuhai Hengqin) Technology Co LtdfiledCriticalDragon Horse Zhixin (zhuhai Hengqin) Technology Co Ltd
Priority to CN201910751541.0ApriorityCriticalpatent/CN110390956A/en
Publication of CN110390956ApublicationCriticalpatent/CN110390956A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

The invention discloses a kind of network model of emotion recognition, method, electronic equipments, pass through speech emotion recognition module and text emotion identification module, realize the type that can be inputted according to target, the speech emotion recognition module or/and the speech emotion recognition module is called to carry out emotion recognition, solve emotion identification method model complexity, cumbersome, the single technical problem of application scenarios of training in the related technology.

Description

Emotion recognition network model, method and electronic equipment
Technical field
The present invention relates to emotion recognition technical field, especially a kind of emotion recognition network model, method and electronic equipment.
Background technique
Emotion recognition is carried out by voice and two kinds of main sides that situation identification is current emotion recognition are carried out by textMethod.Under the scene of voice, for better emotion recognition effect, usual way is the emotion for first identifying voice, then to languageThe corresponding text of sound carries out emotion recognition, and the emotion of voice and text is then carried out comprehensive marking, obtains final emotion and knowsNot as a result, this method needs artificial constructed marking rule in last comprehensive marking, the formulation for rule of giving a mark may excessively be ledIt sees, human factor influences greatly, and versatility is not strong, while the result identified is not also very accurate.
For this problem, patent document (publication number CN108305641A) proposes a kind of feelings of multi-modal Fusion FeaturesFeel recognition methods, specifically extracts audio feature vector respectively into text from audio and by audio identification first and text is specialVector is levied, audio feature vector and Text eigenvector are then merged into input neural network and are trained, finally obtains oneThe model of emotion prediction is carried out by audio and text.The method avoids above-mentioned artificial formulation marking rule it is excessively subjective byThe problem of poor universality caused by human factor influences, while also improving the accuracy of emotion recognition.But the above methodEmotion prediction model is relative complex, and text classification CNN model is respectively trained with text audio DNN model needs, training processIt is cumbersome, simultaneously as the input of its text audio DNN model must be the assemblage characteristic of audio and text, lead to this methodApplication scenarios it is more single, have limitation.
Summary of the invention
The embodiment of the invention provides a kind of emotion recognition network model, method and electronic equipments, at least partly to solveEmotion identification method model is complicated in the related technology, trains cumbersome, the single technical problem of application scenarios.
In order to achieve the above objectives, An embodiment provides a kind of network model of emotion recognition, the netsNetwork model includes: speech emotion recognition module and text emotion identification module;Wherein, the speech emotion recognition module, is used forVoice is inputted and carries out speech emotional feature extraction, exports speech emotional feature vector;The text emotion identification module, is used forText emotion feature extraction is carried out to text input, exports text emotion feature vector;The network model is according to the voiceAffective characteristics vector and/or text emotion feature vector carry out emotion recognition;
The type that the network model can be inputted according to target, calls the speech emotion recognition module or/and the textEmotion recognition module carries out emotion recognition, wherein the type of target input include: voice input, text input, voice andThe input of corresponding text.
Further, the speech emotion recognition module include: speech feature extraction layer and the first multi-layer biaxially oriented length in short-termMemory network layer;The text emotion identification module includes: pretreatment layer, the second multi-layer biaxially oriented length memory network layer and note in short-termMeaning power layer.
Further, the network model further include:
Input layer, for the common input end as the speech emotion recognition module and the text emotion identification module;
Fused layer is merged for merging the speech emotional feature vector and the text emotion feature vectorAffective characteristics vector;
Sorter network layer, for exporting the emotion recognition result of the target input according to the fusion affective characteristics vector.
Further, the fused layer will carry out the speech emotional feature vector and the text emotion feature vectorThe mode of fusion is added using contraposition or connecting method.
Further, the speech emotion recognition module and the text emotion identification module are parallel-connection structure.
Further, the network parameter of the speech emotion recognition module and the text emotion identification module is by oneSecondary property training obtains.
Further, the network parameter of the speech emotion recognition module and the text emotion identification module is by oneSecondary property training obtains, specifically:
Training set data is inputted into the emotion recognition model, obtains emotion prediction result, wherein the training set data packetIt includes: the corresponding text of voice, voice, affective tag;
The emotion prediction result is compared with the affective tag, when the emotion prediction result and the affective tagIn unmatched situation, the speech emotion recognition module and institute are adjusted separately by backpropagation using gradient descent algorithmThe value for stating the network parameter of text emotion identification module completes the speech emotion recognition module and institute by successive ignitionState the training of text emotion identification module network parameter.
According to one embodiment of present invention, a kind of emotion identification method is provided, comprising:
Target input is obtained, the type of the target input includes following one: voice input, text input, voice and correspondenceText input;
According to the type that the target inputs, the speech emotional of network model as claimed in claim 1 to 7 is called to knowOther module or/and the speech emotion recognition module carry out emotion recognition;
Export the emotion recognition result of the target input.
Further, the type inputted according to the target, calls the institute of network model as claimed in claim 1 to 7It states speech emotion recognition module or/and the speech emotion recognition module carries out emotion recognition, comprising:
When target input is voice input, the speech emotion recognition module is called to carry out emotion recognition;
When target input is text input, the text emotion identification module is called to carry out emotion recognition;
When target input is the input of voice and corresponding text, while calling the speech emotion recognition module and instituteIt states text emotion identification module and carries out emotion recognition.
According to still another embodiment of the invention, a kind of electronic equipment, including memory and processor are additionally provided, it is describedComputer program is stored in memory, the processor is arranged to run the computer program to execute any of the above-describedDescribed in method.
The network model of emotion recognition provided by the invention identifies mould by speech emotion recognition module and text emotionBlock realizes the type that can be inputted according to target, calls the speech emotion recognition module or/and the speech emotion recognitionModule carries out emotion recognition, solves that emotion identification method model in the related technology is complicated, training is cumbersome, and application scenarios are singleTechnical problem.
Detailed description of the invention
By referring to the drawings to the description of the embodiment of the present invention, the above and other purposes of the present invention, feature andAdvantage will be apparent from, in the accompanying drawings:
Fig. 1 is a kind of schematic diagram of the network model for emotion recognition that one embodiment of the present of invention provides;
Fig. 2 is a kind of flow chart for emotion identification method that another embodiment of the invention provides;
Fig. 3 is the hardware block diagram of the electronic equipment for the emotion identification method that one embodiment of the present of invention provides.
Specific embodiment
Below based on embodiment, present invention is described, but the present invention is not restricted to these embodiments.UnderText is detailed to describe some specific detail sections in datail description of the invention, in order to avoid obscuring essence of the invention,There is no narrations in detail for well known method, process, process, element.
In addition, it should be understood by one skilled in the art that provided herein attached drawing be provided to explanation purpose, andWhat attached drawing was not necessarily drawn to scale.
Unless the context clearly requires otherwise, "include", "comprise" otherwise throughout the specification and claims etc. are similarWord should be construed as the meaning for including rather than exclusive or exhaustive meaning;That is, be " including but not limited to " containsJustice.
In the description of the present invention, it is to be understood that, term " first ", " second " etc. are used for description purposes only, withoutIt can be interpreted as indication or suggestion relative importance.In addition, in the description of the present invention, unless otherwise indicated, the meaning of " multiple "It is two or more.
It is a kind of signal of the network model 20 for emotion recognition that one embodiment of the present of invention provides referring to Fig. 1, Fig. 1Figure, the network model 20 include:
Speech emotion recognition module 202 and text emotion identification module 204;Wherein, the speech emotion recognition module 202 is usedSpeech emotional feature extraction is carried out in inputting to voice, exports speech emotional feature vector V1;The text emotion identification module204, for carrying out text emotion feature extraction to text input, export text emotion feature vector V2;The network model 20Emotion recognition is carried out according to the speech emotional feature vector V1 and/or text emotion feature vector V2;The network model 20The type that can be inputted according to target, calls the speech emotion recognition module 202 or/and the text emotion identification module204 carry out emotion recognition, wherein the type of the target input includes: voice input, text input, voice and corresponding textThis input.
It should be noted that in the prior art, to solve the problems, such as the emotion recognition under a certain scene, can generally be directed to shouldThe input of the network model of the specific emotion recognition of scenario building, the network structure, emotion recognition of this network model is relatively solidFixed, if changing emotion recognition scene, when the input of emotion recognition changes, this network model is just no longer applicable in, and is neededThe network model of new emotion recognition is rebuild, thus the cost that bring is additional.For example, under the scene of text, exampleSuch as SMS chat, mail dealing or simple wechat text chat etc., the target object of emotion recognition under these scenesIt is text, therefore, it is necessary to construct for text input the network model for identifying emotion;And under the scene of voice, such as electricityVoice-enabled chat, wechat voice-enabled chat, session recording etc. are talked about, the target object of emotion recognition is voice under these scenes, therefore, is neededIt constructs for voice input and identifies the network model of emotion;In addition, some while there is voice and corresponding textUnder scene, such as voice-enabled chat platform of some included speech identifying functions can export voice and corresponding text, theseThe target object of emotion recognition is under scape, and therefore, it is necessary to construct the input for voice and corresponding text to identify emotionNetwork model.If necessary to adapt to the emotion recognition of above-mentioned three kinds of scenes simultaneously, multiple emotions are now often used in the artThe network model of identification is thus resulted in the need for constructing multiple network models respectively, be trained, while needing to acquire differenceTraining data, time-consuming and laborious, cost is very high.
The network model of emotion recognition provided in an embodiment of the present invention is known by speech emotion recognition module and text emotionOther module realizes the type that can be inputted according to target, calls the speech emotion recognition module or/and the speech emotionalIdentification module carries out emotion recognition, solves the single technical problem of the model application scenarios of emotion recognition in the prior art, togetherWhen, network model training process is simple, and the collection of training set data is also relatively easy.
Specifically, the speech emotion recognition module 202 include: speech feature extraction layer and the first multi-layer biaxially oriented length in short-termMemory network layer (Bi-LSTM);The text emotion identification module 204 includes: that pretreatment layer, the second multi-layer biaxially oriented length are remembered in short-termRecall network layer (Bi-LSTM) and attention layer (Attention).In text emotion identification module 204, because main emotionExpression focuses mostly on the word or phrase of certain keys, so text emotion identification module 204 is needed with attention mechanism(Attention Model) goes emphasis to find the keyword or phrase to show emotion, facilitates the standard for promoting text emotion identificationTrue rate.And in speech emotion recognition module 202, the expression of emotion is mostly related with the tone intonation variation of duration, therefore languageMemory network layer (Bi-LSTM) structure removes study front and back audio letter to the sound emotion recognition module 202 multi-layer biaxially oriented length of needs in short-termBreath can, do not need attention mechanism (Attention Model).Various ways can be used in speech feature extraction layer, packetInclude linear prediction analysis (LinearPredictionCoefficients, LPC), perception linear predictor coefficient(PerceptualLinearPredictive, PLP), linear prediction residue error (LinearPredictiveCepstralCoEfficient, LPCC), mel-frequency cepstrum coefficient (MelFrequencyCepstrumCoefficient, MFCC) etc., at thisMFCC feature is used in inventive embodiments, is the prior art, the emphasis of non-present invention, details are not described herein.In addition, multilayer is doubleIt is also the prior art, Fei Benfa to the long network structure of memory network layer (Bi-LSTM) and attention layer (Attention) in short-termBright emphasis, also repeats no more herein.Particularly, the overall structure of the network model 20 for focusing on emotion recognition of the inventionDesign, and do not lie in the change of component part in network model 20 itself, therefore specification is only to the entirety of network model 20The composition of structure, design principle carry out emphasis elaboration.
Further, the network model 20 further include: input layer 206, for as the voice module 202 and describedThe common input end of text emotion identification module 204;Fused layer 208, for by the speech emotional feature vector V1 and describedText emotion feature vector V2 is merged, and fusion affective characteristics vector V3 is obtained;Sorter network layer (Softmax) 210, is used forThe emotion recognition result of the target input is exported according to the fusion affective characteristics vector V3.Input layer 206 can be according to inputInput data is transferred to the speech emotion recognition module 202 and/or the text emotion identification module by the type of data204, for example, the input data is inputted the speech emotion recognition module 202, if defeated if input data is voiceEntering data is text, then the input data is input to the text emotion identification module 204, if input is voice and correspondenceText, then the input data is input to the speech emotion recognition module 202 and the text emotion identification module 204.Sorter network layer (Softmax) is the prior art, and the emphasis of non-present invention also repeats no more herein.
Specifically, the fused layer will melt the speech emotional feature vector and the text emotion feature vectorThe mode of conjunction is added using contraposition or connecting method.The speech emotional feature vector V1 be a form be 1*M dimension toAmount, the text emotion feature vector V2 are the vectors that a form is 1*N dimension.As M=N, the speech emotional feature vectorV1 can be used the mode that contraposition is added with the text emotion feature vector V2 and be merged, and obtain final fusion affective characteristicsThe formula of vector V3, fusion are as follows: V3=V1+V2.As M ≠ N, the speech emotional feature vector V1 and the text emotion are specialThe method that splicing can be used in sign vector V2, i.e. V3=[V1, V2].As M ≠ N, in network training, it should be noted that reversely passingSowing time, the parameter update of network will set corresponding dimension, i.e. M dimension updates the network ginseng in speech emotion recognition module 202Number, N-dimensional update the network parameter in the text emotion identification module 204.Specifically, the speech emotion recognition module 202It is parallel-connection structure with the text emotion identification module 204.Network structure design in parallel, so that network model training is reversedIn communication process, it may be implemented to have updated the speech emotion recognition module 202 and the text emotion identification module 204 simultaneouslyNetwork parameter effect, and then the speech emotion recognition module 202 and the text emotion are completed by disposably trainingThe training of the network parameter of identification module 204 so that training process is simple and efficient, saved now collect training data atThis.In addition, network structure design in parallel, so that during the training of network model 20 of emotion recognition, training data ChineseOriginally the text emotion information for including has simultaneously participated in the update of the network parameter of the speech emotion recognition module 202, training numberAccording to the update for the network parameter that the speech emotional information in middle voice has also assisted in, therefore, two networks respective field similarly hereinafterWhen may learn more affective characteristics information, independent training one text emotion identification model or list than in the prior artThe information that solely one speech emotion recognition model of training is acquired is more, so that network parameter is restrained more excellent, so thatThe prediction of network model is more accurate.The network model of emotion recognition provided in an embodiment of the present invention, passes through speech emotion recognitionModule and text emotion identification module realize the type that can be inputted according to target, call the speech emotion recognition moduleOr/and the speech emotion recognition module carries out emotion recognition, solves the model application scenarios of emotion recognition in the prior artSingle technical problem, simultaneously as speech emotion recognition module and text emotion in emotion recognition network model of the inventionIdentification module is parallel-connection structure, and network structure is simple, and can complete speech emotion recognition module by disposably trainingWith the network parameter training of text emotion identification module, training process is simple, and the collection of training set data is also relatively easy.
In the embodiment of the present invention, the speech emotion recognition module 202 of the network model 20 and the text emotionThe network parameter of identification module 204 is obtained by disposably training.Specific training process is as follows:
Training set data is inputted into the emotion recognition model 20, obtains emotion prediction result, wherein the training set data packetIt includes: the corresponding text of voice, voice, affective tag;
The emotion prediction result is compared with the affective tag, when the emotion prediction result and the affective tagIn unmatched situation, the speech emotion recognition module 202 is adjusted separately by backpropagation using gradient descent algorithmThe speech emotion recognition mould is completed by successive ignition with the value of the network parameter of the text emotion identification module 204The training of block 202 and 204 network parameter of text emotion identification module.
Specifically, the data in training set include voice, the corresponding text of voice, affective tag, format shaped like:" wav ", " " txt ", " affective tag " }, wherein " wav " is one section of speech audio file, file format wav format, voiceAudio file can also be using other audio formats;" txt " is the text that is obtained by speech recognition of voice, and be byText after manual review;" affective tag " is then the feeling polarities of the voice and corresponding text, such as " happiness ", " sadness "," gentle " etc..
The data of above-mentioned training set are input in network model of the invention, obtain emotion prediction result.Detailed processAre as follows: input of the phonological component " wav " as speech emotion recognition module 202 in the data of training set, in speech feature extractionLayer extracts phonetic feature, such as MFCC feature, and then in the first multi-layer biaxially oriented length, memory network layer (Bi-LSTM) forms language in short-termSound affective characteristics vector V1, V1 are the vectors that a form is 1*N dimension;Textual portions " txt " conduct in the data of training setThe input of text emotion identification module 204, first pre-processes text, pre-treatment step include segment and generate word toAmount, then in the second multi-layer biaxially oriented length, memory network layer (Bi-LSTM) and attention layer (Attention) form text feelings in short-termFeel feature vector V2, V2 is the vector that a form is 1*M dimension;Then, fused layer 208 by speech emotional feature vector V1 withText emotion feature vector V2 is merged, obtain fusion affective characteristics vector V3, fusion process can be used contraposition be added orThe mode of splicing carries out;Finally, based on fusion affective characteristics vector V3, it is pre- using sorter network layer (Softmax) output emotionSurvey result.
The emotion prediction result is compared with the affective tag, for example, working as the emotion prediction result and instituteIt states in the unmatched situation of affective tag, for example, the emotion prediction result of the voice and corresponding text is " happiness " and its emotionLabel is " gentle " to be mismatched, then adjusts separately the speech emotion recognition module by backpropagation using gradient descent algorithm202 and the value of network parameter of the text emotion identification module 204 complete the speech emotional and know by successive ignitionThe training of other module 202 and 204 network parameter of text emotion identification module.Network model is carried out with gradient descent algorithmParameter training be the prior art, the emphasis of non-present invention, therefore be no longer described in detail.
It should be noted that the unique texture of the network model 20 due to emotion recognition of the invention, i.e., the described voice feelingsFeel identification module 202 and the text emotion identification module 204 is parallel-connection structure, in the back-propagation process of network model trainingIn, realize while having updated the network parameter of the speech emotion recognition module 202 and the text emotion identification module 204Effect, and then the speech emotion recognition module 202 and the text emotion identification module 204 are completed by disposably trainingThe training of network parameter saved the cost for collecting training data now so that training process is simple and efficient.
In addition, the text emotion that text includes in training data is believed during the training of network model 20 of emotion recognitionBreath has simultaneously participated in the update of the network parameter of the speech emotion recognition module 202, the voice feelings in training data in voiceThe update for the network parameter that sense information has also assisted in, therefore, two networks may learn more simultaneously under respective fieldAffective characteristics information individually trains a text emotion identification model or individually one speech emotional of training than in the prior artThe information that identification model is acquired is more, so that network parameter is restrained more excellent, so that the prediction of network model is moreAccurately.On the other hand, since only sorter network layer (Softmax) is used to export emotion prediction result after fused layer 208,The speech emotion recognition module 202 is parallel-connection structure with the text emotion identification module 204, is not total between two networksNetwork parameter, so two networks are mutually indepedent, disassembled.The speech emotion recognition module 202 can individually be taken outAs independent speech emotion recognition model, and its network parameter contains the text emotion information of priori, i.e., with languageSound is to combine the affective characteristics of text in the case where main feature, compared with the existing technology in independent trained voice feelingsThe emotion recognition effect for feeling identification model is more accurate.Similarly, the text emotion identification module 204 can also individually be taken outAs independent text emotion identification model, network parameter contains the speech emotional information of priori, i.e., based on textWant the affective characteristics that part of speech is taken into account in the case where feature, compared with the existing technology in independent trained text emotion recognitionThe emotion recognition effect of model is more accurate.
The speech emotion recognition module 202 and text emotion of the network model 20 of emotion recognition provided in an embodiment of the present inventionIdentification module 204 is both individually called and can also be called simultaneously, suitable for the emotion recognition of several scenes, meanwhile, network architectureSimply, training process is also relatively simple is easy, and the collection of training set data is also relatively easy.
Referring to fig. 2, Fig. 2 is a kind of flow chart for emotion identification method that another embodiment of the invention provides, the feelingsFeeling recognition methods includes:
S100 obtains target input, and the type of target input includes following one: voice input, text input, voice andThe input of corresponding text;
S200 calls the voice feelings of network model 20 described in above embodiments according to the type that the target inputsFeel identification module 202 or/and the speech emotion recognition module 204 carries out emotion recognition;
S300 exports the emotion recognition result of the target input.
Specifically, step S200 is specifically included:
When target input is voice input, the speech emotion recognition module 202 is called to carry out emotion recognition;
When target input is text input, the text emotion identification module 204 is called to carry out emotion recognition;
When target input is the input of voice and corresponding text, while calling the speech emotion recognition module 202Emotion recognition is carried out with the text emotion identification module 204.
It is the hardware configuration of the electronic equipment for the emotion identification method that one embodiment of the present of invention provides referring to Fig. 3, Fig. 3Block diagram.
Embodiment of the method provided by the embodiment of the present application can be in mobile terminal, terminal or similar operationIt is executed in device.For running on mobile terminals, Fig. 1 is that a kind of electronics of emotion identification method of the embodiment of the present invention is setStandby hardware block diagram.As shown in Figure 1, mobile terminal 10 may include one or more (only showing one in Fig. 1) processingDevice 102(processor 102 can include but is not limited to the processing unit of Micro-processor MCV or programmable logic device FPGA etc.) andMemory 104 for storing data, optionally, above-mentioned mobile terminal can also include the transmission device for communication function106 and input-output equipment 108.It will appreciated by the skilled person that structure shown in FIG. 1 is only to illustrate, simultaneouslyThe structure of above-mentioned mobile terminal is not caused to limit.For example, mobile terminal 10 may also include it is more than shown in Fig. 1 or lessComponent, or with the configuration different from shown in Fig. 1.
Memory 104 can be used for storing computer program, for example, the software program and module of application software, such as this hairThe corresponding computer program of emotion identification method in bright embodiment, processor 102 are stored in memory 104 by operationComputer program realizes above-mentioned method thereby executing various function application and data processing.Memory 104 may includeHigh speed random access memory, may also include nonvolatile memory, as one or more magnetic storage device, flash memory or itsHis non-volatile solid state memory.In some instances, memory 104 can further comprise remotely setting relative to processor 102The memory set, these remote memories can pass through network connection to mobile terminal 10.The example of above-mentioned network includes but notIt is limited to internet, intranet, local area network, mobile radio communication and combinations thereof.
Transmitting device 106 is used to that data to be received or sent via a network.Above-mentioned network specific example may includeThe wireless network that the communication providers of mobile terminal 10 provide.In an example, transmitting device 106 includes a Network adaptationDevice (Network Interface Controller, referred to as NIC), can be connected by base station with other network equipments toIt can be communicated with internet.In an example, transmitting device 106 can for radio frequency (Radio Frequency, referred to asRF) module is used to wirelessly be communicated with internet.
Those skilled in the art will readily recognize that above-mentioned each preferred embodiment can be free under the premise of not conflictingGround combination, superposition.
It should be appreciated that above-mentioned embodiment is merely exemplary, and not restrictive, without departing from of the invention basicIn the case where principle, those skilled in the art can be directed to the various apparent or equivalent modification or replace that above-mentioned details is madeIt changes, is all included in scope of the presently claimed invention.

Claims (10)

CN201910751541.0A2019-08-152019-08-15Emotion recognition network model, method and electronic equipmentPendingCN110390956A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201910751541.0ACN110390956A (en)2019-08-152019-08-15Emotion recognition network model, method and electronic equipment

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201910751541.0ACN110390956A (en)2019-08-152019-08-15Emotion recognition network model, method and electronic equipment

Publications (1)

Publication NumberPublication Date
CN110390956Atrue CN110390956A (en)2019-10-29

Family

ID=68288786

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201910751541.0APendingCN110390956A (en)2019-08-152019-08-15Emotion recognition network model, method and electronic equipment

Country Status (1)

CountryLink
CN (1)CN110390956A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110910902A (en)*2019-12-042020-03-24杭州哲信信息技术有限公司Mixed model speech emotion recognition method and system based on ensemble learning
CN110909131A (en)*2019-11-262020-03-24携程计算机技术(上海)有限公司Model generation method, emotion recognition method, system, device and storage medium
CN111081279A (en)*2019-12-242020-04-28深圳壹账通智能科技有限公司Voice emotion fluctuation analysis method and device
CN111081280A (en)*2019-12-302020-04-28苏州思必驰信息科技有限公司Text-independent speech emotion recognition method and device and emotion recognition algorithm model generation method
CN111179945A (en)*2019-12-312020-05-19中国银行股份有限公司Voiceprint recognition-based safety door control method and device
CN111930940A (en)*2020-07-302020-11-13腾讯科技(深圳)有限公司Text emotion classification method and device, electronic equipment and storage medium
CN112017758A (en)*2020-09-152020-12-01龙马智芯(珠海横琴)科技有限公司Emotion recognition method and device, emotion recognition system and analysis decision terminal
WO2021068843A1 (en)*2019-10-082021-04-15平安科技(深圳)有限公司Emotion recognition method and apparatus, electronic device, and readable storage medium
CN112733546A (en)*2020-12-282021-04-30科大讯飞股份有限公司Expression symbol generation method and device, electronic equipment and storage medium
CN114420168A (en)*2022-02-142022-04-29平安科技(深圳)有限公司 Emotion recognition method, device, device and storage medium

Citations (14)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
KR20120121298A (en)*2011-04-262012-11-05한국과학기술원Assistive robot with emotional speech synthesizing function, method of synthesizing emotional speech for the assistive robot, and recording medium
JP2014106313A (en)*2012-11-272014-06-09Nippon Telegr & Teleph Corp <Ntt>Spoken language analyzer and method and program for the same
CN105427869A (en)*2015-11-022016-03-23北京大学Session emotion autoanalysis method based on depth learning
CN107729569A (en)*2017-11-172018-02-23杭州师范大学A kind of social networks Forecasting Methodology of UNE structure and text message
CN108039181A (en)*2017-11-022018-05-15北京捷通华声科技股份有限公司The emotion information analysis method and device of a kind of voice signal
CN108305641A (en)*2017-06-302018-07-20腾讯科技(深圳)有限公司The determination method and apparatus of emotion information
CN108564942A (en)*2018-04-042018-09-21南京师范大学One kind being based on the adjustable speech-emotion recognition method of susceptibility and system
CN108681562A (en)*2018-04-262018-10-19第四范式(北京)技术有限公司Category classification method and system and Classification Neural training method and device
CN108763325A (en)*2018-05-042018-11-06北京达佳互联信息技术有限公司A kind of network object processing method and processing device
CN108985358A (en)*2018-06-292018-12-11北京百度网讯科技有限公司Emotion identification method, apparatus, equipment and storage medium
CN109614895A (en)*2018-10-292019-04-12山东大学 A multimodal emotion recognition method based on attention feature fusion
CN109933664A (en)*2019-03-122019-06-25中南大学 An Improved Method for Fine-Grained Sentiment Analysis Based on Sentiment Word Embedding
CN109948156A (en)*2019-03-132019-06-28青海师范大学 A Tibetan word vector representation method fused with component and word information
CN110083716A (en)*2019-05-072019-08-02青海大学Multi-modal affection computation method and system based on Tibetan language

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
KR20120121298A (en)*2011-04-262012-11-05한국과학기술원Assistive robot with emotional speech synthesizing function, method of synthesizing emotional speech for the assistive robot, and recording medium
JP2014106313A (en)*2012-11-272014-06-09Nippon Telegr & Teleph Corp <Ntt>Spoken language analyzer and method and program for the same
CN105427869A (en)*2015-11-022016-03-23北京大学Session emotion autoanalysis method based on depth learning
CN108305641A (en)*2017-06-302018-07-20腾讯科技(深圳)有限公司The determination method and apparatus of emotion information
CN108039181A (en)*2017-11-022018-05-15北京捷通华声科技股份有限公司The emotion information analysis method and device of a kind of voice signal
CN107729569A (en)*2017-11-172018-02-23杭州师范大学A kind of social networks Forecasting Methodology of UNE structure and text message
CN108564942A (en)*2018-04-042018-09-21南京师范大学One kind being based on the adjustable speech-emotion recognition method of susceptibility and system
CN108681562A (en)*2018-04-262018-10-19第四范式(北京)技术有限公司Category classification method and system and Classification Neural training method and device
CN108763325A (en)*2018-05-042018-11-06北京达佳互联信息技术有限公司A kind of network object processing method and processing device
CN108985358A (en)*2018-06-292018-12-11北京百度网讯科技有限公司Emotion identification method, apparatus, equipment and storage medium
CN109614895A (en)*2018-10-292019-04-12山东大学 A multimodal emotion recognition method based on attention feature fusion
CN109933664A (en)*2019-03-122019-06-25中南大学 An Improved Method for Fine-Grained Sentiment Analysis Based on Sentiment Word Embedding
CN109948156A (en)*2019-03-132019-06-28青海师范大学 A Tibetan word vector representation method fused with component and word information
CN110083716A (en)*2019-05-072019-08-02青海大学Multi-modal affection computation method and system based on Tibetan language

Cited By (14)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO2021068843A1 (en)*2019-10-082021-04-15平安科技(深圳)有限公司Emotion recognition method and apparatus, electronic device, and readable storage medium
CN110909131A (en)*2019-11-262020-03-24携程计算机技术(上海)有限公司Model generation method, emotion recognition method, system, device and storage medium
CN110910902A (en)*2019-12-042020-03-24杭州哲信信息技术有限公司Mixed model speech emotion recognition method and system based on ensemble learning
CN110910902B (en)*2019-12-042022-09-06杭州哲信信息技术有限公司Mixed model speech emotion recognition method and system based on ensemble learning
WO2021128741A1 (en)*2019-12-242021-07-01深圳壹账通智能科技有限公司Voice emotion fluctuation analysis method and apparatus, and computer device and storage medium
CN111081279A (en)*2019-12-242020-04-28深圳壹账通智能科技有限公司Voice emotion fluctuation analysis method and device
CN111081280A (en)*2019-12-302020-04-28苏州思必驰信息科技有限公司Text-independent speech emotion recognition method and device and emotion recognition algorithm model generation method
CN111179945A (en)*2019-12-312020-05-19中国银行股份有限公司Voiceprint recognition-based safety door control method and device
CN111930940A (en)*2020-07-302020-11-13腾讯科技(深圳)有限公司Text emotion classification method and device, electronic equipment and storage medium
CN111930940B (en)*2020-07-302024-04-16腾讯科技(深圳)有限公司Text emotion classification method and device, electronic equipment and storage medium
CN112017758A (en)*2020-09-152020-12-01龙马智芯(珠海横琴)科技有限公司Emotion recognition method and device, emotion recognition system and analysis decision terminal
CN112017758B (en)*2020-09-152021-04-30龙马智芯(珠海横琴)科技有限公司Emotion recognition method and device, emotion recognition system and analysis decision terminal
CN112733546A (en)*2020-12-282021-04-30科大讯飞股份有限公司Expression symbol generation method and device, electronic equipment and storage medium
CN114420168A (en)*2022-02-142022-04-29平安科技(深圳)有限公司 Emotion recognition method, device, device and storage medium

Similar Documents

PublicationPublication DateTitle
CN110390956A (en)Emotion recognition network model, method and electronic equipment
CN108305641B (en)Method and device for determining emotion information
CN111177310B (en)Intelligent scene conversation method and device for power service robot
CN108305643B (en)Method and device for determining emotion information
CN110459210A (en)Answering method, device, equipment and storage medium based on speech analysis
CN112233680B (en)Speaker character recognition method, speaker character recognition device, electronic equipment and storage medium
CN109920414A (en)Nan-machine interrogation&#39;s method, apparatus, equipment and storage medium
CN108806667A (en)The method for synchronously recognizing of voice and mood based on neural network
CN108711420A (en)Multilingual hybrid model foundation, data capture method and device, electronic equipment
CN114596844A (en)Acoustic model training method, voice recognition method and related equipment
CN109670166A (en)Collection householder method, device, equipment and storage medium based on speech recognition
CN113223509A (en)Fuzzy statement identification method and system applied to multi-person mixed scene
CN107464566A (en)Audio recognition method and device
CN110992959A (en)Voice recognition method and system
US20180308501A1 (en)Multi speaker attribution using personal grammar detection
Palaskar et al.Learned in speech recognition: Contextual acoustic word embeddings
CN116108856B (en) Emotion recognition method and system based on long-short circuit cognition and explicit-implicit emotion interaction
CN110019741A (en)Request-answer system answer matching process, device, equipment and readable storage medium storing program for executing
Amiriparian et al.On the impact of word error rate on acoustic-linguistic speech emotion recognition: An update for the deep learning era
CN118134049A (en)Conference decision support condition prediction method, device, equipment, medium and product
CN109003600B (en)Message processing method and device
CN118260711A (en)Multi-mode emotion recognition method and device
Barkur et al.EnsembleWave: an ensembled approach for automatic speech emotion recognition
KR101941924B1 (en)Method for providing association model based intention nano analysis service using cognitive neural network
CN113889149B (en) Speech emotion recognition method and device

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication

Application publication date:20191029

RJ01Rejection of invention patent application after publication

[8]ページ先頭

©2009-2025 Movatter.jp