Summary of the invention
The invention provides a kind of call voice monitoring method and system, to realize comprehensive, the Real-Time Monitoring to contact staff's call voice, promote the service quality of service end contact staff.
For this reason, the invention provides following technical scheme:
A kind of call voice monitoring method, comprising:
The speech data of Real-time Collection service end and client call respectively;
Speech recognition is carried out to the speech data of described service end and client, obtains service end identification text and client identification text respectively;
According to the identification text of described service end call voice data with correspondence and the identification text of described client call voice data and correspondence, Real-Time Monitoring is carried out to service end call voice.
Preferably, the speech data of the described service end of Real-time Collection respectively and client call comprises:
By direct mode Real-time Collection service end speech data of recording from physics sound card;
By the mode Real-time Collection client speech data of recording in the virtual sound card from configuration.
Preferably, describedly Real-Time Monitoring is carried out to service end call voice comprise: in Monitoring Service end call voice, whether have service to prohibit the validity that in language and/or Monitoring Service end call voice, contact staff replies.
Preferably, whether have service to prohibit language in described Monitoring Service end call voice to comprise:
Be semantic vector by current service end identification text-converted;
Calculate the distance that described semantic vector and the every bar in the taboo language database to build in advance prohibit the semantic vector of language;
If described distance is less than the distance threshold of setting, then reminds contact staff to employ service and prohibit language;
The validity that in described Monitoring Service end call voice, contact staff replies comprises:
According to active client identification text, in the answer storehouse built in advance, search the answer corresponding to problem that client proposes;
Extract the key word of described answer;
Check the quantity occurring described key word in current service end identification text;
If described quantity is less than the amount threshold of setting, then contact staff is reminded to note the validity replied.
Preferably, the validity that in described Monitoring Service end call voice, contact staff replies also comprises:
If described quantity is less than the amount threshold of setting, then sends to service end and represent described answer, to point out the content of contact staff's correct option.
Preferably, described method also comprises: Monitoring Service end call voice following any one or more: volume, word speed, the tone;
The volume of described Monitoring Service end call voice comprises:
Calculate the average energy value of current service end speech data;
If described the average energy value exceeds energy reference scope, then contact staff is reminded to note call voice;
The word speed of described Monitoring Service end call voice comprises:
Calculate the average word speed of current service end speech data;
If described average word speed exceeds word speed term of reference, then contact staff is reminded to note word speed of conversing;
The tone of described Monitoring Service end call voice comprises:
The stressed correlated characteristic of current service end speech data is extracted in units of syllable;
According to described stressed correlated characteristic, detected the stressed syllable of current service end call voice by the stressed detection model of training in advance;
If the stressed syllable quantity of described current service end call voice data is greater than stressed syllable amount threshold, then contact staff is reminded to note the tone of conversing.
Preferably, described prompting contact staff note call voice or word speed comprise following any one or multiple:
Become curve to show in real time the energy value of described service end current talking speech data or Speeking speed changing, and the volume needing contact staff to note or curved portion different colours corresponding to word speed are distinguished;
Become the mode of signal lattice to represent the energy value of described service end current talking speech data or Speeking speed changing, and mark the signal lattice of volume or the word speed needing contact staff to note;
The energy value of described service end current talking speech data or word speed are represented in the mode of color gradient, and marks the energy value or color corresponding to word speed that need contact staff to note;
The energy value of described service end current talking speech data or word speed are represented in the mode of bubble, when needs contact staff notes volume or word speed, bubble explosion;
Individualized voice is used to remind contact staff;
Described prompting contact staff notices that the call tone comprises:
Stressed syllable in display service end current talking speech data; Or
Individualized voice is used to remind contact staff.
A kind of call voice monitoring system, comprising:
First voice acquisition module, for the speech data of Real-time Collection service end call;
Second voice acquisition module, for the speech data of Real-time Collection client call;
Sound identification module, for carrying out speech recognition to the service end call voice data of described first voice acquisition module collection and the client call voice data of described second voice acquisition module collection respectively, obtain service end identification text and client identification text;
Speech monitoring module, for the identification text according to described service end call voice data and corresponding identification text and described client call voice data and correspondence, carries out Real-Time Monitoring to service end call voice.
Preferably, described first voice acquisition module is specifically for the mode Real-time Collection service end speech data by directly recording from physics sound card;
Described second voice acquisition module is specifically for the mode Real-time Collection client speech data by recording the virtual sound card from configuration.
Preferably, described speech monitoring module comprises: service is prohibited language monitoring submodule and/or replied validity monitoring submodule;
Language monitoring submodule is prohibited in described service, prohibits language for whether there being service in Monitoring Service end call voice;
Described answer validity monitoring submodule, for the validity that contact staff in Monitoring Service end call voice replies.
Preferably, described service taboo language monitoring submodule comprises:
Semantic vector converting unit, for being semantic vector by current service end identification text-converted;
Metrics calculation unit, prohibits the distance of the semantic vector of language for calculating described semantic vector and the every bar in the taboo language database built in advance;
Language reminding unit is prohibited in service, for when described distance is less than the distance threshold of setting, reminds contact staff to employ service and prohibits language;
Described answer validity monitoring submodule comprises:
Unit is searched in answer, for according to active client identification text, searches the answer corresponding to problem that client proposes in the answer storehouse built in advance;
Keyword extraction unit, for extracting the key word of described answer;
Volume check unit, for checking the quantity occurring described key word in current service end identification text;
Reply validity reminding unit, when described quantity is less than the amount threshold of setting, remind contact staff to note the validity replied.
Preferably, described answer validity reminding unit also comprises:
Answer transmitting element, for when described quantity is less than the amount threshold of setting, sends described answer to service end;
Answer represents unit, for representing described answer, to point out the content of contact staff's correct option.
Preferably, described system also comprises following any one or more module: volume monitoring modular, word speed monitoring modular, tone monitoring modular;
Described volume monitoring modular, for the volume of Monitoring Service end call voice;
Described word speed monitoring modular, for the word speed of Monitoring Service end call voice;
Described tone monitoring modular, for the tone of Monitoring Service end call voice;
Described volume monitoring modular comprises:
Energy value calculating sub module, for calculating the average energy value of current service end speech data;
Volume prompting submodule, for when described the average energy value exceeds energy reference scope, reminds contact staff to note call voice;
Described word speed monitoring modular comprises:
Word speed calculating sub module, for calculating the average word speed of current service end speech data;
Word speed reminds submodule, for when described average word speed exceeds word speed term of reference, reminds contact staff to note word speed of conversing;
Described tone monitoring modular comprises:
Read correlated characteristic again and extract submodule, for extracting the stressed correlated characteristic of current service end speech data in units of syllable;
Stressed syllable detection sub-module, for according to described stressed correlated characteristic, detects the stressed syllable of current service end call voice by the stressed detection model of training in advance;
The tone reminds submodule, for when the stressed syllable quantity of described current service end call voice data is greater than stressed syllable amount threshold, reminds contact staff to note the tone of conversing.
Preferably, described volume prompting submodule or word speed remind submodule specifically to adopt following any one or various ways to remind contact staff:
Become curve to show in real time the energy value of described service end current talking speech data or Speeking speed changing, and the volume needing contact staff to note or curved portion different colours corresponding to word speed are distinguished;
Become the mode of signal lattice to represent the energy value of described service end current talking speech data or Speeking speed changing, and mark the signal lattice of volume or the word speed needing contact staff to note;
The energy value of described service end current talking speech data or word speed are represented in the mode of color gradient, and marks the energy value or color corresponding to word speed that need contact staff to note;
The energy value of described service end current talking speech data or word speed are represented in the mode of bubble, when needs contact staff notes volume or word speed, bubble explosion;
Individualized voice is used to remind contact staff;
The described tone reminds submodule specifically for the stressed syllable in display service end current talking speech data; Or use individualized voice to remind contact staff.
A kind of call voice monitoring method provided by the invention and system, the call voice of contact staff can be monitored fully and effectively, different analyses is carried out according to different factors, the prompting of different content can be sent for produced problem after analyzing result, contact staff is helped to correct produced problem timely and effectively, thus while guarantee speech quality, promote service end service quality.
Embodiment
In order to the scheme making those skilled in the art person understand the embodiment of the present invention better, below in conjunction with drawings and embodiments, the embodiment of the present invention is described in further detail.
As shown in Figure 1, be the process flow diagram of the call voice monitoring method of the embodiment of the present invention, comprise the following steps:
Step 101, the speech data of Real-time Collection service end and client call respectively.
When carrying out data under voice, can respectively by the speech data of two different recording channel Real-time Collection service ends and client.Particularly, directly can pass through the mode Real-time Collection service end speech data of recording from physics sound card, in client, virtual sound card is installed, gather client speech data by described virtual sound card.Certainly, can also adopt the speech data of alternate manner difference Real-time Collection service end and client call, the embodiment of the present invention does not limit this.
Step 102, carries out speech recognition to the speech data of described service end and client, obtains service end identification text and client identification text respectively.
It should be noted that, in order to promote the accuracy of speech recognition, can, after detecting that service end or client use dialect, use dialect customizing model to carry out speech recognition.Certainly, also can, after detecting that service end or client use foreign language, corresponding foreign language customizing model be used to carry out speech recognition.Concrete sound recognition methods can adopt prior art, and the embodiment of the present invention does not also limit this.
Step 103, according to the identification text of described service end call voice data with correspondence and the identification text of described client call voice data and correspondence, carries out Real-Time Monitoring to service end call voice.
In order to improve the service quality of contact staff, when the service of contact staff going wrong, making corresponding prompting to make contact staff's more direct problem timely and effectively, thus realizing carrying out monitoring in real time, comprehensively, effectively to service end call voice.
Particularly, carry out Real-Time Monitoring to service end call voice can comprise: in Monitoring Service end call voice, whether have service to prohibit the validity that in language and/or Monitoring Service end call voice, contact staff replies.
Whether have service to prohibit language in described Monitoring Service end call voice can comprise the following steps:
(1) be semantic vector by current service end identification text-converted.
Particularly, can prior art being adopted, be obtained the semantic vector of each word by training, then by identifying that the semantic vector of all words in text combines, obtaining the semantic vector of whole sentence.Such as, can be semantic vector by current service end identification text-converted by Sentence2Vec model, certainly, other conversion regimes can also be had, this embodiment of the present invention is not limited.
(2) distance that described semantic vector and the every bar in the taboo language database to build in advance prohibit the semantic vector of language is calculated.
Described distance can be COS distance also can be other distances, does not limit this embodiment of the present invention.Described taboo language database can build by collecting the word, phrase or the sentence that easily cause client's unhealthy emotion to react.When storing institute's predicate, phrase or sentence, can store in the mode of semantic vector simultaneously, also can only store corresponding semantic vector.In addition, constantly can upgrade described taboo language database, emerging word, phrase or the sentence easily causing client's unhealthy emotion to react is joined in this taboo language database.
(3) if described distance is less than the distance threshold of setting, then remind contact staff to employ service and prohibit language.
Service whether is had to prohibit language mainly in order to prevent contact staff when answering a question for client in Monitoring Service end call voice, mediate complaint, use some with personal mood or impatient statement, such as " say as soon as possible; do not waste me the time ", the statement such as " also having problem; I also has other thing ", thus cause service quality to decline.
The validity that in described Monitoring Service end call voice, contact staff replies can comprise the following steps:
(1) according to active client identification text, in the answer storehouse built in advance, the answer corresponding to problem that client proposes is searched.
It should be noted that, when building described answer storehouse, can expand the problem in described answer storehouse and answer, described expansion can be synonym expansion, hypernym expansion, hyponym expansion etc., such as, can expand to " I will open CRBT ", " I will use CRBT " etc. to problem " I will do CRBT ".Certainly, extended method also has a lot, does not limit this embodiment of the present invention.
(2) key word of described answer is extracted.
(3) quantity occurring described key word in current service end identification text is checked.
(4) if the quantity of described key word is less than the amount threshold of setting, then contact staff is reminded to note the validity replied.
It should be noted that, if the quantity being checked through described key word in step (4) is less than the amount threshold of setting, can also sends to service end and represent described answer, to point out the content of contact staff's correct option.The answer of the problem that so can effectively help contact staff to find client to propose in time, the misunderstanding avoided in communication then avoids client and produces bad emotional reactions, meanwhile, the time that contact staff thinks deeply answer can also be reduced, make service more efficient.
In an alternative embodiment of the invention, call voice monitoring method can also comprise the following steps:
Carry out Real-Time Monitoring to the volume of service end call voice, word speed, the tone, certainly, one or both also can selecting in this three are monitored.
Particularly, the volume of described Monitoring Service end call voice can be carried out in the following manner: the average energy value calculating current service end speech data, if described the average energy value exceeds energy reference scope, then reminds contact staff to note call voice.
The determination of described energy reference scope, can by collecting the good speech data of a large amount of service end call voice quality in advance, add up the mean value of described speech data, this mean value is heard the energy reference value of sound as applicable client, then with this reference value for benchmark expands the number percent of setting up and down, such as 10%, obtain described energy reference scope.Certainly additive method can also be had to obtain described energy reference scope, this embodiment of the present invention is not limited.
Described prompting contact staff notices that call voice can be detect that contact staff's volume is excessive or the too small time-division is awake indescribably, the problem place of the In Call clearly occurred to make contact staff.
Above-mentioned prompting contact staff note the mode of call voice can comprise following any one or multiple:
(1) energy value of described service end current talking speech data is transformed into curve to show in real time, and curved portion different colours corresponding for the volume needing contact staff to note is distinguished.
(2) mode energy value of described service end current talking speech data being transformed into signal lattice represents, and marks the signal lattice of the volume needing contact staff to note.
(3) energy value of described service end current talking speech data is represented in the mode of color gradient, and mark color corresponding to energy value needing contact staff to note.
(4) energy value of described service end current talking speech data is represented in the mode of bubble, when needs contact staff notes volume, bubble explosion.
(5) individualized voice is used to remind contact staff.
It should be noted that, additive method can also be had to remind contact staff to note call voice or word speed, and enumerate no longer one by one at this, the embodiment of the present invention does not also limit this.
The volume of described Monitoring Service end call voice can prevent the call voice volume of contact staff excessive or too small, and described volume is crossed conference and damaged client's hearing, and the too small meeting of described volume makes client not hear the content of contact staff's call.The method that the embodiment of the present invention provides can effectively avoid these problems, detect the call voice volume of contact staff excessive or too small time remind timely and effectively, help contact staff more direct problem.
The word speed of described Monitoring Service end call voice can comprise: the average word speed calculating current service end speech data, if described average word speed exceeds word speed term of reference, then reminds contact staff to note word speed of conversing.
Particularly, the determination of described word speed term of reference, can by collecting the good call voice data of a large amount of service end call voice in advance, add up the number of words of speaking of average minute clock in these speech datas, using the standard word speed that the mean value obtained is conversed as service end, then with this standard word speed for benchmark expands up and down to it, such as multiple word slower in standard word speed or fast multiple word, obtain described word speed term of reference.Certainly additive method can also be had to obtain described word speed term of reference, this embodiment of the present invention is not limited.
Described prompting contact staff notices that call word speed can be detect that the too fast or word speed of contact staff's word speed is spent the slow time-division and be you can well imagine and wake up, the problem place of the call word speed clearly occurred to make contact staff.
The concrete mode of call word speed can have multiple to remind contact staff to note, such as similar above-mentioned prompting contact staff notes the various modes etc. of call voice.
The method that the embodiment of the present invention provides can be reminded timely and effectively when detecting that the call voice word speed of contact staff is too fast or cross slow, can effectively avoid client cannot know the content of speaking, help contact staff's more direct problem, converse with contact staff more comfily to help client.
The tone of described Monitoring Service end call voice can comprise:
(1) in units of syllable, extract the stressed correlated characteristic of current service end speech data.
Described stressed correlated characteristic can comprise the fundamental frequency of current syllable, whether the fundamental frequency of the previous syllable of current syllable, the fundamental frequency of a rear syllable of current syllable, current syllable position, current syllable in sentence should be read again.Whether described current syllable should be read again can obtain by manually marking a large amount of speech datas in advance.
(2) according to described stressed correlated characteristic, the stressed syllable of current service end call voice is detected by the stressed detection model of training in advance.
Described stressed detection model can obtain by training after the stressed correlated characteristic of a large amount of service end call voice data preferably of extraction.In addition, the model representation that described stressed detection model can be conventional in Using statistics, such as supporting vector machine model, neural network model etc.It should be noted that, training and represent that the method for reading detection model again also has a lot, differ at this and one to enumerate, the embodiment of the present invention does not also limit this.
(3) if the stressed syllable quantity of described current service end call voice data is greater than stressed syllable amount threshold, then contact staff is reminded to note the tone of conversing.
Described prompting contact staff notices that the method for the call tone can comprise: the stressed syllable in display service end current talking speech data or use individualized voice remind contact staff.Certainly additive method can also be had to remind contact staff to note call voice or word speed, and enumerate no longer one by one at this, the embodiment of the present invention does not also limit this.
By the tone of Monitoring Service end call voice, contact staff can being helped when providing voice service to client, avoiding reading again in the place of reading again and affecting the mood of client.
The method of the call voice monitoring that the embodiment of the present invention provides, speech recognition is carried out to it by the speech data of Real-time Collection service end respectively and client call, according to the identification text of service end call voice data with correspondence and the identification text of described client call voice data and correspondence, the call voice that Real-Time Monitoring can monitor contact staff is comprehensively and effectively carried out to service end call voice, different analyses is carried out according to different factors, the prompting of different content can be sent for produced problem after analyzing result, contact staff is helped to correct produced problem timely and effectively, thus while guarantee speech quality, promote service end service quality.
Further, in another embodiment of the inventive method, can also the volume of Monitoring Service end call voice, word speed, one or more in the tone, give contact staff when going wrong in time and remind, convenient service end adjusts in time, and then ensure that service quality better.
Correspondingly, the embodiment of the present invention also provides a kind of call voice monitoring system, as shown in Figure 2, is the structural drawing of the call voice monitoring system of the embodiment of the present invention, can comprises following module:
First voice acquisition module 201, for the speech data of Real-time Collection service end call.
Second voice acquisition module 202, for the speech data of Real-time Collection client call.
Sound identification module 203, for carrying out speech recognition to the client speech data that service end call voice data and described second voice acquisition module 202 of described first voice acquisition module 201 collection gather respectively, obtain service end identification text and client identification text.
Speech monitoring module 204, for the identification text according to described service end call voice data and corresponding identification text and described client call voice data and correspondence, carries out Real-Time Monitoring to service end call voice.
The mode Real-time Collection service end speech data that above-mentioned first voice acquisition module 201 specifically can directly be recorded from physics sound card.
Above-mentioned second voice acquisition module 202 specifically can be passed through in client configuration virtual sound card, the mode Real-time Collection client speech data of recording from virtual sound card.
Whether described speech monitoring module 204 specifically can have service to prohibit the validity that in language and/or Monitoring Service end call voice, contact staff replies in Monitoring Service end call voice.Correspondingly, this speech monitoring module 204 can comprise: service is prohibited language monitoring submodule and/or replied validity monitoring submodule.Language monitoring submodule is prohibited in described service, prohibits language for whether there being service in Monitoring Service end call voice.Described answer validity monitoring submodule, for the validity that contact staff in Monitoring Service end call voice replies.
Described service is prohibited language monitoring submodule and can be comprised:
Semantic vector converting unit, for being semantic vector by current service end identification text-converted.
Metrics calculation unit, prohibits the distance of the semantic vector of language for calculating described semantic vector and the every bar in the taboo language database built in advance.
Language reminding unit is prohibited in service, for when described distance is less than the distance threshold of setting, reminds contact staff to employ service and prohibits language.
Described answer validity monitoring submodule can comprise:
Unit is searched in answer, for according to active client identification text, searches the answer corresponding to problem that client proposes in the answer storehouse built in advance.
Keyword extraction unit, for extracting the key word of described answer.
Volume check unit, for checking the quantity occurring described key word in current service end identification text.
Reply validity reminding unit, when the quantity of described key word is less than the amount threshold of setting, remind contact staff to note the validity replied.
It should be noted that, answer validity reminding unit can also comprise answer transmitting element and answer represents unit.Wherein, described answer transmitting element, when the quantity of described key word is less than the amount threshold of setting, sends described answer to service end.Described answer represents unit and represents described answer, to point out the content of contact staff's correct option.The answer of the problem that so can effectively help contact staff to find client to propose in time, the misunderstanding avoided in communication then avoids client and produces bad emotional reactions, meanwhile, the time that contact staff thinks deeply answer can also be reduced, make service more efficient.
In an alternative embodiment of the invention, call voice monitoring system can also comprise following any one or multiple module:
Volume monitoring modular, for according to described service end call voice data and described client call voice data, the volume of Monitoring Service end call voice.
Word speed monitoring modular, for according to described service end call voice data and described client call voice data, the word speed of Monitoring Service end call voice.
Tone monitoring modular, for according to described service end call voice data and described client call voice data, the tone of Monitoring Service end call voice.
Particularly, described volume monitoring modular can comprise:
Energy value calculating sub module, for calculating the average energy value of current service end speech data.
Volume prompting submodule, for when described the average energy value exceeds energy reference scope, reminds contact staff to note call voice.
Described word speed monitoring modular can comprise:
Word speed calculating sub module, for calculating the average word speed of current service end speech data.
Word speed reminds submodule, for when described average word speed exceeds word speed term of reference, reminds contact staff to note word speed of conversing.
Above-mentioned volume prompting submodule or word speed remind submodule that following any one or various ways specifically can be adopted to remind contact staff:
(1) become curve to show in real time the energy value of described service end current talking speech data or Speeking speed changing, and the volume needing contact staff to note or curved portion different colours corresponding to word speed are distinguished.
(2) become the mode of signal lattice to represent the energy value of described service end current talking speech data or Speeking speed changing, and mark the signal lattice of volume or the word speed needing contact staff to note.
(3) energy value of described service end current talking speech data or word speed are represented in the mode of color gradient, and mark the energy value or color corresponding to word speed that need contact staff to note.
(4) energy value of described service end current talking speech data or word speed are represented in the mode of bubble, when needs contact staff notes volume or word speed, bubble explosion.
(5) individualized voice is used to remind contact staff.
Certainly, above-mentioned volume prompting submodule or word speed remind submodule that other modes can also be adopted to remind contact staff to note call voice or word speed, and enumerate no longer one by one at this, the embodiment of the present invention does not also limit this.
Described tone monitoring modular can comprise:
Read correlated characteristic again and extract submodule, for extracting the stressed correlated characteristic of current service end speech data in units of syllable.
Stressed syllable detection sub-module, for according to described stressed correlated characteristic, detects the stressed syllable of current service end call voice by the stressed detection model of training in advance.
The tone reminds submodule, for when the stressed syllable quantity of described current service end call voice data is greater than stressed syllable amount threshold, reminds contact staff to note the tone of conversing.
The described tone reminds submodule specifically or individualized voice can be used to remind contact staff by the stressed syllable in display service end current talking speech data.
The system of the call voice monitoring that the embodiment of the present invention provides, speech recognition is carried out to it by the speech data of Real-time Collection service end respectively and client call, according to the identification text of service end call voice data with correspondence and the identification text of described client call voice data and correspondence, the call voice that Real-Time Monitoring can monitor contact staff is comprehensively and effectively carried out to service end call voice, different analyses is carried out according to different factors, the prompting of different content can be sent for produced problem after analyzing result, contact staff is helped to correct produced problem timely and effectively, thus while guarantee speech quality, promote service end service quality.
Further, the system of call voice of the present invention monitoring can also the volume of Monitoring Service end call voice, word speed, one or more in the tone, give contact staff in time to remind when going wrong, convenient service end adjusts in time, and then ensure that service quality better.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, between each embodiment identical similar part mutually see, what each embodiment stressed is the difference with other embodiments.Especially, for system embodiment, because it is substantially similar to embodiment of the method, so describe fairly simple, relevant part illustrates see the part of embodiment of the method.System embodiment described above is only schematic, the wherein said unit illustrated as separating component or can may not be and physically separates, parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of module wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.Those of ordinary skill in the art, when not paying creative work, are namely appreciated that and implement.
Being described in detail the embodiment of the present invention above, applying embodiment herein to invention has been elaboration, the explanation of above embodiment just understands method and system of the present invention for helping; Meanwhile, for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.