Movatterモバイル変換


[0]ホーム

URL:


CN108962253A - A kind of voice-based data processing method, device and electronic equipment - Google Patents

A kind of voice-based data processing method, device and electronic equipment
Download PDF

Info

Publication number
CN108962253A
CN108962253ACN201710384412.3ACN201710384412ACN108962253ACN 108962253 ACN108962253 ACN 108962253ACN 201710384412 ACN201710384412 ACN 201710384412ACN 108962253 ACN108962253 ACN 108962253A
Authority
CN
China
Prior art keywords
data
text
voice
interrogation
text data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710384412.3A
Other languages
Chinese (zh)
Inventor
李明修
银磊
卜海亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co LtdfiledCriticalBeijing Sogou Technology Development Co Ltd
Priority to CN201710384412.3ApriorityCriticalpatent/CN108962253A/en
Priority to PCT/CN2018/082702prioritypatent/WO2018214663A1/en
Publication of CN108962253ApublicationCriticalpatent/CN108962253A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

The embodiment of the present invention provides a kind of voice-based data processing method, device and electronic equipment, completely to record interrogation process.The method includes: to obtain interrogation process data, and the interrogation process data is determined according to the voice data acquired during interrogation;It is identified according to the interrogation process data, obtains corresponding first text data and the second text data, wherein first text data belongs to a target user, and second text data belongs to the other users in addition to the target user;According to first text data and the second text data, interrogation information is obtained.Using the embodiment of the present invention, can during automatic distinguishing interrogation doctor, patient sentence, complete to record interrogation process, automatic arranging obtains the contents such as case, saves the finishing time of interrogation record.

Description

A kind of voice-based data processing method, device and electronic equipment
Technical field
The present invention relates to technical fields, set more particularly to a kind of voice-based data processing method, device and electronicsIt is standby.
Background technique
Speech recognition is usually to convert speech into text, and traditional speech recognition equipments of recording can only turn voice dataIt is changed to corresponding text, and speaker cannot be distinguished.It therefore, can not be effective by speech recognition in the case where multi-person speechIt is recorded.
Such as in the practical diagnosis and treatment process of hospital, at least have two people and exchange, i.e., at least have doctor and patient intoRow exchange, is also possible that family numbers of patients etc. sometimes, and cannot achieve by existing voice identification facility and ask the voice of acquisitionIt examines and records corresponding voice producer and distinguish, can not comprehensively record entire interrogation process.
Summary of the invention
The embodiment of the present invention provides a kind of voice-based data processing method, completely to record interrogation process.
Correspondingly, the embodiment of the invention also provides a kind of voice-based data processing equipments, a kind of electronic equipment, oneKind readable storage medium storing program for executing, to guarantee the implementation and application of the above method.
To solve the above-mentioned problems, the embodiment of the invention discloses a kind of voice-based data processing methods, comprising: obtainsInterrogation process data is taken, the interrogation process data is determined according to the voice data acquired during interrogation;According to the interrogationProcess data is identified, obtains corresponding first text data and the second text data, wherein the first text data categoryIn a target user, second text data belongs to the other users in addition to the target user;According to described firstText data and the second text data, obtain interrogation information.
Optionally, the interrogation process data is voice data;It is described to be identified according to the interrogation process data, it obtainsTake corresponding first text data and the second text data, comprising: according to vocal print feature, the is isolated from the voice dataOne voice data and second speech data;Speech recognition is carried out to first voice data and second speech data respectively, is obtainedTake corresponding first text data and the second text data.
Optionally, described according to vocal print feature, the first voice data and the second voice are isolated from the voice dataData, comprising: the voice data is divided into multiple sound bites;According to vocal print feature, determined using the sound biteFirst voice data and second speech data.
Optionally, described according to vocal print feature, the first voice data and the second voice number are determined using the sound biteAccording to, comprising: each sound bite is matched respectively using benchmark vocal print feature, wherein the benchmark vocal print feature is targetThe vocal print feature of user;The sound bite being consistent with the benchmark vocal print feature is obtained, corresponding first voice data is obtained;It obtainsThe sound bite not being consistent with the benchmark vocal print feature is taken, corresponding second speech data is obtained.
Optionally, described according to vocal print feature, the first voice data and the second voice number are determined using the sound biteAccording to, comprising: the vocal print feature of each sound bite is identified;Count the quantity that each vocal print feature corresponds to sound bite;It determinesThe maximum vocal print feature of quantity with sound bite generates the first voice number using the corresponding sound bite of the vocal print featureAccording to;Second speech data is generated using the sound bite for being not belonging to first voice data.
Optionally, described that speech recognition is carried out respectively to first voice data and second speech data, it obtains and corresponds toThe first text data and the second text data, comprising: voice is carried out respectively to each sound bite in first voice dataIdentification generates the first text data using the text fragments that identification obtains;To each sound bite in the second speech data pointNot carry out speech recognition, generate the second text data using the obtained text fragments of identification;Then, described according to first textData and the second text data, obtain interrogation information, comprising: according to each text fragments in first text data and describedEach text fragments respectively correspond the time sequencing of sound bite in two text datas, are ranked up, are asked to each text fragmentsExamine information.
Optionally, the interrogation process data is the text identification result that voice data identifies;Described in the foundationInterrogation process data is identified, obtains corresponding first text data and the second text data, comprising: to the text identificationAs a result feature identification is carried out, isolates the first text data and the second text data according to language feature.
Optionally, feature identification is carried out to the text identification result, isolates the first text data according to language featureWith the second text data, comprising: divided to the text identification result, obtain corresponding text fragments;Using default mouldType identifies the text fragments, determines the language feature that the text fragments have, the language feature includes targetUser language feature and non-targeted user language feature;The first text is generated using the text fragments with target user's language featureNotebook data, and, the second text data is generated using the text fragments with non-targeted user language feature.
The embodiment of the invention also discloses a kind of voice-based data processing equipments, comprising: data acquisition module is used forInterrogation process data is obtained, the interrogation process data is determined according to the voice data acquired during interrogation;Text identification mouldBlock, for being identified according to the interrogation process data, corresponding first text data of acquisition and the second text data,In, first text data belongs to a target user, and second text data belongs in addition to the target userOther users;Information determination module, for obtaining interrogation information according to first text data and the second text data.
Optionally, the interrogation process data is voice data;The text identification module, comprising: separation submodule is usedAccording to vocal print feature, the first voice data and second speech data are isolated from the voice data;Speech recognition submoduleBlock obtains corresponding first textual data for carrying out speech recognition respectively to first voice data and second speech dataAccording to the second text data.
Optionally, the separation submodule, for the voice data to be divided into multiple sound bites;It is special according to vocal printSign, determines the first voice data and second speech data using the sound bite.
Optionally, the separation submodule, for being matched respectively using benchmark vocal print feature to each sound bite,In, the benchmark vocal print feature is the vocal print feature of target user;The audio fragment being consistent with the benchmark vocal print feature is obtained,Obtain corresponding first voice data;The audio fragment not being consistent with the benchmark vocal print feature is obtained, obtains corresponding secondVoice data.
Optionally, the separation submodule, identifies for the vocal print feature to each sound bite;Statistics has respectivelyThe sound bite and its quantity of identical vocal print feature generate second speech data using quantity maximum sound bite, wherein quantityMaximum vocal print feature is the vocal print feature of target user;Second speech data is generated using remaining sound bite.
Optionally, the speech recognition submodule, for being carried out respectively to each sound bite in first voice dataSpeech recognition generates the first text data using the text fragments that identification obtains;To each voice sheet in the second speech dataSection carries out speech recognition respectively, generates the second text data using the text fragments that identification obtains;The information determination module is usedEach text fragments respectively correspond voice in each text fragments and second text data according to first text dataThe time sequencing of segment is ranked up each text fragments, obtains interrogation information.
Optionally, the interrogation process data is the text identification result that voice data identifies;The text identificationModule isolates the first text data and second according to language feature for carrying out feature identification to the text identification resultText data.
Optionally, the text identification module, comprising: segment changes molecular modules, for the text identification result intoRow divides, and obtains corresponding text fragments;Segment identifies submodule, for being known using preset model to the text fragmentsNot, determine that the language feature that the text fragments have, the language feature include first language feature and second language feature;Text generation submodule for use there are the text fragments of first language feature to generate the first text data, and, using toolThere are the text fragments of second language feature to generate the second text data.
The embodiment of the invention also discloses a kind of readable storage medium storing program for executing, when the instruction in the storage medium is by electronic equipmentProcessor execute when so that electronic equipment be able to carry out it is voice-based as described in one or more in the embodiment of the present inventionData processing method.
Optionally, a kind of electronic equipment, includes memory and one or more than one program, one of themPerhaps more than one program is stored in memory and is configured to be executed by one or more than one processor oneOr more than one program includes the instruction for performing the following operation: obtaining interrogation process data, the interrogation process dataIt is determined according to the voice data acquired during interrogation;It is identified according to the interrogation process data, obtains corresponding firstText data and the second text data, wherein first text data belongs to a target user, second text dataBelong to the other users in addition to the target user;According to first text data and the second text data, interrogation is obtainedInformation.
Optionally, the interrogation process data is voice data;It is described to be identified according to the interrogation process data, it obtainsTake corresponding first text data and the second text data, comprising: according to vocal print feature, the is isolated from the voice dataOne voice data and second speech data;Speech recognition is carried out to first voice data and second speech data respectively, is obtainedTake corresponding first text data and the second text data.
Optionally, described according to vocal print feature, the first voice data and the second voice are isolated from the voice dataData, comprising: the voice data is divided into multiple sound bites;According to vocal print feature, determined using the sound biteFirst voice data and second speech data.
Optionally, described according to vocal print feature, the first voice data and the second voice number are determined using the sound biteAccording to, comprising: each sound bite is matched respectively using benchmark vocal print feature, wherein the benchmark vocal print feature is targetThe vocal print feature of user;The sound bite being consistent with the benchmark vocal print feature is obtained, corresponding first voice data is obtained;It obtainsThe sound bite not being consistent with the benchmark vocal print feature is taken, corresponding second speech data is obtained.
Optionally, described according to vocal print feature, the first voice data and the second voice number are determined using the sound biteAccording to, comprising: the vocal print feature of each sound bite is identified;Count the quantity that each vocal print feature corresponds to sound bite;It determinesThe maximum vocal print feature of quantity with sound bite generates the first voice number using the corresponding sound bite of the vocal print featureAccording to;Second speech data is generated using the sound bite for being not belonging to first voice data.
Optionally, described that speech recognition is carried out respectively to first voice data and second speech data, it obtains and corresponds toThe first text data and the second text data, comprising: voice is carried out respectively to each sound bite in first voice dataIdentification generates the first text data using the text fragments that identification obtains;To each sound bite in the second speech data pointNot carry out speech recognition, generate the second text data using the obtained text fragments of identification;Then, described according to first textData and the second text data, obtain interrogation information, comprising: according to each text fragments in first text data and describedEach text fragments respectively correspond the time sequencing of sound bite in two text datas, are ranked up, are asked to each text fragmentsExamine information.
Optionally, the interrogation process data is the text identification result that voice data identifies;Described in the foundationInterrogation process data is identified, obtains corresponding first text data and the second text data, comprising: to the text identificationAs a result feature identification is carried out, isolates the first text data and the second text data according to language feature.
Optionally, feature identification is carried out to the text identification result, isolates the first text data according to language featureWith the second text data, comprising: divided to the text identification result, obtain corresponding text fragments;Using default mouldType identifies the text fragments, determines the language feature that the text fragments have, the language feature includes targetUser language feature and non-targeted user language feature;The first text is generated using the text fragments with target user's language featureNotebook data, and, the second text data is generated using the text fragments with non-targeted user language feature.
The embodiment of the present invention includes following advantages:
The interrogation process data that the embodiment of the present invention can be determined during interrogation by acquisition voice, can be from interrogationNumber of passes identifies the first text data and the second text data according to different user in, wherein the first text data categoryIn a target user, second text data belongs to the other users in addition to the target user, can automatic areaThe sentence of doctor, patient during point interrogation, then according to first text data and the second text data, obtain interrogation letterBreath can completely record interrogation process, and automatic arranging obtains the contents such as case, save the finishing time of interrogation record.
Detailed description of the invention
Fig. 1 is a kind of step flow chart of voice-based data processing method embodiment of the invention;
Fig. 2 is the step flow chart of the voice-based data processing method embodiment of another kind of the invention;
Fig. 3 is the step flow chart of another voice-based data processing method embodiment of the invention;
Fig. 4 is a kind of structural block diagram of voice-based data processing equipment embodiment of the invention;
Fig. 5 is the structural block diagram of the voice-based data processing equipment embodiment of another kind of the invention;
Fig. 6 is that a kind of present invention electronics for voice-based data processing shown according to an exemplary embodiment is setStandby structural block diagram;
Fig. 7 is a kind of electronic equipment for voice-based data processing that the present invention is shown according to another exemplary embodimentStructural schematic diagram.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific realApplying mode, the present invention is described in further detail.
Referring to Fig.1, a kind of step flow chart of voice-based data processing method embodiment of the invention is shown, is hadBody may include steps of:
Step 102, interrogation process data is obtained, the interrogation process data is according to the voice data acquired during interrogationIt determines.
During interrogation, voice collecting, the language based on acquisition can be carried out to the interrogation process by various electronic equipmentsSound data obtain interrogation process data, i.e. the interrogation process data can be the voice data of acquisition, also may be based on the language of acquisitionThe text identification result that sound data conversion obtains.To the embodiment of the present invention can using various interrogation processes acquisition data intoRow identification.
Step 104, it is identified according to the interrogation process data, obtains corresponding first text data and the second textData, wherein first text data belongs to a target user, and second text data belongs to except the target userExcept other users.
The interrogation process data can be identified, the difference according to data type uses different recognition methods, such asVoice data can be handled by modes such as vocal print feature, speech recognitions, text data can be identified by text feature,To obtain the first text data and the second text data distinguished according to user.Wherein, can have at least during the interrogationTwo users carry out communication interaction, and a user is doctor, and other users are patient, family numbers of patients etc..E.g. according to doctorThe acquisition of outpatient service in one day, then it wherein will include a doctor and several patients, it is also possible to have one or several family numbers of patients.ThereforeCan be by doctor's behaviours target user for interrogation record, then the first text data is the corresponding interrogation text data of doctor, andUsing the text data of at least one other users as the second text data, the i.e. corresponding interrogation text data of patient and family members.
Step 106, according to first text data and the second text data, interrogation information is obtained.
Since interrogation is usually the process of question and answer, above-mentioned first text data and the second text data can be and pass throughWhat multiple text fragments were constituted, therefore the time based on text fragments interrogation information can be obtained with corresponding user.
Such as a kind of example of interrogation information is as follows:
2017-4-23 10:23AM
What symptom do doctor A: you have?
Patient B: my XXX is uncomfortable.
Doctor A: either with or without XXX?
Patient B: have.
……
In actual treatment, it may also be combined with outpatient service record of hospital etc. and obtain patient information, to be distinguished in interrogation informationDifferent patient etc. out.
In conclusion for the interrogation process data determined during interrogation by acquisition voice, it can be from interrogation processThe first text data and the second text data are identified according to different user in data, wherein first text data belongs toOne target user, second text data belong to the other users in addition to the target user, can automatic distinguishingThe sentence of doctor, patient during interrogation, then according to first text data and the second text data, interrogation information is obtained,Interrogation process can be completely recorded, automatic arranging obtains the contents such as case, saves the finishing time of interrogation record.
In the embodiment of the present invention, interrogation process data includes the text knowledge that voice data and/or voice data identifyOther result.The recognition methods of different types of interrogation process data is different, therefore the embodiment of the present invention discusses different type respectivelyThe treatment process of interrogation process data.
Referring to Fig. 2, the step flow chart of the voice-based data processing method embodiment of another kind of the invention is shown,In the embodiment, the interrogation process data is voice data;It can specifically include following steps:
Step 202, interrogation process data is obtained, the interrogation process data is the voice data of acquisition during interrogation.
During interrogation, the acquisition of voice data can be carried out to the interrogation process by various electronic equipments, such as logicalThe equipment recording audio data such as recording pen, mobile phone, computer are crossed, the voice data acquired during interrogation, the voice number are obtainedIt can also be the voice data that a doctor acquires in multiple outpatient service, the present invention according to the voice data that can be primary outpatient service acquisitionEmbodiment to this with no restriction.It therefore include the voice data of a doctor and the language of at least one patient in the voice dataSound data may also include the voice data of at least one family numbers of patients.
Wherein, above-mentioned steps 104 are identified according to the interrogation process data, obtain corresponding first text data andSecond text data, it may include following steps 204-206.
Step 204, according to vocal print feature, the first voice data and the second voice number are isolated from the voice dataAccording to.
Vocal print (Voiceprint) refers to the sound wave spectrum for the carrying verbal information that electricity consumption acoustic instrument is shown.Vocal print toolThere is the feature of specificity and stability.After adult, the vocal print of people can keep stablizing relatively for a long time constant, therefore can pass through vocal printIdentify different people.Therefore, it for voice data, can be identified by vocal print feature, determine different user in the voice data(vocal print feature) corresponding sound bite, to obtain the first voice data of target user and the second voice number of other usersAccording to.
Wherein, described according to vocal print feature, the first voice data and the second voice number are isolated from the voice dataAccording to, comprising: the voice data is divided into multiple sound bites;According to vocal print feature, the is determined using the sound biteOne voice data and second speech data.
Specifically, voice data can be divided into multiple sound bites.It wherein, can be according to voice division rule, such as soundDwell interval between segment is divided;The corresponding vocal print feature of each sound can also be determined, thus foundation according to vocal print featureDifferent vocal print features divides sound bite.Therefore a voice data can mark off multiple sound bites, between each sound biteWith tandem, different sound bites can have identical or different vocal print feature.Therefore also true based on vocal print featureFixed each sound bite belongs to the first voice data or second speech data, that is, can determine that sound possessed by each sound biteThen multiple sound bites of vocal print feature with target user are constituted the first voice data by line feature, by other residuesSound bite constitute second speech data.
In the embodiment of the present invention, during to interrogation before the acquisition of voice data, doctor (target user) can first be acquiredOne section of voice is as reference data, in order to identify the vocal print feature i.e. benchmark vocal print feature of doctor from the reference data.Speech recognition modeling can also be set in the embodiment of the present invention, after voice data is inputted the speech recognition modeling, can will be metThe sound bite of benchmark voice print database is separated with the sound bite of other vocal print features, to obtain each voice sheet of target userThe sound bite of section and other users.In doctor's outpatient procedures, a doctor is usually only included in the case information of composition, and is suffered fromPerson may have it is multiple, so that its corresponding a large amount of case sample can be obtained for some particular doctor through the above way.
In an alternative embodiment of the invention, the vocal print feature of target user can be acquired in advance, as benchmark vocal print feature,To carry out the division of voice data.It is i.e. described according to vocal print feature, using the sound bite determine the first voice data andSecond speech data, comprising: each sound bite is matched respectively using benchmark vocal print feature, wherein the benchmark vocal printFeature is the vocal print feature of target user;The sound bite being consistent with the benchmark vocal print feature is obtained, obtains corresponding firstVoice data;The sound bite not being consistent with the benchmark vocal print feature is obtained, corresponding second speech data is obtained.I.e. forTarget user such as doctor can acquire in advance its voice data to extract vocal print feature, using the vocal print feature of target user as baseQuasi- vocal print feature, so that the benchmark vocal print feature can be used to each sound bite point for the voice data with target userIt is not matched, determines that vocal print feature and benchmark vocal print are characterized in no consistent in each sound bite, think the language if consistentTablet section and benchmark vocal print characteristic matching, are added to the first voice data (as corresponding language of target user for the sound biteSound data) in.After vocal print feature in sound bite and benchmark vocal print feature are inconsistent, the sound bite and benchmark vocal print featureIt mismatches, which is added in second speech data (as non-targeted user corresponding voice data).I.e. firstVoice data and second speech data are made of corresponding sound bite, wherein each sound bite also has ordinal relation, fromAnd it is convenient for subsequent accurate determining interrogation information.
In another alternative embodiment of the invention, sound bite can also be corresponded to by vocal print feature identical in voice dataQuantity carries out the division of voice data.The i.e. described foundation vocal print feature, determines the first voice data using the sound biteAnd second speech data, comprising: the vocal print feature of each sound bite is identified;It counts each vocal print feature and corresponds to sound biteQuantity;Determine the maximum vocal print feature of quantity with sound bite, it is raw using the corresponding sound bite of the vocal print featureAt the first voice data, wherein the maximum vocal print feature of quantity is the vocal print feature of target user;Using being not belonging to the first voiceThe sound bite of data generates second speech data.Based on the characteristic of interrogation process, interrogation process data may be a doctorThe record data of multiple outpatient service, therefore, in this process doctor often occupy more time and different patients and itsFamily members exchange interrogation, i.e., the voice quantity of doctor (target user) is most in voice data, therefore can be corresponding according to different userThe amount field partial objectives for user of sound bite and other users, and obtain the first voice data and second speech data.It can be rightVocal print feature in the sound bite is identified, is determined the vocal print feature that each sound bite is included, is then counted respectivelyEach vocal print feature corresponds to the quantity of sound bite, and determining has the maximum vocal print feature of quantity of sound bite, by the soundLine feature is determined as the vocal print feature of target user, other vocal print features are the vocal print feature of other users, to will have meshThe sound bite for marking the vocal print feature of user constitutes the first audio data in sequence, and other sound bites (are not belonging to theThe sound bite of one voice data) second audio data is constituted in sequence.
In the embodiment of the present invention, since voice data is acquired in the scene of multi-conference, a voice sheetIt may include the vocal print feature of multiple users in section.The case where for identifying multiple vocal print features from a sound bite:It, can be by the language if vocal print feature is the vocal print feature of other users when different vocal print features are that occur in different timeTablet section is added in second speech data;And if vocal print feature includes the vocal print feature of target user and the vocal print of other usersFeature can will then be added in corresponding voice data after the subdivided sub-piece of the sound bite.When different vocal print features beWhat the same time occurred, i.e., the same time has at least two users speaking, if then vocal print feature is the vocal print of other usersThe sound bite can be added in second speech data by feature, and if vocal print feature include target user vocal print feature andThe vocal print feature of other users can divide according to demand, such as the sound bite that the sound bite is classified as target user is comeTo the first voice data, which is perhaps classified as to the sound bite of other users come obtain second speech data orIt is added respectively in the voice data of two kinds of users.
Step 206, speech recognition carried out to first voice data and second speech data respectively, obtains corresponding theOne text data and the second text data.
After getting the first voice data and second speech data, two kinds of voice data can be identified respectively, fromAnd obtain the first text data of target user and the second text data of other users.
In one alternative embodiment, speech recognition is carried out to first voice data and second speech data respectively, is obtainedTake corresponding first text data and the second text data, comprising: to each sound bite in first voice data respectively intoRow speech recognition generates the first text data using the text fragments that identification obtains;To each voice in the second speech dataSegment carries out speech recognition respectively, generates the second text data using the text fragments that identification obtains.The first voice can be passed throughIdentification of the data to each sound bite obtains the corresponding text data of the sound bite, thus the sequence according to sound biteThe first text data is constituted, the second text data also can be obtained using corresponding mode.Due to during interrogation the problem of doctorAnswer with patient is all sequential, therefore corresponding time sequencing is recorded when voice data is divided into sound bite,Obtained the first text data and the second text data is also to have ordinal relation, is convenient for subsequent accurate arrangement interrogation information.
Step 208, according to first text data and the second text data, interrogation information is obtained.
The time sequencing that sound bite is corresponded to according to the first text data and the second text data, can be by the first text dataIn each text fragments in each text fragments and the second text data, be ranked up according to corresponding sequence, such as time sequencing, thusCorresponding interrogation information is obtained, can record doctor in the interrogation information in the problems in interrogation and respective patient (family members)Answer and doctor the various information such as diagnosis, doctor's advice.
Step 210, the interrogation information is analyzed, is analyzed accordingly as a result, the analysis result and diseaseDiagnosis is related.
After sorting out interrogation information, the embodiment of the present invention can also be analyzed interrogation information according to demand, obtain phaseThe analysis answered as a result, due to interrogation be it is relevant to medical diagnosis on disease, the analysis result is also related to medical diagnosis on disease, specifically according toIt is determined according to analysis demand.
For example, the common problem of doctor can be counted to every kind of disease, it is supplied to the less doctor's behaviours reference of experience;Interrogation information can be analyzed, develop Chinese medicine (doctor trained in Western medicine) artificial intelligence question answering system etc.;It can also be by counting, analyzingEtc. modes determine the corresponding symptom of every kind of disease, treatment method etc..
Referring to Fig. 3, the step flow chart of another voice-based data processing method embodiment of the invention is shown,In the present embodiment, the interrogation process data is the text identification that identifies of voice data as a result, can specifically include as followsStep:
Step 302, the text identification result that voice data identifies is obtained.
The voice data is that interrogation collects in the process, and the voice data collected is obtained by speech recognition conversionTo text recognition result, text recognition result can be directly acquired.
Wherein, above-mentioned steps 104 are identified according to the interrogation process data, obtain corresponding first text data andSecond text data, it may include following steps 304.
Step 304, feature identification is carried out to the text identification result, isolates the first text data according to language featureWith the second text data.
It, can not be directly as asking since unknown every section words are which people says for being identified as the data of textInformation is examined, therefore, if the embodiment of the present invention identifies different user from text identification result and arranges interrogation information.ItsIn, during interrogation, doctor would generally put question to symptom, and user can reply Symptoms, consultation of doctor break for corresponding disease,Inspection, drug of needs of required work etc., to can identify doctor and patient from text identification result based on these featuresSentence, and then isolate the first text data and the second text data.
I.e. the embodiment of the present invention can collect the text of doctor's interrogation and the text of patient's interrogation in advance, and for eachThe interrogation information analyzed is collected, to count language feature and patient and its family of doctor (i.e. target user)Belong to the language feature of (i.e. other users), and establish corresponding model, convenient for distinguishing the text of different user based on the language featureThis.Wherein, it can determine that the language feature of different user establishes preset model by modes such as machine learning, probability statistics.
Wherein, the embodiment of the present invention can obtain a large amount of separated case text as training data, separated doctorCase text is the interrogation information for identifying target user and other users, the positive information of text such as obtained in history according to identification.It canTo including doctor's content-data (the first text data of target user) and patient content's data (second of other usersText data) it is trained respectively, doctor's content model and patient content's model are obtained, both certain models can synthesize onePreset model may recognize that the sentence of doctor and the sentence of patient based on the preset model.
For example, it is the question sentence with symptom class vocabulary that doctor's content is generally mostly in the case information that interrogation obtains, such as youFeel how, have what symptom, it is where uncomfortable etc.;And patient content is generally mostly to be asked with Symptoms, epidemic disease classSentence, such as whether I catch a cold, and is XX disease etc.;It is the declarative sentence with symptom and drug that doctor's content is generally mostly, such asYou are viral influenza, you can have some XX medicine etc..To which the sentence content of doctor and the sentence content of patient all have ratioMore significant language feature, therefore doctor's content model and patient content's mould can be obtained according to the training of separated case informationType.
Feature identification is carried out to the text identification result, isolates the first text data and the second text according to language featureNotebook data, comprising: the text identification result is divided, corresponding text fragments are obtained;Using preset model to describedText fragments are identified, determine that the language feature that the text fragments have, the language feature include first language featureWith second language feature;The first text data is generated using the text fragments with first language feature, and, using having theThe text fragments of two language features generate the second text data.First text identification result can be divided, it can be according to ChineseSentence feature etc., is divided into sentence for text identification result, can also divide to obtain multiple text fragments according to other modes.Then willEach text fragments sequentially input preset model, are identified by preset model to text fragments, each so as to identifyLanguage feature possessed by text fragments.Certainly, which may be alternatively provided as based on the language feature identified, be this articleThis segment divides owning user.Wherein, using language this feature of target user as first language feature, by the language of other usersFeature is sayed as second language feature, then preset model can be used and determine that text fragments have first language feature or the second languageSay feature.Then the first text will can be generated with the text fragments of first language feature according to the stripe sequence of text fragmentsData, and, the second text data is generated using the text fragments with second language feature.
Step 306, according to first text data and the second text data, interrogation information is obtained.
Step 308, the interrogation information is analyzed, is analyzed accordingly as a result, the analysis result and diseaseDiagnosis is related.
Correspond to the sequence of sound bite according to the first text data and the second text data, it can will be in the first text data respectivelyEach text fragments in text fragments and the second text data, are ranked up according to corresponding sequence, to obtain corresponding interrogationInformation can record doctor in the interrogation information in the problems in interrogation and the answer of respective patient (family members), and doctorThe various information such as raw diagnosis, doctor's advice.
After sorting out interrogation information, the embodiment of the present invention can also be analyzed interrogation information according to demand, obtain phaseThe analysis answered as a result, due to interrogation be it is relevant to medical diagnosis on disease, the analysis result is also related to medical diagnosis on disease, specifically according toIt is determined according to analysis demand.
For example, the common problem of doctor can be counted to every kind of disease, it is supplied to the less doctor's behaviours reference of experience;Interrogation information can be analyzed, develop Chinese medicine (doctor trained in Western medicine) artificial intelligence question answering system etc.;It can also be by counting, analyzingEtc. modes determine the corresponding symptom of every kind of disease, treatment method etc..
Habit, the demand of case are recorded for doctor, are based on above scheme, can be by way of recording, it will be with patient'sCommunication process is recorded, and the sentence of doctor and patient is then demultiplex out, and is distinguished and is arranged, is supplied in the form of a dialogDoctor's behaviours case can be effectively reduced the time of doctor's institute's telephone expenses in case arrangement.
It should be noted that for simple description, therefore, it is stated as a series of action groups for embodiment of the methodIt closes, but those skilled in the art should understand that, embodiment of that present invention are not limited by the describe sequence of actions, because according toAccording to the embodiment of the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art also shouldKnow, the embodiments described in the specification are all preferred embodiments, and the related movement not necessarily present invention is implementedNecessary to example.
Referring to Fig. 4, a kind of structural block diagram of voice-based data processing equipment embodiment of the invention is shown, specificallyMay include following module:
Data acquisition module 402, for obtaining interrogation process data, the interrogation process data is adopted in the process according to interrogationThe voice data of collection determines.
Text identification module 404 obtains corresponding first textual data for being identified according to the interrogation process dataAccording to the second text data, wherein first text data belongs to a target user, and second text data, which belongs to, to be removedOther users except the target user.
Information determination module 406, for obtaining interrogation information according to first text data and the second text data.
Wherein, at least two users can carry out communication interaction during the interrogation, a user is doctor, other useFamily is patient, family numbers of patients etc..E.g. according to doctor's outpatient service in one day acquisition, then it wherein will include a doctor and several troublePerson, it is also possible to have one or several family numbers of patients.Therefore can be by doctor's behaviours target user for interrogation record, then the first textData are the corresponding interrogation text data of doctor, and using the text data of at least one other users as the second textual dataAccording to the i.e. corresponding interrogation text data of patient and family members.Since interrogation is usually the process of question and answer, above-mentioned first textual dataIt is made up of according to can be with the second text data multiple text fragments, therefore can time based on text fragments and to applicationFamily obtains interrogation information.
Such as a kind of example of interrogation information is as follows:
What symptom do 2017-4-23 10:23AM doctor A: you have? patient B: my XXX is uncomfortable.Doctor A: either with or withoutXXX? patient B, has ...
In actual treatment, it may also be combined with outpatient service record of hospital etc. and obtain patient information, to be distinguished in interrogation informationDifferent patient etc. out.
In conclusion for passing through acquisition determining interrogation process data during interrogation, it can be from interrogation process dataAccording to different user identify the first text data and the second text data, wherein first text data belongs to oneTarget user, second text data belong to the other users in addition to the target user, can automatic distinguishing interrogationThe sentence of doctor, patient in the process, then according to first text data and the second text data, interrogation information is obtained, it canComplete record interrogation process, automatic arranging obtain the contents such as case, save the finishing time of interrogation record.
Referring to Fig. 5, a kind of structural block diagram of voice-based data processing equipment embodiment of the invention is shown, specificallyMay include following module:
Wherein, the interrogation process data includes the text identification result that voice data and/or voice data identify.
The interrogation process data is voice data;The text identification module 404 may include:
Separate submodule 40402, for according to vocal print feature, isolated from the voice data the first voice data andSecond speech data.
Speech recognition submodule 40404, for carrying out voice respectively to first voice data and second speech dataIdentification obtains corresponding first text data and the second text data.
Wherein, the separation submodule 40402, for the voice data to be divided into multiple sound bites;According to soundLine feature determines the first voice data and second speech data using the sound bite.
Preferably, the separation submodule 40402, for being carried out respectively to each sound bite using benchmark vocal print featureMatch, wherein the benchmark vocal print feature is the vocal print feature of target user;Obtain the voice being consistent with the benchmark vocal print featureSegment obtains corresponding first voice data;The sound bite not being consistent with the benchmark vocal print feature is obtained, is obtained correspondingSecond speech data.
In the embodiment of the present invention, during to interrogation before the acquisition of voice data, doctor (target user) can first be acquiredOne section of voice is as reference data, in order to identify the vocal print feature i.e. benchmark vocal print feature of doctor from the reference data.Speech recognition modeling can also be set in the embodiment of the present invention, after voice data is inputted the speech recognition modeling, can will be metThe sound bite of benchmark voice print database is separated with the sound bite of other vocal print features, to obtain each voice sheet of target userThe sound bite of section and other users.In doctor's outpatient procedures, a doctor is usually only included in the case information of composition, and is suffered fromPerson may have it is multiple, so that its corresponding a large amount of case sample can be obtained for some particular doctor through the above way.
Preferably, the separation submodule 40402, identifies for the vocal print feature to each sound bite;It unites respectivelyThe quantity that each vocal print feature corresponds to sound bite is counted, determining has the maximum vocal print feature of quantity of sound bite, using describedThe corresponding sound bite of vocal print feature generates the first voice data, wherein the maximum vocal print feature of quantity is the sound of target userLine feature;Second speech data is generated using the sound bite for being not belonging to the first voice data.
It may be the record data of a multiple outpatient service of doctor by interrogation process data based on the characteristic of interrogation process,Therefore, doctor often occupies more time and different patients and its family members' exchange interrogation, i.e. voice in this processThe voice quantity of doctor (target user) is most in data, therefore the amount field subhead of sound bite can be corresponded to according to different userUser and other users are marked, and obtain the first voice data and second speech data.
In the embodiment of the present invention, since voice data is acquired in the scene of multi-conference, a voice sheetIt may include the vocal print feature of multiple users in section.It is multiple for identifying from a sound bite to separate submodule 40402The case where vocal print feature, can be performed following processing: in different vocal print features be occurred in different time, if vocal print feature isThe sound bite can then be added in second speech data by the vocal print feature of other users;And if vocal print feature includes targetThe vocal print feature of user and the vocal print feature of other users can will then be added to corresponding after the subdivided sub-piece of the sound biteIn voice data.When different vocal print features are that occur in the same time, i.e., the same time has at least two users speaking, thenIf vocal print feature is the vocal print feature of other users, which can be added in second speech data, and if vocal printFeature includes the vocal print feature of target user and the vocal print feature of other users, can be divided according to demand, such as by the voice sheetSection is classified as the sound bite of target user to obtain the first voice data, or the sound bite is classified as to the voice of other usersSegment is added respectively to obtain second speech data, or in the voice data of two kinds of users.
Preferably, the speech recognition submodule 40404, for distinguishing sound bite each in first voice dataSpeech recognition is carried out, generates the first text data using the text fragments that identification obtains;To each language in the second speech dataTablet section carries out speech recognition respectively, generates the second text data using the text fragments that identification obtains.Then the information determinesModule 406, for according to each text fragments in each text fragments in first text data and second text data pointThe time sequencing for not corresponding to sound bite is ranked up each text fragments, obtains interrogation information.
Preferably, the interrogation process data is the text identification result that voice data identifies;The text identificationModule 404 isolates the first text data and the according to language feature for carrying out feature identification to the text identification resultTwo text datas.
The text identification module 404, comprising:
Segment changes molecular modules 40406, for dividing to the text identification result, obtains corresponding text pieceSection.
Segment identifies submodule 40408, for identifying using preset model to the text fragments, determines the textThe language feature that this segment has, the language feature include first language feature and second language feature.
Wherein, the embodiment of the present invention can obtain a large amount of separated case text as training data, separated doctorCase text is the interrogation information for identifying target user and other users, the positive information of text such as obtained in history according to identification.It canTo including doctor's content-data (the first text data of target user) and patient content's data (second of other usersText data) it is trained respectively, doctor's content model and patient content's model are obtained, both certain models can synthesize onePreset model may recognize that the sentence of doctor and the sentence of patient based on the preset model.For example, the case information that interrogation obtainsIn, it is the question sentence with symptom class vocabulary that doctor's content is generally mostly, such as you feel how, have what symptom, where do not relaxClothes etc.;And it is the question sentence with Symptoms, epidemic disease class that patient content is generally mostly, such as whether I catch a cold, and is XX diseaseDeng;It is the declarative sentence with symptom and drug that doctor's content is generally mostly, such as you are viral influenza, you can have some XX medicine etc.Deng.To which, the sentence content of doctor and the sentence content of patient all have the significant language feature of comparison, therefore can be according to having dividedFrom case information training obtain doctor's content model and patient content's model.
Preferably, text generation submodule 40410, for generating first using the text fragments with first language featureText data, and, the second text data is generated using the text fragments with second language feature.
Preferably, the device further include: analysis module 408 obtains phase for analyzing the interrogation informationThe analysis answered is as a result, the analysis result is related to medical diagnosis on disease.
Correspond to the sequence of sound bite according to the first text data and the second text data, it can will be in the first text data respectivelyEach text fragments in text fragments and the second text data, are ranked up according to corresponding sequence, to obtain corresponding interrogationInformation can record doctor in the interrogation information in the problems in interrogation and the answer of respective patient (family members), and doctorThe various information such as raw diagnosis, doctor's advice.
After sorting out interrogation information, the embodiment of the present invention can also be analyzed interrogation information according to demand, obtain phaseThe analysis answered as a result, due to interrogation be it is relevant to medical diagnosis on disease, the analysis result is also related to medical diagnosis on disease, specifically according toIt is determined according to analysis demand.
For example, the common problem of doctor can be counted to every kind of disease, it is supplied to the less doctor's behaviours reference of experience;Interrogation information can be analyzed, develop Chinese medicine (doctor trained in Western medicine) artificial intelligence question answering system etc.;It can also be by counting, analyzingEtc. modes determine the corresponding symptom of every kind of disease, treatment method etc..
Habit, the demand of case are recorded for doctor, are based on above scheme, can be by way of recording, it will be with patient'sCommunication process is recorded, and the sentence of doctor and patient is then demultiplex out, and is distinguished and is arranged, is supplied in the form of a dialogDoctor's behaviours case can be effectively reduced the time of doctor's institute's telephone expenses in case arrangement.
For device embodiment, since it is basically similar to the method embodiment, related so being described relatively simplePlace illustrates referring to the part of embodiment of the method.
Fig. 6 is a kind of electronic equipment 600 for voice-based data processing shown according to an exemplary embodimentStructural block diagram.For example, electronic equipment 600 can be mobile phone, computer, digital broadcasting terminal, messaging device, tripPlay console, tablet device, Medical Devices, body-building equipment, personal digital assistant etc.;It is also possible to server device, such as servicesDevice.
Referring to Fig. 6, electronic equipment 600 may include following one or more components: processing component 602, memory 604,Power supply module 606, multimedia component 608, audio component 610, the interface 612 of input/output (I/O), sensor module 614,And communication component 616.
The integrated operation of the usual controlling electronic devices 600 of processing component 602, such as with display, call, data are logicalLetter, camera operation and record operate associated operation.Processing element 602 may include one or more processors 620 to holdRow instruction, to perform all or part of the steps of the methods described above.In addition, processing component 602 may include one or more mouldsBlock, convenient for the interaction between processing component 602 and other assemblies.For example, processing component 602 may include multi-media module, withFacilitate the interaction between multimedia component 608 and processing component 602.
Memory 604 is configured as storing various types of data to support the operation in equipment 600.These data are shownExample includes the instruction of any application or method for operating on electronic equipment 600, contact data, telephone directory numberAccording to, message, picture, video etc..Memory 604 can by any kind of volatibility or non-volatile memory device or theyCombination realize, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasableProgrammable read only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, quick flashingMemory, disk or CD.
Electric power assembly 604 provides electric power for the various assemblies of electronic equipment 600.Electric power assembly 604 may include power supply pipeReason system, one or more power supplys and other with for electronic equipment 600 generate, manage, and distribute the associated component of electric power.
Multimedia component 608 includes the screen of one output interface of offer between the electronic equipment 600 and user.In some embodiments, screen may include liquid crystal display (LCD) and touch panel (TP).If screen includes touch surfacePlate, screen may be implemented as touch screen, to receive input signal from the user.Touch panel includes one or more touchesSensor is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or slidingThe boundary of movement, but also detect duration and pressure associated with the touch or slide operation.In some embodiments,Multimedia component 608 includes a front camera and/or rear camera.When electronic equipment 600 is in operation mode, as clappedWhen taking the photograph mode or video mode, front camera and/or rear camera can receive external multi-medium data.It is each prepositionCamera and rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.
Audio component 610 is configured as output and/or input audio signal.For example, audio component 610 includes a MikeWind (MIC), when electronic equipment 600 is in operation mode, when such as call mode, recording mode, and voice recognition mode, microphoneIt is configured as receiving external audio signal.The received audio signal can be further stored in memory 604 or via logicalBelieve that component 616 is sent.In some embodiments, audio component 610 further includes a loudspeaker, is used for output audio signal.
I/O interface 612 provides interface between processing component 402 and peripheral interface module, and above-mentioned peripheral interface module canTo be keyboard, click wheel, button etc..These buttons may include, but are not limited to: home button, volume button, start button and lockDetermine button.
Sensor module 614 includes one or more sensors, for providing the state of various aspects for electronic equipment 600Assessment.For example, sensor module 614 can detecte the state that opens/closes of equipment 600, the relative positioning of component, such as instituteThe display and keypad that component is electronic equipment 600 are stated, sensor module 614 can also detect electronic equipment 600 or electronicsThe position change of 600 1 components of equipment, the existence or non-existence that user contacts with electronic equipment 600,600 orientation of electronic equipmentOr the temperature change of acceleration/deceleration and electronic equipment 600.Sensor module 614 may include proximity sensor, be configured toIt detects the presence of nearby objects without any physical contact.Sensor module 614 can also include optical sensor, such asCMOS or ccd image sensor, for being used in imaging applications.In some embodiments, which can be withIncluding acceleration transducer, gyro sensor, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 616 is configured to facilitate the communication of wired or wireless way between electronic equipment 600 and other equipment.Electronic equipment 400 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or their combination.Show at oneIn example property embodiment, communication component 614 receives broadcast singal or broadcast from external broadcasting management system via broadcast channelRelevant information.In one exemplary embodiment, the communication component 614 further includes near-field communication (NFC) module, short to promoteCheng Tongxin.For example, radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band can be based in NFC module(UWB) technology, bluetooth (BT) technology and other technologies are realized.
In the exemplary embodiment, electronic equipment 600 can be by one or more application specific integrated circuit (ASIC), numberWord signal processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array(FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing the above method.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instruction, example are additionally providedIt such as include the memory 604 of instruction, above-metioned instruction can be executed by the processor 620 of electronic equipment 400 to complete the above method.ExampleSuch as, the non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, softDisk and optical data storage devices etc..
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the processing of electronic equipmentWhen device executes, so that electronic equipment is able to carry out a kind of voice-based data processing method, which comprises obtain interrogationProcess data, the interrogation process data are determined according to the voice data acquired during interrogation;Number of passes is crossed according to the interrogationAccording to being identified, corresponding first text data and the second text data are obtained, wherein first text data belongs to oneTarget user, second text data belong to the other users in addition to the target user;According to first textual dataAccording to the second text data, obtain interrogation information.
Optionally, the interrogation process data includes the text identification knot that voice data and/or voice data identifyFruit.
Optionally, the interrogation process data is voice data;It is described to be identified according to the interrogation process data, it obtainsTake corresponding first text data and the second text data, comprising: according to vocal print feature, the is isolated from the voice dataOne voice data and second speech data;Speech recognition is carried out to first voice data and second speech data respectively, is obtainedTake corresponding first text data and the second text data.
Optionally, described according to vocal print feature, the first voice data and the second voice are isolated from the voice dataData, comprising: the voice data is divided into multiple sound bites;According to vocal print feature, determined using the sound biteFirst voice data and second speech data.
Optionally, described according to vocal print feature, the first voice data and the second voice number are determined using the sound biteAccording to, comprising: each sound bite is matched respectively using benchmark vocal print feature, wherein the benchmark vocal print feature is targetThe vocal print feature of user;The sound bite being consistent with the benchmark vocal print feature is obtained, corresponding first voice data is obtained;It obtainsThe sound bite not being consistent with the benchmark vocal print feature is taken, corresponding second speech data is obtained.
Optionally, described according to vocal print feature, the first voice data and the second voice number are determined using the sound biteAccording to, comprising: the vocal print feature of each sound bite is identified;Count the quantity that each vocal print feature corresponds to sound bite;It determinesThe maximum vocal print feature of quantity with sound bite generates the first voice number using the corresponding sound bite of the vocal print featureAccording to;Second speech data is generated using the sound bite for being not belonging to the first voice data.
Optionally, speech recognition carried out to first voice data and second speech data respectively, obtains corresponding theOne text data and the second text data, comprising: speech recognition is carried out respectively to each sound bite in first voice data,The first text data is generated using the text fragments that identification obtains;Each sound bite in the second speech data is carried out respectivelySpeech recognition generates the second text data using the text fragments that identification obtains.
Optionally, the interrogation process data is the text identification result that voice data identifies;Described in the foundationInterrogation process data is identified, obtains corresponding first text data and the second text data, comprising: to the text identificationAs a result feature identification is carried out, isolates the first text data and the second text data according to language feature.
Optionally, feature identification is carried out to the text identification result, isolates the first text data according to language featureWith the second text data, comprising: divided to the text identification result, obtain corresponding text fragments;Using default mouldType identifies the text fragments, determines that the language feature that the text fragments have, the language feature include firstLanguage feature and second language feature;The first text data is generated using the text fragments with first language feature, and, it adoptsThe second text data is generated with the text fragments with second language feature.
Optionally, further includes: the interrogation information is analyzed, is analyzed accordingly as a result, the analysis resultIt is related to medical diagnosis on disease.
Fig. 7 is a kind of electronics for voice-based data processing that the present invention is shown according to another exemplary embodimentThe structural schematic diagram of equipment 700.The electronic equipment 700 can be server, which can produce because configuration or performance are differentRaw bigger difference, may include one or more central processing units (central processing units, CPU)722 (for example, one or more processors) and memory 732, one or more storage application programs 742 or data744 storage medium 730 (such as one or more mass memory units).Wherein, memory 732 and storage medium 730It can be of short duration storage or persistent storage.The program for being stored in storage medium 730 may include one or more module (figuresShow and do not mark), each module may include to the series of instructions operation in server.Further, central processing unit 722It can be set to communicate with storage medium 730, execute the series of instructions operation in storage medium 730 on the server.
Server can also include one or more power supplys 726, one or more wired or wireless networks connectMouthfuls 750, one or more input/output interfaces 758, one or more keyboards 756, and/or, one or one withUpper operating system 741, such as Windows ServerTM, MacOS XTM, UnixTM, LinuxTM, FreeBSDTM etc..
In the exemplary embodiment, server is configured to by one or more than one central processing unit 722 executes oneA or more than one program includes the instruction for performing the following operation: obtaining interrogation process data, number of passes is crossed in the interrogationIt is determined according to according to the voice data acquired during interrogation;It is identified according to the interrogation process data, obtains corresponding theOne text data and the second text data, wherein first text data belongs to a target user, second textual dataAccording to the other users belonged in addition to the target user;According to first text data and the second text data, askedExamine information.
Optionally, the interrogation process data includes the text identification knot that voice data and/or voice data identifyFruit.
Optionally, the interrogation process data is voice data;It is described to be identified according to the interrogation process data, it obtainsTake corresponding first text data and the second text data, comprising: according to vocal print feature, the is isolated from the voice dataOne voice data and second speech data;Speech recognition is carried out to first voice data and second speech data respectively, is obtainedTake corresponding first text data and the second text data.
Optionally, described according to vocal print feature, the first voice data and the second voice are isolated from the voice dataData, comprising: the voice data is divided into multiple sound bites;According to vocal print feature, determined using the sound biteFirst voice data and second speech data.
Optionally, described according to vocal print feature, the first voice data and the second voice number are determined using the sound biteAccording to, comprising: each sound bite is matched respectively using benchmark vocal print feature, wherein the benchmark vocal print feature is targetThe vocal print feature of user;The sound bite being consistent with the benchmark vocal print feature is obtained, corresponding first voice data is obtained;It obtainsThe sound bite not being consistent with the benchmark vocal print feature is taken, corresponding second speech data is obtained.
Optionally, described according to vocal print feature, the first voice data and the second voice number are determined using the sound biteAccording to, comprising: the vocal print feature of each sound bite is identified;Count the quantity that each vocal print feature corresponds to sound bite;It determinesThe maximum vocal print feature of quantity with sound bite generates the first voice number using the corresponding sound bite of the vocal print featureAccording to;Second speech data is generated using the sound bite for being not belonging to the first voice data.
Optionally, speech recognition carried out to first voice data and second speech data respectively, obtains corresponding theOne text data and the second text data, comprising: speech recognition is carried out respectively to each sound bite in first voice data,The first text data is generated using the text fragments that identification obtains;Each sound bite in the second speech data is carried out respectivelySpeech recognition generates the second text data using the text fragments that identification obtains.
Optionally, the interrogation process data is the text identification result that voice data identifies;Described in the foundationInterrogation process data is identified, obtains corresponding first text data and the second text data, comprising: to the text identificationAs a result feature identification is carried out, isolates the first text data and the second text data according to language feature.
Optionally, feature identification is carried out to the text identification result, isolates the first text data according to language featureWith the second text data, comprising: divided to the text identification result, obtain corresponding text fragments;Using default mouldType identifies the text fragments, determines that the language feature that the text fragments have, the language feature include firstLanguage feature and second language feature;The first text data is generated using the text fragments with first language feature, and, it adoptsThe second text data is generated with the text fragments with second language feature.
Optionally, server is by one or more than one processor 522 executes the one or more programsInclude the instruction for being also used to perform the following operation: the interrogation information is analyzed, is analyzed accordingly as a result, described pointIt is related to medical diagnosis on disease to analyse result.
All the embodiments in this specification are described in a progressive manner, the highlights of each of the examples are withThe difference of other embodiments, the same or similar parts between the embodiments can be referred to each other.
It should be understood by those skilled in the art that, the embodiment of the embodiment of the present invention can provide as method, apparatus or calculateMachine program product.Therefore, the embodiment of the present invention can be used complete hardware embodiment, complete software embodiment or combine software andThe form of the embodiment of hardware aspect.Moreover, the embodiment of the present invention can be used one or more wherein include computer canWith in the computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) of program codeThe form of the computer program product of implementation.
The embodiment of the present invention be referring to according to the method for the embodiment of the present invention, terminal device (system) and computer programThe flowchart and/or the block diagram of product describes.It should be understood that flowchart and/or the block diagram can be realized by computer program instructionsIn each flow and/or block and flowchart and/or the block diagram in process and/or box combination.It can provide theseComputer program instructions are set to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing terminalsStandby processor is to generate a machine, so that being held by the processor of computer or other programmable data processing terminal devicesCapable instruction generates for realizing in one or more flows of the flowchart and/or one or more blocks of the block diagramThe device of specified function.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing terminal devicesIn computer-readable memory operate in a specific manner, so that instruction stored in the computer readable memory generates packetThe manufacture of command device is included, which realizes in one side of one or more flows of the flowchart and/or block diagramThe function of being specified in frame or multiple boxes.
These computer program instructions can also be loaded into computer or other programmable data processing terminal devices, so thatSeries of operation steps are executed on computer or other programmable terminal equipments to generate computer implemented processing, thusThe instruction executed on computer or other programmable terminal equipments is provided for realizing in one or more flows of the flowchartAnd/or in one or more blocks of the block diagram specify function the step of.
Although the preferred embodiment of the embodiment of the present invention has been described, once a person skilled in the art knows basesThis creative concept, then additional changes and modifications can be made to these embodiments.So the following claims are intended to be interpreted asIncluding preferred embodiment and fall into all change and modification of range of embodiment of the invention.
Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to byOne entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operationBetween there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaningCovering non-exclusive inclusion, so that process, method, article or terminal device including a series of elements not only wrapThose elements are included, but also including other elements that are not explicitly listed, or further includes for this process, method, articleOr the element that terminal device is intrinsic.In the absence of more restrictions, being wanted by what sentence "including a ..." limitedElement, it is not excluded that there is also other identical elements in process, method, article or the terminal device for including the element.
Above to a kind of corpus abstracting method provided by the present invention, a kind of corpus draw-out device and a kind of electronic equipment,It is described in detail, used herein a specific example illustrates the principle and implementation of the invention, the above realityThe explanation for applying example is merely used to help understand method and its core concept of the invention;Meanwhile for the general technology of this fieldPersonnel, according to the thought of the present invention, there will be changes in the specific implementation manner and application range, in conclusion this theoryBright book content should not be construed as limiting the invention.

Claims (11)

CN201710384412.3A2017-05-262017-05-26A kind of voice-based data processing method, device and electronic equipmentPendingCN108962253A (en)

Priority Applications (2)

Application NumberPriority DateFiling DateTitle
CN201710384412.3ACN108962253A (en)2017-05-262017-05-26A kind of voice-based data processing method, device and electronic equipment
PCT/CN2018/082702WO2018214663A1 (en)2017-05-262018-04-11Voice-based data processing method and apparatus, and electronic device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201710384412.3ACN108962253A (en)2017-05-262017-05-26A kind of voice-based data processing method, device and electronic equipment

Publications (1)

Publication NumberPublication Date
CN108962253Atrue CN108962253A (en)2018-12-07

Family

ID=64395285

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201710384412.3APendingCN108962253A (en)2017-05-262017-05-26A kind of voice-based data processing method, device and electronic equipment

Country Status (2)

CountryLink
CN (1)CN108962253A (en)
WO (1)WO2018214663A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111582708A (en)*2020-04-302020-08-25北京声智科技有限公司Medical information detection method, system, electronic device and computer-readable storage medium
CN112118415A (en)*2020-09-182020-12-22瑞然(天津)科技有限公司Remote diagnosis and treatment method and device, patient side terminal and doctor side terminal
CN113555133A (en)*2021-05-312021-10-26北京易康医疗科技有限公司Medical inquiry data processing method and device
CN114520062A (en)*2022-04-202022-05-20杭州马兰头医学科技有限公司Medical cloud communication system based on AI and letter creation
CN118098263A (en)*2023-12-292024-05-28齐鲁工业大学(山东省科学院) Speech separation method and electronic device for one doctor with multiple patients
CN118486440A (en)*2024-06-032024-08-13江苏苏桦技术股份有限公司 A medical self-service machine intelligent diagnosis guidance system and method

Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104268279A (en)*2014-10-162015-01-07魔方天空科技(北京)有限公司Query method and device of corpus data
CN104427292A (en)*2013-08-222015-03-18中兴通讯股份有限公司Method and device for extracting a conference summary
CN105469790A (en)*2014-08-292016-04-06上海联影医疗科技有限公司Consultation information processing method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN106326640A (en)*2016-08-122017-01-11上海交通大学医学院附属瑞金医院卢湾分院Medical speech control system and control method thereof
CN106328124A (en)*2016-08-242017-01-11安徽咪鼠科技有限公司Voice recognition method based on user behavior characteristics

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104427292A (en)*2013-08-222015-03-18中兴通讯股份有限公司Method and device for extracting a conference summary
CN105469790A (en)*2014-08-292016-04-06上海联影医疗科技有限公司Consultation information processing method and device
CN104268279A (en)*2014-10-162015-01-07魔方天空科技(北京)有限公司Query method and device of corpus data

Cited By (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111582708A (en)*2020-04-302020-08-25北京声智科技有限公司Medical information detection method, system, electronic device and computer-readable storage medium
CN112118415A (en)*2020-09-182020-12-22瑞然(天津)科技有限公司Remote diagnosis and treatment method and device, patient side terminal and doctor side terminal
CN113555133A (en)*2021-05-312021-10-26北京易康医疗科技有限公司Medical inquiry data processing method and device
CN114520062A (en)*2022-04-202022-05-20杭州马兰头医学科技有限公司Medical cloud communication system based on AI and letter creation
CN114520062B (en)*2022-04-202022-07-22杭州马兰头医学科技有限公司Medical cloud communication system based on AI and letter creation
CN118098263A (en)*2023-12-292024-05-28齐鲁工业大学(山东省科学院) Speech separation method and electronic device for one doctor with multiple patients
CN118098263B (en)*2023-12-292025-02-18齐鲁工业大学(山东省科学院)Voice separation method for multiple patients and electronic equipment
CN118486440A (en)*2024-06-032024-08-13江苏苏桦技术股份有限公司 A medical self-service machine intelligent diagnosis guidance system and method
CN118486440B (en)*2024-06-032025-02-11江苏苏桦技术股份有限公司 A medical self-service machine intelligent diagnosis guidance system and method

Also Published As

Publication numberPublication date
WO2018214663A1 (en)2018-11-29

Similar Documents

PublicationPublication DateTitle
CN108962253A (en)A kind of voice-based data processing method, device and electronic equipment
CN104457955B (en)Body weight information acquisition method, apparatus and system
CN110459214B (en)Voice interaction method and device
CN109960743A (en) Method, device, computer equipment and storage medium for distinguishing conference content
CN107146631B (en)Music identification method, note identification model establishment method, device and electronic equipment
CN106113038A (en)Mode switching method based on robot and device
CN109002184B (en)Association method and device for candidate words of input method
CN109558599A (en)A kind of conversion method, device and electronic equipment
CN107870904A (en)A kind of interpretation method, device and the device for translation
CN106469297A (en)Emotion identification method, device and terminal unit
CN108665889A (en)The Method of Speech Endpoint Detection, device, equipment and storage medium
CN106919629A (en)The method and device of information sifting is realized in group chat
CN109585001A (en)A kind of data analysing method, device, electronic equipment and storage medium
CN111739535A (en)Voice recognition method and device and electronic equipment
KR20220121661A (en)Method, apparatus, terminal for audio processing and storage medium
JP2014149571A (en)Content search device
CN110069143A (en)A kind of information is anti-error to entangle method, apparatus and electronic equipment
CN110634570A (en) A diagnostic simulation method and related device
CN111489260A (en)Item classification method and device, electronic equipment and storage medium
CN109102813B (en)Voiceprint recognition method and device, electronic equipment and storage medium
CN111241238B (en)User evaluation method, device, electronic equipment and storage medium
CN108268667A (en)Audio file clustering method and device
CN118277609A (en) Video annotation method, device, electronic device and storage medium
CN107846347A (en)A kind of Content of Communication processing method, device and electronic equipment
CN109145151B (en)Video emotion classification acquisition method and device

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication

Application publication date:20181207

RJ01Rejection of invention patent application after publication

[8]ページ先頭

©2009-2025 Movatter.jp