Disclosure of Invention
In view of the above, the present application provides a method, an apparatus, a device and a readable storage medium for noise reduction of a headphone. The method can solve the defect that the existing earphone noise reduction method eliminates all external environment sounds and brings great inconvenience to an earphone wearing object.
In order to achieve the above object, the following solutions are proposed:
a headphone noise reduction method, comprising:
acquiring environmental sound data;
determining whether the ambient sound data contains content of interest to the headset wearing object;
and if so, eliminating the contents except the contents interested by the earphone wearing object in the environmental sound data, and playing the contents interested by the earphone wearing object.
Preferably, the determining whether the ambient sound data includes content of interest to the headset wearing object includes:
and determining whether the environmental sound data contains sound data corresponding to the information of interest according to information of interest corresponding to the earphone wearing object which is configured in advance.
Preferably, the determining, according to the preset interest information corresponding to the headset wearing object, whether the environmental sound data includes sound data corresponding to the interest information includes:
converting the ambient sound data into a recognition text;
searching whether a pre-configured interest word corresponding to the earphone wearing object exists in the identification text, if so, determining that the environmental sound data contains sound data corresponding to the interest information, and if not, determining that the environmental sound data does not contain the sound data corresponding to the interest information.
Preferably, the determining, according to the preset interest information corresponding to the earphone wearing object, whether the environmental sound data includes sound data corresponding to the interest information includes:
determining whether sound data corresponding to the interest voiceprint exists in the environment sound data;
and if the environmental sound data exists, determining that the environmental sound data contains the sound data corresponding to the interesting information, and if the environmental sound data does not exist, determining that the environmental sound data does not contain the sound data corresponding to the interesting information.
Preferably, the determining whether the ambient sound data includes content of interest to the headset wearing object includes:
acquiring characteristic information of the earphone wearing object, wherein the characteristic information is related to interested contents of the earphone wearing object;
and determining whether the environmental sound data is matched with the characteristic information, if so, determining that the environmental sound data contains the content interested by the earphone wearing object, and if not, determining that the environmental sound data does not contain the content interested by the earphone wearing object.
Preferably, if it is determined that the environmental sound data does not include sound data corresponding to the information of interest based on information of interest corresponding to the headset wearing object, which is configured in advance, the method further includes:
acquiring characteristic information of the earphone wearing object, wherein the characteristic information is related to interested contents of the earphone wearing object;
and determining whether the environmental sound data is matched with the characteristic information, if so, determining that the environmental sound data contains the content interested by the earphone wearing object, and if not, determining that the environmental sound data does not contain the content interested by the earphone wearing object.
Preferably, the acquiring the characteristic information of the headset wearing object includes:
acquiring preset personal information of the earphone wearing object as characteristic information;
the determining whether the environmental sound data matches the feature information includes:
acquiring a label corresponding to the personal information;
and determining whether the identification text corresponding to the environmental sound data contains the label, if so, determining matching, and if not, determining mismatching.
Preferably, the acquiring the characteristic information of the headset wearing object includes:
acquiring key time points and event information contained in historical speech content of the earphone wearing object as characteristic information;
the determining whether the environmental sound data matches the feature information includes:
and determining whether the identification text corresponding to the environmental sound data contains the key time point and/or the event information, if so, determining matching, and if not, determining mismatching.
Preferably, the acquiring the characteristic information of the headset wearing object includes:
acquiring a corresponding relation between interest content and interest degree determined according to historical speech content of the earphone wearing object as characteristic information;
the determining whether the environmental sound data matches the feature information includes:
determining whether the identification text corresponding to the environmental sound data contains target interesting content;
if the target interest content is determined to be contained, inquiring the target interest degree corresponding to the target interest content according to the corresponding relation between the interest content and the interest degree;
and determining the matching condition of the environmental sound data and the characteristic information according to the size relation between the target interestingness and a preset interestingness threshold value.
Preferably, the playing the content of interest to the headset-worn object includes:
identifying a target voiceprint corresponding to the content interested by the earphone wearing object in the environmental sound data;
entering a dialog mode for an object corresponding to the target voiceprint, the dialog mode comprising: and playing the sound data corresponding to the target voiceprint in the environment sound data.
Preferably, the playing the content of interest of the headset-worn object further includes:
recording the starting time for starting to acquire the sound data corresponding to the target voiceprint;
detecting whether sound data corresponding to the target voiceprint is acquired within a set time length after the starting time;
if not, exiting the dialogue mode of the object corresponding to the target voiceprint.
A headphone noise reduction apparatus comprising:
an ambient sound data acquisition unit for acquiring ambient sound data;
an interested content judging unit, configured to determine whether the ambient sound data contains a content that is interested in the earphone wearing object;
a noise elimination unit configured to eliminate, when it is determined that the ambient sound data includes a content of interest to a headphone-worn object, a content of the ambient sound data other than the content of interest to the headphone-worn object;
and the interesting content playing unit is used for playing the interesting content of the earphone wearing object when the environment sound data is determined to contain the interesting content of the earphone wearing object.
Preferably, the content of interest determination unit includes:
and the first interesting content judging subunit is used for determining whether the environmental sound data contains sound data corresponding to the interesting information or not according to the preset interesting information corresponding to the earphone wearing object.
Preferably, the information of interest is an interest word, and the first content of interest determination subunit includes:
an identification text acquisition unit for converting the environmental sound data into an identification text;
and the interest word searching unit is used for searching whether a preset interest word corresponding to the earphone wearing object exists in the identification text, if so, determining that the environmental sound data contains sound data corresponding to the interest information, and if not, determining that the environmental sound data does not contain the sound data corresponding to the interest information.
Preferably, the information of interest is an interest voiceprint, and the first content of interest determination subunit includes:
an interest voiceprint determining unit, configured to determine whether sound data corresponding to the interest voiceprint exists in the environmental sound data; and if the environmental sound data exists, determining that the environmental sound data contains the sound data corresponding to the interesting information, and if the environmental sound data does not exist, determining that the environmental sound data does not contain the sound data corresponding to the interesting information.
Preferably, the content of interest determination unit includes:
the second interested content judging subunit is used for acquiring the characteristic information of the earphone wearing object, wherein the characteristic information is related to the interested content of the earphone wearing object;
and the third interested content judging subunit is used for determining whether the environmental sound data is matched with the feature information, if so, determining that the environmental sound data contains the interested content of the earphone wearing object, and if not, determining that the environmental sound data does not contain the interested content of the earphone wearing object.
Preferably, the interesting content determining unit further includes:
a second interested content judging subunit, configured to, when the first interested content judging subunit determines that the environmental sound data does not include sound data corresponding to the interested information, acquire feature information of the earphone wearing object, where the feature information is related to the interested content of the earphone wearing object;
and the third interested content judging subunit determines whether the environmental sound data is matched with the feature information, if so, determines that the environmental sound data contains the interested content of the earphone wearing object, and if not, determines that the environmental sound data does not contain the interested content of the earphone wearing object.
Preferably, the second content of interest judging subunit includes:
a first characteristic information acquisition unit configured to acquire, as characteristic information, personal information of the earphone wearing object set in advance;
the third content of interest determination subunit includes:
a tag acquisition unit configured to acquire a tag corresponding to the personal information;
and the tag searching unit is used for determining whether the identification text corresponding to the environmental sound data contains the tag or not, if so, determining matching, and if not, determining mismatching.
Preferably, the second content of interest judging subunit includes:
a second feature information acquiring unit, configured to acquire, as feature information, key time points and event information included in historical speech content of the earphone wearing object;
the third content of interest determination subunit includes:
and the identification text reference unit is used for determining whether the identification text corresponding to the environmental sound data contains the key time point and/or the event information, if so, determining matching, and if not, determining mismatching.
Preferably, the second content of interest judging subunit includes:
a third feature information obtaining unit, configured to obtain a correspondence between interest content and interest level determined according to the historical speech content of the earphone wearing object, as feature information;
the third content of interest determination subunit includes:
the interest content judging unit is used for determining whether the identification text corresponding to the environmental sound data contains target interest content;
a target interest degree query unit, configured to query a target interest degree corresponding to the target interest content according to a corresponding relationship between the interest content and the interest degree if the determination result of the interest content determination unit is yes;
and the interest degree comparison unit is used for determining the matching condition of the environmental sound data and the characteristic information according to the size relation between the target interest degree and a preset interest degree threshold value.
Preferably, the content of interest playing unit includes:
the target voiceprint identification unit is used for identifying a target voiceprint corresponding to the content which is interested by the earphone wearing object in the environment sound data;
a dialog mode entering unit, configured to enter a dialog mode for an object corresponding to the target voiceprint, where the dialog mode includes: and playing the sound data corresponding to the target voiceprint in the environment sound data.
Preferably, the content of interest playing unit further includes:
the starting time recording unit is used for recording the starting time for starting to acquire the sound data corresponding to the target voiceprint;
a set duration detection unit, configured to detect whether sound data corresponding to the target voiceprint is obtained within a set duration after the start time;
and the dialogue mode exit unit is used for exiting the dialogue mode of the object corresponding to the target voiceprint when the detection result of the target voiceprint detection unit is negative.
A headphone noise reduction device comprising a memory and a processor;
the memory is used for storing programs;
the processor is configured to execute the program to implement the steps of any one of the above-mentioned earphone noise reduction methods.
A readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of any of the headphone noise reduction methods as described above.
It can be seen from the foregoing technical solutions that, in the method for reducing noise of an earphone provided in the embodiment of the present application, external environment sound data in an environment where an earphone wearing object is located is obtained, whether the environment sound data includes content that the earphone wearing object is interested in is determined, where the content that is interested in refers to content that the earphone wearing object wants to hear, and further, after it is determined that the environment sound data includes the content that the earphone wearing object is interested in, content of the environment sound data other than the content that the earphone wearing object is interested in is eliminated, and the content that the earphone wearing object is interested in is played. Therefore, the method plays the content which is interested by the earphone wearing object and eliminates the content except the content which is interested by the earphone wearing object, so that the earphone wearing object can be in a quiet state without being interfered by environmental noise, the earphone wearing object can be prevented from missing sound content which the earphone wearing object wants to hear, and the convenience of the earphone wearing object is improved.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The earphone noise reduction method provided by the application can be applied to noise reduction earphones, and is suitable for being used in a scene where a user needs to wear the noise reduction earphones to reduce noise of ambient sound, for example, when the user is in a noisy public place, the user needs to wear the noise reduction earphones to experience a quiet rest environment. In this case, the conventional earphone noise reduction method is to perform noise reduction processing on all the ambient sounds, but this method eliminates the ambient sounds of interest to the earphone wearing object, for example, conversation contents with others or broadcast contents. Therefore, in the process of carrying out noise reduction processing on noise in the environmental sound data, the interested content of the earphone wearing object is played, and the interested content in the environmental sound can be heard while the earphone wearing object is in a quiet state.
Next, the method for reducing noise of a headphone of the present application is described with reference to fig. 1, where fig. 1 illustrates a flowchart of a method for reducing noise of a headphone, the method includes in detail:
and step S100, acquiring environmental sound data.
Specifically, the external environment sound data in the environment where the earphone wearing object is located is acquired, an optional implementation way is to set a recording module, and when the sound of the external environment is detected, the recording module is started to collect the external environment sound signal.
Step S110, determining whether the environmental sound data includes content that is interested by the earphone wearing object, if yes, executing step S120.
Specifically, the content that the earphone wearing object is interested in refers to content that the earphone wearing object wants to hear in the environmental sound data, for example, broadcast content in a waiting hall, or conversation content between others and the earphone wearing object, and the step further determines whether the environmental sound data contains the content that the earphone wearing object wants to hear after acquiring the environmental sound data.
Step S120, eliminating the content of the environmental sound data except the content of interest of the earphone wearing object, and playing the content of interest of the earphone wearing object.
Specifically, when it is determined in step S110 that the ambient sound data includes the content of interest to the headphone-wearing object, the ambient sound data is eliminated as noise, and the principle of the elimination method may be the same as that of the existing noise reduction method, that is, the noise sound data content is neutralized by generating a sound wave opposite to the noise, and further, the content of interest to the headphone-wearing object is played, so that the user can hear the sound data content desired to be heard.
It can be seen from the foregoing technical solutions that, in the method for reducing noise of an earphone provided in the embodiment of the present application, external environment sound data in an environment where an earphone wearing object is located is obtained, whether the environment sound data includes content that the earphone wearing object is interested in is determined, where the content that is interested in refers to sound content that the earphone wearing object wants to hear, and further, after it is determined that the environment sound data includes the content that the earphone wearing object is interested in, content of the environment sound data other than the content that the earphone wearing object is interested in is eliminated, and the content that the earphone wearing object is interested in is played. Therefore, the method plays the content which is interested by the earphone wearing object and eliminates the content except the content which is interested by the earphone wearing object, so that the earphone wearing object can be in a quiet state without being interfered by environmental noise, the earphone wearing object can be prevented from missing sound content which the earphone wearing object wants to hear, and the convenience of the earphone wearing object is improved.
Next, an embodiment of the present application is provided, which describes determining whether the ambient sound data includes content that is of interest to the earphone wearing object in step S110.
As described above, the content of interest of the earphone wearing object refers to the sound data content that the earphone wearing object wants to hear, and there are various embodiments for determining whether the ambient sound data contains the content of interest of the earphone wearing object, and this embodiment describes an alternative embodiment.
Specifically, the embodiment described in this embodiment is based on the information of interest, where the information of interest is information of interest corresponding to the headphone-wearing object that is configured in advance, and obviously, the information of interest is information corresponding to the sound content that the headphone-wearing object wants to hear that is set in advance, and it is possible to determine whether the content of interest to the headphone-wearing object is contained in the ambient sound data by determining whether the sound data corresponding to the information of interest is contained in the ambient sound data.
The method for determining whether the environmental sound data contains sound data corresponding to the information of interest according to different types of information of interest corresponding to the earphone wearing object, which is configured in advance, comprises a plurality of methods, and two optional embodiments of the method are described below.
The first, interest information is an interest word.
Here, the interest word is one or more words pre-configured by the headset wearing object, and may include time, event, person, location, and the like, and it is understood that the interest word is a word included in sound data content that the headset wearing object wants to hear, for example, the name of the headset wearing object, the name of a fellow, and the like are pre-configured as the interest word, or in a waiting room scene, if the broadcast content is wanted to be heard, keywords such as "passenger", "broadcast", "number of cars", and the like may be pre-configured as the interest word in advance.
Based on this, the process of determining whether the environmental sound data includes sound data corresponding to the information of interest may include:
first, the ambient sound data is converted into a recognition text.
Specifically, the acquired environmental sound data is converted into a recognition text by a voice recognition technique. Here, the speech recognition technology may be selected from the prior art, and this embodiment is not described in detail.
Further, whether a pre-configured interest word corresponding to the earphone wearing object exists or not is searched in the identification text.
Specifically, the previously configured interest word is searched for in the recognition text of the environmental sound data, and since the interest word is the interest information, if the search result is present, it is determined that the environmental sound data includes the sound data corresponding to the interest information, and if the search result is absent, it is determined that the environmental sound data does not include the sound data corresponding to the interest information.
For example, the earphone wearing object wants to hear the sound content containing the name of the earphone wearing object, so the name "Li Ming" of the earphone wearing object is configured as an interest word in advance, after the earphone starts to be used, the environment sound data is continuously acquired, the environment sound data is converted into an identification text, whether the interest word "Li Ming" configured in advance by the earphone wearing object exists is further searched, and if the interest word "Li Ming" exists, the environment sound data is determined to contain the sound data corresponding to the name of the earphone wearing object, namely, the environment sound data is determined to contain the sound data corresponding to the information of interest.
The second, interesting information is an interesting voiceprint.
Here, the interest voiceprint refers to a voiceprint corresponding to a sound that the ear-set wearing subject wants to hear, and for example, when the ear-set wearing subject wants to hear the conversation contents of the fellow with himself, the voiceprint of the fellow sound can be previously configured as the interest voiceprint.
At this time, the process of determining whether the environmental sound data includes sound data corresponding to the information of interest may include:
and determining whether sound data corresponding to the interest voiceprint exists in the environment sound data.
Specifically, the environmental sound data is searched, and whether sound data corresponding to the pre-configured interest voiceprint exists or not is determined. It can be understood that the interest voiceprint is information of interest configured in advance by the headset wearing object, so when environment sound data is searched and sound data corresponding to the interest voiceprint is determined to exist, it is determined that the environment sound data contains sound data corresponding to the information of interest, and if the search result is that the environment sound data does not exist, it is determined that the environment sound data does not contain sound data corresponding to the information of interest.
For example, since the earphone wearing object intends to hear the sound content of the fellow when the fellow speaks to itself, the sound of the fellow is configured as the interested voiceprint in advance, after the earphone starts to be used, the ambient sound data is continuously acquired, whether the interested voiceprint configured in advance by the earphone wearing object exists is determined, if the interested voiceprint exists, the ambient sound data is determined to include the sound data corresponding to the voiceprint of the fellow, that is, the ambient sound data is determined to include the sound data corresponding to the information of interest.
It should be noted that the two methods for determining whether the environmental sound data includes the sound data corresponding to the information of interest may include both online and offline implementation processes.
In the online implementation process, the background server acquires and stores pre-configured interesting information. The earphone sends the environment sound data to the server after acquiring the environment sound data, the server determines whether the environment sound data contains sound data corresponding to the information of interest, and the determination result is sent to the earphone.
The offline process is a process in which the earphone acquires information of interest configured in advance, stores the information locally, and determines whether the environmental sound data includes sound data corresponding to the information of interest locally based on the information.
Based on the two methods, the pre-configured interest word or interest voiceprint is used as the interest information, determining whether the environmental sound data contains sound data corresponding to the information of interest by determining whether the environmental sound data contains the interest word or the interest voiceprint, since the information of interest is a pre-configured interest word or interest voiceprint related to the content that the headset-worn subject wants to hear, based on the above determination result, it is possible to further determine whether or not the ambient sound data contains a content of interest to the headphone-wearing subject, that is, if it is determined that the environmental sound data includes sound data corresponding to the information of interest, it may be determined that the environmental sound data includes content of interest to the headphone-worn object, and if it is determined that the environmental sound data does not include sound data corresponding to the information of interest, it may be determined that the environmental sound data does not include content of interest to the headphone-worn object.
Next, the present embodiment introduces step S110, and another alternative implementation of determining whether the ambient sound data includes content that is of interest to the earphone wearing object may include:
and A1, acquiring characteristic information of the earphone wearing object.
In particular, the characteristic information relates to content of interest to the headset wearing object. For example, the headphone-wearing object likes to watch movies at ordinary times and likes to move, and it can be considered that the topic related to movies or moves in the environmental sound data is the content that the headphone-wearing object wants to hear. Information related to the taste of the headset-worn subject may be used as characteristic information such as a movie and the name of the movie or sports and sporting events. Obviously, the characteristic information is related to the content of interest of the headset wearing object.
A2, determining whether the environmental sound data is matched with the characteristic information.
Specifically, since the feature information is related to the content of interest of the headphone wearing object, if it is determined that the ambient sound data matches the feature information, it is determined that the content of interest of the headphone wearing object is included in the ambient sound data, and if it is determined that the ambient sound data does not match the feature information, it is determined that the content of interest of the headphone wearing object is not included in the ambient sound data.
It should be noted that, when it is determined that the environmental sound data matches the feature information, a process of secondary confirmation may be further added: and setting a voice prompt or a vibration prompt, and confirming whether the ambient sound data contains the interested content or not by the earphone wearing object. Obviously, the added secondary confirmation process can make the judgment result more accurate.
The two embodiments described above describe two different implementations of determining whether the ambient sound data includes a content that is of interest to the earphone wearing object, and this embodiment may also combine the two implementations, and the specific process may include:
b1, determining whether the environmental sound data includes sound data corresponding to the information of interest based on information of interest corresponding to the headset wearing object configured in advance.
Specifically, the interest information corresponding to the earphone wearing object may include a preconfigured interest word or an interest voiceprint, and when it is determined that the environmental sound data includes sound data corresponding to the interest information, it is determined that the environmental sound data includes content in which the earphone wearing object is interested.
If it is determined that the sound data corresponding to the information of interest is not included in the environmental sound data, B2 is executed. If it is determined that the environmental sound data includes sound data corresponding to the information of interest, it may be determined that the environmental sound data includes content of interest to the headphone-worn object.
And B2, acquiring characteristic information of the earphone wearing object.
Wherein the characteristic information is related to the content of interest of the earphone wearing object. This step may refer to the implementation process of a1, which is not described herein.
B3, determining whether the environmental sound data is matched with the characteristic information.
It is to be understood that if the determination result of B3 is yes, it may be determined that the content of interest to the headphone-wearing object is included in the ambient sound data, and otherwise, it may be determined that the content of interest to the headphone-wearing object is not included in the ambient sound data.
The implementation process of the step B3 can refer to the step a2, and the detailed process is not described herein.
It is understood that there are many alternative ways to obtain the feature information, for example, the feature information may be determined according to the personal information of the headset wearing object, or the feature information may be determined according to the historical speech content of the headset wearing object. According to different feature information obtaining manners, embodiments of determining whether the environmental sound data matches the feature information in this embodiment may include multiple embodiments, and then, three optional embodiments are described below.
A first kind,
And acquiring personal information of a preset earphone wearing object as characteristic information.
Here, the personal information refers to some personal information set in advance, such as occupation, age, home address, and the like. It will be appreciated that the ambient sound data associated with such personal information may be content of interest to the subject wearing the headset. For example, if a professional "programmer" is set in advance as the feature information, the ambient sound data related to the programmer may be that the headphone-wearing object wants to hear the content.
Based on this, the process of determining whether the ambient sound data matches the feature information by a2 or B3 may include:
first, a tag corresponding to the personal information is acquired.
Specifically, labels corresponding to preset personal information are obtained, the labels are words related to personal information content, and each piece of personal information may correspond to one or more labels. For example, if the professional of the headset wearing object is a programmer, the "programmer" related labels, such as "code farmer", "program ape", "program", and the like, are acquired.
Further, whether the identification text corresponding to the environmental sound data contains the label is determined, if yes, matching is determined, and if not, mismatching is determined.
Specifically, whether a tag corresponding to the personal information is contained in an identification text corresponding to the environmental sound data is searched, when one or more tags are contained in the identification text corresponding to the environmental sound data, it is determined that the environmental sound data is matched with the feature information, and it is further determined that the environmental sound data contains content in which the earphone wearing object is interested. For example, the occupation of the earphone wearing object is a programmer, and when the identification text corresponding to the ambient sound data includes the label "program ape", it is determined that the ambient sound data matches the feature information, and it is further determined that the ambient sound data includes the content related to the own occupation, which is of interest to the earphone wearing object.
And when the identification text corresponding to the environmental sound data does not contain the label, determining that the environmental sound data is not matched with the characteristic information.
A second kind,
And acquiring key time points and event information contained in the historical speaking content of the earphone wearing object as characteristic information. Here, the historical speech content may include recognition text into which collected historical speech audio is converted.
Specifically, the key time point and event information refers to time points and events contained in the history speech content of the headphone-worn object over a period of time, and it is understood that the key time point and event information is information related to the content of interest to the headphone-worn object, so the acquired key time point and event information can be taken as feature information. Generally, the key time point or the event information may be used alone as the feature information, or the key time point and the corresponding event information may be used together as a set of feature information, for example, if the earphone wearing object discusses the related content of "two-point-in-the-afternoon meeting" before one week, then the "two-point-in-the-afternoon" may be used as the key time point, and the "meeting" may be used as the event information corresponding to the key time point.
Further, after determining the key time point and the event information according to the historical speech content of the earphone wearing object, a prompt voice prompt earphone wearing object can be set, and whether the key time point needs to be set as a timing alarm clock or not can be set. After the confirmation of the earphone wearing object is obtained, the key time point can be set as a timing alarm clock, and when the system time reaches the timing alarm clock time, the earphone can play corresponding prompt tones to remind the earphone wearing object to process events corresponding to the key time point.
The process of determining whether the ambient sound data matches the feature information a2 or B3 above, when the feature information contains key time point and event information, may include:
and determining whether the corresponding recognition text of the environmental sound data contains key time point and/or event information.
Alternatively, it may be determined that the ambient sound data matches the feature information as long as at least one of the key time point or the event information is included in the recognition text corresponding to the ambient sound data. For example, if the recognition text corresponding to the environmental sound data includes a key time point "two points in the afternoon on the next week", it may be determined that the environmental sound data matches the feature information.
Alternatively, it is determined that the ambient sound data matches the feature information only when the recognition text corresponding to the ambient sound data includes both the key time point and the event information corresponding to the key time point. For example, the identification text corresponding to the environmental sound data includes a key time point "two points in the afternoon on the next week" and an event information "meeting" corresponding to the time point, it may be determined that the environmental sound data matches the feature information.
Since the feature information is determined based on the historical speech content of the earphone wearing object, when it is determined that the ambient sound data matches the feature information, it can be further determined that the ambient sound data contains the content of interest to the earphone wearing object.
It should be noted that, because the key time point and the event information have timeliness, a time tag may be set for the key time point and the event information, and the timeliness of the time tag is determined by the timeliness of the key time point and the event information.
The process of determining whether the ambient sound data matches the characteristic information may include:
and determining whether the identification text corresponding to the environmental sound data contains the key time point and the event information of which the time tag is in the validity period or not, if so, determining that the matching is successful, otherwise, determining that the matching is unsuccessful.
A third one,
And acquiring the corresponding relation between the interest content and the interest degree determined according to the historical speech content of the earphone wearing object as the characteristic information. Here, the historical speech content may include recognition text into which collected historical speech audio is converted.
Specifically, the interest content includes interest items under an interest category, wherein the interest category may include movies, music, books, and the like, the interest items may include movie names, book names, music names, and the like, and a set of interest types and a set of interest items included in the earphone wearing object may be determined by analyzing historical speech content of the earphone wearing object. And forming an interest content by any interest type in the interest type set and any interest item in the interest item set.
It can be understood that the degree of interest of the headset wearing object for each content of interest may be different, so the degree of interest of each content of interest is further determined according to the historical speech content of the headset wearing object, and the method for determining the degree of interest may be various, for example, the number of times or frequency of the content of interest appearing in the historical speech content may be calculated, and this embodiment introduces an optional method for calculating the degree of interest, as follows:
1) and calculating the support degree of the interest content.
The support degree of the interest content refers to: the type of interest included in the interest content and the frequency with which the interest item appears simultaneously in a sentence in the history of speech of the subject wearing the headset. That is, for a piece of interest, if the support degree of the interest is high, it indicates that the occurrence frequency of the interest is high, for example, the support degree of "programmer" and "Java" is higher than the support degree of "programmer" and "tea", which indicates that the frequency of "programmer" and "Java" appearing in a sentence at the same time is higher than that of "programmer" and "tea". The interest type and the interest item are respectively expressed as X and Y, and the specific calculation formula of the support degree is as follows:
the total word number refers to the total number of words contained in the historical speech content of the earphone wearing object. For example, if the acquired historical speech content includes 100 words in total, and the number of times "programmer" and "Java" appear in a sentence at the same time is 8, the support of the interest content including "programmer" and "Java" is determined as follows: support (programmer, JAVA) 8/100 0.08.
2) A confidence level of the content of interest is calculated.
Specifically, the confidence level of the interest content refers to: the probability that the interest item of the interest content appears on the basis of the interest type of the interest content appearing in the historical speech content of the headset wearing object. The interest type is represented as X, the interest item is represented as Y, and the specific calculation formula of the confidence coefficient is as follows:
wherein, the total number of times of occurrence of X refers to the total number of times of occurrence of X in the history speech content of the earphone wearing object. For example, if the acquired historical speech content includes 100 words in total, where the number of occurrences of "programmer" is 20, and the number of occurrences of "programmer" and "Java" in a sentence is 8 at the same time, the confidence level of the interest content including "programmer" and "Java" is determined as: the confidence (JAVA) 8/20 is 0.4.
Based on the method, the support degree and the confidence degree of each piece of interest content are calculated. Table 1 below illustrates a correspondence table of interest content support and confidence.
TABLE 1
3) Calculating interest degree of interest content
The support degree and the confidence degree of each piece of interest content are determined, and the support degree and the confidence degree corresponding to each piece of interest content are stored. The process of calculating the interestingness can directly select the support degree or confidence degree representation of the interesting content, or the method considers the influence of the support degree and the confidence degree at the same time and determines the corresponding relation between the interesting content and the interestingness according to the support degree and the confidence degree.
Specifically, the interestingness refers to the degree of interest of the headset wearing object in the content of interest, the interest type and the interest item are respectively represented as X and Y, the interestingness is generally represented by I (X, Y), and a specific calculation formula can be as follows:
for example, if the support of the interest content of the above "programmer" and "Java" is 0.08 and the confidence is 0.4, the interest level can be calculated by the following formula:
the corresponding relation between each interest content and the interest degree is obtained by the method, and the corresponding relation is used as the characteristic information. It is understood that the correspondence between the interest content and the interest level reflects the interest level of the earphone wearing object in the interest content, and the correspondence is continuously updated as the historical speech content increases.
Based on this, the process of determining whether the ambient sound data matches the feature information by a2 or B3 may include:
and C1, determining whether the corresponding identification text of the environmental sound data contains the target interesting content.
Specifically, whether the target content of interest determined by the above is included is searched in the recognition text corresponding to the environmental sound data. Here, as long as it is determined that any one or more items of the target interest type set or the target interest item set are included in the identification text corresponding to the environmental sound data, it is determined that the target interest content is included in the identification text corresponding to the environmental sound data. For example, when searching for the identification text corresponding to the environmental sound data and determining that the identification text corresponding to the environmental sound data includes the "revenge league" in the target interest item set, it may be determined that the identification text corresponding to the environmental sound data includes the target interest content. Or when the identification text corresponding to the environmental sound data is searched, and the identification text corresponding to the environmental sound data contains the 'hollywood movie' in the target interest type set, the identification text corresponding to the environmental sound data can be determined to contain the target interest content.
It can be understood that, if the identification text corresponding to the environmental sound data only contains the target interest type, the interest content containing the target interest type may be used as the target interest content contained in the identification text; if the identification text corresponding to the environmental sound data only contains the target interest item, the interest content containing the target interest item can be used as the target interest content contained in the identification text; if the identification text corresponding to the environmental sound data contains the target interest type and the target interest item at the same time, the interest content containing the target interest type and the target interest item can be used as the target interest content contained in the identification text.
C2, if the target interest content is determined to be contained, inquiring the target interest degree corresponding to the target interest content according to the corresponding relation between the interest content and the interest degree.
It is understood that the determined target interest content may include a plurality of pieces, for example, it is determined that the identification text corresponding to the environmental sound data includes "revenge alliance" in the target interest item set, and table 1 may determine that there are two pieces of target interest content including the target interest item "revenge alliance", and the target interest types of the two pieces of target interest content are "hollywood movie" and "romance", respectively.
Further, the target interest level corresponding to each target interest content may be obtained through querying, and when the target interest content includes a plurality of target interest levels, the target interest level corresponding to the target interest content may be obtained according to the plurality of target interest levels, for example, the plurality of target interest levels may be added or weighted and summed. For example, if the identification text corresponding to the current environmental sound data includes "revenge alliance", the target interest content including "revenge alliance" is queried, and one item is found to be the "revenge alliance" under the "cheerway" interest type, and the other item is the "revenge alliance" under the "hollywood movie" interest type.
And C3, determining the matching condition of the environmental sound data and the feature information according to the size relation between the target interestingness and a preset interestingness threshold value.
Specifically, an interestingness threshold may be preset according to a use habit of the object worn by the headset, when the target interestingness corresponding to the target interest content is greater than or equal to the preset interestingness threshold, it may be determined that the environmental sound data matches the feature information, otherwise, it may be determined that the environmental sound data does not match the feature information. Since the corresponding relation between the target interest content and the target interest degree is determined according to the historical speech content of the earphone wearing object, the content which the earphone wearing object is interested in can be further determined to be contained in the environment sound data.
The above describes three ways of determining whether the ambient sound data matches with the feature information, and since the feature information is determined according to the personal information or the historical speech content preset by the earphone wearing object, the feature information is related to the content of interest of the earphone wearing object, and when it is determined that the ambient sound data matches with the feature information by any of the above methods, it can be determined that the ambient sound data contains the content of interest of the earphone wearing object.
It should be noted that, the matching process of the three kinds of feature information may have an online implementation process or an offline implementation process.
In the online implementation process, the background server calls an open application Program API (application Program interface) interface to acquire preset personal information or historical speech content, obtains and stores the characteristic information, when the characteristic information is matched with the characteristic information, the earphone sends an identification text corresponding to the environmental sound data to the background server, the background server performs the process of determining whether the environmental sound data is matched with the characteristic information, and sends a matching result to the earphone.
The off-line process is that the earphone downloads the characteristic information stored in the background server to the local in advance, and based on the characteristic information, the process of determining whether the environmental sound data is matched with the characteristic information is carried out in the local.
The following provides an embodiment of the present application, and describes a process of playing a content of interest of the earphone wearing object in step S120, where the process may include:
and D1, identifying a target voiceprint corresponding to the content which is interested by the earphone wearing object in the environment sound data.
Specifically, the target voiceprints corresponding to the content of interest of the earphone wearing object may include one or more target voiceprints, which are sequentially identified and marked. For example, in a multi-person conversation scene, a and B speak into the earphone wearing object at the same time, and the interested contents of the earphone wearing object in the environmental sound data come from a and B respectively, the voiceprints of a and B are respectively identified through a voiceprint identification technology, and the target voiceprint is sequentially determined to be a target voiceprint 1 and atarget voiceprint 2.
Specifically, when the information of interest is an interest voiceprint, if it is determined that sound data corresponding to the interest voiceprint exists in the environment sound data, the interest voiceprint is directly used as a target voiceprint corresponding to the content of interest of the earphone wearing object.
And D2, entering a dialogue mode of the object corresponding to the target voiceprint.
Specifically, the sound data corresponding to the target voiceprint includes content that is of interest to the earphone wearing object, so the dialog mode is to play the sound data corresponding to each target voiceprint in the ambient sound data.
On the basis, after entering the dialog mode of the object corresponding to the target voiceprint, the earphone wearing object may hear the sound of the object corresponding to the target voiceprint, and it can be understood that, as time goes by, the environmental sound data may no longer include the sound data corresponding to the target voiceprint, for example, in a dialog scene, after the earphone wearing object has finished talking with the object a, the object a leaves, and at this time, the environmental sound data does not include the sound data corresponding to the object a, so the process of playing the content of interest of the earphone wearing object in step S120 may further include:
first, the start time for acquiring the sound data corresponding to each target voiceprint is recorded. Further, for each target voiceprint, detecting whether sound data corresponding to the target voiceprint is acquired within a set time length after the starting time, and if the fact that the sound data corresponding to the target voiceprint is not acquired within the set time length is determined, exiting the dialogue mode of the object corresponding to the target voiceprint, but not affecting the dialogue modes of other target voiceprints.
For example, if the content of interest of the earphone wearing object in the environmental sound data is from the object a, the voiceprint of the object a is identified as the target voiceprint by the voiceprint identification technology, and the starting time for acquiring the sound data corresponding to the target voiceprint is 10: 00, assuming a set time length of 5 minutes, from start time 10: 00, and if the sound data corresponding to the target voiceprint is not obtained within 5 minutes, exiting the dialogue mode of the object A corresponding to the target voiceprint.
The following describes the noise reduction device for a headphone provided in an embodiment of the present application, and the noise reduction device for a headphone described below and the noise reduction method for a headphone described above may be referred to in correspondence.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a noise reduction device for a headset disclosed in an embodiment of the present application.
As shown in fig. 2, the apparatus may include:
an ambient sounddata acquisition unit 11 for acquiring ambient sound data;
an interestedcontent judging unit 12 for determining whether the ambient sound data contains a content that is interested by the earphone wearing object;
anoise canceling unit 13 configured to, when it is determined that the ambient sound data includes a content of interest to a headphone-worn object, cancel a content of the ambient sound data other than the content of interest to the headphone-worn object;
an interestingcontent playing unit 14, configured to play the content that the earphone wearing object is interested in when it is determined that the ambient sound data includes the content that the earphone wearing object is interested in.
Optionally, the embodiment of the present application introduces a first one of a plurality of structures that the interestedcontent determining unit 12 may include:
the content of interest determination unit may include:
and the first interesting content judging subunit is used for determining whether the environmental sound data contains sound data corresponding to the interesting information or not according to the preset interesting information corresponding to the earphone wearing object.
Optionally, when the information of interest is an interest word, the first interested content determining subunit may include:
an identification text acquisition unit for converting the environmental sound data into an identification text;
and the interest word searching unit is used for searching whether a preset interest word corresponding to the earphone wearing object exists in the identification text, if so, determining that the environmental sound data contains sound data corresponding to the interest information, and if not, determining that the environmental sound data does not contain the sound data corresponding to the interest information.
Optionally, when the information of interest is an interest voiceprint, the first content of interest determining subunit may include:
an interest voiceprint determining unit, configured to determine whether sound data corresponding to the interest voiceprint exists in the environmental sound data; and if the environmental sound data exists, determining that the environmental sound data contains the sound data corresponding to the interesting information, and if the environmental sound data does not exist, determining that the environmental sound data does not contain the sound data corresponding to the interesting information.
Optionally, the present embodiment introduces a second structure of multiple structures that the interested content determining unit may include:
the content of interest determination unit may include:
the second interested content judging subunit is used for acquiring the characteristic information of the earphone wearing object, wherein the characteristic information is related to the interested content of the earphone wearing object;
and the third interested content judging subunit determines whether the environmental sound data is matched with the feature information, if so, determines that the environmental sound data contains the interested content of the earphone wearing object, and if not, determines that the environmental sound data does not contain the interested content of the earphone wearing object.
Optionally, two different structures of the interested content determining unit are introduced above, in this embodiment, the two structures may be combined to provide a third optional structure of the interested content determining unit, as follows:
the interested content judging unit may include:
the first interesting content judging subunit is configured to determine, according to preset interesting information corresponding to the earphone wearing object, whether the environmental sound data includes sound data corresponding to the interesting information, and if so, determine that the environmental sound data includes interesting content of the earphone wearing object;
a second interested content judging subunit, configured to, when the first interested content judging subunit determines that the environmental sound data does not include sound data corresponding to the interested information, acquire feature information of the earphone wearing object, where the feature information is related to the interested content of the earphone wearing object;
and the third interested content judging subunit is used for determining whether the environmental sound data is matched with the feature information, if so, determining that the environmental sound data contains the interested content of the earphone wearing object, and if not, determining that the environmental sound data does not contain the interested content of the earphone wearing object.
Optionally, in the embodiment of the present application, a plurality of structures that the third interested content determining subunit may include are introduced:
a first kind,
The second content of interest determination subunit may include:
a first characteristic information acquisition unit configured to acquire personal information of the headphone wearing object set in advance as characteristic information.
Based on this, the third content of interest determination subunit may include:
a tag acquisition unit configured to acquire a tag corresponding to the personal information;
and the tag searching unit is used for determining whether the identification text corresponding to the environmental sound data contains the tag or not, if so, determining matching, and if not, determining mismatching.
A second kind,
The second content of interest determination subunit may include:
and the second characteristic information acquisition unit is used for acquiring key time points and event information contained in the historical speaking content of the earphone wearing object as characteristic information.
Based on this, the third content of interest determination subunit may include:
and the identification text reference unit is used for determining whether the identification text corresponding to the environmental sound data contains the key time point and/or the event information, if so, determining matching, and if not, determining mismatching.
A third one,
The second content of interest determination subunit may include:
and the third characteristic information acquisition unit is used for acquiring the corresponding relation between the interest content and the interest degree determined according to the historical speaking content of the earphone wearing object as characteristic information.
Based on this, the third content of interest determination subunit may include:
the interest content judging unit is used for determining whether the identification text corresponding to the environmental sound data contains target interest content;
the target interest degree query unit is used for querying the target interest degree corresponding to the target interest content according to the corresponding relation between the interest content and the interest degree if the judgment result of the interest content judgment unit is yes;
and the interest degree comparison unit is used for determining the matching condition of the environmental sound data and the characteristic information according to the size relation between the target interest degree and a preset interest degree threshold value.
Optionally, the present application embodiment introduces the content of interest playing unit, which may include:
the target voiceprint identification unit is used for identifying a target voiceprint corresponding to the content which is interested by the earphone wearing object in the environment sound data;
a dialog mode entering unit, configured to enter a dialog mode for an object corresponding to the target voiceprint, where the dialog mode includes: and playing the sound data corresponding to the target voiceprint in the environment sound data.
Optionally, the interesting content playing unit may further include:
the starting time recording unit is used for recording the starting time for starting to acquire the sound data corresponding to the target voiceprint;
a set time detection unit, configured to detect whether sound data corresponding to the target voiceprint is obtained within a set time length after the start time;
and the dialogue mode exit unit is used for exiting the dialogue mode of the object corresponding to the target voiceprint when the detection result of the target voiceprint detection unit is negative.
The earphone noise reduction device provided by the embodiment of the application can be applied to earphone noise reduction equipment, such as a PC terminal, a cloud platform, a server cluster and the like. Alternatively, fig. 3 shows a block diagram of a hardware structure of the headphone noise reduction device, and referring to fig. 3, the hardware structure of the headphone noise reduction device may include: at least one processor 1, at least onecommunication interface 2, at least onememory 3 and at least onecommunication bus 4;
in the embodiment of the application, the number of the processor 1, thecommunication interface 2, thememory 3 and thecommunication bus 4 is at least one, and the processor 1, thecommunication interface 2 and thememory 3 complete mutual communication through thecommunication bus 4;
the processor 1 may be a central processing unit CPU, or an application Specific Integrated circuit asic, or one or more Integrated circuits configured to implement embodiments of the present invention, etc.;
thememory 3 may include a high-speed RAM memory, and may further include a non-volatile memory (non-volatile memory) or the like, such as at least one disk memory;
wherein the memory stores a program and the processor can call the program stored in the memory, the program for:
acquiring environmental sound data;
determining whether the ambient sound data contains content of interest to the headset wearing object;
and if so, eliminating the contents except the contents interested by the earphone wearing object in the environmental sound data, and playing the contents interested by the earphone wearing object.
Alternatively, the detailed function and the extended function of the program may be as described above.
Embodiments of the present application further provide a readable storage medium, where a program suitable for being executed by a processor may be stored, where the program is configured to:
acquiring environmental sound data;
determining whether the ambient sound data contains content of interest to the headset wearing object;
and if so, eliminating the contents except the contents interested by the earphone wearing object in the environmental sound data, and playing the contents interested by the earphone wearing object.
Alternatively, the detailed function and the extended function of the program may be as described above.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.