the total word number refers to the total number of words contained in the historical speech content of the earphone wearing object. For example, if the acquired historical speech content includes 100 words in total, and the number of times "programmer" and "Java" appear in a sentence at the same time is 8, the support of the interest content including "programmer" and "Java" is determined as follows: support (programmer, JAVA) 8/100 0.08.

2) A confidence level of the content of interest is calculated.

Specifically, the confidence level of the interest content refers to: the probability that the interest item of the interest content appears on the basis of the interest type of the interest content appearing in the historical speech content of the headset wearing object. The interest type is represented as X, the interest item is represented as Y, and the specific calculation formula of the confidence coefficient is as follows:

wherein, the total number of times of occurrence of X refers to the total number of times of occurrence of X in the history speech content of the earphone wearing object. For example, if the acquired historical speech content includes 100 words in total, where the number of occurrences of "programmer" is 20, and the number of occurrences of "programmer" and "Java" in a sentence is 8 at the same time, the confidence level of the interest content including "programmer" and "Java" is determined as: the confidence (JAVA) 8/20 is 0.4.

Based on the method, the support degree and the confidence degree of each piece of interest content are calculated. Table 1 below illustrates a correspondence table of interest content support and confidence.

TABLE 1

3) Calculating interest degree of interest content

The support degree and the confidence degree of each piece of interest content are determined, and the support degree and the confidence degree corresponding to each piece of interest content are stored. The process of calculating the interestingness can directly select the support degree or confidence degree representation of the interesting content, or the method considers the influence of the support degree and the confidence degree at the same time and determines the corresponding relation between the interesting content and the interestingness according to the support degree and the confidence degree.

Specifically, the interestingness refers to the degree of interest of the headset wearing object in the content of interest, the interest type and the interest item are respectively represented as X and Y, the interestingness is generally represented by I (X, Y), and a specific calculation formula can be as follows:

for example, if the support of the interest content of the above "programmer" and "Java" is 0.08 and the confidence is 0.4, the interest level can be calculated by the following formula:

the corresponding relation between each interest content and the interest degree is obtained by the method, and the corresponding relation is used as the characteristic information. It is understood that the correspondence between the interest content and the interest level reflects the interest level of the earphone wearing object in the interest content, and the correspondence is continuously updated as the historical speech content increases.

and C1, determining whether the corresponding identification text of the environmental sound data contains the target interesting content.

Specifically, whether the target content of interest determined by the above is included is searched in the recognition text corresponding to the environmental sound data. Here, as long as it is determined that any one or more items of the target interest type set or the target interest item set are included in the identification text corresponding to the environmental sound data, it is determined that the target interest content is included in the identification text corresponding to the environmental sound data. For example, when searching for the identification text corresponding to the environmental sound data and determining that the identification text corresponding to the environmental sound data includes the "revenge league" in the target interest item set, it may be determined that the identification text corresponding to the environmental sound data includes the target interest content. Or when the identification text corresponding to the environmental sound data is searched, and the identification text corresponding to the environmental sound data contains the 'hollywood movie' in the target interest type set, the identification text corresponding to the environmental sound data can be determined to contain the target interest content.

It can be understood that, if the identification text corresponding to the environmental sound data only contains the target interest type, the interest content containing the target interest type may be used as the target interest content contained in the identification text; if the identification text corresponding to the environmental sound data only contains the target interest item, the interest content containing the target interest item can be used as the target interest content contained in the identification text; if the identification text corresponding to the environmental sound data contains the target interest type and the target interest item at the same time, the interest content containing the target interest type and the target interest item can be used as the target interest content contained in the identification text.

C2, if the target interest content is determined to be contained, inquiring the target interest degree corresponding to the target interest content according to the corresponding relation between the interest content and the interest degree.

It is understood that the determined target interest content may include a plurality of pieces, for example, it is determined that the identification text corresponding to the environmental sound data includes "revenge alliance" in the target interest item set, and table 1 may determine that there are two pieces of target interest content including the target interest item "revenge alliance", and the target interest types of the two pieces of target interest content are "hollywood movie" and "romance", respectively.

Further, the target interest level corresponding to each target interest content may be obtained through querying, and when the target interest content includes a plurality of target interest levels, the target interest level corresponding to the target interest content may be obtained according to the plurality of target interest levels, for example, the plurality of target interest levels may be added or weighted and summed. For example, if the identification text corresponding to the current environmental sound data includes "revenge alliance", the target interest content including "revenge alliance" is queried, and one item is found to be the "revenge alliance" under the "cheerway" interest type, and the other item is the "revenge alliance" under the "hollywood movie" interest type.

And C3, determining the matching condition of the environmental sound data and the feature information according to the size relation between the target interestingness and a preset interestingness threshold value.

Specifically, an interestingness threshold may be preset according to a use habit of the object worn by the headset, when the target interestingness corresponding to the target interest content is greater than or equal to the preset interestingness threshold, it may be determined that the environmental sound data matches the feature information, otherwise, it may be determined that the environmental sound data does not match the feature information. Since the corresponding relation between the target interest content and the target interest degree is determined according to the historical speech content of the earphone wearing object, the content which the earphone wearing object is interested in can be further determined to be contained in the environment sound data.

The above describes three ways of determining whether the ambient sound data matches with the feature information, and since the feature information is determined according to the personal information or the historical speech content preset by the earphone wearing object, the feature information is related to the content of interest of the earphone wearing object, and when it is determined that the ambient sound data matches with the feature information by any of the above methods, it can be determined that the ambient sound data contains the content of interest of the earphone wearing object.

It should be noted that, the matching process of the three kinds of feature information may have an online implementation process or an offline implementation process.

In the online implementation process, the background server calls an open application Program API (application Program interface) interface to acquire preset personal information or historical speech content, obtains and stores the characteristic information, when the characteristic information is matched with the characteristic information, the earphone sends an identification text corresponding to the environmental sound data to the background server, the background server performs the process of determining whether the environmental sound data is matched with the characteristic information, and sends a matching result to the earphone.

The off-line process is that the earphone downloads the characteristic information stored in the background server to the local in advance, and based on the characteristic information, the process of determining whether the environmental sound data is matched with the characteristic information is carried out in the local.

The following provides an embodiment of the present application, and describes a process of playing a content of interest of the earphone wearing object in step S120, where the process may include:

and D1, identifying a target voiceprint corresponding to the content which is interested by the earphone wearing object in the environment sound data.

Specifically, the target voiceprints corresponding to the content of interest of the earphone wearing object may include one or more target voiceprints, which are sequentially identified and marked. For example, in a multi-person conversation scene, a and B speak into the earphone wearing object at the same time, and the interested contents of the earphone wearing object in the environmental sound data come from a and B respectively, the voiceprints of a and B are respectively identified through a voiceprint identification technology, and the target voiceprint is sequentially determined to be a target voiceprint 1 and atarget voiceprint 2.

Specifically, when the information of interest is an interest voiceprint, if it is determined that sound data corresponding to the interest voiceprint exists in the environment sound data, the interest voiceprint is directly used as a target voiceprint corresponding to the content of interest of the earphone wearing object.

And D2, entering a dialogue mode of the object corresponding to the target voiceprint.

Specifically, the sound data corresponding to the target voiceprint includes content that is of interest to the earphone wearing object, so the dialog mode is to play the sound data corresponding to each target voiceprint in the ambient sound data.

On the basis, after entering the dialog mode of the object corresponding to the target voiceprint, the earphone wearing object may hear the sound of the object corresponding to the target voiceprint, and it can be understood that, as time goes by, the environmental sound data may no longer include the sound data corresponding to the target voiceprint, for example, in a dialog scene, after the earphone wearing object has finished talking with the object a, the object a leaves, and at this time, the environmental sound data does not include the sound data corresponding to the object a, so the process of playing the content of interest of the earphone wearing object in step S120 may further include:

first, the start time for acquiring the sound data corresponding to each target voiceprint is recorded. Further, for each target voiceprint, detecting whether sound data corresponding to the target voiceprint is acquired within a set time length after the starting time, and if the fact that the sound data corresponding to the target voiceprint is not acquired within the set time length is determined, exiting the dialogue mode of the object corresponding to the target voiceprint, but not affecting the dialogue modes of other target voiceprints.

For example, if the content of interest of the earphone wearing object in the environmental sound data is from the object a, the voiceprint of the object a is identified as the target voiceprint by the voiceprint identification technology, and the starting time for acquiring the sound data corresponding to the target voiceprint is 10: 00, assuming a set time length of 5 minutes, from start time 10: 00, and if the sound data corresponding to the target voiceprint is not obtained within 5 minutes, exiting the dialogue mode of the object A corresponding to the target voiceprint.

The following describes the noise reduction device for a headphone provided in an embodiment of the present application, and the noise reduction device for a headphone described below and the noise reduction method for a headphone described above may be referred to in correspondence.

Referring to fig. 2, fig. 2 is a schematic structural diagram of a noise reduction device for a headset disclosed in an embodiment of the present application.

As shown in fig. 2, the apparatus may include:

an ambient sounddata acquisition unit 11 for acquiring ambient sound data;

an interestedcontent judging unit 12 for determining whether the ambient sound data contains a content that is interested by the earphone wearing object;

anoise canceling unit 13 configured to, when it is determined that the ambient sound data includes a content of interest to a headphone-worn object, cancel a content of the ambient sound data other than the content of interest to the headphone-worn object;

an interestingcontent playing unit 14, configured to play the content that the earphone wearing object is interested in when it is determined that the ambient sound data includes the content that the earphone wearing object is interested in.

Optionally, the embodiment of the present application introduces a first one of a plurality of structures that the interestedcontent determining unit 12 may include:

the content of interest determination unit may include:

Optionally, when the information of interest is an interest word, the first interested content determining subunit may include:

Optionally, when the information of interest is an interest voiceprint, the first content of interest determining subunit may include:

Optionally, the present embodiment introduces a second structure of multiple structures that the interested content determining unit may include:

the content of interest determination unit may include:

Optionally, two different structures of the interested content determining unit are introduced above, in this embodiment, the two structures may be combined to provide a third optional structure of the interested content determining unit, as follows:

the interested content judging unit may include:

the first interesting content judging subunit is configured to determine, according to preset interesting information corresponding to the earphone wearing object, whether the environmental sound data includes sound data corresponding to the interesting information, and if so, determine that the environmental sound data includes interesting content of the earphone wearing object;

Optionally, in the embodiment of the present application, a plurality of structures that the third interested content determining subunit may include are introduced:

a first kind,

The second content of interest determination subunit may include:

a first characteristic information acquisition unit configured to acquire personal information of the headphone wearing object set in advance as characteristic information.

Based on this, the third content of interest determination subunit may include:

A second kind,

The second content of interest determination subunit may include:

and the second characteristic information acquisition unit is used for acquiring key time points and event information contained in the historical speaking content of the earphone wearing object as characteristic information.

Based on this, the third content of interest determination subunit may include:

A third one,

The second content of interest determination subunit may include:

and the third characteristic information acquisition unit is used for acquiring the corresponding relation between the interest content and the interest degree determined according to the historical speaking content of the earphone wearing object as characteristic information.

Based on this, the third content of interest determination subunit may include:

the target interest degree query unit is used for querying the target interest degree corresponding to the target interest content according to the corresponding relation between the interest content and the interest degree if the judgment result of the interest content judgment unit is yes;

Optionally, the present application embodiment introduces the content of interest playing unit, which may include:

Optionally, the interesting content playing unit may further include:

a set time detection unit, configured to detect whether sound data corresponding to the target voiceprint is obtained within a set time length after the start time;

The earphone noise reduction device provided by the embodiment of the application can be applied to earphone noise reduction equipment, such as a PC terminal, a cloud platform, a server cluster and the like. Alternatively, fig. 3 shows a block diagram of a hardware structure of the headphone noise reduction device, and referring to fig. 3, the hardware structure of the headphone noise reduction device may include: at least one processor 1, at least onecommunication interface 2, at least onememory 3 and at least onecommunication bus 4;

in the embodiment of the application, the number of the processor 1, thecommunication interface 2, thememory 3 and thecommunication bus 4 is at least one, and the processor 1, thecommunication interface 2 and thememory 3 complete mutual communication through thecommunication bus 4;

the processor 1 may be a central processing unit CPU, or an application Specific Integrated circuit asic, or one or more Integrated circuits configured to implement embodiments of the present invention, etc.;

thememory 3 may include a high-speed RAM memory, and may further include a non-volatile memory (non-volatile memory) or the like, such as at least one disk memory;

wherein the memory stores a program and the processor can call the program stored in the memory, the program for:

acquiring environmental sound data;

Alternatively, the detailed function and the extended function of the program may be as described above.

Embodiments of the present application further provide a readable storage medium, where a program suitable for being executed by a processor may be stored, where the program is configured to:

acquiring environmental sound data;

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for reducing noise in a headphone, comprising:

acquiring environmental sound data;

if so, eliminating the contents except the contents interested by the earphone wearing object in the environmental sound data, and playing the contents interested by the earphone wearing object;

wherein the determining whether the ambient sound data contains content of interest to the headset wearing object comprises:

2. The method of claim 1, wherein determining whether the ambient sound data includes content of interest to the headset-worn subject further comprises:

before acquiring the characteristic information of the earphone wearing object, determining whether the environmental sound data contains sound data corresponding to the information of interest according to preset information of interest corresponding to the earphone wearing object, if not, executing the step of acquiring the characteristic information of the earphone wearing object, and if so, determining that the environmental sound data contains content in which the earphone wearing object is interested.

3. The method of claim 2, wherein the information of interest is an interest word, and the determining whether the environmental sound data includes sound data corresponding to the information of interest according to pre-configured information of interest corresponding to the headset wearing object comprises:

converting the ambient sound data into a recognition text;

4. The method of claim 2, wherein the information of interest is an interest voiceprint, and the determining whether the environmental sound data includes sound data corresponding to the information of interest according to pre-configured information of interest corresponding to the headset wearing object comprises:

5. The method according to claim 1 or 2, wherein the acquiring the characteristic information of the headset-worn object comprises:

acquiring a label corresponding to the personal information;

6. The method according to claim 1 or 2, wherein the acquiring the characteristic information of the headset-worn object comprises:

7. The method according to claim 1 or 2, wherein the acquiring the characteristic information of the headset-worn object comprises:

8. The method of claim 1, wherein the playing the content of interest to the headset-worn subject comprises:

9. The method of claim 8, wherein the playing the content of interest to the headset-worn subject further comprises:

10. An apparatus for reducing noise in a headphone, comprising:

an ambient sound data acquisition unit for acquiring ambient sound data;

the interesting content playing unit is used for playing the interesting content of the earphone wearing object when the fact that the interesting content of the earphone wearing object is contained in the environment sound data is determined;

wherein the interesting content judging unit includes:

11. The apparatus of claim 10, wherein the content of interest determination unit further comprises:

the first interested content judging subunit is configured to, before the second interested content judging subunit executes, determine whether the ambient sound data includes sound data corresponding to the information of interest according to information of interest corresponding to the earphone wearing object, where if not, the second interested content judging subunit executes, and if so, determines that the ambient sound data includes content of interest of the earphone wearing object.

12. The apparatus of claim 11, wherein the information of interest is an interest word, and wherein the first content of interest determination subunit comprises:

13. The apparatus of claim 11, wherein the information of interest is a voiceprint of interest, and wherein the first content of interest determination subunit comprises:

14. The apparatus according to claim 10 or 11, wherein the second content of interest determining subunit comprises:

the third content of interest determination subunit includes:

15. The apparatus according to claim 10 or 11, wherein the second content of interest determining subunit comprises: a second feature information acquiring unit, configured to acquire, as feature information, key time points and event information included in historical speech content of the earphone wearing object;

the third content of interest determination subunit includes:

16. The apparatus according to claim 10 or 11, wherein the second content of interest determining subunit comprises:

the third content of interest determination subunit includes:

17. The apparatus of claim 10, wherein the content of interest playing unit comprises:

18. The apparatus of claim 17, wherein the content playing unit of interest further comprises:

19. An earphone noise reduction device, comprising a memory and a processor;

the memory is used for storing programs;

the processor, configured to execute the program, implementing the steps of the headphone noise reduction method according to any one of claims 1-9.

20. A readable storage medium having stored thereon a computer program for implementing the steps of the headphone noise reduction method according to any one of claims 1-9 when being executed by a processor.