Movatterモバイル変換


[0]ホーム

URL:


CN112883350A - Data processing method and device, electronic equipment and storage medium - Google Patents

Data processing method and device, electronic equipment and storage medium
Download PDF

Info

Publication number
CN112883350A
CN112883350ACN201911206373.3ACN201911206373ACN112883350ACN 112883350 ACN112883350 ACN 112883350ACN 201911206373 ACN201911206373 ACN 201911206373ACN 112883350 ACN112883350 ACN 112883350A
Authority
CN
China
Prior art keywords
identity
target
user
template
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911206373.3A
Other languages
Chinese (zh)
Other versions
CN112883350B (en
Inventor
杨广煜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co LtdfiledCriticalTencent Technology Shenzhen Co Ltd
Priority to CN201911206373.3ApriorityCriticalpatent/CN112883350B/en
Publication of CN112883350ApublicationCriticalpatent/CN112883350A/en
Application grantedgrantedCritical
Publication of CN112883350BpublicationCriticalpatent/CN112883350B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本申请实施例公开了一种数据处理方法、装置、电子设备以及存储介质,方法包括:当一级身份标识处于有效态时,获取目标生物信息;识别与所述目标生物信息对应的业务意图和目标用户身份;获取与所述目标用户身份对应的目标二级身份标识;所述目标二级身份标识是所述一级身份标识的子标识;基于所述目标二级身份标识执行与所述业务意图对应的业务指令。采用本申请,可以使终端设备执行的业务行为与用户身份相匹配。

Figure 201911206373

The embodiments of the present application disclose a data processing method, apparatus, electronic device, and storage medium. The method includes: when a primary identity identifier is in a valid state, acquiring target biological information; target user identity; obtain a target secondary identity identifier corresponding to the target user identity; the target secondary identity identifier is a sub-identity of the primary identity identifier; execute and communicate with the business based on the target secondary identity identifier The business instruction corresponding to the intent. By adopting the present application, the business behavior performed by the terminal device can be matched with the user identity.

Figure 201911206373

Description

Data processing method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method and apparatus, and a related device.
Background
With the rapid development of internet technology, the number of internet users is continuously increasing, and among all application software, video software is one of the software which is used by internet users most frequently. Data shows that the usage time of the video software is up to 34.5% of the total usage time of the mobile device.
In a family scene, a terminal device (e.g., a smart television) is often shared by a plurality of family members, and when a family member a logs in a video application in the terminal device by using its own account a and watches a video 1, the terminal device records the watching progress of the family member a on the video 1. If another family member B uses the video application again to watch the video 1 and does not switch the login account of the video application, the terminal equipment automatically jumps to the watching progress of the family member A to the video 1, so that the service behavior executed by the terminal equipment is not matched with the user identity.
Disclosure of Invention
The embodiment of the application provides a data processing method, a data processing device and related equipment, and service behaviors executed by terminal equipment can be matched with user identities.
An embodiment of the present application provides a data processing method, including:
when the primary identity mark is in an effective state, acquiring target biological information;
identifying a business intention and a target user identity corresponding to the target biological information;
acquiring a target secondary identity corresponding to the target user identity; the target secondary identity is a sub-identity of the primary identity;
and executing a service instruction corresponding to the service intention based on the target secondary identity.
Wherein the target bio-information includes target voice data;
the identifying the business intention and the target user identity corresponding to the target biological information comprises:
converting the target voice data into text data, and performing semantic recognition on the text data to obtain the service intention;
calling an identity recognition model corresponding to the primary identity to determine a matching result between the target voice data and at least one template user identity; the identity recognition model is a classification model generated according to the at least one template user identity and template voice data respectively corresponding to the at least one template user identity;
if at least one matching result meets the matching condition, taking the template user identity corresponding to the matching result meeting the matching condition as the target user identity;
the obtaining of the target secondary identity corresponding to the target user identity includes:
extracting a target identity corresponding to the target user identity from a secondary identity set corresponding to the at least one template user identity; and the secondary identification in the secondary identification set is a sub-identification of the primary identification.
Wherein, still include:
if the matching result meeting the matching condition does not exist in the at least one matching result, establishing the target user identity;
identifying age information corresponding to the target voice data, and searching an identity head portrait matched with the age information in an image material library;
the obtaining of the target secondary identity corresponding to the target user identity includes:
creating the target secondary user identification for the target user identity;
setting the target secondary user identification as a sub-identification of the primary identity identification;
and performing associated storage on the target user identity, the target secondary identity and the identity head portrait.
Wherein the identity recognition model comprises a feature generator and a pattern matcher;
the calling of the identity recognition model corresponding to the primary identity to determine the matching result between the target voice data and at least one template user identity comprises:
extracting a target voiceprint feature of the target voice data based on the feature generator;
determining the matching probability between the target voiceprint feature and at least one template voiceprint feature based on the pattern matcher, and taking the obtained matching probabilities as matching results; the at least one template voiceprint feature is a voiceprint feature corresponding to the at least one template voice data respectively.
Wherein the extracting the target voiceprint feature of the target voice data based on the feature generator comprises:
extracting a spectrum parameter and a linear prediction parameter of the target voice data based on the feature generator; the frequency spectrum parameter is a short-time spectrum characteristic parameter of the target voice data; the linear prediction parameters are frequency spectrum fitting characteristic parameters of the target voice data;
and obtaining the target voiceprint characteristics according to the frequency spectrum parameters and the linear prediction parameters.
Wherein, still include:
acquiring template voice data corresponding to the identity of a template user;
generating an identity tag vector corresponding to the template voice data;
acquiring an initial classification model, predicting the matching degree between the sample voice data and the identity of the at least one template user based on the initial classification model, and acquiring an identity prediction vector according to the acquired matching degree;
and determining a classification error according to the identity label vector and the identity prediction vector, and training the initial classification model according to the classification error to obtain the identity recognition model.
Wherein, still include:
when the matching result meeting the matching condition exists in the at least one matching result, sending an animation playing instruction to a client to instruct the client to play a target animation;
and when the execution of the service instruction is finished, sending an animation playing stopping instruction to the client, and indicating the client to close the target animation.
The service intention comprises a client secondary login object switching intention;
the executing the business instruction corresponding to the business intention based on the target secondary identity comprises the following steps:
generating a switching instruction corresponding to the switching intention of the secondary login object of the client; the switching instruction belongs to the service instruction;
and according to the switching instruction, taking the target secondary identity as a secondary login object of the client.
Wherein, still include:
acquiring behavior data of a user in the client corresponding to the target secondary identity; the behavior data is used for generating recommended service data for the user;
and performing associated storage on the behavior data and the target secondary identity.
Wherein the business intent comprises a business data query intent;
the executing the business instruction corresponding to the business intention based on the target secondary identity comprises the following steps:
generating a query instruction corresponding to the business data query intention; the query instruction belongs to the service instruction;
and inquiring target service data corresponding to the target secondary identity, and returning the target service data to the client.
And the user authority of the target secondary identity is the same as the user authority of the primary identity.
Another aspect of the embodiments of the present application provides a data processing apparatus, including:
the first acquisition module is used for acquiring the target biological information when the primary identity is in an effective state;
the identification module is used for identifying the business intention and the target user identity corresponding to the target biological information;
the second acquisition module is used for acquiring a target secondary identity corresponding to the target user identity; the target secondary identity is a sub-identity of the primary identity;
and the determining module is used for executing the service instruction corresponding to the service intention based on the target secondary identity.
Wherein the target bio-information includes target voice data;
the identification module comprises:
the conversion unit is used for converting the target voice data into text data, and semantically identifying the text data to obtain the service intention;
the calling unit is used for calling an identity recognition model corresponding to the primary identity to determine a matching result between the target voice data and at least one template user identity; the identity recognition model is a classification model generated according to the at least one template user identity and template voice data respectively corresponding to the at least one template user identity;
the first determining unit is used for taking the template user identity corresponding to the matching result meeting the matching condition as the target user identity if the matching result meeting the matching condition exists in at least one matching result;
the second obtaining module includes:
a first extraction unit, configured to extract a target identity corresponding to the target user identity from a secondary identity set corresponding to the at least one template user identity; and the secondary identification in the secondary identification set is a sub-identification of the primary identification.
Wherein, still include:
a second determining unit, configured to create the target user identity if there is no matching result that meets the matching condition in the at least one matching result, identify age information corresponding to the target voice data, and search for an identity avatar matching the age information in an image material library;
the second obtaining module includes:
and the second extraction unit is used for creating the target secondary user identification for the target user identity, setting the target secondary user identification as a sub-identification of the primary identification, and performing associated storage on the target user identity, the target secondary identification and the identity head portrait.
Wherein the identity recognition model comprises a feature generator and a pattern matcher;
the calling unit comprises:
an extraction subunit configured to extract a target voiceprint feature of the target speech data based on the feature generator;
a matching subunit, configured to determine, based on the pattern matcher, matching probabilities between the target voiceprint feature and at least one template voiceprint feature, and take all the obtained matching probabilities as matching results; the at least one template voiceprint feature is a voiceprint feature corresponding to the at least one template voice data respectively.
The extracting subunit is specifically configured to extract, based on the feature generator, a spectrum parameter and a linear prediction parameter of the target speech data, and obtain the target voiceprint feature according to the spectrum parameter and the linear prediction parameter; the frequency spectrum parameter is a short-time spectrum characteristic parameter of the target voice data; the linear prediction parameters are spectrum fitting characteristic parameters of the target speech data.
Wherein, still include:
the training module is used for obtaining template voice data corresponding to the identity of a template user, generating an identity tag vector corresponding to the template voice data, obtaining an initial classification model, predicting the matching degree between the template voice data and the identity of the at least one template user based on the initial classification model, obtaining an identity prediction vector according to the obtained matching degree, determining a classification error according to the identity tag vector and the identity prediction vector, and training the initial classification model according to the classification error to obtain the identity recognition model.
Wherein, still include:
the playing module is used for sending an animation playing instruction to a client to indicate the client to play the target animation when the matching result meeting the matching condition exists in the at least one matching result;
and the playing module is further used for sending an animation playing stopping instruction to the client to indicate the client to close the target animation when the execution of the service instruction is completed.
The service intention comprises a client secondary login object switching intention;
the determining module includes:
the first generation unit is used for generating a switching instruction corresponding to the switching intention of the secondary login object of the client, and taking the target secondary identity as the secondary login object of the client according to the switching instruction; the switching instruction belongs to the service instruction.
Wherein, still include:
the storage module is used for acquiring behavior data of a user in the client corresponding to the target secondary identity, and storing the behavior data and the target secondary identity in an associated manner; the behavior data is used for generating recommended service data for the user.
Wherein the business intent comprises a business data query intent;
the determining module includes:
the second generation unit is used for generating a query instruction corresponding to the service data query intention, querying target service data corresponding to the target secondary identity and returning the target service data to the client; the query instruction belongs to the service instruction.
And the user authority of the target secondary identity is the same as the user authority of the primary identity.
Another aspect of the embodiments of the present application provides an electronic device, including a memory and a processor, where the memory stores a computer program, and the computer program, when executed by the processor, causes the processor to execute the method according to one aspect of the embodiments of the present application.
Another aspect of the embodiments of the present application provides a computer storage medium storing a computer program, the computer program comprising program instructions that, when executed by a processor, perform a method as in one aspect of the embodiments of the present application.
According to the method and the device, the target secondary identity corresponding to the user identity can be determined by identifying the user identity and the service intention which generate the current biological information, so that a service instruction executed based on the target secondary identity not only meets the current service intention of the user, but also is matched with the user identity; furthermore, the method and the device only need to acquire the target biological information of the user, can simultaneously determine the service intention and the user identity of the user, and the user does not need to perform twice operations of determining the service intention and determining the user identity, so that the operation cost of the user can be reduced, and the efficiency of executing the service instruction matched with both the service intention and the user identity of the user by the terminal can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a block diagram of a system architecture for data processing according to an embodiment of the present disclosure;
2 a-2 d are schematic diagrams of a data processing scenario provided by an embodiment of the present application;
fig. 3 is a schematic flowchart of a data processing method according to an embodiment of the present application;
FIG. 4 is a schematic flow chart diagram of another data processing method provided in the embodiments of the present application;
fig. 5 is a timing diagram of a data processing method according to an embodiment of the present application;
FIG. 6 is a schematic flowchart of determining a target user identity and a target secondary identity according to an embodiment of the present disclosure;
FIG. 7 is a schematic flow chart diagram of another data processing method provided in the embodiments of the present application;
FIG. 8 is a timing diagram of another data processing method provided in an embodiment of the present application;
fig. 9 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
The scheme provided by the embodiment of the application belongs to Speech Technology (Speech Technology), Natural Language Processing (NLP) and Machine Learning (ML) belonging to the field of artificial intelligence.
Key technologies for Speech Technology (Speech Technology) are automatic Speech recognition Technology (ASR) and Speech synthesis Technology (TTS), as well as voiceprint recognition Technology. The computer can listen, watch, speak and feel.
Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.
In the present application, speech technology is involved in converting a user's speech into text, and natural language processing is involved in semantically recognizing the text to determine the user's intent.
Machine learning is a multi-field cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. In the application, machine learning is related to identifying the user identity of the current user, and specific technical means relate to technologies such as artificial neural networks and logistic regression in machine learning.
Fig. 1 is a block diagram of a system architecture for data processing according to an embodiment of the present disclosure. The application relates to aserver 10d and a terminal device cluster, and the terminal device cluster may include:terminal device 10a,terminal device 10 b.
Taking theterminal device 10a as an example, when the primary identity is in the valid state, theterminal device 10a collects the biological information of the user, and sends the collected biological information to theserver 10 d. Theserver 10d performs semantic recognition on the biological information to determine the intention of the biological information; theserver 10d determines the user identity of the biological information, and extracts the secondary identification of the user identity, which is a sub-identification of the primary identification. Theserver 10d executes the instruction associated with the intention based on the determined secondary identity. Subsequently, theserver 10d may return the execution result of the instruction to theterminal device 10 a.
Identifying the intent of the biometric information, determining the user identity of the biometric information, and executing instructions related to the intent may also be accomplished by theterminal device 10 a.
Theterminal device 10a, theterminal device 10b,. theterminal device 10c, etc. shown in fig. 1 may include a smart television, a mobile phone, a tablet computer, a notebook computer, a palm computer, a Mobile Internet Device (MID), a wearable device (e.g., a smart watch, a smart band, etc.), etc. Theserver 10d shown in fig. 1 may refer to a single server device, or may refer to a server cluster including a plurality of server devices.
Fig. 2a to 2d below specifically describe how theterminal device 10a recognizes the intention of the biometric information, determines the user identity of the biometric information, and executes the instruction related to the intention, and the intention of recognizing the biometric information, the user identity of the biometric information, and the instruction related to the intention may be embodied as a video client in theterminal device 10 a:
please refer to fig. 2 a-2 d, which are schematic diagrams illustrating a data processing scenario according to an embodiment of the present application. When the current user starts the video client in theterminal device 10a, and the video client detects that the primary account "01" has logged in but no secondary account under the primary account "01" has logged in, the video client can display prompt information on the screen: and inputting the current secondary account which is not logged in by voice or clicking the secondary account which can be logged in by the user, so as to prompt the current user to log in the secondary account, wherein theprimary account 01 is the primary account of the user 1.
The current user can input by voice: the video client can acquire thevoice data 20b of the voice 'login of the secondary account', the video client converts thevoice data 20b into text data, semantically recognizes the text data, and determines that the intention corresponding to thevoice data 20b is as follows: "secondary Account Login".
The video client inputs thevoice data 20b into a trainedprediction model 20d corresponding to the primary account number "01", theprediction model 20d can extract voiceprint features of thevoice data 20b and match the voiceprint features with a plurality of template voiceprint features, and if a template voiceprint feature matched with the voiceprint features of thevoice data 20b exists in the plurality of template voiceprint features, a user identity corresponding to the matched template voiceprint feature is extracted (assuming that the extracted user identity is user 2).
Each template voiceprint feature in theprediction model 20d corresponds to 1 user identity, and it is assumed that theprediction model 20d is trained from 2 template voiceprint features, and the user identities corresponding to the 2 template voiceprint features respectively are: user 1 and user 2, and the secondary account of user 1 and the secondary account of user 2 are sub-accounts of the current primary account "01".
As shown in fig. 2b, the second-level account corresponding to the user 2 is searched in the user information record table 20e corresponding to the first-level account "01" as follows: 002.
as can be seen from fig. 2 b: the user information record table 20e includes 3 user records, where the 3 user records respectively correspond to a primary account "01" and 2 secondary accounts (respectively, a secondary account "001" and a secondary account "002") subordinate to the primary account "01"; the primary account "01" and the secondary account "001" are both accounts of the user 1, and the secondary account "002" is an account of the user 2; the user's history is stored in association with the secondary account.
The video client extracts the secondary account of the user 2: 002, since the intention of thevoice data 20b determined in the foregoing is: the 'secondary account number login' can take the secondary account number '002' as a secondary login object of the video client.
As shown inpage 20f in fig. 2b, the video client may display an animation in the screen during the process of determining the intention of thevoice data 20b and determining the secondary account number "002", and when the video client has taken the secondary account number "002" as the secondary login object, the animation is stopped to be played, and the video client jumps to the home page.
As shown inpage 20g, the video client is currently logged in to secondary account "002" and primary account "01".
Alternatively, the foregoing assumes that there is a template voiceprint feature matching the voiceprint feature of thespeech data 20b from among the plurality of template voiceprint features, thereby determining that the user identity corresponding to thespeech data 20b is user 2.
As shown in fig. 2c, it is assumed that there is no template voiceprint feature from the plurality of template voiceprint features that matches the voiceprint feature of thespeech data 20b, that is, there is no corresponding user identity for the current user that generated thespeech data 20b, and there is no corresponding user record in the user information record table 20 c. Because there is no corresponding user record, the video client may create 1 new user record for the current user, and the user record includes: user identity "user 2", secondary account number "002", avatar, level "level 2", and history (of course, the history at this time is empty).
The video client adds the user record to the user information record table 20c, and a new user information record table 20h can be obtained after the addition.
The video client extracts the newly created secondary account of the user 2: 002, since the intention of thevoice data 20b determined in the foregoing is: the 'secondary account number login' can take the secondary account number '002' as a secondary login object of the video client.
As shown in page 20i in fig. 2d, after the video client creates the secondary account "002" and logs in the secondary account "002", a prompt message may be displayed in the screen: the method comprises the steps of 'not detecting your secondary account, creating a secondary account for your new account and logging in', and prompting the current user to create a new secondary account.
As shown inpage 20j, the video client is currently logged in to the newly created secondary account "002" and the primary account "01".
The specific process of acquiring the target biological information (thevoice data 20b of "log in secondary account" in the above embodiment), identifying the service intention (the intention "log in secondary account" in the above embodiment) and the target user identity (the user 2 in the above embodiment) can be referred to the following embodiments corresponding to fig. 3 to 8.
Referring to fig. 3, which is a schematic flow chart of a data processing method according to an embodiment of the present application, as shown in fig. 3, the data processing method may include the following steps:
and S101, when the primary identity mark is in an effective state, acquiring target biological information.
Specifically, the server (e.g., theserver 10d in the embodiment corresponding to fig. 1) detects whether the primary id is a primary login object of the current corresponding client (e.g., the video client in the embodiment corresponding to fig. 2a to fig. 2 d), and if the primary id is the primary login object of the client, it indicates that the primary id is in an active state. The login object of the client side can comprise a primary login object and a secondary login object, wherein the primary login object corresponds to a primary identity, the secondary login object corresponds to a secondary identity, and the secondary identity is a sub-identity of the primary identity.
The client may specifically be a video client, an instant messaging client, or a mail client.
When the primary id is in the valid state, the server may receive the biometric information (referred to as target biometric information, such as thevoice data 20b of the voice "log in the next secondary account" in the corresponding embodiments of fig. 2a to 2 d) sent by the client.
The target bio-information may include voice data (referred to as target voice data), and the target bio-information may also include voice data (referred to as target voice data) and image data (referred to as target image data), wherein the target image data may be image data of the face of the current user.
And step S102, identifying the service intention and the target user identity corresponding to the target biological information.
Specifically, when the target bio-information includes the target voice data, the server may semantically recognize the text data by converting the target voice data into the text data to determine the business intention of the target voice data.
And, the server may determine the user identity of the current user (referred to as the target user identity, e.g. user 2 in the embodiment corresponding to fig. 2 a-2 d) through the identification model corresponding to the first identity (e.g. thepredictive model 20d in the embodiment corresponding to fig. 2 a-2 d).
The sequence of the server determining the service intention and determining the identity of the target user is not limited.
The conversion of the target speech and speech data into text data may employ an acoustic model (the acoustic model may be a model established by a dynamic time warping method based on pattern matching, or a model established by an artificial neural network recognition method, etc.), determine the state of each audio frame of the target speech and speech data, combine a plurality of states into phonemes, and then combine a plurality of phonemes into words.
And combining a plurality of words into correct, unambiguous and logical sentences by adopting a language model (the language model can be an N-Gram language model, a Markov N-Gram, an Exponential model or a Decision Tree model) to obtain text data.
Semantically identifying the text data to determine the service intention of the text data, and performing mode matching on the text data by adopting an entity-predicate knowledge graph to further determine an entity and a predicate in the text data. The server can combine the identified entities and predicates into a business intent.
For example, current user speech input: after the voice data is converted into text data, and the historical play record is inquired, the knowledge graph can be adopted to determine that the entity is as follows: "history play record", the predicate is "inquiry", so the business intention is: history play records-queries.
The identity recognition model is a classification model trained by at least one template user identity and voice data (referred to as template voice data) corresponding to the template user identity, each template user identity has a secondary identity corresponding to the template user identity (such as the secondary account "001" and the secondary account "002" in the corresponding embodiments of fig. 2 a-2 d, and the identity may be a user account), and the secondary identity of each template user identity is a sub-identity of the primary identity.
When the target biological information includes the target voice data and the target image data, the server may also determine the service intention of the target voice data in the above manner, and determine the user identity (referred to as a first user identity) of the target voice data according to the identity recognition model;
the server may also determine a user identity (referred to as a second user identity) of the target image data according to an image recognition model, where the image recognition model and the identity recognition model are similar and are classification models trained by at least one template user identity and image data (referred to as template image data) corresponding to the template user identity.
The server can determine the final target user identity according to the first user identity determined by the identity recognition model and the second user identity determined by the image recognition model, and the target user identity determined based on the two models has higher accuracy.
Alternatively, when the target bio-information includes the target voice data and the target image data, the server may also determine the business intention of the target voice data in the above-described manner, and determine the target user identity of the target image data only based on the image recognition model.
Step S103, acquiring a target secondary identity corresponding to the target user identity; the target secondary identity is a sub-identity of the primary identity.
Specifically, the server may obtain a target secondary id of the target user identity (e.g. the secondary account "002" in the embodiment corresponding to fig. 2a to fig. 2 d) from the secondary id set corresponding to the at least one template user identity, or recreate the target secondary id, where the target secondary id is a sub-id of the primary id.
And step S104, executing a service instruction corresponding to the service intention based on the target secondary identity.
Specifically, when the service intention is a switching intention of a secondary login object of the client, the server generates a switching instruction corresponding to the switching intention of the secondary login object of the client, wherein the switching instruction is used for indicating the server to switch the current secondary login object of the client; the handover command belongs to a service command.
The server can take the target secondary identity as the current secondary login object of the client according to the switching instruction. Subsequently, the server may issue the switching notification message to the client, so that after receiving the switching notification message, the client may display a prompt message for prompting the user that the current secondary login object is the target secondary identity.
Subsequently, when the secondary login object of the client is the target secondary identity, the server may receive behavior data (the behavior data may include at least one of viewing behavior data, browsing behavior data, searching behavior data, and comment behavior data) reported by the client, where the behavior data is user behavior data of the user collected by the client when the secondary login object of the client is the target secondary identity.
The server can store the target secondary identity and the behavior data reported by the client in a correlation manner. Subsequently, the server can generate recommended service data for the user based on the behavior data so as to achieve the purpose of personalized recommendation.
When the business intention is a business data query intention, the server generates a query instruction corresponding to the business data query intention, wherein the query instruction is used for indicating the server to query the business data; the query command belongs to a service command. For example, query history viewing records, query viewing progress records, query search records, and the like.
The server may query, according to the query instruction, service data (referred to as target service data) related to the target secondary identity, and subsequently, the server may return the queried target service data to the client, so that the client may display the target service data after receiving the target service data.
Or, after the server generates the query instruction, the target secondary identity may be used as a secondary login object of the client, and meanwhile, the query operation related to the query instruction is executed.
It should be noted that the user right of the target secondary identity is the same as the user right of the primary identity, and further, the user right of all the sub-identities of the primary identity is the same as the user right of the primary identity.
For example, if the primary id has member VIP rights, all the sub-ids of the primary id (including the target secondary id and the secondary id of the template user id mentioned above) have member VIP rights.
The user identities at different levels correspond to different functional architectures, and the primary identity can be used for managing membership rights and counting statistical information (for example, total viewing time, etc.) of all secondary identities. The secondary identity is personalized information for managing the identity of each user.
It should be noted that, the above steps S101 to S104 are described with a server as an execution subject, where the execution subject may also be a client installed in a terminal device (such as theterminal device 10a in the corresponding embodiment of fig. 2a to fig. 2 d), the terminal device may be a smart television, and the client may be a video client installed in the smart television.
When the primary identity mark is in an effective state, the client acquires target biological information, and the client identifies the service intention of the target biological information and calls an identity identification model to determine the identity of a target user; the client obtains a target secondary identity of the target user identity, and executes a service instruction corresponding to the service intention based on the target secondary identity, for example, the target secondary identity is used as a service instruction of a secondary login object of the client, and the service instruction of target service data corresponding to the target secondary identity is inquired.
According to the method and the device, the target secondary identity corresponding to the user identity can be determined by identifying the user identity and the service intention which generate the current biological information, so that a service instruction executed based on the target secondary identity not only meets the current service intention of the user, but also is matched with the user identity; furthermore, the method and the device only need to acquire the target biological information of the user, can simultaneously determine the service intention and the user identity of the user, and the user does not need to perform twice operations of determining the service intention and determining the user identity, so that the operation cost of the user can be reduced, and the efficiency of executing the service instruction matched with both the service intention and the user identity of the user by the terminal can be improved.
Please refer to fig. 4, which is a schematic flow chart of another data processing method provided in the embodiment of the present application, where the data processing includes the following steps:
in step S201, the flow starts.
In step S202, the server acquires voice data.
Specifically, when the primary account (which may correspond to the primary id in the present application) logs in the client, that is, the primary account is in an active state, the server receives the voice data sent by the client, where the voice data is data collected by the client when the user inputs a "login sub-account" in the client by voice.
In step S203, the server determines whether the voiceprint already exists.
Specifically, the server semantically recognizes the voice data and determines that the service intention is a login sub-account.
The server judges whether template voiceprint characteristics matched with the voiceprint characteristics of the voice data exist or not by calling an identity recognition model of the primary account, and if the template voiceprint characteristics exist, the server executes the step S204 and the step S206; if not, step S205-step S206 are executed.
And step S204, the server logs in the secondary account to the client.
Specifically, the server sets the secondary account (which may correspond to the target secondary id in the present application) corresponding to the matched voiceprint feature as the secondary login account of the client, and at this time, the client logs in a primary account and a secondary account.
Step S205, the server creates a new secondary account (which may correspond to the target secondary id in the present application), and the secondary account is a sub-account of the primary account, stores the newly created secondary account in association with the voiceprint feature, and uses the newly created secondary account as the secondary login account of the client.
Step S206, the flow ends.
Please further refer to fig. 5, which is a timing diagram of a data processing method according to an embodiment of the present application, where the video background server, the voice recognition server, and the voiceprint recognition server described below all belong to the servers in the present application, and the data processing includes the following steps:
step S301, the primary account is in an active state, and the client collects voice data 'enter sub-account' input by the user.
Step S302, the client sends the voice data to a video background server.
Step S303, the video background server sends the voice data to the voice recognition server.
And step S304, the video background server sends the voice data to a voiceprint recognition server.
Step S305, the voice recognition server carries out semantic recognition on the voice data, determines that the service intention is to access a secondary account, and sends the determined service intention back to the video background server.
And S306, the voiceprint recognition server performs voiceprint recognition on the voice data according to the identity recognition model corresponding to the primary account to obtain a voiceprint recognition result, and the voiceprint recognition server sends the voiceprint recognition result back to the video background server.
Step S307, the video background server generates a secondary account access instruction corresponding to the service intention.
Step S308, the video background server judges whether a corresponding secondary account exists according to the voiceprint recognition result, and if so, returns service data corresponding to the secondary account to the client according to the instruction of accessing the secondary account; if not, a new secondary account is created, and the secondary account is a sub-account of the primary account.
According to the method and the device, the target secondary identity corresponding to the user identity can be determined by identifying the user identity and the service intention which generate the current biological information, so that a service instruction executed based on the target secondary identity not only meets the current service intention of the user, but also is matched with the user identity; furthermore, the method and the device only need to acquire the target biological information of the user, can simultaneously determine the service intention and the user identity of the user, and the user does not need to perform twice operations of determining the service intention and determining the user identity, so that the operation cost of the user is reduced, and the efficiency of executing the service instruction matched with both the service intention and the user identity of the user by the terminal is improved.
Please refer to fig. 6, which is a schematic flowchart illustrating a process of determining a target user identity and a target secondary identity provided in an embodiment of the present application, where determining the target user identity and the target secondary identity includes the following steps S401 to S404, and steps S401 to S404 are specific embodiments of steps S102 to S103 in the embodiment corresponding to fig. 3:
step S401, converting the target voice data into text data, and performing semantic recognition on the text data to obtain the service intention.
Specifically, when the target biological information is the target voice data, the server divides the target voice data into a plurality of audio frames according to a preset frame length and a preset frame shift, and the audio frames are partially overlapped with each other, so that the overlapping length is equal to the preset frame shift.
For example, the target voice data with the time dimension of 0-30ms is divided according to the frame length of 20ms and the frame shift of 10ms, and can be divided into audio frames 1: speech data between 0-20ms and audio frame 2: voice data between 10-30 ms.
Extracting a spectrum parameter of each audio frame, wherein the spectrum parameter is a short-time spectrum characteristic parameter of the audio frame, and the short-time spectrum characteristic parameter is a parameter extracted based on a physiological structure of a sounding organ such as a glottis, a vocal tract or a nasal cavity.
The short-time spectrum characteristic parameters may include: at least one of parameters such as a pitch spectrum and its contour, an energy of a pitch frame, a spectrum envelope, an appearance frequency of a pitch formant and its locus.
And extracting linear prediction parameters of each audio frame, wherein the linear prediction parameters are spectrum fitting characteristic parameters of the audio frames, the spectrum fitting characteristic parameters are parameters provided by simulating the characteristics of human ears on sound frequency perception from the perspective of hearing, and the spectrum fitting characteristic parameters are speech characteristics estimated by using corresponding approximation parameters, wherein mathematically, the spectrum fitting characteristic parameters are a plurality of 'past' audio frames to approximate the current audio frame.
The spectral fit characteristic parameters may include: at least one of linear prediction cepstrum (LPCC), Line Spectrum Pair (LSP), autocorrelation and log-area ratio, Mel-frequency cepstrum (MFCC), Perceptual Linear Prediction (PLP), etc.
In the above manner, the spectral parameters and linear prediction parameters extracted for each audio frame are combined into a vector, so that each audio frame can be expressed as a multi-dimensional vector (also referred to as a feature vector). The acoustic model is used to determine the state to which the feature vector corresponding to each audio frame belongs, and generally, the states of adjacent audio frames should be the same because the frame length of each audio frame is short and is in the order of milliseconds and ms.
The states corresponding to several audio frames (generally 3 audio frames) are combined into a phoneme, the phoneme is the smallest unit of speech, the phoneme is a unit separated from the point of timbre, and a phoneme exists alone or several phonemes are combined to be called a syllable.
Then, several phonemes are combined into words (or words). Because of the time-varying property, noise and other unstable factors of the speech signal, each word has a close relationship with the context, and in order to further improve the accuracy of speech-to-text conversion, adaptive adjustment is performed according to the contexts of all words. Therefore, the server can adopt a language model to combine the recognized words into a logical and unambiguous statement, and can obtain text data corresponding to the target voice data.
The server may obtain an entity-predicate knowledge graph, where the entity-predicate knowledge graph includes a plurality of entity strings and predicate strings, and each entity string (or predicate string) identifies whether the string is an entity attribute or a predicate attribute. The server may perform multi-modal string matching of the text data to the entity-predicate knowledgegraph using a multi-modal string matching algorithm (which may include AC automata, hash function matching, etc.), determine matching strings in the text data and whether the strings are entity attributes or predicate attributes. The server may use a character string belonging to an entity attribute in the text data as an entity and a character string belonging to a predicate attribute as a predicate. And combining the entity identified from the text data and the predicate into a business intention.
Step S402, calling an identity recognition model corresponding to the primary identity to determine a matching result between the target voice data and at least one template user identity; the identity recognition model is a classification model generated according to the at least one template user identity and template voice data corresponding to the at least one template user identity respectively.
Specifically, the server obtains an identity recognition model corresponding to the primary identity, where the identity recognition model is a classification model trained according to at least one template user identity and template voice data corresponding to each template user identity, and the template user identity may be understood as a user identity that has already been created by the server, and each template user identity has a secondary identity corresponding to the template user identity, and the secondary identity is a sub-identity of the primary identity.
The identity recognition model comprises a feature generator and a pattern matcher:
the feature generator is configured to divide the target speech data into a plurality of audio frames, extract the spectral parameters and linear prediction parameters of each audio frame (the process of extracting the spectral parameters and linear prediction parameters of each audio frame may refer to step S401 described above), combine the spectral parameters of all audio frames into the spectral parameters of the target speech data, and combine the linear prediction parameters of all audio frames into the linear prediction parameters of the target speech data. The spectral parameters of the target speech data and the linear prediction parameters of the target speech data are combined into the voiceprint features of the target speech data (referred to as target voiceprint features) in a predetermined order.
The pattern matcher is used for identifying similarity (or matching probability) between a target voiceprint feature and at least one template voiceprint feature, the obtained at least one matching probability is used as a matching result, and the template voiceprint feature is the voiceprint feature of template voice data (the extraction process of the template voiceprint feature is the same as that of the target biological feature).
Since the template voice data is voice data corresponding to the identity of the template user, the similarity (or matching result) between the target biometric feature and the at least one template voiceprint feature is equal to the matching degree between the target voice data and the identity of the at least one template user.
The pattern matcher is a model that may have a prediction classification function, such as a Back Propagation (BP) neural network model, a convolutional neural network model, or various regression models (e.g., a linear regression model, a logistic regression model).
Step S403, if there is a matching result that meets the matching condition in at least one matching result, taking the template user identity corresponding to the matching result that meets the matching condition as the target user identity.
Specifically, the server obtains a preset probability threshold. And if the matching result is greater than the preset probability threshold, the matching result is the matching result meeting the matching condition.
And when at least one acquired matching result has a matching result meeting the matching condition, taking the template user identity corresponding to the matching result meeting the matching condition as the target user identity.
Step S404, extracting a target identity corresponding to the target user identity from a secondary identity set corresponding to the at least one template user identity; and the secondary identification in the secondary identification set is a sub-identification of the primary identification.
Specifically, each template user identity has a secondary identity corresponding to the template user identity, the secondary identity of each template user identity is a sub-identity of the primary identity, and the secondary identities of all template user identities can be combined into a secondary identity set.
The server may extract, from the secondary identity set, a secondary identity of the target user identity (i.e., the template user identity corresponding to the matching result that satisfies the matching condition) as the target secondary identity.
For example, the matching probability between the target voiceprint feature of the target speech data and the template voiceprint feature 1 (corresponding to the template user identity 1) is 0.1, the matching probability between the target voiceprint feature and the template voiceprint feature 2 (corresponding to the template user identity 2) is 0.8, and the matching probability between the target voiceprint feature and the template voiceprint feature 3 (corresponding to the template user identity 3) is 0.1. If the preset probability threshold is 0.5, it indicates that the matching result between the target biological feature and the template voiceprint feature 2 meets the matching condition, and the server may use the template user identity 2 as the target user identity and use the secondary identity of the template user identity as the target secondary identity.
Optionally, when at least one obtained matching result has a matching result that meets the matching condition, the server may send an animation playing instruction to the client, so that the client plays the target animation according to the animation playing instruction, where the target animation may be a lightweight animation.
Subsequently, when the execution of the service instruction corresponding to the service intention is completed, the server may send an animation playing stopping instruction to the client, so that the client stops the target animation according to the animation playing stopping instruction.
The above-described steps S403 to S404 describe the case when there is a matching result satisfying the matching condition in the acquired at least one matching result, and the following describes the case when there is no matching result satisfying the matching condition in the acquired at least one matching result:
and if the matching result is smaller than or equal to the preset probability threshold, the matching result is the matching result which does not meet the matching condition.
When at least one of the obtained matching results does not have a matching result that satisfies the matching condition (or when none of the obtained matching results satisfies the matching condition), the server may create a user identity for the current user (referred to as a target user identity), create a secondary identity for the target user identity (referred to as a target secondary identity), and set the target secondary identity as a sub-identity of the primary identity.
The server can also identify age information corresponding to the target voice data, and searches an image matched with the age information from the image material library to be used as the identity head portrait.
The server can store the target user identity, the target secondary identity and the identity head portrait in an associated mode.
Subsequently, the server may take the target user identity as a new template user identity and add the target secondary user identity to the set of target secondary identities.
For example, the matching probability between the target voiceprint feature of the target speech data and the template voiceprint feature 1 (corresponding to the template user identity 1) is 0.1, the matching probability between the target voiceprint feature and the template voiceprint feature 2 (corresponding to the template user identity 2) is 0.2, and the matching probability between the target voiceprint feature and the template voiceprint feature 3 (corresponding to the template user identity 3) is 0.2. If the preset probability threshold is 0.5, it indicates that there is no matching result satisfying the matching condition in the 3 matching results, the server may recreate the target user identity (e.g., user identity 4) and recreate the target secondary identity for the target user identity, and set the recreated target secondary identity as a sub-identity of the primary identity.
Optionally, the above describes a usage process of the identity recognition model, and the following describes a training process of the identity recognition model, where the training process takes as an example that one template user identity and corresponding template voice data perform model training once:
the server acquires template voice data of the identity of the template user, and generates a tag vector (called an identity tag vector) of the template voice data, wherein the identity tag vector is used for identifying the identity of the template user to which the template voice data belongs.
And acquiring an initial classification model, predicting the matching degree between the template voice data and at least one template user identity based on the initial classification model, and combining the acquired matching degrees into an identity prediction vector.
Determining a difference between the identity label vector and the identity prediction vector as a classification error, and back-propagating the classification error to the initial classification model to adjust model parameters in the initial classification model.
For example, there are 3 template user identities (template user identity 1, template user identity 2, and template user identity 3, respectively), and the template user identity 2 is currently trained, so the identity label vector of the template voice data of the template user identity 2 is: [0,1,0]. If the initial classification model predicts that the matching degree between the template voice data of the template user identity 2 and the template user identity 1 is 0.4, the matching degree between the template voice data and the template user identity 2 is 0.3, and the matching degree between the template voice data and the template user identity 3 is 0.3, the identity prediction vector is as follows: [0.4,0.3,0.3]. The classification error may be: (0-0.4)2+(1-0.3)2+(0-0.3)20.41. And reversely propagating the calculated classification error to the initial classification model so as to adjust the model parameters in the initial classification model.
The server can continuously train the initial classification model by adopting the above mode, and when the training times reach the time threshold value or when the variation of the model parameter adjusted twice is small, the trained initial classification model can be used as the identity recognition model.
As can be seen from the foregoing, the server may newly create a target user identity and a target secondary identity, where the newly created target user identity is to be used as a new template user identity, in this case, the server needs to retrain the identity recognition model, and the new identity recognition model needs to add a new category output based on the original identity recognition model, where the new category output is used to output a probability that the voice data belongs to the new target user identity.
According to the method and the device, the target secondary identity corresponding to the user identity can be determined by identifying the user identity and the service intention which generate the current biological information, so that a service instruction executed based on the target secondary identity not only meets the current service intention of the user, but also is matched with the user identity; furthermore, the method and the device only need to acquire the target biological information of the user, can simultaneously determine the service intention and the user identity of the user, and the user does not need to perform twice operations of determining the service intention and determining the user identity, so that the operation cost of the user can be reduced, and the efficiency of executing the service instruction matched with both the service intention and the user identity of the user by the terminal can be improved.
Please refer to fig. 7, which is a schematic flow chart of another data processing method provided in the embodiment of the present application, where the data processing may include the following steps:
in step S501, the flow starts.
Step S502, the server acquires voice data.
Specifically, when the primary account (which may correspond to the primary id in the present application) logs in the client, that is, the primary account is in an active state, the server receives the voice data sent by the client, where the voice data is data collected by the client when the user inputs "enter sub-account" in the client by voice.
In step S503, the server recognizes the service intention of the voice data as an intention to access the secondary account, the server extracts the target voiceprint feature of the voice data based on the identity recognition model, and the specific process of extracting the target voiceprint feature may refer to step S402 in the embodiment corresponding to fig. 6.
Step S504, the server carries out pattern matching on the extracted target voiceprint characteristics and the existing template voiceprint characteristics.
Step S505, the server determines whether the existing template voiceprint characteristics have template voiceprint characteristics matched with the target biological characteristics according to the pattern matching result, and if yes, the step S507-step S508 are executed; if not, step S506 and step S508 are executed.
Step S506, the server creates a new secondary account (which may correspond to the target secondary id in the present application) according to the intention of accessing the secondary account, establishes an association relationship between the secondary account and the extracted target voiceprint feature, and logs in the newly created secondary account to the client.
Step S507, the server searches for a secondary account (which may correspond to the target secondary identity in the present application) corresponding to the matched template voiceprint feature, where the secondary account is an existing secondary account, and the server returns service data under the secondary account to the client.
Step S508 ends the flow.
When the client in the above steps is a video client and the video client is installed in the smart television, each family member sharing the smart television uniquely corresponds to one user identity and a secondary account (which may correspond to a secondary identity in the present application) through the voiceprint feature, and the server may determine the viewing history, the concerned movie, and the voiceprint feature of each family member based on the unique secondary account, thereby achieving personalized recommendation.
The following scenario is described by taking an example that the user a has created a secondary account of the client, but the user B has not created a secondary account: the client collects the voice data 'enter sub account' input by the user.
Please further refer to fig. 8, which is a timing chart of another data processing method according to an embodiment of the present application, where the data processing method includes the following steps:
step S601, the client collects the voice data of the user A.
Specifically, when the current primary account logs in the client in the terminal device, that is, the primary account is in an active state, the user inputs a login sub-account to the client by voice, and the client collects voice data of the user inputting the login sub-account.
Step S602, the client sends the voice data to the server.
Step S603, the server determines the service intention through semantic recognition, and matches the service intention with a corresponding secondary account through voiceprint recognition, logs in the secondary account to the client, finds behavior data (e.g., a historical viewing record, a focused video, a commented video, a search record, etc.) under the secondary account, and generates recommendation data according to the behavior data.
In step S604, the server returns the recommended data to the client.
Step S605, the client acquires the voice data of the user B: "login to sub-account".
Step S606, the client uploads the voice data of the user B to the server.
Step S607, the server determines the service intention through semantic recognition, identifies the unmatched secondary account through voiceprint, creates a secondary account, uses the created secondary account as a sub-account of the primary account, and logs in the client with the created secondary account.
In step S608, the user B generates viewing behavior data (viewing video, focused video, commented video, search record, etc.) based on the newly created secondary account in the client.
In step S609, the client uploads the viewing behavior data of the user B to the server.
And step S610, the server stores the film watching behavior data and the newly-built secondary account in a correlation mode, and the data are used for subsequently generating personalized recommendation data aiming at the user B.
Further, please refer to fig. 9, which is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application. As shown in fig. 9, the data processing apparatus 1 may be applied to the server in the corresponding embodiment of fig. 3 to 8, and the data processing apparatus 1 may include: a first obtainingmodule 11, arecognition module 12, a second obtainingmodule 13 and adetermination module 14.
The first obtainingmodule 11 is configured to obtain target biological information when the primary identity is in an active state;
theidentification module 12 is used for identifying the business intention and the target user identity corresponding to the target biological information;
a second obtainingmodule 13, configured to obtain a target secondary identity identifier corresponding to the target user identity; the target secondary identity is a sub-identity of the primary identity;
and the determiningmodule 14 is configured to execute a service instruction corresponding to the service intention based on the target secondary identity.
For specific functional implementation manners of the first obtainingmodule 11, the identifyingmodule 12, the second obtainingmodule 13, and the determiningmodule 14, reference may be made to steps S101 to S104 in the embodiment corresponding to fig. 3, which is not described herein again.
Referring to fig. 9, the target bio-information includes target voice data;
theidentification module 12 may include: aconversion unit 121, a callingunit 122, and afirst determination unit 123.
Aconversion unit 121, configured to convert the target voice data into text data, and semantically identify the text data to obtain the service intention;
the callingunit 122 is configured to call an identity recognition model corresponding to the primary identity to determine a matching result between the target voice data and at least one template user identity; the identity recognition model is a classification model generated according to the at least one template user identity and template voice data respectively corresponding to the at least one template user identity;
a first determiningunit 123, configured to, if a matching result meeting a matching condition exists in at least one matching result, take a template user identity corresponding to the matching result meeting the matching condition as the target user identity;
the second obtainingmodule 13 may include: afirst extraction unit 131.
A first extractingunit 131, configured to extract a target identity corresponding to the target user identity from a secondary identity set corresponding to the at least one template user identity; and the secondary identification in the secondary identification set is a sub-identification of the primary identification.
Theidentification module 12 may further include: a second determination unit 124.
A second determining unit 124, configured to create the target user identity if there is no matching result that meets the matching condition in the at least one matching result, identify age information corresponding to the target voice data, and search an identity avatar matching the age information in an image material library;
the second obtainingmodule 13 may include: asecond extraction unit 132.
A second extractingunit 132, configured to create the target secondary user identifier for the target user identity, set the target secondary user identifier as a sub-identifier of the primary identity identifier, and perform associated storage on the target user identity, the target secondary identity identifier, and the identity avatar.
For specific processes of the convertingunit 121, the callingunit 122, the first determiningunit 123, the second determining unit 124, the first extractingunit 131, and the second extractingunit 132, reference may be made to steps S401 to S404 in the embodiment corresponding to fig. 6, which is not described herein again.
When the first determiningunit 123 and the first extractingunit 131 determine the target user identity and the target secondary identity, the second determining unit 124 and the second extractingunit 132 do not perform the corresponding steps; when the second determining unit 124 and the second extractingunit 132 determine the target user identity and the target secondary identity, the first determiningunit 123 and the first extractingunit 131 do not perform the corresponding steps.
Referring to fig. 9, the identification model includes a feature generator and a pattern matcher;
the callingunit 122 may include: anextraction subunit 1221 and amatching subunit 1222.
An extracting sub-unit 1221 configured to extract a target voiceprint feature of the target speech data based on the feature generator;
amatching subunit 1222, configured to determine, based on the pattern matcher, matching probabilities between the target voiceprint feature and at least one template voiceprint feature, where the obtained matching probabilities are all used as matching results; the at least one template voiceprint feature is a voiceprint feature corresponding to the at least one template voice data respectively;
an extractingsubunit 1221, configured to specifically extract, based on the feature generator, a spectrum parameter and a linear prediction parameter of the target voice data, and obtain the target voiceprint feature according to the spectrum parameter and the linear prediction parameter; the frequency spectrum parameter is a short-time spectrum characteristic parameter of the target voice data; the linear prediction parameters are spectrum fitting characteristic parameters of the target speech data.
The specific processes of the extractingsubunit 1221 and thematching subunit 1222 may refer to step S402 in the embodiment corresponding to fig. 6, which is not described herein again.
Referring to fig. 9, the service intention includes a client-side secondary login object switching intention;
the determiningmodule 14 includes: afirst generating unit 141.
Afirst generating unit 141, configured to generate a switching instruction corresponding to the switching intention of the client secondary login object, and use the target secondary identity as a secondary login object of the client according to the switching instruction; the switching instruction belongs to the service instruction.
The specific process of thefirst generating unit 141 may refer to step S104 in the embodiment corresponding to fig. 3, which is not described herein again.
Referring to fig. 9, the business intent includes a business data query intent;
the determiningmodule 14 may include: asecond generating unit 142.
Asecond generating unit 142, configured to generate a query instruction corresponding to the service data query intention, query target service data corresponding to the target secondary identity, and return the target service data to a client; the query instruction belongs to the service instruction.
The specific process of thesecond generating unit 142 may refer to step S104 in the embodiment corresponding to fig. 3, which is not described herein again.
Referring to fig. 9, the data processing apparatus 1 may include: the first obtainingmodule 11, the identifyingmodule 12, the second obtainingmodule 13, and the determiningmodule 14 may further include: astorage module 15, atraining module 16 and a playing module 17.
Thestorage module 15 is configured to acquire behavior data of the user in the client corresponding to the target secondary identity, and perform associated storage on the behavior data and the target secondary identity; the behavior data is used for generating recommended service data for the user.
Thetraining module 16 is configured to obtain template voice data corresponding to an identity of a template user, generate an identity tag vector corresponding to the template voice data, obtain an initial classification model, predict a matching degree between the template voice data and the identity of the at least one template user based on the initial classification model, obtain an identity prediction vector according to the obtained matching degree, determine a classification error according to the identity tag vector and the identity prediction vector, and train the initial classification model according to the classification error to obtain the identity recognition model.
The playing module 17 is configured to send an animation playing instruction to a client to instruct the client to play a target animation when a matching result meeting the matching condition exists in the at least one matching result;
the playing module 17 is further configured to send an animation playing stopping instruction to the client when the execution of the service instruction is completed, and instruct the client to close the target animation.
The specific processes of thestorage module 15, thetraining module 16, and the playing module 17 may refer to step S404 in the embodiment corresponding to fig. 6, which is not described herein again.
Further, please refer to fig. 10, which is a schematic structural diagram of an electronic device according to an embodiment of the present invention. The server in the embodiments corresponding to fig. 3 to fig. 8 may be anelectronic device 1000, as shown in fig. 10, where theelectronic device 1000 may include: auser interface 1002, aprocessor 1004, anencoder 1006, and amemory 1008.Signal receiver 1016 is used to receive or transmit data viacellular interface 1010,WIFI interface 1012. Theencoder 1006 encodes the received data into a computer-processed data format. Thememory 1008 has stored therein a computer program by which theprocessor 1004 is arranged to perform the steps of any of the method embodiments described above. Thememory 1008 may include volatile memory (e.g., dynamic random access memory DRAM) and may also include non-volatile memory (e.g., one time programmable read only memory OTPROM). In some examples, thememory 1008 can further include memory located remotely from theprocessor 1004, which can be connected to theelectronic device 1000 via a network. Theuser interface 1002 may include: akeyboard 1018, and adisplay 1020.
In theelectronic device 1000 shown in fig. 10, theprocessor 1004 may be configured to call thememory 1008 to store a computer program to implement:
when the primary identity mark is in an effective state, acquiring target biological information;
identifying a business intention and a target user identity corresponding to the target biological information;
acquiring a target secondary identity corresponding to the target user identity; the target secondary identity is a sub-identity of the primary identity;
and executing a service instruction corresponding to the service intention based on the target secondary identity.
It should be understood that theelectronic device 1000 described in the embodiment of the present invention may perform the description of the data processing method in the embodiment corresponding to fig. 3 to fig. 8, and may also perform the description of the data processing apparatus 1 in the embodiment corresponding to fig. 9, which is not described herein again. In addition, the beneficial effects of the same method are not described in detail.
Further, here, it is to be noted that: an embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores the aforementioned computer program executed by the data processing apparatus 1, and the computer program includes program instructions, and when the processor executes the program instructions, the description of the data processing method in the embodiment corresponding to fig. 3 to 8 can be performed, so that details are not repeated here. In addition, the beneficial effects of the same method are not described in detail. For technical details not disclosed in the embodiments of the computer storage medium to which the present invention relates, reference is made to the description of the method embodiments of the present invention.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims (14)

Translated fromChinese
1.一种数据处理方法,其特征在于,包括:1. a data processing method, is characterized in that, comprises:当一级身份标识处于有效态时,获取目标生物信息;When the first-level identification is in a valid state, obtain the target biological information;识别与所述目标生物信息对应的业务意图和目标用户身份;Identify the business intent and target user identity corresponding to the target biometric information;获取与所述目标用户身份对应的目标二级身份标识;所述目标二级身份标识是所述一级身份标识的子标识;Acquiring a target secondary identity corresponding to the target user identity; the target secondary identity is a sub-identity of the primary identity;基于所述目标二级身份标识执行与所述业务意图对应的业务指令。Execute a business instruction corresponding to the business intent based on the target secondary identity.2.根据权利要求1所述的方法,其特征在于,所述目标生物信息包括目标语音数据;2. The method according to claim 1, wherein the target biological information comprises target voice data;所述识别与所述目标生物信息对应的业务意图和目标用户身份,包括:The identifying the business intention and target user identity corresponding to the target biological information includes:将所述目标语音数据转换为文本数据,语义识别所述文本数据,得到所述业务意图;Converting the target voice data into text data, semantically identifying the text data, and obtaining the business intent;调用与所述一级身份标识对应的身份识别模型确定所述目标语音数据与至少一个模板用户身份之间的匹配结果;所述身份识别模型是根据所述至少一个模板用户身份,以及与所述至少一个模板用户身份分别对应的模板语音数据所生成的分类模型;Invoking the identity recognition model corresponding to the primary identity identifier to determine the matching result between the target voice data and at least one template user identity; the identity recognition model is based on the at least one template user identity and the A classification model generated by template speech data corresponding to at least one template user identity respectively;若至少一个匹配结果中存在满足匹配条件的匹配结果,则将满足所述匹配条件的匹配结果对应的模板用户身份作为所述目标用户身份;If there is a matching result that satisfies the matching condition in at least one of the matching results, the template user identity corresponding to the matching result that satisfies the matching condition is used as the target user identity;则所述获取与所述目标用户身份对应的目标二级身份标识,包括:Then the acquiring the target secondary identity corresponding to the target user identity includes:在所述至少一个模板用户身份对应的二级身份标识集合中,提取所述目标用户身份对应的目标身份标识;所述二级身份标识集合中的二级身份标识是所述一级身份标识的子标识。From the secondary identity set corresponding to the at least one template user identity, extract the target identity corresponding to the target user identity; the secondary identity in the secondary identity set is the primary identity sub-id.3.根据权利要求2所述的方法,其特征在于,还包括:3. The method of claim 2, further comprising:若所述至少一个匹配结果中不存在满足所述匹配条件的匹配结果,则创建所述目标用户身份;If there is no matching result satisfying the matching condition in the at least one matching result, creating the target user identity;识别与所述目标语音数据对应的年龄信息,在图像素材库中查找与所述年龄信息相匹配的身份头像;Identify the age information corresponding to the target voice data, and search the image material library for an identity avatar that matches the age information;则所述获取与所述目标用户身份对应的目标二级身份标识,包括:Then the acquiring the target secondary identity corresponding to the target user identity includes:为所述目标用户身份创建所述目标二级用户标识;creating the target secondary user identity for the target user identity;将所述目标二级用户标识设置为所述一级身份标识的子标识;The target secondary user identification is set as the sub-identity of the first-level identification;将所述目标用户身份、所述目标二级身份标识以及所述身份头像进行关联存储。The target user identity, the target secondary identity identifier and the identity avatar are associated and stored.4.根据权利要求2所述的方法,其特征在于,所述身份识别模型包括特征生成器和模式匹配器;4. The method of claim 2, wherein the identity recognition model comprises a feature generator and a pattern matcher;所述调用与所述一级身份标识对应的身份识别模型确定所述目标语音数据与至少一个模板用户身份之间的匹配结果,包括:The invoking the identity recognition model corresponding to the first-level identity to determine the matching result between the target voice data and at least one template user identity includes:基于所述特征生成器,提取所述目标语音数据的目标声纹特征;Based on the feature generator, extract the target voiceprint feature of the target voice data;基于所述模式匹配器确定所述目标声纹特征与至少一个模板声纹特征之间的匹配概率,将获取到的匹配概率均作为匹配结果;所述至少一个模板声纹特征是所述至少一个模板语音数据分别对应的声纹特征。The matching probability between the target voiceprint feature and the at least one template voiceprint feature is determined based on the pattern matcher, and the obtained matching probability is used as a matching result; the at least one template voiceprint feature is the at least one template voiceprint feature. The voiceprint features corresponding to the template speech data respectively.5.根据权利要求4所述的方法,其特征在于,所述基于所述特征生成器,提取所述目标语音数据的目标声纹特征,包括:5. The method according to claim 4, wherein, extracting the target voiceprint feature of the target voice data based on the feature generator, comprising:基于所述特征生成器,提取所述目标语音数据的频谱参数和线性预测参数;所述频谱参数是所述目标语音数据的短时谱特征参数;所述线性预测参数是所述目标语音数据的频谱拟合特征参数;Based on the feature generator, extract the spectral parameters and linear prediction parameters of the target speech data; the spectral parameters are short-time spectral feature parameters of the target speech data; the linear prediction parameters are the Spectral fitting characteristic parameters;根据所述频谱参数和所述线性预测参数,得到所述目标声纹特征。According to the spectral parameter and the linear prediction parameter, the target voiceprint feature is obtained.6.根据权利要求2所述的方法,其特征在于,还包括:6. The method of claim 2, further comprising:获取模板用户身份对应的模板语音数据;Obtain the template voice data corresponding to the template user identity;生成与所述模板语音数据对应的身份标签向量;generating an identity tag vector corresponding to the template speech data;获取初始分类模型,基于所述初始分类模型预测所述模板语音数据与所述至少一个模板用户身份之间的匹配度,根据获取到的匹配度得到身份预测向量;Obtaining an initial classification model, predicting the matching degree between the template speech data and the at least one template user identity based on the initial classification model, and obtaining an identity prediction vector according to the obtained matching degree;根据所述身份标签向量和所述身份预测向量确定分类误差,根据所述分类误差训练所述初始分类模型,得到所述身份识别模型。A classification error is determined according to the identity label vector and the identity prediction vector, and the initial classification model is trained according to the classification error to obtain the identity recognition model.7.根据权利要求2所述的方法,其特征在于,还包括:7. The method of claim 2, further comprising:当所述至少一个匹配结果中存在满足所述匹配条件的匹配结果时,向客户端发送播放动画指令,指示所述客户端播放目标动画;When there is a matching result that satisfies the matching condition in the at least one matching result, sending a play animation instruction to the client, instructing the client to play the target animation;当所述业务指令执行完成时,向客户端发送停止播放动画指令,指示所述客户端关闭所述目标动画。When the execution of the service instruction is completed, a stop animation instruction is sent to the client, instructing the client to close the target animation.8.根据权利要求1所述的方法,其特征在于,所述业务意图包括客户端二级登录对象切换意图;8. The method according to claim 1, wherein the service intent comprises a client secondary login object switching intent;所述基于所述目标二级身份标识执行与所述业务意图对应的业务指令,包括:The executing the business instruction corresponding to the business intent based on the target secondary identity includes:生成与所述客户端二级登录对象切换意图对应的切换指令;所述切换指令属于所述业务指令;generating a switching instruction corresponding to the switching intention of the secondary login object of the client; the switching instruction belongs to the service instruction;根据所述切换指令,将所述目标二级身份标识作为客户端的二级登录对象。According to the switching instruction, the target secondary identity identifier is used as the secondary login object of the client.9.根据权利要求8所述的方法,其特征在于,还包括:9. The method of claim 8, further comprising:获取与所述目标二级身份标识对应的用户在所述客户端中的行为数据;所述行为数据是用于生成针对所述用户的推荐业务数据;Acquiring the behavior data of the user corresponding to the target secondary identity in the client; the behavior data is used to generate recommended service data for the user;将所述行为数据以及所述目标二级身份标识进行关联存储。The behavior data and the target secondary identity are associated and stored.10.根据权利要求1所述的方法,其特征在于,所述业务意图包括业务数据查询意图;10. The method according to claim 1, wherein the business intent comprises a business data query intent;所述基于所述目标二级身份标识执行与所述业务意图对应的业务指令,包括:The executing the business instruction corresponding to the business intent based on the target secondary identity includes:生成与所述业务数据查询意图对应的查询指令;所述查询指令属于所述业务指令;generating a query instruction corresponding to the business data query intent; the query instruction belongs to the business instruction;查询与所述目标二级身份标识对应的目标业务数据,向客户端返回所述目标业务数据。The target service data corresponding to the target secondary identity identifier is queried, and the target service data is returned to the client.11.根据权利要求1所述的方法,其特征在于,所述目标二级身份标识所具有的用户权限与所述一级身份标识所具有的用户权限相同。11 . The method according to claim 1 , wherein the user authority possessed by the target secondary identification is the same as the user authority possessed by the primary identification. 12 .12.一种数据处理装置,其特征在于,包括:12. A data processing device, comprising:第一获取模块,用于当一级身份标识处于有效态时,获取目标生物信息;a first acquisition module, used for acquiring target biological information when the primary identity identifier is in a valid state;识别模块,用于识别与所述目标生物信息对应的业务意图和目标用户身份;an identification module, used to identify the business intention and target user identity corresponding to the target biological information;第二获取模块,用于获取与所述目标用户身份对应的目标二级身份标识;所述目标二级身份标识是所述一级身份标识的子标识;A second acquisition module, configured to acquire a target secondary identity identifier corresponding to the target user identity; the target secondary identity identifier is a sub-identity of the primary identity identifier;确定模块,用于基于所述目标二级身份标识执行与所述业务意图对应的业务指令。A determination module, configured to execute a business instruction corresponding to the business intent based on the target secondary identity identifier.13.一种电子设备,包括存储器和处理器,所述存储器存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行如权利要求1-11中任一项所述方法的步骤。13. An electronic device comprising a memory and a processor, wherein the memory stores a computer program, and when the computer program is executed by the processor, the processor causes the processor to perform the execution according to any one of claims 1-11. steps of the method described.14.一种计算机存储介质,其特征在于,所述计算机存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时,执行如权利要求1-11任一项所述的方法。14. A computer storage medium, characterized in that the computer storage medium stores a computer program, the computer program comprises program instructions, and the program instructions, when executed by a processor, execute any one of claims 1-11. method described in item.
CN201911206373.3A2019-11-292019-11-29Data processing method, device, electronic equipment and storage mediumActiveCN112883350B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201911206373.3ACN112883350B (en)2019-11-292019-11-29Data processing method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201911206373.3ACN112883350B (en)2019-11-292019-11-29Data processing method, device, electronic equipment and storage medium

Publications (2)

Publication NumberPublication Date
CN112883350Atrue CN112883350A (en)2021-06-01
CN112883350B CN112883350B (en)2024-12-17

Family

ID=76039056

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201911206373.3AActiveCN112883350B (en)2019-11-292019-11-29Data processing method, device, electronic equipment and storage medium

Country Status (1)

CountryLink
CN (1)CN112883350B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN114550265A (en)*2022-02-282022-05-27上海商汤智能科技有限公司 Image processing method, face recognition method and system
WO2025039693A1 (en)*2023-08-212025-02-27腾讯科技(深圳)有限公司Service processing methods, apparatus, device, storage medium and program product

Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104050401A (en)*2013-03-122014-09-17腾讯科技(深圳)有限公司User permission management method and system
CN104301498A (en)*2013-07-152015-01-21联想(北京)有限公司Information processing method and electronic equipment
CN104866774A (en)*2015-05-292015-08-26北京瑞星信息技术有限公司Method and system for managing account authorities
CN104899206A (en)*2014-03-052015-09-09电信科学技术研究院Method and system for equipment operation
CN105915491A (en)*2015-11-182016-08-31乐视网信息技术(北京)股份有限公司Account number login method and device
CN107357875A (en)*2017-07-042017-11-17北京奇艺世纪科技有限公司A kind of voice search method, device and electronic equipment
CN108075892A (en)*2016-11-092018-05-25阿里巴巴集团控股有限公司The method, apparatus and equipment of a kind of speech processes
CN110138712A (en)*2018-02-092019-08-16中国移动通信有限公司研究院Identity identifying method, device, medium, robot and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104050401A (en)*2013-03-122014-09-17腾讯科技(深圳)有限公司User permission management method and system
CN104301498A (en)*2013-07-152015-01-21联想(北京)有限公司Information processing method and electronic equipment
CN104899206A (en)*2014-03-052015-09-09电信科学技术研究院Method and system for equipment operation
CN104866774A (en)*2015-05-292015-08-26北京瑞星信息技术有限公司Method and system for managing account authorities
CN105915491A (en)*2015-11-182016-08-31乐视网信息技术(北京)股份有限公司Account number login method and device
CN108075892A (en)*2016-11-092018-05-25阿里巴巴集团控股有限公司The method, apparatus and equipment of a kind of speech processes
CN107357875A (en)*2017-07-042017-11-17北京奇艺世纪科技有限公司A kind of voice search method, device and electronic equipment
CN110138712A (en)*2018-02-092019-08-16中国移动通信有限公司研究院Identity identifying method, device, medium, robot and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN114550265A (en)*2022-02-282022-05-27上海商汤智能科技有限公司 Image processing method, face recognition method and system
WO2025039693A1 (en)*2023-08-212025-02-27腾讯科技(深圳)有限公司Service processing methods, apparatus, device, storage medium and program product

Also Published As

Publication numberPublication date
CN112883350B (en)2024-12-17

Similar Documents

PublicationPublication DateTitle
CN111933129B (en)Audio processing method, language model training method and device and computer equipment
US10878808B1 (en)Speech processing dialog management
US11270698B2 (en)Proactive command framework
CN112071330B (en)Audio data processing method and device and computer readable storage medium
US11494434B2 (en)Systems and methods for managing voice queries using pronunciation information
CN113168832A (en) Alternate Response Generation
US12165636B1 (en)Natural language processing
CN109155132A (en)Speaker verification method and system
US12332937B2 (en)Systems and methods for managing voice queries using pronunciation information
US10504512B1 (en)Natural language speech processing application selection
CN116417003A (en)Voice interaction system, method, electronic device and storage medium
US12340797B1 (en)Natural language processing
KR102389995B1 (en)Method for generating spontaneous speech, and computer program recorded on record-medium for executing method therefor
CN114360511B (en)Voice recognition and model training method and device
CN114373443B (en) Speech synthesis method and device, computing device, storage medium and program product
US11410656B2 (en)Systems and methods for managing voice queries using pronunciation information
US11804225B1 (en)Dialog management system
CN115171731A (en)Emotion category determination method, device and equipment and readable storage medium
CN110853669A (en)Audio identification method, device and equipment
CN117688145A (en)Method and device for question-answer interaction and intelligent equipment
CN112883350B (en)Data processing method, device, electronic equipment and storage medium
US11798538B1 (en)Answer prediction in a speech processing system
CN118916443A (en)Information retrieval method and device and electronic equipment
CN118467780A (en)Film and television search recommendation method, system, equipment and medium based on large model
HK40047328A (en)Data processing method and apparatus, electronic device, and storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
REGReference to a national code

Ref country code:HK

Ref legal event code:DE

Ref document number:40047328

Country of ref document:HK

GR01Patent grant
GR01Patent grant
TG01Patent term adjustment
TG01Patent term adjustment

[8]ページ先頭

©2009-2025 Movatter.jp