CN114818982A

Movatterモバイル変換

Info

Publication number: CN114818982A
Application number: CN202210587354.5A
Authority: CN
Inventors: 钱晨
Original assignee: Shenzhen Coocaa Network Technology Co Ltd
Current assignee: Shenzhen Coocaa Network Technology Co Ltd
Priority date: 2022-05-25
Filing date: 2022-05-25
Publication date: 2022-07-29

Abstract

Translated fromChinese

本发明公开了一种用户画像的生成方法、装置及电子设备。该方法包括：响应于检测到多个当前观影序列，获取各当前观影序列分别对应的待检测设备标识和当前特征向量；对多个当前特征向量执行聚类操作，得到至少一个当前聚类簇；基于多个当前观影序列对应的当前映射列表和各当前聚类簇分别对应的至少一个待检测设备标识，确定各当前观影序列分别对应的当前用户画像；其中，当前映射列表中包含至少两种用户画像以及与各用户画像分别对应的多个上一设备标识，当前映射列表是基于上一次检测到的多个上一观影序列分别对应的上一特征向量确定的。本发明实施例解决了现有观影映射列表依赖人为经验设置的问题，提高了用户画像的预测准确率。

The invention discloses a method, a device and an electronic device for generating a user portrait. The method includes: in response to detecting a plurality of current movie viewing sequences, obtaining a device identifier to be detected and a current feature vector corresponding to each current movie viewing sequence respectively; performing a clustering operation on the multiple current feature vectors to obtain at least one current cluster cluster; based on the current mapping lists corresponding to multiple current viewing sequences and at least one device identification to be detected respectively corresponding to each current cluster, determine the current user portraits corresponding to each current viewing sequence respectively; wherein, the current mapping list contains At least two user portraits and multiple previous device identifiers corresponding to each user portrait, and the current mapping list is determined based on the last feature vectors corresponding to the multiple previous viewing sequences detected last time. The embodiment of the present invention solves the problem that the existing movie viewing mapping list is set by relying on human experience, and improves the prediction accuracy of the user portrait.

Description

Translated fromChinese

一种用户画像的生成方法、装置及电子设备A method, device and electronic device for generating user portrait

技术领域technical field

本发明涉及大数据技术领域，尤其涉及一种用户画像的生成方法、装置及电子设备。The present invention relates to the technical field of big data, and in particular, to a method, device and electronic device for generating a user portrait.

背景技术Background technique

用户画像通常是指根据用户在互联网的一些行为触发抽取出不同维度的数据标签，用户画像可以帮助大数据“走出”数据仓库，针对用户进行的个性化推荐、精准营销和广告投放等多样性服务是大数据落地的一个重要方向。User portraits usually refer to the extraction of data labels of different dimensions based on some behaviors of users on the Internet. User portraits can help big data "get out" of the data warehouse and provide diversified services such as personalized recommendation, precision marketing, and advertising for users. It is an important direction for the implementation of big data.

OTT是指互联网公司越过运营商，发展基于开放互联网的各种视频及数据服务业务。目前，为获取OTT行业中的用户画像，最常采用的方法是人为预先设置观影映射列表，该观影映射列表包含不同观影内容以及各观影内容分别对应的用户画像。当获取到观影内容时，通过查询观影映射列表确定与该观影内容对应的目标用户画像。OTT means that Internet companies go beyond operators to develop various video and data services based on the open Internet. At present, in order to obtain user portraits in the OTT industry, the most commonly used method is to manually preset a movie viewing mapping list, which includes different viewing contents and respective user portraits corresponding to each viewing content. When the movie viewing content is acquired, the target user portrait corresponding to the movie viewing content is determined by querying the movie viewing mapping list.

现有技术中，观影映射列表的设置受到人为经验的影响比较大，并且人们的观影内容可能会随时间发生迁移，但观影映射列表固定且无法进行有效更新，导致预测的用户画像的准确度并不高。In the prior art, the setting of the movie viewing mapping list is greatly affected by human experience, and people's viewing content may migrate over time, but the movie viewing mapping list is fixed and cannot be effectively updated, resulting in the prediction of user portraits. Accuracy is not high.

发明内容SUMMARY OF THE INVENTION

本发明提供了一种用户画像的生成方法、装置及电子设备，以解决现有观影映射列表依赖人为经验设置的问题，提高用户画像的预测准确率。The present invention provides a method, device and electronic device for generating a user portrait, so as to solve the problem that the existing movie viewing mapping list depends on the setting of human experience, and improve the prediction accuracy of the user portrait.

根据本发明的一方面，提供了一种用户画像的生成方法，该方法包括：According to an aspect of the present invention, a method for generating a user portrait is provided, the method comprising:

响应于检测到多个当前观影序列，获取各所述当前观影序列分别对应的待检测设备标识和当前特征向量；In response to detecting a plurality of current movie viewing sequences, acquiring the identification of the device to be detected and the current feature vector corresponding to each of the current movie viewing sequences respectively;

对多个当前特征向量执行聚类操作，得到至少一个当前聚类簇；Perform a clustering operation on a plurality of current feature vectors to obtain at least one current cluster;

基于所述多个当前观影序列对应的当前映射列表和各所述当前聚类簇分别对应的至少一个待检测设备标识，确定各所述当前观影序列分别对应的当前用户画像；Based on the current mapping lists corresponding to the multiple current movie viewing sequences and at least one device identifier to be detected corresponding to each of the current clusters, determine the current user portraits corresponding to each of the current movie viewing sequences respectively;

其中，所述当前映射列表中包含至少两种用户画像以及与各所述用户画像分别对应的多个上一设备标识，所述当前映射列表是基于上一次检测到的多个上一观影序列分别对应的上一特征向量确定的。Wherein, the current mapping list includes at least two user portraits and multiple previous device identifiers corresponding to each of the user portraits, and the current mapping list is based on multiple previous viewing sequences detected last time. respectively corresponding to the previous eigenvectors.

根据本发明的另一方面，提供了一种用户画像的生成装置，该装置包括：According to another aspect of the present invention, there is provided an apparatus for generating a user portrait, the apparatus comprising:

当前特征向量获取模块，用于响应于检测到多个当前观影序列，获取各所述当前观影序列分别对应的待检测设备标识和当前特征向量；The current feature vector obtaining module is used to obtain the device identification to be detected and the current feature vector corresponding to each of the current movie viewing sequences respectively in response to detecting a plurality of current movie viewing sequences;

当前聚类簇确定模块，用于对多个当前特征向量执行聚类操作，得到至少一个当前聚类簇；The current cluster determination module is used to perform clustering operations on a plurality of current feature vectors to obtain at least one current cluster;

当前用户画像确定模块，用于基于所述多个当前观影序列对应的当前映射列表和各所述当前聚类簇分别对应的至少一个待检测设备标识，确定各所述当前观影序列分别对应的当前用户画像；The current user portrait determination module is configured to determine, based on the current mapping lists corresponding to the multiple current movie viewing sequences and at least one device identifier to be detected corresponding to each of the current clusters, respectively, to determine the corresponding current movie viewing sequences respectively 's current user profile;

根据本发明的另一方面，提供了一种电子设备，所述电子设备包括：According to another aspect of the present invention, an electronic device is provided, the electronic device comprising:

至少一个处理器；以及at least one processor; and

与所述至少一个处理器通信连接的存储器；其中，a memory communicatively coupled to the at least one processor; wherein,

所述存储器存储有可被所述至少一个处理器执行的计算机程序，所述计算机程序被所述至少一个处理器执行，以使所述至少一个处理器能够执行本发明任一实施例所述的用户画像的生成方法。The memory stores a computer program executable by the at least one processor, the computer program being executed by the at least one processor to enable the at least one processor to perform any of the embodiments of the present invention. How to generate user portraits.

根据本发明的另一方面，提供了一种计算机可读存储介质，所述计算机可读存储介质存储有计算机指令，所述计算机指令用于使处理器执行时实现本发明任一实施例所述的用户画像的生成方法。According to another aspect of the present invention, a computer-readable storage medium is provided, where computer instructions are stored in the computer-readable storage medium, and the computer instructions are used to cause a processor to implement any of the embodiments of the present invention when executed. The generation method of the user portrait.

本发明实施例的技术方案，通过获取检测到的多个当前观影序列分别对应的待检测设备标识和当前特征向量，对多个当前特征向量执行聚类操作，得到至少一个当前聚类簇，基于多个当前观影序列对应的当前映射列表和各当前聚类簇分别对应的至少一个待检测设备标识，确定各当前观影序列分别对应的当前用户画像，其中，当前映射列表是基于上一次检测到的多个上一观影序列分别对应的上一特征向量确定的，解决了现有观影映射列表依赖人为经验设置的问题，使得当前映射列表具备时间迁移性，进而提高了用户画像的预测准确率。The technical solution of the embodiment of the present invention is to obtain at least one current cluster by acquiring the identification of the device to be detected and the current feature vector corresponding to the detected multiple current viewing sequences respectively, and performing a clustering operation on the multiple current feature vectors to obtain at least one current cluster, Based on the current mapping lists corresponding to multiple current viewing sequences and at least one device identifier to be detected corresponding to each current cluster, the current user portraits corresponding to each current viewing sequence are determined, wherein the current mapping list is based on the last The previous feature vectors corresponding to the detected multiple previous viewing sequences are determined, which solves the problem that the existing movie viewing mapping list depends on the setting of human experience, makes the current mapping list have temporal mobility, and further improves the user portrait. prediction accuracy.

应当理解，本部分所描述的内容并非旨在标识本发明的实施例的关键或重要特征，也不用于限制本发明的范围。本发明的其它特征将通过以下的说明书而变得容易理解。It should be understood that the content described in this section is not intended to identify key or critical features of the embodiments of the invention, nor is it intended to limit the scope of the invention. Other features of the present invention will become readily understood from the following description.

附图说明Description of drawings

为了更清楚地说明本发明实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions in the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative effort.

图1是本发明实施例一提供的一种用户画像的生成方法的流程图；1 is a flowchart of a method for generating a user portrait provided in Embodiment 1 of the present invention;

图2是本发明实施例提供的一种用户画像的生成方法的具体实例的示意图；2 is a schematic diagram of a specific example of a method for generating a user portrait provided by an embodiment of the present invention;

图3是本发明实施例二提供的一种用户画像的生成方法的流程图；3 is a flowchart of a method for generating a user portrait provided in Embodiment 2 of the present invention;

图4是本发明实施例三提供的一种用户画像的生成方法的流程图；4 is a flowchart of a method for generating a user portrait provided in Embodiment 3 of the present invention;

图5是本发明实施例四提供的一种用户画像的生成装置的结构示意图；5 is a schematic structural diagram of a device for generating user portraits according to Embodiment 4 of the present invention;

图6是本发明实施例五提供的一种电子设备的结构示意图。FIG. 6 is a schematic structural diagram of an electronic device according to Embodiment 5 of the present invention.

具体实施方式Detailed ways

为了使本技术领域的人员更好地理解本发明方案，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分的实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都应当属于本发明保护的范围。In order to make those skilled in the art better understand the solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only Embodiments are part of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

需要说明的是，本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象，而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换，以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外，术语“包括”和“具有”以及他们的任何变形，意图在于覆盖不排他的包含，例如，包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元，而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second" and the like in the description and claims of the present invention and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used may be interchanged under appropriate circumstances such that the embodiments of the invention described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having" and any variations thereof, are intended to cover non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or units is not necessarily limited to those expressly listed Rather, those steps or units may include other steps or units not expressly listed or inherent to these processes, methods, products or devices.

实施例一Example 1

图1是本发明实施例一提供的一种用户画像的生成方法的流程图，本实施例可适用于基于观影序列预测用户画像的情况，该方法可以由用户画像的生成装置来执行，该用户画像的生成装置可以采用硬件和/或软件的形式实现，该用户画像的生成装置可配置于终端设备中。如图1所示，该方法包括：FIG. 1 is a flowchart of a method for generating a user portrait provided in Embodiment 1 of the present invention. This embodiment can be applied to the situation of predicting a user portrait based on a movie viewing sequence. The method can be executed by a user portrait generating device. The The device for generating the user portrait can be implemented in the form of hardware and/or software, and the device for generating the user portrait can be configured in the terminal device. As shown in Figure 1, the method includes:

S110、响应于检测到多个当前观影序列，获取各当前观影序列分别对应的待检测设备标识和当前特征向量。S110. In response to detecting a plurality of current movie viewing sequences, acquire the identification of the device to be detected and the current feature vector corresponding to each current movie viewing sequence respectively.

其中，具体的，当前观影序列可用于表征从一次开机时刻开始到关机时刻结束内的至少一个观影作品分别对应的观影信息数据，其中，开机时刻到关机时刻构成当前观影序列对应的观影时长。示例性的，观影信息数据包括但不限于电视剧名称、电视剧类型、上映时间、电视剧主要演员名字、电视剧导演名字、电视剧单集时长和电视剧简介等等。举例而言，当观影作品包含电视剧AA时，与该观影作品对应的观影信息数据包括“AA”、“言情&古装”、“2022年”、“男主角名字A”、“女主角名字B”、“导演名字C”、“45分钟”和“上古时期的爱恨情仇”等等。Specifically, the current movie viewing sequence can be used to represent movie viewing information data corresponding to at least one movie viewing work from the start of a power-on time to the end of the power-off time. Viewing time. Exemplarily, the movie viewing information data includes, but is not limited to, the name of the TV series, the type of the TV series, the show time, the name of the main actor of the TV series, the name of the TV series director, the duration of a single episode of the TV series, the introduction of the TV series, and so on. For example, when the viewing work includes the TV series AA, the viewing information data corresponding to the viewing work includes "AA", "Romance & Costume", "2022", "The Actor's Name A", "The Actress" Name B", "Director's Name C", "45 Minutes" and "Ancient Love Hate" and more.

在一个实施例中，可选的，该方法还包括：基于各多个当前观影序列分别对应的观影时长和预设时长阈值，确定筛选后的多个当前观影序列。其中，具体的，如果当前观影序列A的观影时长大于或等于预设时长阈值，则将当前观影序列A作为筛选后的当前观影序列，如果当前观影序列A的观影时长小于预设时长阈值，则将当前观影序列A删除。示例性的，预设时长阈值可以为10分钟。In one embodiment, optionally, the method further includes: determining the filtered multiple current movie viewing sequences based on the movie viewing durations and preset duration thresholds corresponding to the multiple current movie viewing sequences respectively. Specifically, if the viewing duration of the current viewing sequence A is greater than or equal to the preset duration threshold, the current viewing sequence A is used as the screened current viewing sequence, and if the viewing duration of the current viewing sequence A is less than If the preset duration threshold is set, the current viewing sequence A is deleted. Exemplarily, the preset duration threshold may be 10 minutes.

这样设置的好处在于，观影时长较短的当前观影序列与用户画像之间的耦合性较低，由于本发明实施例是以聚类的方式确定用户画像，如果在参与聚类的数据中加入观影时长较短的当前观影序列，容易影响到整体聚类效果，导致用户画像的预测结果不准确。The advantage of this setting is that the coupling between the current movie viewing sequence with a short viewing time and the user portrait is low. Adding a current movie viewing sequence with a short viewing duration will easily affect the overall clustering effect, resulting in inaccurate prediction results of user portraits.

其中，具体的，待检测设备标识可用于表征采集到当前观影序列的终端设备对应的设备标识，同一待检测设备标识可对应至少一个当前观影序列。示例性的，待检测设备标识可以是终端设备的MAC(Media Access Control Address，媒体存取控制地址)设备标识。其中，待检测设备标识的编码内容包含文字、数字、字母和特殊字符中至少一种，此处对待检测设备标识的编码内容不作限定，用户可根据实际需求进行自定义设置。Specifically, the device identifier to be detected may be used to represent the device identifier corresponding to the terminal device that has collected the current movie viewing sequence, and the same device identifier to be detected may correspond to at least one current movie viewing sequence. Exemplarily, the device identifier to be detected may be a MAC (Media Access Control Address, media access control address) device identifier of the terminal device. Wherein, the coded content of the device identification to be detected includes at least one of characters, numbers, letters and special characters. The coded content of the device identification to be detected is not limited here, and the user can customize the settings according to actual needs.

在一个实施例中，可选的，获取各当前观影序列分别对应的当前特征向量，包括：针对每个当前观影序列，将当前观影序列对应的至少一个观影作品分别对应的观影信息数据分别输入到预先训练完成的预训练模型中，得到输出的与各观影作品分别对应的初始特征向量，对各初始特征向量执行求均值操作，得到当前观影序列对应的当前特征向量。其中，预训练模型具备语义信息提取能力，示例性的，预训练模型包括但不限于bert模型、ELMo模型或GPT模型等等。In one embodiment, optionally, acquiring the current feature vectors corresponding to each current movie viewing sequence includes: for each current movie viewing sequence, obtaining the movie viewing corresponding to at least one movie viewing work corresponding to the current movie viewing sequence respectively. The information data is respectively input into the pre-trained pre-training model, and the output initial feature vector corresponding to each movie watching work is obtained, and the mean value operation is performed on each initial feature vector to obtain the current feature vector corresponding to the current movie viewing sequence. The pre-training model has the capability of extracting semantic information. Exemplarily, the pre-training model includes, but is not limited to, a bert model, an ELMo model, a GPT model, and the like.

S120、对多个当前特征向量执行聚类操作，得到至少一个当前聚类簇。S120. Perform a clustering operation on a plurality of current feature vectors to obtain at least one current cluster.

其中，示例性的，聚类操作采用的聚类算法包括但不限于k-means聚类算法、均值漂移聚类算法、基于密度的聚类算法、k-medoids聚类算法或clarans聚类算法等等。此处对采用的聚类算法不作限定。Wherein, exemplary, the clustering algorithm adopted in the clustering operation includes but is not limited to k-means clustering algorithm, mean-shift clustering algorithm, density-based clustering algorithm, k-medoids clustering algorithm or clarans clustering algorithm, etc. Wait. The clustering algorithm used is not limited here.

其中，具体的，每个当前聚类聚分别包含至少一个当前特征向量，每个当前特征向量分别对应一个当前观影序列以及待检测设备标识。Specifically, each current cluster includes at least one current feature vector, and each current feature vector corresponds to a current movie viewing sequence and an identifier of a device to be detected.

S130、基于多个当前观影序列对应的当前映射列表和各当前聚类簇分别对应的至少一个待检测设备标识，确定各当前观影序列分别对应的当前用户画像。S130: Determine the current user portraits corresponding to each current movie viewing sequence based on the current mapping lists corresponding to the multiple current movie viewing sequences and at least one device to be detected identifier corresponding to each current cluster.

其中，示例性的，当前用户画像包括但不限于用户年龄段、用户性别和用户身份属性中至少一种，其中，用户身份属性可以是文艺青年、专家学者、白领或未受教育者等等。Exemplarily, the current user portrait includes but is not limited to at least one of user age group, user gender and user identity attribute, wherein the user identity attribute can be literary youth, experts and scholars, white-collar workers or uneducated people, etc.

在本实施例中，当前映射列表中包含至少两种用户画像以及与各用户画像分别对应的多个上一设备标识，当前映射列表是基于上一次检测到的多个上一观影序列分别对应的上一特征向量确定的。其中，示例性的，当用户画像仅包含用户年龄段时，当前映射列表中可包含0-17岁、18-24岁、25-34岁、35-44岁、45岁-54岁以及55岁以上，当用户画像包含用户年龄段和性别时，当前映射列表中可包含0-17岁女性、0-17岁男性、18-24岁女性和18-24岁男性等等。In this embodiment, the current mapping list contains at least two user portraits and multiple previous device identifiers corresponding to each user portrait, and the current mapping list is based on the last detected multiple previous viewing sequences respectively corresponding is determined by the previous eigenvector of . Wherein, exemplarily, when the user portrait only includes the user age group, the current mapping list may include 0-17 years old, 18-24 years old, 25-34 years old, 35-44 years old, 45 years old-54 years old and 55 years old Above, when the user portrait includes the user's age group and gender, the current mapping list may include 0-17 year old females, 0-17 year old males, 18-24 year old females, 18-24 year old males, and so on.

在一个实施例中，可选的，该方法还包括：获取多个上一观影序列对应的至少两个上一聚类簇、各上一聚类簇分别对应的用户画像以及上一设备标识集；其中，各上一聚类簇是基于各上一观影序列分别对应的上一特征向量聚类得到的，上一设备标识集包含上一聚类簇对应的多个上一观影序列的上一设备标识；针对每个上一聚类簇，获取上一聚类簇对应的上一设备标识集中的任一上一设备标识，如果上一设备标识与其他上一设备标识集不存在交集，则将上一设备标识与上一聚类簇对应的用户画像对应添加到当前映射列表中。In one embodiment, optionally, the method further includes: acquiring at least two previous clusters corresponding to multiple previous viewing sequences, user portraits corresponding to each previous cluster, and a previous device identifier where each previous cluster is clustered based on the previous feature vector corresponding to each previous viewing sequence, and the previous device identification set includes multiple previous viewing sequences corresponding to the previous cluster For each previous cluster, obtain any previous device ID in the previous device ID set corresponding to the previous cluster, if the previous device ID and other previous device ID sets do not exist If the intersection is set, the user portrait corresponding to the previous device ID and the previous cluster is added to the current mapping list.

其中，具体的，上一观影序列是上一次检测到的观影序列，其中，上一次检测到上一观影序列的上一检测时刻与当前检测到当前观影序列的当前检测时刻满足预设检测周期。其中，示例性的，预设检测周期可以是1天或一周。Specifically, the last movie viewing sequence is the movie viewing sequence detected last time, wherein the last detection time when the last movie viewing sequence was detected last time and the current detection time when the current movie viewing sequence is currently detected meet the predetermined Set the detection period. Wherein, for example, the preset detection period may be one day or one week.

其中，具体的，如果各上一观影序列是第一次检测到的观影序列，则各上一聚类簇分别对应的用户画像可以是专家根据各上一聚类簇分别对应的观影上一观影序列设置的。如果各上一观影序列不是第一次检测到的观影序列，则各上一聚类簇分别对应的用户画像可以是根据各上一观影序列对应的上一映射列表确定的。Specifically, if each previous movie viewing sequence is a movie viewing sequence detected for the first time, the user portraits corresponding to each previous cluster may be the movie viewing corresponding to each previous cluster by an expert. Set from the previous viewing sequence. If each previous movie viewing sequence is not the first detected movie viewing sequence, the user portraits corresponding to each previous cluster cluster may be determined according to the previous mapping list corresponding to each previous movie viewing sequence.

其中，具体的，当前映射列表中的每个上一设备标识都可以唯一标识每个用户画像，即另一用户画像对应的上一聚类簇中不存在本用户画像对应的上一设备标识。采样得到的当前映射列表中用户画像对应的上一设备标识的采样数量可根据各上一聚类簇的大小确定。举例而言，假设上一聚类簇A对应的上一设备标识集A包含10个上一设备标识，上一聚类簇B对应的上一设备标识集B包含5个上一设备标识，则采样数量小于5。Specifically, each previous device identifier in the current mapping list can uniquely identify each user portrait, that is, the previous device identifier corresponding to this user portrait does not exist in the previous cluster corresponding to another user portrait. The sampling quantity of the previous device identification corresponding to the user portrait in the current mapping list obtained by sampling may be determined according to the size of each previous cluster. For example, assuming that the previous device identification set A corresponding to the previous cluster A contains 10 previous device identifications, and the previous device identification set B corresponding to the previous cluster B contains 5 previous device identifications, then The number of samples is less than 5.

在一个实施例中，可选的，基于多个当前观影序列对应的当前映射列表和各当前聚类簇分别对应的至少一个待检测设备标识，确定各当前观影序列分别对应的当前用户画像，包括：针对每个当前聚类簇，基于当前聚类簇对应的至少一个待检测设备标识，确定与当前映射列表中至少两个用户画像分别对应的设备标识重叠数量；将设备标识重叠数量最多的用户画像作为当前聚类簇对应的各当前观影序列的当前用户画像。In one embodiment, optionally, based on the current mapping lists corresponding to multiple current movie viewing sequences and at least one device identifier to be detected corresponding to each current cluster, determine the current user portraits corresponding to each current movie viewing sequence respectively , including: for each current cluster, based on at least one device identification to be detected corresponding to the current cluster, determining the overlapping number of device identifications corresponding to at least two user portraits in the current mapping list respectively; The user portrait of is the current user portrait of each current viewing sequence corresponding to the current cluster.

其中，示例性的，假设当前聚类簇A对应10个待检测设备标识，分别为1-10，当前映射列表中用户画像A对应的上一设备标识包含[1-4,11-20]，用户画像B对应的上一设备标识包含[5-10,21-31]，则用户画像A对应的设备标识重叠数量为4，用户画像B对应的设备标识重叠数量为6，则当前聚类簇A对应的各当前观影序列的目标用户画像为用户画像B。Among them, exemplarily, it is assumed that the current cluster A corresponds to 10 device identifiers to be detected, which are 1-10 respectively, and the previous device identifier corresponding to the user portrait A in the current mapping list includes [1-4, 11-20], If the previous device ID corresponding to user portrait B contains [5-10, 21-31], the number of overlapping device IDs corresponding to user portrait A is 4, and the number of overlapping device IDs corresponding to user portrait B is 6, then the current cluster The target user portrait of each current viewing sequence corresponding to A is user portrait B.

图2是本发明实施例提供的一种用户画像的生成方法的具体实例的示意图，图2以预设检测周期为1天为例。具体的，确定第1天检测到的第一观影序列对应的第一特征向量，对各第一特征向量执行聚类操作，得到两个第一聚类簇，专家可根据各第一聚类簇分别对应第一观影序列设置各第一聚类簇分别对应的用户画像。并对各第一聚类簇分别对应的第一设备标识进行采样，得到映射列表1。确定第2天检测到的第二观影序列对应的第二特征向量，对各第二特征向量执行聚类操作，得到两个第二聚类簇，基于映射列表1和各第二聚类簇分别对应第二设备标识确定各第二聚类簇分别对应的观影序列的用户画像。并对各第二聚类簇分别对应的第二设备标识进行采样，得到映射列表2。以此类推，确定第N天检测到的第N观影序列对应的第N特征向量，对各第N特征向量执行聚类操作，得到两个第N聚类簇，基于映射列表N-1和各第N聚类簇分别对应第N设备标识确定各第N聚类簇分别对应的观影序列的用户画像。FIG. 2 is a schematic diagram of a specific example of a method for generating a user portrait provided by an embodiment of the present invention. FIG. 2 takes a preset detection period of 1 day as an example. Specifically, the first feature vector corresponding to the first viewing sequence detected on the first day is determined, and the clustering operation is performed on each first feature vector to obtain two first clusters. The clusters correspond to the first movie viewing sequence, respectively, and set user portraits corresponding to the first clusters respectively. The first device identifiers corresponding to the first clusters are sampled to obtain a mapping list 1 . Determine the second feature vector corresponding to the second viewing sequence detected on the second day, perform a clustering operation on each second feature vector, and obtain two second clusters, based on the mapping list 1 and each second cluster The user portraits of the movie viewing sequences corresponding to the second clusters are determined respectively corresponding to the second device identifiers. The second device identifiers corresponding to the second clusters are sampled to obtain a mapping list 2 . By analogy, determine the Nth eigenvector corresponding to the Nth viewing sequence detected on the Nth day, perform a clustering operation on each Nth eigenvector, and obtain two Nth clusters, based on the mapping list N-1 and Each Nth cluster corresponds to the Nth device identifier to determine the user portrait of the movie viewing sequence corresponding to each Nth cluster respectively.

这样设置的好处在于，现有的映射列表在每新推出一个影视作品就需要在映射列表中添加该影视作品对应的用户画像，但影视作品层出不穷，更新速度非常快。采用聚类的方法可以根据新的影视作品的特征向量找到与其相似的聚类簇，从而很好的弥补现有需要人为动态更新映射列表的缺陷，进而保证了用户画像的预测准确率。The advantage of this setting is that every time a new film and television work is launched in the existing mapping list, the user portrait corresponding to the film and television work needs to be added to the mapping list, but the film and television works emerge in an endless stream, and the update speed is very fast. The clustering method can find similar clusters according to the feature vector of the new film and television work, so as to make up for the defect of the existing need to manually update the mapping list, and thus ensure the prediction accuracy of the user portrait.

本实施例的技术方案，通过获取检测到的多个当前观影序列分别对应的待检测设备标识和当前特征向量，对多个当前特征向量执行聚类操作，得到至少一个当前聚类簇，基于多个当前观影序列对应的当前映射列表和各当前聚类簇分别对应的至少一个待检测设备标识，确定各当前观影序列分别对应的当前用户画像，其中，当前映射列表是基于上一次检测到的多个上一观影序列分别对应的上一特征向量确定的，解决了现有观影映射列表依赖人为经验设置的问题，使得当前映射列表具备时间迁移性，进而提高了用户画像的预测准确率。The technical solution of this embodiment is to obtain at least one current cluster by obtaining the device identifiers to be detected and the current feature vectors corresponding to the detected multiple current viewing sequences respectively, and performing a clustering operation on the multiple current feature vectors to obtain at least one current cluster. The current mapping lists corresponding to multiple current viewing sequences and at least one device identifier to be detected corresponding to each current cluster respectively, determine the current user portraits corresponding to each current viewing sequence, wherein the current mapping list is based on the last detection The previous feature vectors corresponding to multiple previous viewing sequences are determined, which solves the problem that the existing viewing mapping list relies on human experience setting, makes the current mapping list have time migration, and improves the prediction of user portraits Accuracy.

OTT行业中的终端设备具备公共属性，即一个终端设备通常对应一个家庭里的所有成员，如何将一个终端设备对应的观影行为拆分到每个家庭成员的视角中，即对一个终端设备的观影行为进行区分，分别构建家庭中每一个成员对应的用户画像，对于个性化推荐和精准营销的精细化定位具有重要意义。Terminal devices in the OTT industry have common attributes, that is, a terminal device usually corresponds to all members of a family. It is of great significance for the refined positioning of personalized recommendation and precision marketing to distinguish the viewing behaviors and construct the user portrait corresponding to each member of the family.

实施例二Embodiment 2

图3是本发明实施例二提供的一种用户画像的生成方法的流程图。本实施例对上述实施例在预测用户画像后，确定单一终端设备所属家庭的家庭成员不同观影时间习惯的方法进行进一步细化。如图3所示，该方法包括：FIG. 3 is a flowchart of a method for generating a user portrait provided by Embodiment 2 of the present invention. This embodiment further refines the method for determining the different viewing time habits of family members of the family to which a single terminal device belongs after predicting the user portrait in the foregoing embodiment. As shown in Figure 3, the method includes:

S210、响应于检测到多个当前观影序列，获取各当前观影序列分别对应的待检测设备标识和当前特征向量。S210. In response to detecting a plurality of current movie viewing sequences, acquire the identification of the device to be detected and the current feature vector corresponding to each current movie viewing sequence respectively.

S220、对多个当前特征向量执行聚类操作，得到至少一个当前聚类簇。S220. Perform a clustering operation on multiple current feature vectors to obtain at least one current cluster.

S230、基于多个当前观影序列对应的当前映射列表和各当前聚类簇分别对应的至少一个待检测设备标识，确定各当前观影序列分别对应的当前用户画像。S230. Determine the current user portraits corresponding to each current movie viewing sequence based on the current mapping lists corresponding to the multiple current movie viewing sequences and at least one to-be-detected device identifier corresponding to each current cluster.

S240、针对每个待检测设备标识，获取待检测设备标识对应的多个历史观影序列分别对应的历史开机时刻和历史用户画像。S240. For each device identification to be detected, obtain the historical power-on time and historical user portraits respectively corresponding to multiple historical viewing sequences corresponding to the identification of the device to be detected.

在本实施例中，历史观影序列包括待检测设备标识对应的多个当前观影序列和/或上一观影序列。其中，具体的，同一待检测设备标识通常会对应多个观影序列，基于预设统计周期，将历史检测到的与待检测设备标识对应的观影序列作为待检测设备标识对应的历史观影序列。示例性的，假设预设统计周期为10天，预设检测周期为1天，则历史观影序列包含第i天到第i+9天检测到的与待检测设备标识对应的观影序列。In this embodiment, the historical movie viewing sequence includes multiple current movie viewing sequences and/or previous movie viewing sequences corresponding to the identification of the device to be detected. Specifically, the same device identification to be detected usually corresponds to multiple viewing sequences, and based on a preset statistical period, the historically detected viewing sequence corresponding to the identification of the device to be detected is used as the historical viewing sequence corresponding to the identification of the device to be detected. sequence. Exemplarily, assuming that the preset statistical period is 10 days and the preset detection period is 1 day, the historical viewing sequence includes the viewing sequences detected from the ith day to the ith+9th day and corresponding to the ID of the device to be detected.

举例而言，假设预设统计周期为2天，预设检测周期为1天，第i天检测到观影序列1和观影序列2，且其分别对应的待检测设备标识为设备标识A和设备标识B，第i+1天检测到观影序列3和观影序列4，且其分别对应的待检测设备标识为设备标识A和设备标识C，则针对设备标识A，与其对应的多个历史观影序列包括观影序列1和观影序列3。For example, it is assumed that the preset statistical period is 2 days, the preset detection period is 1 day, the movie viewing sequence 1 and the movie viewing sequence 2 are detected on the ith day, and the corresponding device identifiers to be detected are device identifiers A and 2 respectively. Device identification B, the movie viewing sequence 3 and the movie viewing sequence 4 are detected on the i+1th day, and their corresponding device identifications to be detected are device identification A and device identification C respectively. The historical viewing sequence includes viewing sequence 1 and viewing sequence 3.

S250、基于各历史开机时刻以及各历史用户画像，确定时间画像映射列表。S250. Determine a time-portrait mapping list based on each historical boot time and each historical user portrait.

在本实施例中，时间画像映射列表包含至少两个预设时间段以及与各预设时间段分别对应的分类用户画像。其中，具体的，时间画像映射列表可用于表征待检测设备标识对应的终端设备所属家庭的家庭成员不同观影时间习惯的信息。In this embodiment, the time portrait mapping list includes at least two preset time periods and classified user portraits corresponding to each preset time period. Specifically, the time portrait mapping list may be used to represent information about the different viewing time habits of family members of the family to which the terminal device corresponding to the device to be detected belongs.

其中，具体的，每个预设时间段分别对应的时间间隔可以相同，也可以不同。在一个实施例中，预设时间段可以是一天内的24小时对应的时间段，如多个预设时间段分别为0-7点、7-12点、12-18点以及18-24点。在另一个实施例中，预设时间段可以是一周内的7天对应的时间段，如多个预设时间段分别为周一到周五、周六和周日。在另一个实施例中，预设时间段还可以是一周内的7天及一天内的24小时对应的时间段，如多个预设时间段分别为周一到周五的0-7点、周一到周五的7-12点、周一到周五的12-18点、周一到周五的18-24点、周六的0-7点、周六的7-24点，周日的0-24点。此处对各预设时间段的具体参数值不作限定，用户可根据实际需求自定义设置。Specifically, the time intervals corresponding to each preset time period may be the same or different. In one embodiment, the preset time period may be a time period corresponding to 24 hours in a day, for example, multiple preset time periods are respectively 0-7 o'clock, 7-12 o'clock, 12-18 o'clock and 18-24 o'clock . In another embodiment, the preset time period may be a time period corresponding to 7 days in a week, for example, the multiple preset time periods are Monday to Friday, Saturday and Sunday, respectively. In another embodiment, the preset time period may also be a time period corresponding to 7 days in a week and 24 hours in a day. 7-12 on Friday, 12-18 on Monday to Friday, 18-24 on Monday to Friday, 0-7 on Saturday, 7-24 on Saturday, 0-24 on Sunday 24 o'clock. The specific parameter values of each preset time period are not limited here, and the user can customize the settings according to actual needs.

本实施例的以下内容均以1天内的24小时为例进行示例性说明。The following content of this embodiment is exemplified by taking 24 hours in a day as an example.

其中，示例性的，时间画像映射列表包含0-7点、7-12点、12-18点以及18-24点，且各预设时间段对应的分类用户画像分别为用户画像A、用户画像B、用户画像A和用户画像C。可以理解的是，不同预设时间段分别对应的用户画像可以相同，也可以不同。如当终端设备为电视时，老人一般会在早上7-12点和晚上18-24点均会观看电视，年轻人一般只在晚上18-24点观看电视。Wherein, exemplarily, the time portrait mapping list includes 0-7 o'clock, 7-12 o'clock, 12-18 o'clock and 18-24 o'clock, and the classified user portraits corresponding to each preset time period are respectively user portrait A, user portrait B. User portrait A and user portrait C. It can be understood that the user portraits corresponding to different preset time periods may be the same or different. For example, when the terminal device is a TV, the elderly generally watch TV at 7-12 in the morning and 18-24 in the evening, and young people generally only watch TV at 18-24 in the evening.

其中，具体的，针对每个预设时间段，基于各历史开机时刻，确定与预设时间段对应的多个目标历史观影序列，确定至少一种用户画像分别对应的各目标历史观影序列的序列数量，并将序列数量最多的用户画像作为预设时间段对应的分类用户画像。Specifically, for each preset time period, based on each historical power-on time, a plurality of target historical movie viewing sequences corresponding to the preset time period are determined, and each target historical movie viewing sequence corresponding to at least one user portrait is determined. and the user portrait with the largest number of sequences is used as the classified user portrait corresponding to the preset time period.

举例而言，假设待检测设备标识对应的多个历史观影序列包括历史观影序列1、历史观影序列2、历史观影序列3和历史观影序列4，且其分别对应的历史开机时刻分别为8:00、8:03、9:02和20:23，假设预设时间段为7-12点，与7-12点对应的多个目标历史观影序列包括历史观影序列1、历史观影序列2和历史观影序列3，假设历史观影序列1、历史观影序列2和历史观影序列3对应的用户画像分别为用户画像A、用户画像A和用户画像B，则用户画像A对应的序列数量为2个，用户画像B对应的序列数量为1个，序列数量最多的用户画像A作为7-12点对应的分类用户画像。For example, it is assumed that the multiple historical viewing sequences corresponding to the ID of the device to be detected include historical viewing sequence 1, historical viewing sequence 2, historical viewing sequence 3, and historical viewing sequence 4, and their corresponding historical power-on times respectively. 8:00, 8:03, 9:02 and 20:23 respectively, assuming the preset time period is 7-12 o'clock, the multiple target historical viewing sequences corresponding to 7-12 o'clock include historical viewing sequence 1, Historical movie viewing sequence 2 and historical viewing sequence 3, assuming that the user portraits corresponding to historical viewing sequence 1, historical viewing sequence 2, and historical viewing sequence 3 are user portrait A, user portrait A, and user portrait B, respectively, then the user The number of sequences corresponding to portrait A is 2, the number of sequences corresponding to user portrait B is 1, and the user portrait A with the largest number of sequences is used as the classified user portrait corresponding to 7-12 points.

在上述实施例的基础上，可选的，方法还包括：响应于检测到目标待检测设备标识对应的开机指令，基于目标待检测设备标识对应的目标时间画像映射列表以及开机指令对应的当前开机时刻，确定与开机指令对应的目标分类用户画像；获取与目标分类用户画像对应的推荐数据，并将推荐数据发送给与目标待检测设备标识对应的终端设备，以使终端设备对推荐数据进行展示。On the basis of the above-mentioned embodiment, optionally, the method further includes: in response to detecting a power-on instruction corresponding to the identifier of the target device to be detected, based on the target time portrait mapping list corresponding to the identifier of the target device to be detected and the current power-on instruction corresponding to the power-on instruction At the same time, determine the target classification user portrait corresponding to the boot command; obtain the recommendation data corresponding to the target classification user portrait, and send the recommendation data to the terminal device corresponding to the target device identification to be detected, so that the terminal device can display the recommendation data. .

其中，具体的，本地存储空间中存储有多个待检测设备标识以及各待检测设备标识分别对应的时间画像映射列表。Specifically, a plurality of device identifiers to be detected and a time portrait mapping list corresponding to each device identifier to be detected are stored in the local storage space.

其中，示例性的，假设目标待检测设备标识对应的目标时间画像映射列表包含0-7点、7-12点、12-18点以及18-24点，且各预设时间段对应的分类用户画像分别为用户画像A、用户画像B、用户画像A和用户画像C，假设当前开机时刻为12:05，则目标分类用户画像为用户画像B，获取与用户画像B对应的推荐数据。Among them, exemplarily, it is assumed that the target time portrait mapping list corresponding to the target device identification to be detected includes 0-7 o'clock, 7-12 o'clock, 12-18 o'clock and 18-24 o'clock, and the classified users corresponding to each preset time period The portraits are user portrait A, user portrait B, user portrait A, and user portrait C. Assuming that the current boot time is 12:05, the target classification user portrait is user portrait B, and the recommendation data corresponding to user portrait B is obtained.

其中，示例性的，推荐数据包括但不限于购物数据、广告数据、推荐影视数据等等。举例而言，当目标分类用户画像为儿童时，推荐数据可以为与儿童零食或儿童玩具相关的广告数据，当目标分类用户画像为老人时，推荐数据为与保健品相关的广告数据。Wherein, exemplarily, the recommendation data includes, but is not limited to, shopping data, advertisement data, recommended video data, and the like. For example, when the target classified user portrait is children, the recommendation data may be advertisement data related to children's snacks or children's toys, and when the target classified user portrait is the elderly, the recommended data may be advertisement data related to health care products.

本实施例的技术方案，通过针对每个待检测设备标识，获取待检测设备标识对应的多个历史观影序列分别对应的历史开机时间和历史用户画像，基于各历史开机时间以及各历史用户画像，确定时间画像映射列表，解决了无法对终端设备所属家庭的家庭成员的观影时间习惯进行精准区分的问题，可以针对某一终端设备所属家庭的每个家庭成员的观影行为进行精细化定位，实现了预测家庭画像的目的，为后续的推荐或营销服务提供了重要的数据参考。The technical solution of this embodiment is to obtain, for each device identifier to be detected, the historical boot time and historical user portraits corresponding to multiple historical viewing sequences corresponding to the device identifier to be detected, respectively, based on each historical boot time and each historical user portrait. , determines the time portrait mapping list, solves the problem of inability to accurately distinguish the viewing time habits of family members of the family to which the terminal device belongs, and can fine-tune the viewing behavior of each family member of the family to which a terminal device belongs. , to achieve the purpose of predicting family portraits and provide important data reference for subsequent recommendation or marketing services.

实施例三Embodiment 3

图4是本发明实施例三提供的一种用户画像的生成方法的流程图，本实施例对上述实施例在预测用户画像后，确定单一终端设备所属家庭的家庭结构的方法进行进一步细化。如图4所示，该方法包括：4 is a flowchart of a method for generating a user portrait provided in Embodiment 3 of the present invention. This embodiment further refines the method for determining the family structure of a family to which a single terminal device belongs after predicting the user portrait in the above embodiment. As shown in Figure 4, the method includes:

S310、响应于检测到多个当前观影序列，获取各当前观影序列分别对应的待检测设备标识和当前特征向量。S310. In response to detecting a plurality of current movie viewing sequences, acquire the identification of the device to be detected and the current feature vector corresponding to each current movie viewing sequence respectively.

S320、对多个当前特征向量执行聚类操作，得到至少一个当前聚类簇。S320. Perform a clustering operation on a plurality of current feature vectors to obtain at least one current cluster.

S330、基于多个当前观影序列对应的当前映射列表和各当前聚类簇分别对应的至少一个待检测设备标识，确定各当前观影序列分别对应的当前用户画像。S330. Determine the current user portraits corresponding to each current movie viewing sequence based on the current mapping lists corresponding to the multiple current movie viewing sequences and at least one to-be-detected device identifier corresponding to each current cluster.

S340、针对每个待检测设备标识，获取待检测设备标识对应的多个历史观影序列分别对应的历史用户画像。S340. For each device identification to be detected, obtain historical user portraits corresponding to multiple historical viewing sequences corresponding to the identification of the device to be detected.

在本实施例中，历史用户画像包含用户年龄段或者历史用户画像包含用户年龄段和用户性别。在一个实施例中，示例性的，历史用户画像包含0-17岁或18-24岁，在另一个实施例中，示例性的，历史用户画像包含女性0-17岁、男性0-17岁或女性18-24岁。In this embodiment, the historical user portrait includes the user age group or the historical user portrait includes the user age group and the user gender. In one embodiment, the exemplary historical user portraits include 0-17 years old or 18-24 years old, and in another embodiment, the exemplary historical user portraits include female 0-17 years old and male 0-17 years old Or women 18-24 years old.

本实施例的以下内容以历史用户画像包含用户年龄段进行示例性说明。The following content of this embodiment is exemplified by the fact that historical user portraits include user age groups.

S350、确定至少一种历史用户画像分别对应的历史观影序列的序列数量，并基于各序列数量，确定待检测设备标识对应的家庭结构。S350: Determine the number of sequences of historical movie viewing sequences corresponding to at least one historical user portrait, and determine the family structure corresponding to the identifier of the device to be detected based on the number of each sequence.

其中，示例性的，假设待检测设备标识对应的多个历史观影序列包括历史观影序列1、历史观影序列2、历史观影序列3和历史观影序列4，且其分别对应的历史用户画像分别为0-17岁、0-17岁、18-24岁和25-34岁，则0-17岁对应的序列数量为2个，18-24岁和25-34岁对应的序列数量分别为1个。Among them, exemplarily, it is assumed that the multiple historical viewing sequences corresponding to the identification of the device to be detected include a historical viewing sequence 1, a historical viewing sequence 2, a historical viewing sequence 3 and a historical viewing sequence 4, and their corresponding historical viewing sequences The user portraits are 0-17 years old, 0-17 years old, 18-24 years old and 25-34 years old, then the number of sequences corresponding to 0-17 years old is 2, and the number of sequences corresponding to 18-24 years old and 25-34 years old 1 respectively.

在一个实施例中，可选的，基于各序列数量，确定待检测设备标识对应的家庭结构，包括：基于预设选取参数和各序列数量，确定至少一个目标历史用户画像，并基于各目标历史用户画像，确定待检测设备标识对应的家庭结构；其中，预设选取参数包括预设选取数量或预设数量阈值；和/或，针对每个历史用户画像分别对应的序列数量，如果本序列数量对应的历史用户画像属于中间年龄段，则在本序列数量大于等于下一序列数量且大于上一序列数量时，将本序列数量对应的历史用户画像作为目标历史用户画像；如果本序列数量对应的历史用户画像属于两端年龄段，则在本序列数量大于等于下一序列数量或大于上一序列数量时，将本序列数量对应的历史用户画像作为目标历史用户画像；基于各目标历史用户画像，确定待检测设备标识对应的家庭结构。In one embodiment, optionally, determining the family structure corresponding to the identifier of the device to be detected based on the number of each sequence includes: determining at least one historical user portrait of a target based on the preset selection parameters and the number of each sequence, and based on the history of each target User portrait, to determine the family structure corresponding to the identifier of the device to be detected; wherein, the preset selection parameters include a preset selection quantity or a preset quantity threshold; and/or, for each historical user portrait respectively corresponding sequence quantity, if the sequence quantity The corresponding historical user portrait belongs to the middle age group, when the number of this sequence is greater than or equal to the number of the next sequence and greater than the number of the previous sequence, the historical user portrait corresponding to the number of this sequence is used as the target historical user portrait; If the historical user portraits belong to the age groups at both ends, when the number of this sequence is greater than or equal to the number of the next sequence or greater than the number of the previous sequence, the historical user portrait corresponding to the number of this sequence is used as the target historical user portrait; based on each target historical user portrait, Determine the family structure corresponding to the ID of the device to be detected.

在一个实施例中，具体的，当预设选取参数为预设选取数量时，对各序列数量进行降序排序，将排序结果靠前预设选取数量的历史用户画像作为目标历史用户画像。当预设选取参数为预设数量阈值时，将序列数量大于预设数量阈值的历史用户画像作为目标历史用户画像。In one embodiment, specifically, when the preset selection parameter is the preset selection quantity, the sequence quantities are sorted in descending order, and the historical user portraits with the preset selection quantity ahead of the sorting result are used as the target historical user portraits. When the preset selection parameter is the preset number threshold, the historical user portraits whose sequence number is greater than the preset number threshold are used as the target historical user portraits.

举例而言，假设0-17岁、18-24岁、25-34岁、35-44岁、45岁-54岁以及55岁以上对应的序列数量分别为100个、2个、50个、7个、10个和1个，如果预设选取数量为3个，则至少一个目标历史用户画像包括0-17岁、25-34岁以及45岁-54岁。如果预设选取阈值为40个，则至少一个目标历史用户画像包括0-17岁和25-34岁。相应的，当至少一个目标历史用户画像包括0-17岁、25-34岁以及45岁-54岁时，家庭结构为三世同堂。当至少一个目标历史用户画像包括0-17岁和25-34岁时，家庭结构为两世同堂。For example, assume that the number of sequences corresponding to 0-17 years old, 18-24 years old, 25-34 years old, 35-44 years old, 45-54 years old and over 55 years old are 100, 2, 50, 7 respectively 1, 10, and 1. If the preset selection number is 3, at least one target historical user portrait includes 0-17 years old, 25-34 years old, and 45-54 years old. If the preset selection threshold is 40, at least one target historical user portrait includes 0-17 years old and 25-34 years old. Correspondingly, when at least one target historical user portrait includes 0-17 years old, 25-34 years old, and 45-54 years old, the family structure is three generations living together. When at least one target historical user profile includes 0-17 years old and 25-34 years old, the family structure is two generations.

在另一个实施例中，具体的，中间年龄段用于表征该历史用户画像既存在与其相邻的上一历史用户画像又存在与其相邻的下一年龄段，两端年龄段用于表征该历史用户画像仅存在与其相邻的上一历史用户画像或仅存在与其相邻的下一年龄段。示例性的，假设多个历史用户画像包括0-17岁、18-24岁、25-34岁、35-44岁、45岁-54岁以及55岁以上，则“0-17岁”和“55岁以上”均属于两端年龄段，8-24岁、25-34岁、35-44岁、45岁-54岁均属于中间年龄段。In another embodiment, specifically, the middle age group is used to represent that the historical user portrait has both the previous historical user portrait adjacent to it and the next age group adjacent to it, and the age groups at both ends are used to represent the The historical user portrait only has the previous historical user portrait adjacent to it or only the next age group adjacent to it. Exemplarily, assuming that multiple historical user portraits include 0-17 years old, 18-24 years old, 25-34 years old, 35-44 years old, 45-54 years old and over 55 years old, then "0-17 years old" and " "Over 55 years old" belong to both ends of the age group, 8-24 years old, 25-34 years old, 35-44 years old, 45-54 years old belong to the middle age group.

其中，具体的，当历史用户画像属于中间年龄段时，之所以本序列数量大于上一序列数量而本序列数量大于等于下一序列数量，是因为观影序列通常是向后发生偏移的，即随着时间的迁移，同一用户偏向于观看较大的历史用户画像对应的观影序列。如用户A一开始观看儿童动画类的影视作品，随着时间迁移，用户A可能会开始观看青年动漫类的影视作品。Specifically, when the historical user portraits belong to the middle age group, the reason why the number of this sequence is greater than the number of the previous sequence and the number of this sequence is greater than or equal to the number of the next sequence is because the viewing sequence usually shifts backwards. That is, over time, the same user tends to watch the movie viewing sequence corresponding to the larger historical user portrait. For example, when user A starts watching children's animation films and television works, with the passage of time, user A may start watching youth animation film and television works.

举例而言，假设0-17岁、18-24岁、25-34岁、35-44岁、45岁-54岁以及55岁以上对应的序列数量分别为100个、2个、50个、7个、10个和1个，则至少一个目标历史用户画像包括0-17岁、25-34岁以及45岁-54岁，相应的，家庭结构为三世同堂。For example, assume that the number of sequences corresponding to 0-17 years old, 18-24 years old, 25-34 years old, 35-44 years old, 45-54 years old and over 55 years old are 100, 2, 50, 7 respectively , 10, and 1, then at least one target historical user portrait includes 0-17 years old, 25-34 years old, and 45-54 years old. Correspondingly, the family structure is three generations.

在另一个实施例中，可选的，当采用上述三种方式分别得到一个家庭结构时，判断三个家庭结构是否相同，如果是，则将任一家庭结构作为待检测设备标识对应的家庭结构，如果否，则将待检测设备标识对应的家庭结构设置为待定或将基于相邻序列数量比较的方法确定的家庭结构作为待检测设备标识对应的家庭结构。In another embodiment, optionally, when one family structure is obtained by using the above three methods respectively, it is judged whether the three family structures are the same, and if so, any family structure is used as the family structure corresponding to the device to be detected. , if not, set the family structure corresponding to the ID of the device to be detected as pending or use the family structure determined based on the method of comparing the number of adjacent sequences as the family structure corresponding to the ID of the device to be detected.

由于基于预设数量阈值比较或序列数量排序的方法，需要根据人工经验提前设置预设选取数量和预设数量阈值的参数值，而预设选取数量和预设数量阈值的参数值会直接影响到确定的目标历史用户画像的数量，如预设选取数量为3个时，目标历史用户画像的数量必为三个。基于相邻序列数量比较的方法则有效避免了依赖于人工经验的问题，确定的家庭结构相比于以上两种方法确定的家庭结构更准确一点，所以在三个家庭结构不同时，可选的，将基于相邻序列数量比较的方法确定的家庭结构作为待检测设备标识对应的家庭结构。Due to the method based on the preset quantity threshold comparison or sequence quantity sorting method, the parameter values of the preset selection quantity and the preset quantity threshold need to be set in advance according to manual experience, and the parameter values of the preset selection quantity and the preset quantity threshold will directly affect the The determined number of target historical user portraits. For example, when the preset selection number is three, the number of target historical user portraits must be three. The method based on the comparison of the number of adjacent sequences effectively avoids the problem of relying on artificial experience, and the determined family structure is more accurate than the family structure determined by the above two methods. Therefore, when the three family structures are different, the optional , and the family structure determined by the method based on the comparison of the number of adjacent sequences is used as the family structure corresponding to the identification of the device to be detected.

在另一个实施例中，当历史用户画像包含用户年龄段和用户性别时，基于至少一个目标历史用户画像确定的家庭结构相比于基于至少一个目标用户年龄段确定的家庭结构更精细。示例性的，当历史用户画像包含用户年龄段时，假设至少一个目标历史用户画像包括0-17岁和25-34岁，则家庭结构为两世同堂。当历史用户画像包含用户年龄段和用户性别时，假设至少一个目标历史用户画像包括女性0-17岁、女性25-34岁和男性25-34岁，则家庭结构为三口之家，假设至少一个目标历史用户画像包括女性0-17岁和女性25-34岁，则家庭结构为女性单亲家庭。In another embodiment, when the historical user portrait includes the user age group and the user gender, the family structure determined based on the at least one target historical user portrait is more refined than the family structure determined based on the at least one target user age group. Exemplarily, when the historical user portraits include user age groups, assuming that at least one target historical user portrait includes 0-17 years old and 25-34 years old, the family structure is two generations living together. When historical user portraits include user age group and user gender, assuming that at least one target historical user portrait includes females 0-17 years old, females 25-34 years old and males 25-34 years old, the family structure is a family of three, assuming at least one The target historical user portraits include female 0-17 years old and female 25-34 years old, and the family structure is female single-parent family.

在上述实施例的基础上，可选的，响应于检测到目标待检测设备标识对应的开机指令，基于目标待检测设备标识对应的目标家庭结构，获取推荐数据；将推荐数据发送给与目标待检测设备标识对应的终端设备，以使终端设备对推荐数据进行展示。On the basis of the above embodiment, optionally, in response to detecting a power-on instruction corresponding to the identifier of the target device to be detected, the recommendation data is obtained based on the target family structure corresponding to the identifier of the target device to be detected; and the recommendation data is sent to the target device to be detected. The detection device identifies the corresponding terminal device, so that the terminal device can display the recommended data.

其中，示例性的，推荐数据包括但不限于购物数据、广告数据、推荐影视数据等等。举例而言，当目标家庭结构为三世同堂时，推荐数据可以为与保健品相关的广告数据，当目标家庭结构为单亲家庭时，推荐数据为与托管所相关的广告数据。Wherein, exemplarily, the recommendation data includes, but is not limited to, shopping data, advertisement data, recommended video data, and the like. For example, when the target family structure is three generations living together, the recommendation data may be advertisement data related to health care products, and when the target family structure is a single-parent family, the recommendation data may be advertisement data related to custody.

本实施例的技术方案，通过针对每个待检测设备标识，获取待检测设备标识对应的多个历史观影序列分别对应的历史用户画像，其中，历史用户画像包含用户年龄段或历史用户画像包含用户年龄段和用户性别，确定至少一种历史用户画像分别对应的历史观影序列的序列数量，并基于各序列数量，确定待检测设备标识对应的家庭结构，解决了无法对终端设备所属家庭的家庭结构进行精准识别的问题，实现了预测家庭结构的目的，为后续的推荐或营销服务提供了重要的数据参考。The technical solution of this embodiment is to obtain historical user portraits corresponding to multiple historical viewing sequences corresponding to the device identification to be detected for each device identification to be detected, wherein the historical user portrait includes the user age group or the historical user portrait includes User age group and user gender, determine the sequence number of historical viewing sequences corresponding to at least one historical user portrait, and determine the family structure corresponding to the identifier of the device to be detected based on the number of each sequence. The problem of accurate identification of family structure achieves the purpose of predicting family structure and provides important data reference for subsequent recommendation or marketing services.

实施例四Embodiment 4

图5是本发明实施例四提供的一种用户画像的生成装置的结构示意图。如图5所示，该装置包括：当前特征向量获取模块410、当前聚类簇确定模块420和当前用户画像确定模块430。FIG. 5 is a schematic structural diagram of an apparatus for generating a user portrait according to Embodiment 4 of the present invention. As shown in FIG. 5 , the apparatus includes: a current featurevector acquisition module 410 , a currentcluster determination module 420 and a current userportrait determination module 430 .

其中，当前特征向量获取模块410，用于响应于检测到多个当前观影序列，获取各当前观影序列分别对应的待检测设备标识和当前特征向量；Wherein, the current featurevector obtaining module 410 is configured to obtain the identification of the device to be detected and the current feature vector corresponding to each current movie viewing sequence in response to detecting a plurality of current movie viewing sequences;

当前聚类簇确定模块420，用于对多个当前特征向量执行聚类操作，得到至少一个当前聚类簇；The currentcluster determination module 420 is configured to perform a clustering operation on a plurality of current feature vectors to obtain at least one current cluster;

当前用户画像确定模块430，用于基于多个当前观影序列对应的当前映射列表和各当前聚类簇分别对应的至少一个待检测设备标识，确定各当前观影序列分别对应的当前用户画像；The current userportrait determination module 430 is configured to determine the current user portraits corresponding to each current movie viewing sequence based on the current mapping lists corresponding to the multiple current movie viewing sequences and at least one device identifier to be detected respectively corresponding to each current cluster;

其中，当前映射列表中包含至少两种用户画像以及与各用户画像分别对应的多个上一设备标识，当前映射列表是基于上一次检测到的多个上一观影序列分别对应的上一特征向量确定的。Wherein, the current mapping list contains at least two user portraits and multiple previous device identifiers corresponding to each user portrait, and the current mapping list is based on the previous features corresponding to multiple previous viewing sequences detected last time. The vector is determined.

本实施例的技术方案，通过获取检测到的多个当前观影序列分别对应的待检测设备标识和当前特征向量，对多个当前特征向量执行聚类操作，得到至少一个当前聚类簇，基于多个当前观影序列对应的当前映射列表和各当前聚类簇分别对应的至少一个待检测设备标识，确定各当前观影序列分别对应的当前用户画像，其中，当前映射列表是基于上一次检测到的多个上一观影序列分别对应的上一特征向量确定的，解决了现有观影映射列表依赖人为经验设置的问题，使得当前映射列表具备时间迁移性，进而提高了用户画像的预测准确率。The technical solution of this embodiment is to obtain at least one current cluster by obtaining the device identifiers to be detected and the current feature vectors corresponding to the detected multiple current viewing sequences respectively, and performing a clustering operation on the multiple current feature vectors to obtain at least one current cluster. The current mapping lists corresponding to multiple current viewing sequences and at least one device identifier to be detected respectively corresponding to each current cluster, and the current user portraits corresponding to each current viewing sequence are determined, wherein the current mapping list is based on the last detection The previous feature vectors corresponding to the multiple previous viewing sequences are determined, which solves the problem that the existing viewing mapping list depends on the setting of human experience, makes the current mapping list have temporal mobility, and improves the prediction of user portraits Accuracy.

在上述实施例的基础上，可选的，该装置还包括：On the basis of the foregoing embodiment, optionally, the device further includes:

当前映射列表确定模块，用于获取多个上一观影序列对应的至少两个上一聚类簇、各上一聚类簇分别对应的用户画像以及上一设备标识集；其中，各上一聚类簇是基于各上一观影序列分别对应的上一特征向量聚类得到的，上一设备标识集包含上一聚类簇对应的多个上一观影序列的上一设备标识；The current mapping list determination module is used to obtain at least two previous clusters corresponding to multiple previous viewing sequences, user portraits corresponding to each previous cluster, and a previous device identification set; The cluster cluster is obtained by clustering based on the previous feature vector corresponding to each previous viewing sequence, and the previous device identification set includes the previous device identifications of multiple previous viewing sequences corresponding to the previous cluster;

针对每个上一聚类簇，获取上一聚类簇对应的上一设备标识集中的任一上一设备标识，如果上一设备标识与其他上一设备标识集不存在交集，则将上一设备标识与上一聚类簇对应的用户画像对应添加到当前映射列表中。For each previous cluster, obtain any previous device ID in the previous device ID set corresponding to the previous cluster. If there is no intersection between the previous device ID and other previous device ID sets, the previous device ID The device ID and the user portrait corresponding to the previous cluster are added to the current mapping list.

在上述实施例的基础上，可选的，当前用户画像确定模块430，具体用于：On the basis of the foregoing embodiment, optionally, the current userportrait determination module 430 is specifically used for:

针对每个当前聚类簇，基于当前聚类簇对应的至少一个待检测设备标识，确定与当前映射列表中至少两个用户画像分别对应的设备标识重叠数量；For each current cluster, based on at least one device identification to be detected corresponding to the current cluster, determine the overlapping number of device identifications corresponding to at least two user portraits in the current mapping list;

将设备标识重叠数量最多的用户画像作为当前聚类簇对应的各当前观影序列的当前用户画像。The user portrait with the largest number of overlapping device identifications is used as the current user portrait of each current movie viewing sequence corresponding to the current cluster.

时间画像映射列表确定模块，用于针对每个待检测设备标识，获取待检测设备标识对应的多个历史观影序列分别对应的历史开机时间和历史用户画像；其中，历史观影序列包括待检测设备标识对应的多个当前观影序列和/或上一观影序列；The time portrait mapping list determination module is used to obtain, for each device identification to be detected, the historical boot times and historical user portraits respectively corresponding to multiple historical viewing sequences corresponding to the identification of the device to be detected; wherein, the historical viewing sequence includes the identification to be detected Multiple current viewing sequences and/or previous viewing sequences corresponding to the device identifier;

基于各历史开机时间以及各历史用户画像，确定时间画像映射列表；其中，时间画像映射列表包含至少两个预设时间段以及与各预设时间段分别对应的分类用户画像。Based on each historical boot time and each historical user portrait, a time portrait mapping list is determined; wherein, the time portrait mapping list includes at least two preset time periods and classified user portraits corresponding to each preset time period.

第一推荐数据发送模块，用于响应于检测到目标待检测设备标识对应的开机指令，基于目标待检测设备标识对应的目标时间画像映射列表以及开机指令对应的当前开机时刻，确定与开机指令对应的目标分类用户画像；The first recommendation data sending module is configured to, in response to detecting the boot command corresponding to the target device identification to be detected, determine the corresponding boot command based on the target time portrait mapping list corresponding to the target device identification to be detected and the current boot time corresponding to the boot command. The target classification user portrait;

获取与目标分类用户画像对应的推荐数据，并将推荐数据发送给与目标待检测设备标识对应的终端设备，以使终端设备对推荐数据进行展示。The recommendation data corresponding to the target classification user portrait is acquired, and the recommendation data is sent to the terminal device corresponding to the identification of the target device to be detected, so that the terminal device can display the recommendation data.

家庭结构确定模块，用于针对每个待检测设备标识，获取待检测设备标识对应的多个历史观影序列分别对应的历史用户画像；其中，历史观影序列包括待检测设备标识对应的多个当前观影序列和/或上一观影序列，历史用户画像包含用户年龄段或历史用户画像包含用户年龄段和用户性别；The family structure determination module is used to obtain, for each device identification to be detected, historical user portraits corresponding to a plurality of historical viewing sequences corresponding to the identification of the device to be detected; wherein, the historical viewing sequence includes multiple identifications of the device to be detected corresponding to The current viewing sequence and/or the previous viewing sequence, the historical user portrait contains the user's age group or the historical user portrait contains the user's age group and user gender;

确定至少一种历史用户画像分别对应的历史观影序列的序列数量，并基于各序列数量，确定待检测设备标识对应的家庭结构。The number of sequences of historical viewing sequences corresponding to at least one historical user portrait is determined, and based on the number of each sequence, the family structure corresponding to the identifier of the device to be detected is determined.

在上述实施例的基础上，可选的，该装置还包括：家庭结构确定模块，具体用于：On the basis of the foregoing embodiment, optionally, the device further includes: a family structure determination module, which is specifically used for:

基于预设选取参数和各序列数量，确定至少一个目标历史用户画像，并基于各目标历史用户画像，确定待检测设备标识对应的家庭结构；其中，预设选取参数包括预设选取数量或预设数量阈值；和/或，Based on the preset selection parameters and the number of each sequence, at least one target historical user portrait is determined, and based on each target historical user portrait, the family structure corresponding to the identification of the device to be detected is determined; wherein, the preset selection parameters include the preset selection number or preset selection. Quantity thresholds; and/or,

针对每个历史用户画像分别对应的序列数量，如果本序列数量对应的历史用户画像属于中间年龄段，则在本序列数量大于等于下一序列数量且大于上一序列数量时，将本序列数量对应的历史用户画像作为目标历史用户画像；For the number of sequences corresponding to each historical user portrait, if the historical user portrait corresponding to this sequence number belongs to the middle age group, when the number of this sequence is greater than or equal to the number of the next sequence and greater than the number of the previous sequence, the number of this sequence corresponds to The historical user portrait of the target is used as the target historical user portrait;

如果本序列数量对应的历史用户画像属于两端年龄段，则在本序列数量大于等于下一序列数量或大于上一序列数量时，将本序列数量对应的历史用户画像作为目标历史用户画像；If the historical user portraits corresponding to the number of this sequence belong to the age groups at both ends, when the number of the current sequence is greater than or equal to the number of the next sequence or greater than the number of the previous sequence, the historical user portrait corresponding to the number of this sequence is used as the target historical user portrait;

基于各目标历史用户画像，确定待检测设备标识对应的家庭结构。Based on the historical user portraits of each target, the family structure corresponding to the ID of the device to be detected is determined.

第二推荐数据发送模块，用于响应于检测到目标待检测设备标识对应的开机指令，基于目标待检测设备标识对应的目标家庭结构，获取推荐数据；The second recommendation data sending module is configured to obtain recommendation data based on the target family structure corresponding to the target device identification to be detected in response to detecting the booting instruction corresponding to the target device identification to be detected;

将推荐数据发送给与目标待检测设备标识对应的终端设备，以使终端设备对推荐数据进行展示。The recommendation data is sent to the terminal device corresponding to the identification of the target device to be detected, so that the terminal device displays the recommendation data.

本发明实施例所提供的用户画像的生成装置可执行本发明任意实施例所提供的用户画像的生成方法，具备执行方法相应的功能模块和有益效果。The device for generating a user portrait provided by the embodiment of the present invention can execute the method for generating a user portrait provided by any embodiment of the present invention, and has functional modules and beneficial effects corresponding to the execution method.

实施例五Embodiment 5

图6是本发明实施例五提供的一种电子设备的结构示意图。电子设备10旨在表示各种形式的数字计算机，诸如，膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置，诸如，个人数字处理、蜂窝电话、智能电话、可穿戴设备(如头盔、眼镜、手表等)和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例，并且不意在限制本文中描述的和/或者要求的本发明的实现。FIG. 6 is a schematic structural diagram of an electronic device according to Embodiment 5 of the present invention.Electronic device 10 is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices (eg, helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the inventions described and/or claimed herein.

如图6所示，电子设备10包括至少一个处理器11，以及与至少一个处理器11通信连接的存储器，如只读存储器(ROM)12、随机访问存储器(RAM)13等，其中，存储器存储有可被至少一个处理器执行的计算机程序，处理器11可以根据存储在只读存储器(ROM)12中的计算机程序或者从存储单元18加载到随机访问存储器(RAM)13中的计算机程序，来执行各种适当的动作和处理。在RAM 13中，还可存储电子设备10操作所需的各种程序和数据。处理器11、ROM 12以及RAM 13通过总线14彼此相连。输入/输出(I/O)接口15也连接至总线14。As shown in FIG. 6, theelectronic device 10 includes at least oneprocessor 11, and a memory, such as a read only memory (ROM) 12, a random access memory (RAM) 13, etc., connected in communication with the at least oneprocessor 11, wherein the memory stores There is a computer program executable by at least one processor, and theprocessor 11 can be executed according to a computer program stored in a read only memory (ROM) 12 or loaded from astorage unit 18 into a random access memory (RAM) 13. Various appropriate actions and processes are performed. In theRAM 13, various programs and data necessary for the operation of theelectronic device 10 can also be stored. Theprocessor 11 , theROM 12 and theRAM 13 are connected to each other through abus 14 . An input/output (I/O)interface 15 is also connected to thebus 14 .

电子设备10中的多个部件连接至I/O接口15，包括：输入单元16，例如键盘、鼠标等；输出单元17，例如各种类型的显示器、扬声器等；存储单元18，例如磁盘、光盘等；以及通信单元19，例如网卡、调制解调器、无线通信收发机等。通信单元19允许电子设备10通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。Various components in theelectronic device 10 are connected to the I/O interface 15, including: aninput unit 16, such as a keyboard, a mouse, etc.; anoutput unit 17, such as various types of displays, speakers, etc.; astorage unit 18, such as a magnetic disk, an optical disk, etc. etc.; and acommunication unit 19, such as a network card, modem, wireless communication transceiver, and the like. Thecommunication unit 19 allows theelectronic device 10 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.

处理器11可以是各种具有处理和计算能力的通用和/或专用处理组件。处理器11的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的处理器、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。处理器11执行上文所描述的各个方法和处理，例如用户画像的生成方法。Theprocessor 11 may be various general and/or special purpose processing components having processing and computing capabilities. Some examples ofprocessors 11 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various specialized artificial intelligence (AI) computing chips, various processors that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc. Theprocessor 11 executes the various methods and processes described above, such as the generation method of the user portrait.

在一些实施例中，用户画像的生成方法可被实现为计算机程序，其被有形地包含于计算机可读存储介质，例如存储单元18。在一些实施例中，计算机程序的部分或者全部可以经由ROM 12和/或通信单元19而被载入和/或安装到电子设备10上。当计算机程序加载到RAM 13并由处理器11执行时，可以执行上文描述的用户画像的生成方法的一个或多个步骤。备选地，在其他实施例中，处理器11可以通过其他任何适当的方式(例如，借助于固件)而被配置为执行用户画像的生成方法。In some embodiments, the method of generating a user profile may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such asstorage unit 18 . In some embodiments, part or all of the computer program may be loaded and/or installed on theelectronic device 10 via theROM 12 and/or thecommunication unit 19 . When the computer program is loaded into theRAM 13 and executed by theprocessor 11, one or more steps of the method for generating a user portrait described above may be performed. Alternatively, in other embodiments, theprocessor 11 may be configured by any other suitable means (eg, by means of firmware) to perform the method of generating the user portrait.

本文中以上描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、负载可编程逻辑设备(CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括：实施在一个或者多个计算机程序中，该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释，该可编程处理器可以是专用或者通用可编程处理器，可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令，并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described herein above may be implemented in digital electronic circuitry, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips system (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that The processor, which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.

用于实施本发明的用户画像的生成方法的计算机程序可以采用一个或多个编程语言的任何组合来编写。这些计算机程序可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器，使得计算机程序当由处理器执行时使流程图和/或框图中所规定的功能/操作被实施。计算机程序可以完全在机器上执行、部分地在机器上执行，作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。A computer program for implementing the method of generating a persona of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/operations specified in the flowcharts and/or block diagrams to be carried out. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

实施例六Embodiment 6

本发明实施例六还提供了一种计算机可读存储介质，计算机可读存储介质存储有计算机指令，计算机指令用于使处理器执行一种用户画像的生成方法，该方法包括：Embodiment 6 of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores computer instructions, and the computer instructions are used to cause a processor to execute a method for generating a user portrait, and the method includes:

响应于检测到多个当前观影序列，获取各当前观影序列分别对应的待检测设备标识和当前特征向量；In response to detecting a plurality of current movie viewing sequences, acquiring the identification of the device to be detected and the current feature vector corresponding to each current movie viewing sequence respectively;

基于多个当前观影序列对应的当前映射列表和各当前聚类簇分别对应的至少一个待检测设备标识，确定各当前观影序列分别对应的当前用户画像；Based on the current mapping lists corresponding to the multiple current movie viewing sequences and the at least one device identifier to be detected corresponding to each current cluster, determining the current user portraits corresponding to each current movie viewing sequence respectively;

在本发明的上下文中，计算机可读存储介质可以是有形的介质，其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的计算机程序。计算机可读存储介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备，或者上述内容的任何合适组合。备选地，计算机可读存储介质可以是机器可读信号介质。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present invention, a computer-readable storage medium may be a tangible medium that may contain or store a computer program for use by or in connection with the instruction execution system, apparatus or device. Computer-readable storage media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing. Alternatively, the computer-readable storage medium may be a machine-readable signal medium. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.

为了提供与用户的交互，可以在电子设备上实施此处描述的系统和技术，该电子设备具有：用于向用户显示信息的显示装置(例如，CRT(阴极射线管)或者LCD(液晶显示器)监视器)；以及键盘和指向装置(例如，鼠标或者轨迹球)，用户可以通过该键盘和该指向装置来将输入提供给电子设备。其它种类的装置还可以用于提供与用户的交互；例如，提供给用户的反馈可以是任何形式的传感反馈(例如，视觉反馈、听觉反馈、或者触觉反馈)；并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on an electronic device having a display device (eg, a CRT (cathode ray tube) or an LCD (liquid crystal display)) for displaying information to the user monitor); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the electronic device. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.

可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如，作为数据服务器)、或者包括中间件部件的计算系统(例如，应用服务器)、或者包括前端部件的计算系统(例如，具有图形用户界面或者网络浏览器的用户计算机，用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如，通信网络)来将系统的部件相互连接。通信网络的示例包括：局域网(LAN)、广域网(WAN)、区块链网络和互联网。The systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user's computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), blockchain networks, and the Internet.

计算系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器，又称为云计算服务器或云主机，是云计算服务体系中的一项主机产品，以解决了传统物理主机与VPS服务中，存在的管理难度大，业务扩展性弱的缺陷。A computing system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also known as a cloud computing server or a cloud host. It is a host product in the cloud computing service system to solve the traditional physical host and VPS services, which are difficult to manage and weak in business scalability. defect.

应该理解，可以使用上面所示的各种形式的流程，重新排序、增加或删除步骤。例如，本发明中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行，只要能够实现本发明的技术方案所期望的结果，本文在此不进行限制。It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, the steps described in the present invention can be performed in parallel, sequentially or in different orders, and as long as the desired results of the technical solutions of the present invention can be achieved, no limitation is imposed herein.

上述具体实施方式，并不构成对本发明保护范围的限制。本领域技术人员应该明白的是，根据设计要求和其他因素，可以进行各种修改、组合、子组合和替代。任何在本发明的精神和原则之内所作的修改、等同替换和改进等，均应包含在本发明保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the protection scope of the present invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.