CN118799948A

Movatterモバイル変換

Info

Publication number: CN118799948A
Application number: CN202411282724.XA
Authority: CN
Inventors: 姚健; 钱鹏江; 薛文倩; 樊成; 张冠宇; 方伟; 王闯; 蒋亦樟; 刘洋; 梁福生
Original assignee: Jilin University; Suzhou University; Jiangnan University
Current assignee: Jilin University; Suzhou University; Jiangnan University
Priority date: 2024-09-13
Filing date: 2024-09-13
Publication date: 2024-10-18
Anticipated expiration: 2044-09-13
Also published as: CN118799948B

Abstract

Translated fromChinese

本发明涉及半监督表情识别技术领域，公开了一种基于交叉融合与置信评估的情绪识别智能合约构建方法，包括获取基于区块链存储的面部图像，划分为有标签与无标签样本集；将所有面部图像分别输入初始图像分类模型，获取预测标签与标签置信度得分；基于每个有标签样本的交叉熵损失，得标签集合损失；将无标签样本划分为正确或错误样本；基于每个正确样本的交叉熵损失，获取集合无监督损失；基于每个错误样本的对比学习损失，获取集合对比损失；基于前述三种损失，构建模型总损失函数，训练初始图像分类模型，获取训练好的图像分类模型，输入待识别面部图像，获取多个置信度得分，以其中得分最高的所表示的情绪类型，作为待识别面部图像的预测标签。

The present invention relates to the technical field of semi-supervised expression recognition, and discloses a method for constructing an emotion recognition smart contract based on cross fusion and confidence assessment, comprising: obtaining a facial image stored based on a blockchain, and dividing the facial image into a labeled sample set and an unlabeled sample set; inputting all facial images into an initial image classification model respectively, and obtaining a predicted label and a label confidence score; obtaining a label set loss based on the cross entropy loss of each labeled sample; dividing the unlabeled samples into correct or incorrect samples; obtaining a set unsupervised loss based on the cross entropy loss of each correct sample; obtaining a set contrast loss based on the contrast learning loss of each incorrect sample; constructing a model total loss function based on the above three losses, training an initial image classification model, obtaining a trained image classification model, inputting a facial image to be recognized, obtaining multiple confidence scores, and taking the emotion type represented by the highest score as the predicted label of the facial image to be recognized.

Description

Translated fromChinese

基于交叉融合与置信评估的情绪识别智能合约构建方法Emotion recognition smart contract construction method based on cross fusion and confidence assessment

技术领域Technical Field

本发明涉及半监督表情识别技术领域，尤其是指一种基于交叉融合与置信评估的情绪识别智能合约构建方法。The present invention relates to the technical field of semi-supervised expression recognition, and in particular to a method for constructing an emotion recognition smart contract based on cross fusion and confidence assessment.

背景技术Background Art

康养，是一种针对老年人群体的综合性养老服务模式。它不仅关注老年人的基本生活照料，更强调通过康复、养生等手段，提升老年人的身体健康水平，同时注重心理健康和精神生活的丰富，以实现老年人身心健康的全面提升。老年人由于生理机能的衰退和社会角色的转变，往往更容易出现孤独、抑郁、焦虑等情绪问题。这些情绪问题如果得不到及时的发现和干预，可能会进一步发展为严重的心理疾病，如抑郁症、焦虑症等。通过情绪识别，可以实时监测老年人的情绪状态，一旦发现异常，就能立即采取措施进行干预，从而有效预防心理疾病的发生。Health care is a comprehensive elderly care service model for the elderly. It not only focuses on the basic life care of the elderly, but also emphasizes improving the physical health of the elderly through rehabilitation, health care and other means, while paying attention to the enrichment of mental health and spiritual life, so as to achieve the overall improvement of the physical and mental health of the elderly. Due to the decline of physiological functions and the change of social roles, the elderly are often more prone to emotional problems such as loneliness, depression, and anxiety. If these emotional problems are not discovered and intervened in time, they may further develop into serious mental illnesses, such as depression and anxiety. Through emotion recognition, the emotional state of the elderly can be monitored in real time. Once abnormalities are found, measures can be taken immediately to intervene, thereby effectively preventing the occurrence of mental illness.

在过去的几十年中，情绪识别方法取得了显著进展，并提出了众多创新方案，但大多数方法依赖于有监督学习。由于标注面部表情需要分析师的专业知识，获取大规模的标注数据需要付出大量的人力、物力和财力，有监督的面部表情识别在实际应用中存在较大的局限性。半监督学习旨在利用少量标记样本训练模型的同时，减少对大规模手工标注的依赖。一致性正则化和伪标签法是现代半监督学习设计的两个基本方法。一致性正则化通过鼓励模型在输入空间中具有一致性的方式进行训练，减少模型在未标记数据上的预测误差；但可能会导致模型在未标记数据上过度拟合，降低模型的泛化能力。伪标签法利用标记数据进行监督学习，使用模型对未标记数据进行预测，选择置信度较高的预测结果作为伪标签，将伪标签与标记数据一起用于训练；一旦模型生成了错误的伪标签，这些错误可能会在后续的训练过程中被传递并放大，导致模型性能下降。In the past few decades, emotion recognition methods have made significant progress and proposed many innovative solutions, but most methods rely on supervised learning. Since labeling facial expressions requires the expertise of analysts and obtaining large-scale labeled data requires a lot of manpower, material and financial resources, supervised facial expression recognition has great limitations in practical applications. Semi-supervised learning aims to reduce the reliance on large-scale manual labeling while training models with a small number of labeled samples. Consistency regularization and pseudo-labeling are two basic methods of modern semi-supervised learning design. Consistency regularization reduces the prediction error of the model on unlabeled data by encouraging the model to be consistent in the input space; however, it may cause the model to overfit on unlabeled data and reduce the generalization ability of the model. Pseudo-labeling uses labeled data for supervised learning, uses the model to predict unlabeled data, selects the prediction results with higher confidence as pseudo-labels, and uses the pseudo-labels together with the labeled data for training; once the model generates incorrect pseudo-labels, these errors may be transmitted and amplified in the subsequent training process, resulting in a decrease in model performance.

且在目前区块链网络中，区块链技术凭借其去中心化、高透明性以及不可篡改的独特特性，已经广泛应用于包括医疗在内的多个领域，展现出其强大的应用潜力和价值。在区块链技术应用与康养产业中，如何记录情绪数据成为一个问题。区块链虽然能保证数据的不可篡改性，但现有的技术缺乏有效的数据验证机制，比如在数据添加进区块链之前，对表情数据进行分析。这种缺乏分析的环节容易导致老年人的不稳定情绪数据直接被写入区块链中，从而影响对老年人的情绪检测，无法有效的及时介入。Moreover, in the current blockchain network, blockchain technology has been widely used in many fields including medical treatment due to its unique characteristics of decentralization, high transparency and non-tamperability, showing its strong application potential and value. In the application of blockchain technology and the health care industry, how to record emotional data has become a problem. Although blockchain can ensure the non-tamperability of data, the existing technology lacks an effective data verification mechanism, such as analyzing expression data before adding data to the blockchain. This lack of analysis can easily lead to the unstable emotional data of the elderly being directly written into the blockchain, thereby affecting the emotional detection of the elderly and failing to effectively intervene in time.

综上所述，现有的依赖于有监督学习的识别方法，由于标注面部表情需要分析师的专业知识，获取大规模的标注数据需要付出大量的人力、物力和财力，在实际应用中存在较大的局限性；且获取的标注数据若由于存储不当导致数据异常，也会使得训练出的模型不够准确，进而导致识别结果不准确，无法及时获取异常情绪并进行干预。In summary, the existing recognition methods that rely on supervised learning have great limitations in practical applications because labeling facial expressions requires the analyst's professional knowledge and obtaining large-scale labeled data requires a lot of manpower, material and financial resources. Moreover, if the acquired labeled data is abnormal due to improper storage, the trained model will not be accurate enough, which will lead to inaccurate recognition results and the inability to obtain abnormal emotions and intervene in time.

发明内容Summary of the invention

为此，本发明所要解决的技术问题在于克服现有技术中无法准确识别出面部表情的情绪类型，且对于获取的表情数据存在存储不安全，导致影响模型训练精度的问题。To this end, the technical problem to be solved by the present invention is to overcome the problem that the existing technology cannot accurately identify the emotional type of facial expressions, and there is unsafe storage of the acquired expression data, which affects the accuracy of model training.

为解决上述技术问题，本发明提供了一种基于交叉融合与置信评估的情绪识别智能合约构建方法，包括：In order to solve the above technical problems, the present invention provides a method for constructing an emotion recognition smart contract based on cross fusion and confidence assessment, comprising:

获取基于区块链存储的多个面部图像，并根据有无真实标签划分为有标签样本或无标签样本，构建有标签样本集与无标签样本集；Obtain multiple facial images stored based on blockchain, and divide them into labeled samples or unlabeled samples according to whether they have real labels, and construct labeled sample sets and unlabeled sample sets;

将所有面部图像分别输入初始图像分类模型中，获取对应的预测标签与标签置信度得分；Input all facial images into the initial image classification model to obtain the corresponding predicted labels and label confidence scores;

对于有标签样本，基于其标签置信度得分与真实标签，计算获取每个有标签样本的交叉熵损失；对有标签样本集中所有的有标签样本的交叉熵损失求和，获取有标签样本集的标签集合损失；For labeled samples, the cross entropy loss of each labeled sample is calculated based on its label confidence score and the true label; the cross entropy losses of all labeled samples in the labeled sample set are summed to obtain the label set loss of the labeled sample set;

对于无标签样本，分别进行两次弱增强，获取对应的第一弱增强样本与第二弱增强样本；基于第一弱增强样本与第二弱增强样本的标签置信度得分，计算该无标签样本的平均概率；将平均概率不小于预设阈值的所有无标签样本，划分至正确样本集合，其余无标签样本划分至错误样本集合；For unlabeled samples, two weak enhancements are performed respectively to obtain the corresponding first weak enhancement samples and second weak enhancement samples; based on the label confidence scores of the first weak enhancement samples and the second weak enhancement samples, the average probability of the unlabeled samples is calculated; all unlabeled samples whose average probability is not less than a preset threshold are divided into the correct sample set, and the remaining unlabeled samples are divided into the error sample set;

计算正确样本集合中每个正确样本的平均概率，以及强增强后对应的强增强正确样本的标签置信度得分；基于每个正确样本的平均概率与强增强正确样本的标签置信度得分，计算每个正确样本的交叉熵损失，并求和，获取正确样本集合的集合无监督损失；Calculate the average probability of each correct sample in the correct sample set and the label confidence score of the corresponding strongly enhanced correct sample after strong enhancement; based on the average probability of each correct sample and the label confidence score of the strongly enhanced correct sample, calculate the cross entropy loss of each correct sample and sum them up to obtain the set unsupervised loss of the correct sample set;

对于错误样本集合中每个错误样本，基于第一弱增强样本与第二弱增强样本的相似性，计算对应错误样本的对比学习损失；将所有错误样本的对比学习损失求和，获取错误样本集合的集合对比损失；For each error sample in the error sample set, based on the similarity between the first weakly enhanced sample and the second weakly enhanced sample, the contrastive learning loss of the corresponding error sample is calculated; the contrastive learning losses of all error samples are summed to obtain the set contrastive loss of the error sample set;

对有标签样本集的标签集合损失、正确样本集合的集合无监督损失与错误样本集合的集合对比损失进行加权求和，构建模型总损失函数；The total loss function of the model is constructed by weighted summing the label set loss of the labeled sample set, the set unsupervised loss of the correct sample set, and the set contrast loss of the wrong sample set;

利用训练集对初始图像分类模型进行训练，更新预设阈值、重新划分正确样本集合与错误样本集合，来计算模型总损失函数，直至模型总损失函数收敛，获取训练好的图像分类模型；The initial image classification model is trained using the training set, the preset threshold is updated, and the correct sample set and the incorrect sample set are re-divided to calculate the total loss function of the model until the total loss function of the model converges to obtain the trained image classification model;

实时采集待识别面部图像，输入训练好的图像分类模型中，获取多个输出置信度得分，以其中得分最高的置信度所表示的情绪类型，作为该待识别面部图像的预测标签，完成情绪识别智能合约的构建。The facial image to be identified is collected in real time and input into the trained image classification model to obtain multiple output confidence scores. The emotion type represented by the confidence score with the highest score is used as the predicted label of the facial image to be identified, thus completing the construction of the emotion recognition smart contract.

优选地，将面部图像输入图像分类模型中，获取对应的预测标签与标签置信度得分，包括：Preferably, the facial image is input into an image classification model to obtain the corresponding predicted label and label confidence score, including:

将面部图像输入图像分类模型中，经过特征提取器，获取出对应的面部特征图；The facial image is input into the image classification model, and the corresponding facial feature map is obtained through the feature extractor;

将面部特征图输入块转换模块中，划分为多个相同大小的patch，组成patch特征图；每个patch均包括多个token；The facial feature map is input into the block conversion module and divided into multiple patches of the same size to form a patch feature map; each patch includes multiple tokens;

将patch特征图输入交叉多头注意力机制模块中，将patch特征图中的所有token划分至多个通道上；所述通道的数量与交叉多头注意力机制模块中的head数量相同；Input the patch feature map into the cross-head attention mechanism module, and divide all tokens in the patch feature map into multiple channels; the number of channels is the same as the number of heads in the cross-head attention mechanism module;

在每个通道上，对位于不同patch上相同位置处有相同head索引的token，执行KQV计算，获取每个token的注意力权重，并进行加权求和后，依次经过沿正传播方向串联的层归一化单元、线性层单元、激活函数单元、线性层单元与层归一化单元，获取加权特征向量；On each channel, KQV calculation is performed on tokens with the same head index at the same position on different patches to obtain the attention weight of each token. After weighted summation, the weighted feature vector is obtained by sequentially passing through the layer normalization unit, linear layer unit, activation function unit, linear layer unit and layer normalization unit connected in series along the forward propagation direction.

将加权特征向量输入分类器中，依次经过全局平均池化单元与全连接层，输出面部图像样本的每种情绪类别的置信度得分；The weighted feature vector is input into the classifier, and passes through the global average pooling unit and the fully connected layer in sequence, and the confidence score of each emotion category of the facial image sample is output;

获取其中置信度得分最高的情绪类别，作为面部图像的预测标签，其对应的置信度得分作为面部图像的标签置信度得分。The emotion category with the highest confidence score is obtained as the predicted label of the facial image, and its corresponding confidence score is used as the label confidence score of the facial image.

优选地，所述特征提取器为截断ResNet。Preferably, the feature extractor is a truncated ResNet.

优选地，获取待识别面部图像的预测标签后，还包括：将所述待识别面部图像的预测标签作为真实标签，与待识别面部图像构建为有标签样本，存储至区块链中。Preferably, after obtaining the predicted label of the facial image to be identified, the method further includes: taking the predicted label of the facial image to be identified as the true label, constructing a labeled sample with the facial image to be identified, and storing it in the blockchain.

优选地，所述对于有标签样本，基于其标签置信度得分与真实标签，计算获取每个有标签样本的交叉熵损失；对有标签样本集中所有的有标签样本的交叉熵损失求和，获取有标签样本集的标签集合损失，表示为：Preferably, for the labeled samples, the cross entropy loss of each labeled sample is calculated based on its label confidence score and the true label; the cross entropy losses of all labeled samples in the labeled sample set are summed to obtain the label set loss of the labeled sample set, which is expressed as:

； ;

其中，表示有标签样本集的标签集合损失，表示训练的批量大小；表示交叉熵损失；有标签样本集，表示有标签样本集合中第个有标签样本的真实标签，表示有标签样本的总数量；表示有标签样本集合中第个有标签样本的标签置信度得分。in, represents the label set loss of the labeled sample set, Indicates the batch size for training; Represents cross entropy loss; there is a labeled sample set , Indicates the number of labeled samples in the set Labeled samples The real label, Represents the total number of labeled samples; Indicates the number of labeled samples in the set Labeled samples The label confidence score of .

优选地，所述预设阈值，表示为：Preferably, the preset threshold is expressed as:

； ;

其中，表示第轮训练中第类情绪类别的阈值，表示情绪类别总数量，；表示训练轮次，当时，预设阈值的起始值为；为预设超参数，表示训练的批量大小，表示第个无标签样本属于第类情绪类别的标签置信度得分。in, Indicates In the training round Threshold for emotion category, represents the total number of emotion categories, ; represents the training round, when When the starting value of the preset threshold is ; To preset hyperparameters, represents the batch size for training, Indicates unlabeled samples Belong to Label confidence scores for sentiment classes.

优选地，所述对于无标签样本，分别进行两次弱增强，获取对应的第一弱增强样本与第二弱增强样本；基于第一弱增强样本与第二弱增强样本的标签置信度得分，计算该无标签样本的平均概率；将平均概率不小于预设阈值的所有无标签样本，划分至正确样本集合，其余无标签样本划分至错误样本集合，包括：Preferably, the method of performing two weak enhancements on the unlabeled samples to obtain the corresponding first weak enhancement samples and second weak enhancement samples; calculating the average probability of the unlabeled samples based on the label confidence scores of the first weak enhancement samples and the second weak enhancement samples; and dividing all unlabeled samples whose average probability is not less than a preset threshold into a correct sample set, and dividing the remaining unlabeled samples into an error sample set, includes:

将无标签样本集中每个无标签样本进行两次弱增强，获取每个无标签样本对应的第一弱增强样本与第二弱增强样本；与分别表示第一弱增强操作与第二弱增强操作，表示无标签样本集中无标签样本总个数；The unlabeled sample set Each unlabeled sample in is weakly enhanced twice to obtain the first weakly enhanced sample corresponding to each unlabeled sample With the second weakly enhanced sample ; and Respectively represent the first weak enhancement operation and the second weak enhancement operation, Represents the total number of unlabeled samples in the unlabeled sample set;

基于无标签样本对应的第一弱增强样本与第二弱增强样本的标签置信度得分，计算该无标签样本的平均概率，表示为：；Based on the label confidence scores of the first weakly enhanced sample and the second weakly enhanced sample corresponding to the unlabeled sample, the average probability of the unlabeled sample is calculated, which is expressed as: ;

若该无标签样本的平均概率不小于预设阈值，则将该无标签样本划分至正确样本集合，表示正确样本集合中正确样本总个数；If the average probability of the unlabeled sample Not less than the preset threshold , then the unlabeled sample is divided into the correct sample set , Represents the total number of correct samples in the correct sample set;

若该无标签样本的平均概率小于预设阈值，则将该无标签样本划分至错误样本集合。If the average probability of the unlabeled sample Less than the preset threshold , then the unlabeled sample is divided into the error sample set.

优选地，所述计算正确样本集合中每个正确样本的平均概率，以及强增强后对应的强增强正确样本的标签置信度得分；基于每个正确样本的平均概率与强增强正确样本的标签置信度得分，计算每个正确样本的交叉熵损失，并求和，获取正确样本集合的集合无监督损失，包括：Preferably, the calculation of the average probability of each correct sample in the correct sample set and the label confidence score of the strongly enhanced correct sample corresponding to the strong enhancement; based on the average probability of each correct sample and the label confidence score of the strongly enhanced correct sample, the cross entropy loss of each correct sample is calculated and summed to obtain the set unsupervised loss of the correct sample set, including:

对正确样本进行强增强，获取强增强正确样本，计算强增强正确样本的标签置信度得分；Strong enhancement of correct samples , get the strong enhanced correct sample , calculate the label confidence score of the strongly enhanced correct sample ;

基于该正确样本的平均概率与强增强正确样本的标签置信度得分，计算该正确样本的交叉熵损失；Based on the average probability of the correct sample Calculate the cross entropy loss of the correct sample by combining the label confidence score of the strongly enhanced correct sample ;

对所有正确样本的交叉熵损失求和，获取正确样本集合的集合无监督损失，表示为：；Sum the cross entropy losses of all correct samples to obtain the set unsupervised loss of the correct sample set , expressed as: ;

其中，表示正确样本集合中正确样本总个数；表示交叉熵损失。in, Represents the total number of correct samples in the correct sample set; represents the cross entropy loss.

优选地，所述对于错误样本集合中每个错误样本，基于第一弱增强样本与第二弱增强样本的相似性，计算对应错误样本的对比学习损失；将所有错误样本的对比学习损失求和，获取错误样本集合的集合对比损失，包括：Preferably, for each error sample in the error sample set, based on the similarity between the first weakly enhanced sample and the second weakly enhanced sample, calculating the contrastive learning loss of the corresponding error sample; summing the contrastive learning losses of all error samples to obtain the set contrastive loss of the error sample set includes:

获取第一弱增强样本与第二弱增强样本对应的面部表情特征与，计算样本相似性，表示为：；Get the first weak enhancement sample With the second weakly enhanced sample Corresponding facial expression features and , calculate the sample similarity, expressed as: ;

计算错误样本的对比学习损失，表示为：；Calculate the contrastive learning loss of the wrong sample, expressed as: ;

将所有错误样本的对比学习损失求和，获取错误样本集合的集合对比损失，表示为：；Sum the contrastive learning losses of all error samples to obtain the set contrastive loss of the error sample set, expressed as: ;

其中，表示范数，表示错误样本集合中错误样本总个数；，；表示预设比值。in, represents the norm, Indicates the total number of error samples in the error sample set; , ; Indicates the preset ratio.

优选地，完成情绪识别智能合约的构建后，还包括对情绪识别智能合约进行编译后，部署至区块链。Preferably, after completing the construction of the emotion recognition smart contract, it also includes compiling the emotion recognition smart contract and deploying it to the blockchain.

本发明的上述技术方案相比现有技术具有以下有益效果：The above technical solution of the present invention has the following beneficial effects compared with the prior art:

本发明所述的基于交叉融合与置信评估的情绪识别智能合约构建方法，将获取的面部图像按照有标签与无标签划分；对于有标签样本集，计算其每个有标签样本的交叉熵损失求和，获取标签集合损失；对于无标签样本集，计算每个无标签样本的平均概率与预设阈值比较，划分为正确样本与错误样本，计算正确样本的交叉熵损失与错误样本的对比学习损失，分别求和，获取正确样本集合的集合无监督损失与错误样本集合的集合对比损失；对有标签样本集的标签集合损失、正确样本集合的集合无监督损失与错误样本集合的集合对比损失进行加权求和，构建模型总损失函数，来训练图像分类模型；其中，预设阈值根据训练次数自适应更新，从而更加有效地利用未标记数据进行训练，提升模型性能，并且在面部表情识别任务中取得了显著的性能提升，提高了图像分类模型的预测准确度。The emotion recognition smart contract construction method based on cross fusion and confidence assessment described in the present invention divides the acquired facial images into labeled and unlabeled ones; for the labeled sample set, the cross entropy loss of each labeled sample is calculated and summed to obtain the label set loss; for the unlabeled sample set, the average probability of each unlabeled sample is calculated and compared with a preset threshold, and the samples are divided into correct samples and incorrect samples, and the cross entropy loss of the correct samples and the contrast learning loss of the incorrect samples are calculated and summed respectively to obtain the set unsupervised loss of the correct sample set and the set contrast loss of the incorrect sample set; the label set loss of the labeled sample set, the set unsupervised loss of the correct sample set and the set contrast loss of the incorrect sample set are weightedly summed to construct a model total loss function to train the image classification model; wherein the preset threshold is adaptively updated according to the number of training times, so as to more effectively utilize unlabeled data for training, improve model performance, and achieve significant performance improvement in facial expression recognition tasks, thereby improving the prediction accuracy of the image classification model.

本发明将待识别面部图像的预测标签作为真实标签，与待识别面部图像构建为有标签样本，存储至区块链中，更新有标签样本集，显著提升了对面部图像进行处理的实时性与准确性，还确保了数据的不可篡改性与高度安全性，为康养行业提供了更加科学、可靠的情绪监测与管理手段，从而全面提升了服务质量和老年人的生活质量。The present invention uses the predicted label of the facial image to be identified as the real label, constructs it with the facial image to be identified as a labeled sample, stores it in the blockchain, and updates the labeled sample set, which significantly improves the real-time and accuracy of facial image processing, and also ensures the data's non-tamperability and high security, providing the health care industry with a more scientific and reliable means of emotion monitoring and management, thereby comprehensively improving the service quality and the quality of life of the elderly.

本发明利用截断ResNet作为特征提取器，并且通过使用预训练权重来减少对样本的过拟合风险；在特征提取时，截断ResNet已经考虑了局部特征的重要性，因此在进行全局建模时，并不需要对细节和局部特征进行过深的观察。且利用块转换模块，将面部特征图分割成相同大小的patch，而每个patch中包含若干个token；每个token只需考虑与不同patch中相同位置的token之间的关系，专注于局部patch，能够更有效地过滤掉无关信息，专注于对表情识别有用的特征，增强不同表情之间的区分能力，并且利用多头交叉注意力机制，从patch特征图中提取特征向量，每个head独立计算注意力分数，然后将结果进行拼接和线性变换，以捕捉到不同子空间中的信息，进一步提高了模型预测的准确性；在保证性能的同时提高计算效率，提高了对康养产业中图像智能识别的准确度。The present invention uses truncated ResNet as a feature extractor, and reduces the risk of overfitting samples by using pre-trained weights; when extracting features, truncated ResNet has taken into account the importance of local features, so when performing global modeling, it is not necessary to observe details and local features too deeply. And using the block conversion module, the facial feature map is divided into patches of the same size, and each patch contains several tokens; each token only needs to consider the relationship with the tokens at the same position in different patches, focusing on the local patch, which can more effectively filter out irrelevant information, focus on features useful for expression recognition, enhance the ability to distinguish between different expressions, and use a multi-head cross-attention mechanism to extract feature vectors from the patch feature map, each head independently calculates the attention score, and then splices and linearly transforms the results to capture information in different subspaces, further improving the accuracy of model prediction; while ensuring performance, the computational efficiency is improved, and the accuracy of intelligent image recognition in the health care industry is improved.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了使本发明的内容更容易被清楚的理解，下面根据本发明的具体实施例并结合附图，对本发明作进一步详细的说明，其中In order to make the content of the present invention more clearly understood, the present invention is further described in detail below according to specific embodiments of the present invention in conjunction with the accompanying drawings, wherein

图1是本发明所提供的基于交叉融合与置信评估的情绪识别智能合约构建方法的步骤流程图；FIG1 is a flowchart of the steps of the method for constructing an emotion recognition smart contract based on cross-fusion and confidence assessment provided by the present invention;

图2是本发明所使用的图像分类模型的模型结构示意图；FIG2 is a schematic diagram of the model structure of the image classification model used in the present invention;

图3是本实施例提出的情绪识别理论模型的理论架构图；FIG3 is a theoretical framework diagram of the emotion recognition theoretical model proposed in this embodiment;

图4是本实施例提出的核心MDUSS-FER算法的整体架构示意图；FIG4 is a schematic diagram of the overall architecture of the core MDUSS-FER algorithm proposed in this embodiment;

图5表示SimMatch算法的预测混淆矩阵示意图；FIG5 is a schematic diagram showing the prediction confusion matrix of the SimMatch algorithm;

图6表示MixMatch算法的预测混淆矩阵示意图；FIG6 is a schematic diagram showing a prediction confusion matrix of the MixMatch algorithm;

图7表示FlexMatch算法的预测混淆矩阵示意图；FIG7 is a schematic diagram showing a prediction confusion matrix of the FlexMatch algorithm;

图8表示FixMatch算法的预测混淆矩阵示意图；FIG8 is a schematic diagram showing a prediction confusion matrix of the FixMatch algorithm;

图9表示Ada-CM算法的预测混淆矩阵示意图；FIG9 is a schematic diagram showing the prediction confusion matrix of the Ada-CM algorithm;

图10表示本实施例提供的MUDSS-FER算法的预测混淆矩阵示意图；FIG10 is a schematic diagram showing a prediction confusion matrix of the MUDSS-FER algorithm provided in this embodiment;

图11表示SimMatch算法下不同情绪类别的ROC曲线图；Figure 11 shows the ROC curves of different emotion categories under the SimMatch algorithm;

图12表示MixMatch算法下不同情绪类别的ROC曲线图；Figure 12 shows the ROC curves of different emotion categories under the MixMatch algorithm;

图13表示FlexMatch算法下不同情绪类别的ROC曲线图；Figure 13 shows the ROC curves of different emotion categories under the FlexMatch algorithm;

图14表示FixMatch算法下不同情绪类别的ROC曲线图；Figure 14 shows the ROC curves of different emotion categories under the FixMatch algorithm;

图15表示Ada-CM算法下不同情绪类别的ROC曲线图；Figure 15 shows the ROC curves of different emotion categories under the Ada-CM algorithm;

图16表示本实施例提供的MUDSS-FER算法下不同情绪类别的ROC曲线图。FIG. 16 shows ROC curves of different emotion categories under the MUDSS-FER algorithm provided in this embodiment.

具体实施方式DETAILED DESCRIPTION

下面结合附图和具体实施例对本发明作进一步说明，以使本领域的技术人员可以更好地理解本发明并能予以实施，但所举实施例不作为对本发明的限定。The present invention is further described below in conjunction with the accompanying drawings and specific embodiments so that those skilled in the art can better understand the present invention and implement it, but the embodiments are not intended to limit the present invention.

参照图1所示，本发明的基于交叉融合与置信评估的情绪识别智能合约构建方法的步骤流程图，具体步骤包括：Referring to FIG. 1 , a flowchart of a method for constructing an emotion recognition smart contract based on cross-fusion and confidence assessment of the present invention is provided, wherein the specific steps include:

S101：获取基于区块链存储的多个面部图像，并根据有无真实标签划分为有标签样本或无标签样本，构建有标签样本集与无标签样本集；S101: Acquire multiple facial images stored based on blockchain, and divide them into labeled samples or unlabeled samples according to whether they have real labels, and construct labeled sample sets and unlabeled sample sets;

S102：将所有面部图像分别输入初始图像分类模型中，获取对应的预测标签与标签置信度得分；S102: Input all facial images into the initial image classification model respectively to obtain corresponding predicted labels and label confidence scores;

S103：对于有标签样本，基于其标签置信度得分与真实标签，计算获取每个有标签样本的交叉熵损失；对有标签样本集中所有的有标签样本的交叉熵损失求和，获取有标签样本集的标签集合损失，表示为：S103: For labeled samples, based on their label confidence scores and true labels, the cross entropy loss of each labeled sample is calculated; the cross entropy losses of all labeled samples in the labeled sample set are summed to obtain the label set loss of the labeled sample set, which is expressed as:

； ;

S104：对于无标签样本，分别进行两次弱增强，获取对应的第一弱增强样本与第二弱增强样本；基于第一弱增强样本与第二弱增强样本的标签置信度得分，计算该无标签样本的平均概率；将平均概率不小于预设阈值的所有无标签样本，划分至正确样本集合，其余无标签样本划分至错误样本集合；S104: For the unlabeled samples, perform two weak enhancements respectively to obtain the corresponding first weak enhancement samples and second weak enhancement samples; calculate the average probability of the unlabeled samples based on the label confidence scores of the first weak enhancement samples and the second weak enhancement samples; divide all the unlabeled samples whose average probability is not less than a preset threshold into the correct sample set, and divide the remaining unlabeled samples into the error sample set;

S105：计算正确样本集合中每个正确样本的平均概率，以及强增强后对应的强增强正确样本的标签置信度得分；基于每个正确样本的平均概率与强增强正确样本的标签置信度得分，计算每个正确样本的交叉熵损失，并求和，获取正确样本集合的集合无监督损失；S105: Calculate the average probability of each correct sample in the correct sample set and the label confidence score of the strongly enhanced correct sample corresponding to the strong enhancement; based on the average probability of each correct sample and the label confidence score of the strongly enhanced correct sample, calculate the cross entropy loss of each correct sample, and sum them up to obtain the set unsupervised loss of the correct sample set;

S106：对于错误样本集合中每个错误样本，基于第一弱增强样本与第二弱增强样本的相似性，计算对应错误样本的对比学习损失；将所有错误样本的对比学习损失求和，获取错误样本集合的集合对比损失；S106: for each error sample in the error sample set, based on the similarity between the first weakly enhanced sample and the second weakly enhanced sample, calculating the contrastive learning loss of the corresponding error sample; summing up the contrastive learning losses of all error samples to obtain the set contrastive loss of the error sample set;

S107：对有标签样本集的标签集合损失、正确样本集合的集合无监督损失与错误样本集合的集合对比损失进行加权求和，构建模型总损失函数；S107: Perform weighted summation on the label set loss of the labeled sample set, the set unsupervised loss of the correct sample set, and the set contrast loss of the error sample set to construct a total loss function of the model;

S108：利用训练集对初始图像分类模型进行训练，更新预设阈值、重新划分正确样本集合与错误样本集合，来计算模型总损失函数，直至模型总损失函数收敛，获取训练好的图像分类模型；S108: Using the training set to train the initial image classification model, updating the preset threshold, re-dividing the correct sample set and the error sample set, to calculate the total loss function of the model until the total loss function of the model converges, and obtaining a trained image classification model;

S109：实时采集待识别面部图像，输入训练好的图像分类模型中，获取多个输出置信度得分，以其中得分最高的置信度所表示的情绪类型，作为该待识别面部图像的预测标签，完成情绪识别智能合约的构建。S109: Collect the facial image to be identified in real time, input it into the trained image classification model, obtain multiple output confidence scores, and use the emotion type represented by the confidence score with the highest score as the predicted label of the facial image to be identified, thereby completing the construction of the emotion recognition smart contract.

具体地，在完成情绪识别智能合约的构建后，还包括对情绪识别智能合约进行编译后，部署至区块链。Specifically, after completing the construction of the emotion recognition smart contract, it also includes compiling the emotion recognition smart contract and deploying it to the blockchain.

具体地，本实施例通过反向传播算法（backpropagation）计算模型总损失相对于模型参数的梯度；使用优化算法（例如 SGD、Adam 等）根据计算得到的梯度更新模型的参数；直至达到设定的迭代次数或模型总损失收敛，获取训练好的图像分类模型。Specifically, this embodiment calculates the gradient of the total model loss relative to the model parameters through the backpropagation algorithm; uses an optimization algorithm (such as SGD, Adam, etc.) to update the model parameters according to the calculated gradient; until the set number of iterations is reached or the total model loss converges, a trained image classification model is obtained.

具体地，在本发明实施例中，将所有面部图像输入图像分类模型中，获取对应的预测标签与标签置信度得分，包括：Specifically, in an embodiment of the present invention, all facial images are input into an image classification model to obtain corresponding predicted labels and label confidence scores, including:

无论是初始图像分类模型还是训练好的图像分类模型，其对于输入的面部图像的处理过程是相同的。参照图2所示，为本发明所使用的图像分类模型的模型结构示意图；本实施例中的图像分类模型为表情识别模型ERM，该模型以截断的预训练ResNet为基础，融合了自主提出的Patch Conversion block、attention block以及分类器。根据先前研究和领域知识的指导，表情识别任务通常需要对人脸的关键部位，如眼睛、嘴巴等区域进行敏感的观察。为此，本实施例采用了Vision Transformer作为表情识别的关键组件。考虑到面部表情识别任务并不要求过多的特征提取层，选择了截断的ResNet作为特征提取器，并且通过使用预训练权重来减少对样本的过拟合风险。一方面，从视觉角度，面部不同部位在表情变化过程中相互协调、同步运动；另一方面，在特征提取时，ResNet已经考虑了局部特征的重要性，因此在进行全局建模时，并不需要对细节和局部特征进行过深的观察。为了进一步优化特征提取过程，本实施例采用了Patch Conversion block的方法；将特征图分割成相同大小的patch，而每个patch中包含若干个token；之前的vision transformer模型，token在进行自注意力计算时需要考虑与其他所有token之间的关系；然而，经过patch转换块处理后的特征图，每个token只需考虑与不同patch中相同位置的token之间的关系。PatchConversion block使RRM专注于局部patch，能够更有效地过滤掉无关信息，专注于对表情识别有用的特征，增强不同表情之间的区分能力，并且在保证性能的同时提高计算效率。接下来，经过patch转换块处理的特征图，通过Attention块进行全局建模，计算出每个token的注意力权重，这些权重表明了模型在图像分类时关注的重要区域。最后，这些加权特征被送入分类器进行处理，以得到模型对表情的预测结果。这一流程将局部特征与全局信息相结合，为表情识别任务提供了有效的特征表示和建模方法。Whether it is the initial image classification model or the trained image classification model, the processing process for the input facial image is the same. Referring to FIG2, a schematic diagram of the model structure of the image classification model used in the present invention is shown; the image classification model in this embodiment is an expression recognition model ERM, which is based on a truncated pre-trained ResNet and integrates the independently proposed Patch Conversion block, attention block and classifier. According to the guidance of previous research and domain knowledge, expression recognition tasks usually require sensitive observation of key parts of the face, such as eyes, mouth and other areas. To this end, this embodiment uses Vision Transformer as a key component for expression recognition. Considering that the facial expression recognition task does not require too many feature extraction layers, a truncated ResNet is selected as a feature extractor, and the risk of overfitting of samples is reduced by using pre-trained weights. On the one hand, from a visual perspective, different parts of the face coordinate and move synchronously during the expression change process; on the other hand, when extracting features, ResNet has taken into account the importance of local features, so when performing global modeling, it is not necessary to observe details and local features too deeply. In order to further optimize the feature extraction process, this embodiment adopts the Patch Conversion block method; the feature map is divided into patches of the same size, and each patch contains several tokens; in the previous vision transformer model, the token needs to consider the relationship with all other tokens when performing self-attention calculation; however, after the feature map processed by the patch conversion block, each token only needs to consider the relationship with the tokens at the same position in different patches. PatchConversion block enables RRM to focus on local patches, which can more effectively filter out irrelevant information, focus on features that are useful for expression recognition, enhance the ability to distinguish between different expressions, and improve computational efficiency while ensuring performance. Next, the feature map processed by the patch conversion block is globally modeled through the Attention block to calculate the attention weight of each token. These weights indicate the important areas that the model pays attention to when classifying images. Finally, these weighted features are sent to the classifier for processing to obtain the model's prediction results for expressions. This process combines local features with global information and provides an effective feature representation and modeling method for expression recognition tasks.

且本发明实施例利用多头交叉注意力机制来提取特征向量；与自注意力不同，自注意力是在同一个输入序列内部计算注意力权重的，而交叉注意力则是在两个不同的输入序列之间计算注意力。比如，一个序列可以作为查询（Query），另一个序列作为键（Key）和值（Value）来计算注意力权重。在传统的注意力机制中，模型会通过一个单一的注意力头来计算输入之间的关系。然而，多头机制引入了多个注意力头，每个头独立计算注意力分数，然后将结果进行拼接和线性变换，可以捕捉到不同子空间中的信息，以提高特征提取准确度。In addition, the embodiment of the present invention utilizes a multi-head cross-attention mechanism to extract feature vectors; unlike self-attention, which calculates attention weights within the same input sequence, cross-attention calculates attention between two different input sequences. For example, one sequence can be used as a query, and the other sequence can be used as a key and value to calculate attention weights. In the traditional attention mechanism, the model calculates the relationship between inputs through a single attention head. However, the multi-head mechanism introduces multiple attention heads, each of which independently calculates the attention score, and then concatenates and linearly transforms the results, which can capture information in different subspaces to improve feature extraction accuracy.

在本发明实施例中，获取待识别面部图像的预测标签后，还包括：将所述待识别面部图像的预测标签作为真实标签，与待识别面部图像构建为有标签样本，存储至区块链中。In an embodiment of the present invention, after obtaining the predicted label of the facial image to be identified, it also includes: using the predicted label of the facial image to be identified as the true label, constructing it with the facial image to be identified as a labeled sample, and storing it in the blockchain.

具体地，在步骤S104中，将无标签样本集划分为正确样本集合与错误样本集合，包括：Specifically, in step S104, the unlabeled sample set is divided into a correct sample set and an incorrect sample set, including:

S104-1：将无标签样本集中每个无标签样本进行两次弱增强，获取每个无标签样本对应的第一弱增强样本与第二弱增强样本；与分别表示第一弱增强操作与第二弱增强操作，表示无标签样本集中无标签样本总个数；S104-1: Unlabeled sample set Each unlabeled sample in is weakly enhanced twice to obtain the first weakly enhanced sample corresponding to each unlabeled sample With the second weakly enhanced sample ; and Respectively represent the first weak enhancement operation and the second weak enhancement operation, Represents the total number of unlabeled samples in the unlabeled sample set;

S104-2：基于无标签样本对应的第一弱增强样本与第二弱增强样本的标签置信度得分，计算该无标签样本的平均概率，表示为：；S104-2: Based on the label confidence scores of the first weakly enhanced sample and the second weakly enhanced sample corresponding to the unlabeled sample, the average probability of the unlabeled sample is calculated, which is expressed as: ;

S104-3：若该无标签样本的平均概率不小于预设阈值，则将该无标签样本划分至正确样本集合，表示正确样本集合中正确样本总个数；S104-3: If the average probability of the unlabeled sample Not less than the preset threshold , then the unlabeled sample is divided into the correct sample set , Represents the total number of correct samples in the correct sample set;

S104-4：若该无标签样本的平均概率小于预设阈值，则将该无标签样本划分至错误样本集合。S104-4: If the average probability of the unlabeled sample Less than the preset threshold , then the unlabeled sample is divided into the error sample set.

其中，所述预设阈值，表示为：Wherein, the preset threshold is expressed as:

； ;

其中，表示第轮训练中第类情绪类别的阈值，表示情绪类别总数量，；表示训练轮次，当时，预设阈值的起始值为；为预设超参数，用来控制相似度分布的平滑度；表示训练的批量大小，表示第个无标签样本属于第类情绪类别的标签置信度得分。in, Indicates In the training round Threshold for emotion category, represents the total number of emotion categories, ; represents the training round, when When the starting value of the preset threshold is ; It is a preset hyperparameter used to control the smoothness of the similarity distribution; represents the batch size for training, Indicates unlabeled samples Belong to Label confidence scores for sentiment classes.

具体地，在步骤S105中，获取正确样本集合的集合无监督损失，包括：Specifically, in step S105, obtaining the set unsupervised loss of the correct sample set includes:

S105-1：对正确样本进行强增强，获取强增强正确样本，计算强增强正确样本的标签置信度得分；S105-1: Strong enhancement of correct samples , get the strong enhanced correct sample , calculate the label confidence score of the strongly enhanced correct sample ;

S105-2：基于该正确样本的平均概率与强增强正确样本的标签置信度得分，计算该正确样本的交叉熵损失；S105-2: Based on the average probability of the correct sample Calculate the cross entropy loss of the correct sample by combining the label confidence score of the strongly enhanced correct sample ;

S105-3：对所有正确样本的交叉熵损失求和，获取正确样本集合的集合无监督损失，表示为：；S105-3: Sum the cross entropy losses of all correct samples to obtain the unsupervised loss of the correct sample set , expressed as: ;

具体地，在步骤S106中，获取错误样本集合的集合对比损失，包括：Specifically, in step S106, obtaining the set contrast loss of the error sample set includes:

S106-1：获取第一弱增强样本与第二弱增强样本对应的面部表情特征与，计算样本相似性，表示为：；S106-1: Obtain the first weak enhancement sample With the second weakly enhanced sample Corresponding facial expression features and , calculate the sample similarity, expressed as: ;

S106-2：计算错误样本的对比学习损失，表示为：；S106-2: Calculate the contrastive learning loss of the error sample, expressed as: ;

S106-3：将所有错误样本的对比学习损失求和，获取错误样本集合的集合对比损失，表示为：；S106-3: Sum the contrastive learning losses of all error samples to obtain the set contrastive loss of the error sample set, expressed as: ;

本发明所述的基于交叉融合与置信评估的情绪识别智能合约构建方法，将获取的面部图像按照有标签与无标签划分；对于有标签样本集，计算其每个有标签样本的交叉熵损失求和，获取标签集合损失；对于无标签样本集，计算每个无标签样本的平均概率与预设阈值比较，划分为正确样本与错误样本，计算正确样本的交叉熵损失与错误样本的对比学习损失，分别求和，获取正确样本集合的集合无监督损失与错误样本集合的集合对比损失；对有标签样本集的标签集合损失、正确样本集合的集合无监督损失与错误样本集合的集合对比损失进行加权求和，构建模型总损失函数，来训练图像分类模型；其中，预设阈值根据训练次数自适应更新，从而更加有效地利用未标记数据进行训练，提升模型性能，并且在面部表情识别任务中取得了显著的性能提升，提高了图像分类模型的预测准确度。本发明将待识别面部图像的预测标签作为真实标签，与待识别面部图像构建为有标签样本，存储至区块链中，更新有标签样本集，显著提升了对面部图像进行处理的实时性与准确性，还确保了数据的不可篡改性与高度安全性，为康养行业提供了更加科学、可靠的情绪监测与管理手段，从而全面提升了服务质量和老年人的生活质量。The emotion recognition smart contract construction method based on cross fusion and confidence assessment described in the present invention divides the acquired facial images into labeled and unlabeled ones; for the labeled sample set, the cross entropy loss of each labeled sample is calculated and summed to obtain the label set loss; for the unlabeled sample set, the average probability of each unlabeled sample is calculated and compared with a preset threshold, and the samples are divided into correct samples and incorrect samples, and the cross entropy loss of the correct samples and the contrast learning loss of the incorrect samples are calculated and summed respectively to obtain the set unsupervised loss of the correct sample set and the set contrast loss of the incorrect sample set; the label set loss of the labeled sample set, the set unsupervised loss of the correct sample set and the set contrast loss of the incorrect sample set are weightedly summed to construct a model total loss function to train the image classification model; wherein the preset threshold is adaptively updated according to the number of training times, so as to more effectively utilize unlabeled data for training, improve model performance, and achieve significant performance improvement in facial expression recognition tasks, thereby improving the prediction accuracy of the image classification model. The present invention uses the predicted label of the facial image to be identified as the real label, constructs it with the facial image to be identified as a labeled sample, stores it in the blockchain, and updates the labeled sample set, which significantly improves the real-time and accuracy of facial image processing, and also ensures the data's non-tamperability and high security, providing the health care industry with a more scientific and reliable means of emotion monitoring and management, thereby comprehensively improving the service quality and the quality of life of the elderly.

基于上述实施例，在本发明实施例中，针对康养产业中的区块链技术提出一种新的情绪识别理论模型，该模型的理论架构如图3所示。首先，将现有的老年人表情数据通过共识机制载入新构建的区块链体系中，然后通过区块链中的数据对本实施例中最核心的MUDSS-FER方法进行训练，将训练后的ERM模型在区块链上创建智能合约并通过共识机制发布到区块链中，此为区块链的构建过程。Based on the above embodiments, in the embodiments of the present invention, a new emotion recognition theoretical model is proposed for the blockchain technology in the health care industry, and the theoretical architecture of the model is shown in Figure 3. First, the existing elderly expression data is loaded into the newly constructed blockchain system through the consensus mechanism, and then the core MUDSS-FER method in this embodiment is trained through the data in the blockchain, and the trained ERM model is used to create a smart contract on the blockchain and published to the blockchain through the consensus mechanism. This is the construction process of the blockchain.

当有新的表情数据准备载入区块链时，训练好的ERM模型对数据进行检测，当模型判断是情绪稳定数据时，则通过共识机制载入区块链；否则，判定为情绪异常数据，将其在送入区块链的同时报告给专业人员，对产生情绪异常数据的老人进行医疗诊断。由此上述过程可见，该方法最核心的地方在于MUDSS-FER方法的构建。When new expression data is ready to be loaded into the blockchain, the trained ERM model detects the data. If the model determines that it is emotionally stable data, it is loaded into the blockchain through the consensus mechanism. Otherwise, it is determined to be emotionally abnormal data, which is reported to professionals while being sent to the blockchain, and medical diagnosis is performed on the elderly who generate emotionally abnormal data. From the above process, it can be seen that the core of this method lies in the construction of the MUDSS-FER method.

参照图4所示，为本实施例提出的核心MDUSS-FER算法的整体架构示意图。首先，利用ERM遍历所有样本，得到每个样本的置信度得分。所有样本分为两个部分：有标记样本和无标记样本。对于有标记样本，MDUSS-FER根据样本的真实标签和模型的预测标签计算交叉熵损失。对于无标记样本，MDUSS-FER通过样本的概率分布自适应地修改阈值。同时，根据得分是否达到相应伪标签的阈值把样本分成CR和CN，进行不同的处理。如果样本的得分达到阈值，则判定该样本被模型分类正确，计算交叉熵损失；否则，判定该样本被模型分类错误，并计算对比学习损失。通过这种方式，MUDSS-FER能够充分利用有标记和无标记数据，提高模型的性能和泛化能力。Referring to FIG4 , a schematic diagram of the overall architecture of the core MDUSS-FER algorithm proposed in this embodiment is shown. First, all samples are traversed using ERM to obtain the confidence score of each sample. All samples are divided into two parts: labeled samples and unlabeled samples. For labeled samples, MDUSS-FER calculates the cross entropy loss based on the true label of the sample and the predicted label of the model. For unlabeled samples, MDUSS-FER adaptively modifies the threshold value based on the probability distribution of the sample. At the same time, the samples are divided into CR and CN according to whether the score reaches the threshold of the corresponding pseudo-label, and different treatments are performed. If the score of the sample reaches the threshold, it is determined that the sample is correctly classified by the model, and the cross entropy loss is calculated; otherwise, it is determined that the sample is incorrectly classified by the model, and the contrastive learning loss is calculated. In this way, MUDSS-FER can make full use of labeled and unlabeled data to improve the performance and generalization ability of the model.

MUDSS-FER对阈值初始化后，在训练过程中利用标记数据在C个类上的预测结果自动调整阈值。随着训练次数的增加，模型逐渐变得自信，分类能力也增强，调整阈值能更准确地分辨样本的分类结果是否可信。此外，由于不同的表情分类难度不一样，用标记数据的软标签对阈值进行计算，提高了模型对样本的识别能力。MUDSS-FER把未标记数据分为两部分，分别是置信度得分达到阈值的CR和未达到阈值的CN。对于CR中的样本，利用强增广的样本和弱增广样本的伪标签计算交叉熵损失。而CN的样本，鉴于经过不同处理的样本应该得到相同的伪标签，因此对样本进行对比学习。After MUDSS-FER initializes the threshold, it automatically adjusts the threshold using the prediction results of the labeled data on C classes during the training process. As the number of training times increases, the model gradually becomes more confident and the classification ability is enhanced. Adjusting the threshold can more accurately distinguish whether the classification results of the sample are credible. In addition, since the difficulty of classifying different expressions is different, the threshold is calculated using the soft label of the labeled data, which improves the model's ability to recognize samples. MUDSS-FER divides the unlabeled data into two parts, namely CR whose confidence score reaches the threshold and CN whose confidence score does not reach the threshold. For samples in CR, the cross entropy loss is calculated using the pseudo-labels of strongly augmented samples and weakly augmented samples. As for the CN samples, since samples that have been processed differently should get the same pseudo-label, comparative learning is performed on the samples.

假设有标记的样本集，其中是有标签样本的数量。MUDSS-FER通过调整特定类的阈值的方式，分辨无标记样本的伪标签是否可信。最近半监督学习的进展，大部分是固定阈值，部分研究对阈值进行自动调整，但阈值起始值的设置各不相同。不同阈值的起始值会对训练的结果性能产生影响，太高会导致可信任的无标记样本数量过少，因此将阈值的起始值设置为。其中，表示情绪类型的总数量；在本实施例中，，类型包括快乐、悲伤、惊讶、害怕、蔑视、厌恶、愤怒。Assume that there is a labeled sample set ,in is the number of labeled samples. MUDSS-FER determines whether the pseudo-labels of unlabeled samples are credible by adjusting the threshold of a specific class. Most of the recent progress in semi-supervised learning is fixed thresholds, and some studies have automatically adjusted the thresholds, but the threshold starting values are set differently. Different threshold starting values will affect the performance of the training results. Too high a threshold will result in too few credible unlabeled samples, so the threshold starting value is set to .in, Indicates the total number of emotion types; in this embodiment, , types include happiness, sadness, surprise, fear, contempt, disgust, and anger.

在t轮训练中第c类的阈值为：；The threshold for class c in round t of training is: ;

其中标记数据的损失为：；The loss of labeled data is: ;

式中，是训练的批量大小，是样本输入模型后得到的输出概率，是交叉熵损失。In the formula, is the batch size for training, is the output probability after the sample is input into the model, is the cross entropy loss.

对于无标记样本集，我们首先对样本进行弱增强，生成和，并利用ERM模型提取面部特征和生成概率。计算弱增强后的样本平均概率分布为：；For the unlabeled sample set , we first perform weak enhancement on the sample to generate and , and use the ERM model to extract facial features and generate probabilities. The average probability distribution of the samples after weak enhancement is calculated as: ;

假设是P中的最大值，代表样本是第个类别的概率，将和相比较，T是各类别的阈值，以阈值做分界点，置信度得分高于阈值的无标记样本被动态地划分到正确样本集合CR，置信度得分低于阈值地无标记样本划分到错误样本集合CN。Assumptions is the maximum value in P, representing the sample It is The probability of each category will be and In comparison, T is the threshold of each category. Taking the threshold as the dividing point, unlabeled samples with confidence scores higher than the threshold are dynamically divided into the correct sample set CR, and unlabeled samples with confidence scores lower than the threshold are divided into the wrong sample set CN.

CR中的样本作为高置信度样本，分类结果具有一定的可信性，将平均概率作为样本的伪标签。对样本进行强增强，由于一致性正则化，和伪标签匹配，因此CR的无监督损失为：；式中是CR中的样本数量。The samples in CR are regarded as high-confidence samples, and the classification results have a certain degree of credibility. The average probability is used as the pseudo label of the sample. , due to consistency regularization, and the pseudo-label matches, so the unsupervised loss of CR is: ; In the formula is the number of samples in CR.

CN中的样本，置信度较低，模型的预测结果不具有说服力，无法通过计算样本的交叉熵损失对模型的学习进行指导。对比学习认为，在表示空间中，相似的样本应该在相近的位置，而不相似的样本则应该在较远的位置。因此样本经过弱增强后得到的和，面部表情特征和相似性为：；根据得到的相似性度量，可以求得的对比损失为；其中，，，这进一步增强了模型对特征的判别能力，而且没有新增加可训练的参数。The samples in CN have low confidence, and the prediction results of the model are not convincing. It is impossible to guide the learning of the model by calculating the cross entropy loss of the samples. Contrastive learning believes that in the representation space, similar samples should be in close positions, while dissimilar samples should be in far positions. Therefore, the samples After weak enhancement, and , facial expression features and The similarities are: ; According to the obtained similarity measure, we can get The contrast loss is ;in, , , which further enhances the model's ability to discriminate features without adding new trainable parameters.

本发明实施例将MUDSS-FER方法和智能合约技术相结合，实现对康养产业中老年人的不稳定情绪数据处理后存入区块链中，显著提升了情绪数据处理的实时性与准确性，还确保了数据的不可篡改性与高度安全性，为康养行业提供了更加科学、可靠的情绪监测与管理手段，从而全面提升了服务质量和老年人的生活质量。本发明实施例利用的半监督学习的关键在于，阈值反应模型的学习状态，而学习效果通过模型的预测置信度估计。最近的半监督学习算法选择高置信度得分的数据，对所有类别的固定阈值进行调整，显然没有考虑类间样本和类内样本的差异性；本实施例提出了自适应阈值MUDSS-FER方法和ERM模型，用于对面部表情进行半监督学习。首先，采用了轻量化的ERM模型，用于提取特征和进行面部表情的分类。ERM模型结合了Transformer的优点，能够有效地捕捉图像中的关键特征，并且具有较低的参数量和计算成本，从而在保证性能的同时减少了计算资源的消耗。其次，引入了自动更新的阈值机制，将无标记样本分为两类，并分别进行伪标记和对比学习。这一机制使得模型能够自适应地调整伪标签的置信度，从而更加有效地利用未标记数据进行训练。通过结合ERM模型和自适应阈值MUDSS-FER方法，能够充分利用未标记数据，提升模型性能，并且在面部表情识别任务中取得了显著的性能提升。The embodiment of the present invention combines the MUDSS-FER method with the smart contract technology to process the unstable emotional data of the elderly in the health care industry and store them in the blockchain, which significantly improves the real-time and accuracy of emotional data processing, ensures the data's immutability and high security, and provides the health care industry with a more scientific and reliable emotional monitoring and management method, thereby comprehensively improving the service quality and the quality of life of the elderly. The key to the semi-supervised learning used in the embodiment of the present invention is that the threshold reflects the learning state of the model, and the learning effect is estimated by the prediction confidence of the model. The recent semi-supervised learning algorithm selects data with high confidence scores and adjusts the fixed thresholds of all categories, which obviously does not consider the differences between samples between classes and samples within classes; this embodiment proposes an adaptive threshold MUDSS-FER method and an ERM model for semi-supervised learning of facial expressions. First, a lightweight ERM model is used to extract features and classify facial expressions. The ERM model combines the advantages of Transformer, can effectively capture key features in images, and has a low parameter amount and computational cost, thereby reducing the consumption of computing resources while ensuring performance. Secondly, an automatically updated threshold mechanism is introduced to divide the unlabeled samples into two categories, and pseudo-labeling and contrastive learning are performed respectively. This mechanism enables the model to adaptively adjust the confidence of pseudo-labels, so as to more effectively use unlabeled data for training. By combining the ERM model and the adaptive threshold MUDSS-FER method, it is possible to fully utilize unlabeled data, improve model performance, and achieve significant performance improvement in facial expression recognition tasks.

基于上述实施例，在本发明实施例中，本发明实施例在三个数据集上验证本发明所提供的基于交叉融合与置信评估的情绪识别智能合约构建方法的有效性。RAF-DB是一个广泛应用于面部表情领域的大规模数据集，包含约3万张从真实情感表达场景中收集而来的面部图像。这些图像来自不同人群，包括来自315名大学学生和教职员工的语音和图像数据。RAF-DB的标注由人工进行，确保了数据的真实性和多样性，从而使得训练出的模型在真实场景中具有更好的泛化能力。另一个数据集FER2013是一个经典的面部表情识别数据集，由加拿大多伦多大学的研究人员创建。该数据集包含约3.8万张图像样本，其中约3.2万张用于训练，约1万张用于测试。每张图像的大小为48x48像素，为灰度图像。FER2013相较于RAF-DB更加具有挑战性，因为其中部分数据的标记可能不够准确，甚至有些数据中并没有人类的脸部。FERPlus则是在FER2013基础上进行改进的数据集，旨在提高面部表情识别的准确性和多样性。FERPlus重新标注了FER2013中的数据标签，并将表情的种类从原本的七种基本情绪扩充到十一种。同时，FERPlus对数据进行了增强处理，以提高数据的多样性和模型的泛化能力。这些数据集在本实施例中将被用于评估本实施例提出的MUDSS-FER模型的性能，并与其他半监督学习方法进行比较，从而充分展示出本实施例所提供的基于交叉融合与置信评估的情绪识别智能合约构建方法的优越性。AffectNet数据集是一个用于情感分析和表情识别的广泛使用的面部表情数据集。包含超过100万张面部图像，是迄今为止最大的面部表情数据集之一。Based on the above embodiments, in an embodiment of the present invention, the embodiment of the present invention verifies the effectiveness of the emotion recognition smart contract construction method based on cross fusion and confidence assessment provided by the present invention on three data sets. RAF-DB is a large-scale data set widely used in the field of facial expressions, containing about 30,000 facial images collected from real emotional expression scenes. These images come from different groups of people, including voice and image data from 315 university students and faculty members. The annotation of RAF-DB is done manually to ensure the authenticity and diversity of the data, so that the trained model has better generalization ability in real scenes. Another data set FER2013 is a classic facial expression recognition data set created by researchers at the University of Toronto in Canada. The data set contains about 38,000 image samples, of which about 32,000 are used for training and about 10,000 are used for testing. The size of each image is 48x48 pixels and is a grayscale image. FER2013 is more challenging than RAF-DB because the labeling of some of the data may not be accurate enough, and some of the data may not even contain human faces. FERPlus is an improved dataset based on FER2013, which aims to improve the accuracy and diversity of facial expression recognition. FERPlus relabeled the data labels in FER2013 and expanded the types of expressions from the original seven basic emotions to eleven. At the same time, FERPlus enhanced the data to improve the diversity of the data and the generalization ability of the model. These datasets will be used in this embodiment to evaluate the performance of the MUDSS-FER model proposed in this embodiment, and compared with other semi-supervised learning methods, so as to fully demonstrate the superiority of the emotion recognition smart contract construction method based on cross-fusion and confidence assessment provided in this embodiment. The AffectNet dataset is a widely used facial expression dataset for sentiment analysis and expression recognition. Containing more than 1 million facial images, it is one of the largest facial expression datasets to date.

基于参照表1所示的实验环境规格，利用本发明所提供的基于交叉融合与置信评估的情绪识别智能合约构建方法，在RAF-DB、FER2013和FERPlus数据集上，验证本发明所提供的基于交叉融合与置信评估的情绪识别智能合约构建方法的有效性；具体地，随机在每个数据集中选择4000个样本作为标记样本，其余训练集数据作为未标记样本，进行验证。Based on the experimental environment specifications shown in Table 1, the emotion recognition smart contract construction method based on cross fusion and confidence assessment provided by the present invention is used to verify the effectiveness of the emotion recognition smart contract construction method based on cross fusion and confidence assessment provided by the present invention on the RAF-DB, FER2013 and FERPlus datasets; specifically, 4000 samples are randomly selected from each dataset as labeled samples, and the remaining training set data are used as unlabeled samples for verification.

表1 实验环境规格Table 1 Experimental environment specifications

实验环境Experimental environment规格Specification处理器processorInter(R) Core(TM) i7-6850K CPU @ 3.60GHZInter(R) Core(TM) i7-6850K CPU @ 3.60GHZ显卡GraphicsNvidia GeForce RTX 3090 GPUNvidia GeForce RTX 3090 GPU开发语言Development LanguagePython 3.8Python 3.8开发系统Development SystemUbuntu 18.04Ubuntu 18.04开发框架Development FrameworkPytorch 1.10.0 + cuda11.1Pytorch 1.10.0 + cuda11.1

为了评估本发明实施例的实验效果，本实施例采用四个常用的评价指标，分别为Accuracy、Precision、Recall以及F1-score，以下对这四种评价指标进行简单介绍。In order to evaluate the experimental effect of the embodiment of the present invention, this embodiment adopts four commonly used evaluation indicators, namely Accuracy, Precision, Recall and F1-score. The following is a brief introduction to these four evaluation indicators.

①Accuracy表示预测正确的结果占总样本的百分比，其计算公式如下：①Accuracy indicates the percentage of correct prediction results in the total samples, and its calculation formula is as follows:

； ;

TP表示预测为正类且实际为正类，预测正确；FP表示预测为正类且实际为负类，预测错误；FN表示预测为负类且实际为正类，预测错误；TN表示预测为负类且实际为负类，预测正确。在类别均衡的数据集中准确率是衡量模型优劣的重要指标。TP means that the prediction is positive and the actual is positive, the prediction is correct; FP means that the prediction is positive and the actual is negative, the prediction is wrong; FN means that the prediction is negative and the actual is positive, the prediction is wrong; TN means that the prediction is negative and the actual is negative, the prediction is correct. In a balanced data set, accuracy is an important indicator to measure the quality of the model.

②Precision，又叫查准率，表示在被所有预测为正的样本中实际为正样本的概率，其表达式为：②Precision, also known as the precision rate, represents the probability of actually being a positive sample among all samples predicted to be positive. Its expression is:

； ;

精确率是针对预测结果而言的，它代表对正样本结果中的预测准确程度。Precision refers to the prediction results, which represents the accuracy of the prediction of the positive sample results.

③Recall，又叫“查全率”，它是针对原样本而言的，其含义是在实际为正的样本中被预测为正样本的概率，评估所有实际正例是否被预测出来的覆盖率占比多少。其表达式为:③Recall, also known as "recall rate", is for the original sample. Its meaning is the probability of being predicted as a positive sample among the actual positive samples, and the coverage ratio of evaluating whether all actual positive examples are predicted. Its expression is:

； ;

④由于单独用精确率或召回率都无法很好地评估模型性能，二者是相互矛盾的。因此，为了同时考虑精确率和召回率，需要选择一个阈值，让两者同时达到最高，取得平衡，由此提出了F1-score的概念。其计算公式如下:④ Since neither precision nor recall alone can effectively evaluate model performance, the two are contradictory. Therefore, in order to consider both precision and recall at the same time, it is necessary to select a threshold so that both can reach the highest value and achieve a balance. Thus, the concept of F1-score is proposed. Its calculation formula is as follows:

； ;

参照表2所示，为本发明所提供的基于交叉融合与置信评估的情绪识别智能合约构建方法与多种现有算法的比较，包括多个半监督学习（SSL）算法和全监督算法。从表2中可以清晰看出，本发明实施例提出的MUDSS-FER方法达到了最佳性能，甚至超过了全监督算法。FixMatch、Simmatch、MixMatch和FlexMatch是当前具有代表性的半监督算法，但在表情识别任务上的表现略逊于Ada-CM。这个结果是合理的，因为Ada-CM在训练过程中充分利用了未标记数据，从而提高了模型的泛化能力。而本发明实施例的MUDSS-FER方法在性能上优于Ada-CM，这一结果展现了对阈值初始化进行调整的有效性。此外，MUDSS-FER通过利用标记数据的概率平均值自动调整阈值，增强了模型的表情分类能力。与近期先进的有监督表情识别算法相比，当使用相同数量的标记数据进行训练时，本发明实施例的半监督模型在多个指标上表现更优。As shown in Table 2, the emotion recognition smart contract construction method based on cross-fusion and confidence assessment provided by the present invention is compared with a variety of existing algorithms, including multiple semi-supervised learning (SSL) algorithms and fully supervised algorithms. It can be clearly seen from Table 2 that the MUDSS-FER method proposed in the embodiment of the present invention achieves the best performance, even exceeding the fully supervised algorithm. FixMatch, Simmatch, MixMatch and FlexMatch are currently representative semi-supervised algorithms, but their performance in expression recognition tasks is slightly inferior to Ada-CM. This result is reasonable because Ada-CM makes full use of unlabeled data during training, thereby improving the generalization ability of the model. The MUDSS-FER method of the embodiment of the present invention is better than Ada-CM in performance, which shows the effectiveness of adjusting the threshold initialization. In addition, MUDSS-FER automatically adjusts the threshold by using the probability average of the labeled data, thereby enhancing the expression classification ability of the model. Compared with the recent advanced supervised expression recognition algorithms, when the same amount of labeled data is used for training, the semi-supervised model of the embodiment of the present invention performs better in multiple indicators.

表2 算法指标对比Table 2 Algorithm index comparison

为了进一步分析模型的性能，本实施例对多个半监督算法的预测混淆矩阵进行了评估，结果如图5至图10所示，其中，横纵坐标为不同情绪类型包括快乐（Happiness）、悲伤（Sadness）、惊讶（Surprise）、害怕（Fear）、蔑视（Neutral）、厌恶（Disgust）、愤怒（Anger）；图5表示SimMatch算法的预测混淆矩阵示意图，图6表示MixMatch算法的预测混淆矩阵示意图，图7表示FlexMatch算法的预测混淆矩阵示意图，图8表示FixMatch算法的预测混淆矩阵示意图，图9表示Ada-CM算法的预测混淆矩阵示意图，图10表示本实施例提供的MUDSS-FER算法的预测混淆矩阵示意图。由图5至图10可知，与其他算法相比，MUDSS-FER提高了AffectNet测试集中所有类别的识别准确性，除了Anger和Fear。虽然MUDSS-FER在对Anger的预测精度不如MixMatch，对Fear的预测精度低于其他算法。但是MUDSS-FER在AffectNet上的整体性能更好，因为Neutral、Surprise和Sadness的识别精度明显过于其他算法。In order to further analyze the performance of the model, this embodiment evaluates the prediction confusion matrices of multiple semi-supervised algorithms, and the results are shown in Figures 5 to 10, where the horizontal and vertical axes are different emotion types including happiness, sadness, surprise, fear, contempt, disgust, and anger; Figure 5 shows a schematic diagram of the prediction confusion matrix of the SimMatch algorithm, Figure 6 shows a schematic diagram of the prediction confusion matrix of the MixMatch algorithm, Figure 7 shows a schematic diagram of the prediction confusion matrix of the FlexMatch algorithm, Figure 8 shows a schematic diagram of the prediction confusion matrix of the FixMatch algorithm, Figure 9 shows a schematic diagram of the prediction confusion matrix of the Ada-CM algorithm, and Figure 10 shows a schematic diagram of the prediction confusion matrix of the MUDSS-FER algorithm provided in this embodiment. It can be seen from Figures 5 to 10 that compared with other algorithms, MUDSS-FER improves the recognition accuracy of all categories in the AffectNet test set, except Anger and Fear. Although the prediction accuracy of MUDSS-FER for Anger is not as good as MixMatch, and the prediction accuracy for Fear is lower than other algorithms. However, MUDSS-FER has better overall performance on AffectNet because the recognition accuracy of Neutral, Surprise, and Sadness is significantly higher than that of other algorithms.

混淆矩阵提供了详细的分类结果，而ROC曲线则展示了分类器在不同类别上的分类性能。图11至图16展示了对比算法和MUDSS-FER在不同情绪类别的Roc曲线图，图11表示SimMatch算法下不同情绪类别的ROC曲线图，图12表示MixMatch算法下不同情绪类别的ROC曲线图，图13表示FlexMatch算法下不同情绪类别的ROC曲线图，图14表示FixMatch算法下不同情绪类别的ROC曲线图，图15表示Ada-CM算法下不同情绪类别的ROC曲线图，图16表示本实施例提供的MUDSS-FER算法下不同情绪类别的ROC曲线图。其中，不同情绪类型包括快乐（Happiness）、悲伤（Sadness）、惊讶（Surprise）、害怕（Fear）、蔑视（Neutral）、厌恶（Disgust）、愤怒（Anger），分别对应ROC曲线图中的一条曲线；由图11至图16可知，MUDSS-FER算法下不同情绪类别的Roc曲线相比其他算法分布更集中，表明模型在各类别的分类性能相对一致，而且各类别的曲线之间没有显著的偏离，这说明模型在处理各类样本时没有明显的偏好或弱点。宏平均Roc曲线综合了所有类别的分类性能，图中MUDSS-FER的曲线与各单独类别的ROC曲线重叠度较高，体现出模型整体上具备良好的泛化能力。The confusion matrix provides detailed classification results, while the ROC curve shows the classification performance of the classifier in different categories. Figures 11 to 16 show the ROC curves of the comparison algorithm and MUDSS-FER in different emotion categories. Figure 11 shows the ROC curves of different emotion categories under the SimMatch algorithm, Figure 12 shows the ROC curves of different emotion categories under the MixMatch algorithm, Figure 13 shows the ROC curves of different emotion categories under the FlexMatch algorithm, Figure 14 shows the ROC curves of different emotion categories under the FixMatch algorithm, Figure 15 shows the ROC curves of different emotion categories under the Ada-CM algorithm, and Figure 16 shows the ROC curves of different emotion categories under the MUDSS-FER algorithm provided in this embodiment. Among them, different emotion types include happiness, sadness, surprise, fear, contempt, disgust, and anger, each corresponding to a curve in the ROC curve diagram; from Figures 11 to 16, it can be seen that the Roc curves of different emotion categories under the MUDSS-FER algorithm are more concentrated than those of other algorithms, indicating that the classification performance of the model in each category is relatively consistent, and there is no significant deviation between the curves of each category, which shows that the model has no obvious preference or weakness when processing various samples. The macro-average Roc curve combines the classification performance of all categories. The MUDSS-FER curve in the figure has a high degree of overlap with the ROC curves of each individual category, reflecting that the model has good generalization ability as a whole.

基于上述实施例，本发明实施例提供的一种基于交叉融合与置信评估的情绪识别智能合约构建装置，具体装置包括：Based on the above embodiments, an embodiment of the present invention provides an emotion recognition smart contract construction device based on cross fusion and confidence assessment, and the specific device includes:

样本集构建模块100，用于获取基于区块链存储的多个面部图像，并根据有无真实标签划分为有标签样本或无标签样本，构建有标签样本集与无标签样本集；The sample set construction module 100 is used to obtain multiple facial images stored based on the blockchain, and divide them into labeled samples or unlabeled samples according to whether there are real labels, and construct a labeled sample set and an unlabeled sample set;

置信度得分计算模块200，用于将所有面部图像分别输入初始图像分类模型中，获取对应的预测标签与标签置信度得分；The confidence score calculation module 200 is used to input all facial images into the initial image classification model to obtain the corresponding predicted labels and label confidence scores;

有标签样本计算模块300，用于对于有标签样本，基于其标签置信度得分与真实标签，计算获取每个有标签样本的交叉熵损失；对有标签样本集中所有的有标签样本的交叉熵损失求和，获取有标签样本集的标签集合损失；The labeled sample calculation module 300 is used to calculate the cross entropy loss of each labeled sample based on its label confidence score and the true label; sum the cross entropy losses of all labeled samples in the labeled sample set to obtain the label set loss of the labeled sample set;

无标签样本计算模块400，用于对于无标签样本，分别进行两次弱增强，获取对应的第一弱增强样本与第二弱增强样本；基于第一弱增强样本与第二弱增强样本的标签置信度得分，计算该无标签样本的平均概率；将平均概率不小于预设阈值的所有无标签样本，划分至正确样本集合，其余无标签样本划分至错误样本集合；计算正确样本集合中每个正确样本的平均概率，以及强增强后对应的强增强正确样本的标签置信度得分；基于每个正确样本的平均概率与强增强正确样本的标签置信度得分，计算每个正确样本的交叉熵损失，并求和，获取正确样本集合的集合无监督损失；对于错误样本集合中每个错误样本，基于第一弱增强样本与第二弱增强样本的相似性，计算对应错误样本的对比学习损失；将所有错误样本的对比学习损失求和，获取错误样本集合的集合对比损失；The unlabeled sample calculation module 400 is used to perform two weak enhancements on the unlabeled samples respectively to obtain the corresponding first weak enhancement samples and second weak enhancement samples; based on the label confidence scores of the first weak enhancement samples and the second weak enhancement samples, calculate the average probability of the unlabeled samples; divide all unlabeled samples whose average probability is not less than a preset threshold into a correct sample set, and divide the remaining unlabeled samples into an error sample set; calculate the average probability of each correct sample in the correct sample set, and the label confidence score of the corresponding strongly enhanced correct sample after strong enhancement; based on the average probability of each correct sample and the label confidence score of the strongly enhanced correct sample, calculate the cross entropy loss of each correct sample, and sum them up to obtain the set unsupervised loss of the correct sample set; for each error sample in the error sample set, calculate the contrastive learning loss of the corresponding error sample based on the similarity between the first weak enhancement sample and the second weak enhancement sample; sum the contrastive learning losses of all error samples to obtain the set contrastive loss of the error sample set;

模型训练模块500，用于对有标签样本集的标签集合损失、正确样本集合的集合无监督损失与错误样本集合的集合对比损失进行加权求和，构建模型总损失函数；利用训练集对初始图像分类模型进行训练，更新预设阈值、重新划分正确样本集合与错误样本集合，来计算模型总损失函数，直至模型总损失函数收敛，获取训练好的图像分类模型；The model training module 500 is used to perform weighted summation of the label set loss of the labeled sample set, the set unsupervised loss of the correct sample set, and the set contrast loss of the error sample set to construct a model total loss function; the initial image classification model is trained using the training set, the preset threshold is updated, and the correct sample set and the error sample set are re-divided to calculate the model total loss function until the model total loss function converges to obtain a trained image classification model;

识别模块600，用于实时采集待识别面部图像，输入训练好的图像分类模型中，获取多个输出置信度得分，以其中得分最高的置信度所表示的情绪类型，作为该待识别面部图像的预测标签。The recognition module 600 is used to collect the facial image to be recognized in real time, input it into the trained image classification model, obtain multiple output confidence scores, and use the emotion type represented by the confidence score with the highest score as the predicted label of the facial image to be recognized.

本实施例的基于交叉融合与置信评估的情绪识别智能合约构建装置用于实现前述的基于交叉融合与置信评估的情绪识别智能合约构建方法，因此基于交叉融合与置信评估的情绪识别智能合约构建装置中的具体实施方式可见前文中的基于交叉融合与置信评估的情绪识别智能合约构建方法的实施例部分，例如，样本集构建模块100，置信度得分计算模块200，有标签样本计算模块300，分别用于实现上述基于交叉融合与置信评估的情绪识别智能合约构建方法中步骤S101，S102，S103；无标签样本计算模块400，用于实现上述基于交叉融合与置信评估的情绪识别智能合约构建方法中步骤S104，S105，S106；模型训练模块500，用于实现上述基于交叉融合与置信评估的情绪识别智能合约构建方法中步骤S107，S108；识别模块600，用于实现上述基于交叉融合与置信评估的情绪识别智能合约构建方法中步骤S109；所以，其具体实施方式可以参照相应的各个部分实施例的描述，在此不再赘述。The emotion recognition smart contract construction device based on cross fusion and confidence assessment of this embodiment is used to implement the aforementioned emotion recognition smart contract construction method based on cross fusion and confidence assessment. Therefore, the specific implementation method of the emotion recognition smart contract construction device based on cross fusion and confidence assessment can be seen in the embodiment part of the emotion recognition smart contract construction method based on cross fusion and confidence assessment in the previous text. For example, the sample set construction module 100, the confidence score calculation module 200, and the labeled sample calculation module 300 are respectively used to implement step S in the above-mentioned emotion recognition smart contract construction method based on cross fusion and confidence assessment. 101, S102, S103; unlabeled sample calculation module 400, used to implement steps S104, S105, S106 in the above-mentioned emotion recognition smart contract construction method based on cross fusion and confidence assessment; model training module 500, used to implement steps S107, S108 in the above-mentioned emotion recognition smart contract construction method based on cross fusion and confidence assessment; recognition module 600, used to implement step S109 in the above-mentioned emotion recognition smart contract construction method based on cross fusion and confidence assessment; therefore, its specific implementation method can refer to the description of the corresponding various parts of the embodiments, which will not be repeated here.

本发明所述的基于交叉融合与置信评估的情绪识别智能合约构建方法，将获取的面部图像按照有标签与无标签划分；对于有标签样本集，计算其每个有标签样本的交叉熵损失求和，获取标签集合损失；对于无标签样本集，计算每个无标签样本的平均概率与预设阈值比较，划分为正确样本与错误样本，计算正确样本的交叉熵损失与错误样本的对比学习损失，分别求和，获取正确样本集合的集合无监督损失与错误样本集合的集合对比损失；对有标签样本集的标签集合损失、正确样本集合的集合无监督损失与错误样本集合的集合对比损失进行加权求和，构建模型总损失函数，来训练图像分类模型；其中，预设阈值根据训练次数自适应更新，从而更加有效地利用未标记数据进行训练，提升模型性能，并且在面部表情识别任务中取得了显著的性能提升，提高了图像分类模型的预测准确度。本发明将待识别面部图像的预测标签作为真实标签，与待识别面部图像构建为有标签样本，存储至区块链中，更新有标签样本集，显著提升了对面部图像进行处理的实时性与准确性，还确保了数据的不可篡改性与高度安全性，为康养行业提供了更加科学、可靠的情绪监测与管理手段，从而全面提升了服务质量和老年人的生活质量。本发明利用截断ResNet作为特征提取器，并且通过使用预训练权重来减少对样本的过拟合风险；在特征提取时，截断ResNet已经考虑了局部特征的重要性，因此在进行全局建模时，并不需要对细节和局部特征进行过深的观察。且利用块转换模块，将面部特征图分割成相同大小的patch，而每个patch中包含若干个token；每个token只需考虑与不同patch中相同位置的token之间的关系，专注于局部patch，能够更有效地过滤掉无关信息，专注于对表情识别有用的特征，增强不同表情之间的区分能力，并且在保证性能的同时提高计算效率，提高了对康养产业中图像智能识别的准确度。The emotion recognition smart contract construction method based on cross fusion and confidence assessment described in the present invention divides the acquired facial images into labeled and unlabeled ones; for the labeled sample set, the cross entropy loss of each labeled sample is calculated and summed to obtain the label set loss; for the unlabeled sample set, the average probability of each unlabeled sample is calculated and compared with a preset threshold, and the samples are divided into correct samples and incorrect samples, and the cross entropy loss of the correct samples and the contrast learning loss of the incorrect samples are calculated and summed respectively to obtain the set unsupervised loss of the correct sample set and the set contrast loss of the incorrect sample set; the label set loss of the labeled sample set, the set unsupervised loss of the correct sample set and the set contrast loss of the incorrect sample set are weightedly summed to construct a model total loss function to train the image classification model; wherein the preset threshold is adaptively updated according to the number of training times, so as to more effectively utilize unlabeled data for training, improve model performance, and achieve significant performance improvement in facial expression recognition tasks, thereby improving the prediction accuracy of the image classification model. The present invention uses the predicted label of the facial image to be identified as the real label, and constructs it into a labeled sample with the facial image to be identified, stores it in the blockchain, and updates the labeled sample set, which significantly improves the real-time and accuracy of facial image processing, and also ensures the immutability and high security of the data, providing a more scientific and reliable emotion monitoring and management method for the health care industry, thereby comprehensively improving the service quality and the quality of life of the elderly. The present invention uses truncated ResNet as a feature extractor, and reduces the risk of overfitting of samples by using pre-trained weights; when extracting features, truncated ResNet has considered the importance of local features, so when performing global modeling, it is not necessary to observe details and local features too deeply. And using the block conversion module, the facial feature map is divided into patches of the same size, and each patch contains several tokens; each token only needs to consider the relationship with the tokens in the same position in different patches, focusing on the local patch, can more effectively filter out irrelevant information, focus on features useful for expression recognition, enhance the ability to distinguish between different expressions, and improve the calculation efficiency while ensuring performance, and improve the accuracy of intelligent image recognition in the health care industry.

本领域内的技术人员应明白，本申请的实施例可提供为方法、系统、或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质（包括但不限于磁盘存储器、CD-ROM、光学存储器等）上实施的计算机程序产品的形式。Those skilled in the art will appreciate that the embodiments of the present application may be provided as methods, systems, or computer program products. Therefore, the present application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the present application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.

本申请是参照根据本申请实施例的方法、设备（系统）、和计算机程序产品的流程图和／或方框图来描述的。应理解可由计算机程序指令实现流程图和／或方框图中的每一流程和／或方框、以及流程图和／或方框图中的流程和／或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和／或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to the flowcharts and/or block diagrams of the methods, devices (systems), and computer program products according to the embodiments of the present application. It should be understood that each process and/or box in the flowchart and/or block diagram, as well as the combination of the processes and/or boxes in the flowchart and/or block diagram, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing device to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing device generate a device for implementing the functions specified in one process or multiple processes in the flowchart and/or one box or multiple boxes in the block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和／或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和／或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

显然，上述实施例仅仅是为清楚地说明所作的举例，并非对实施方式的限定。对于所属领域的普通技术人员来说，在上述说明的基础上还可以做出其它不同形式变化或变动。这里无需也无法对所有的实施方式予以穷举。而由此所引伸出的显而易见的变化或变动仍处于本发明创造的保护范围之中。Obviously, the above embodiments are merely examples for the purpose of clear explanation and are not intended to limit the implementation methods. For those skilled in the art, other different forms of changes or modifications can be made based on the above description. It is not necessary and impossible to list all the implementation methods here. The obvious changes or modifications derived therefrom are still within the scope of protection of the present invention.