Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation,member institutions, and all contributors.Donate
arxiv logo>cs> arXiv:2404.15041
arXiv logo
Cornell University Logo

Computer Science > Computer Vision and Pattern Recognition

arXiv:2404.15041 (cs)
[Submitted on 23 Apr 2024 (v1), last revised 1 Feb 2025 (this version, v3)]

Title:LEAF: Unveiling Two Sides of the Same Coin in Semi-supervised Facial Expression Recognition

View PDFHTML (experimental)
Abstract:Semi-supervised learning has emerged as a promising approach to tackle the challenge of label scarcity in facial expression recognition (FER) task. However, current state-of-the-art methods primarily focus on one side of the coin, i.e., generating high-quality pseudo-labels, while overlooking the other side: enhancing expression-relevant representations. In this paper, we unveil both sides of the coin by proposing a unified framework termed hierarchicaL dEcoupling And Fusing (LEAF) to coordinate expression-relevant representations and pseudo-labels for semi-supervised FER. LEAF introduces a hierarchical expression-aware aggregation strategy that operates at three levels: semantic, instance, and category. (1) At the semantic and instance levels, LEAF decouples representations into expression-agnostic and expression-relevant components, and adaptively fuses them using learnable gating weights. (2) At the category level, LEAF assigns ambiguous pseudo-labels by decoupling predictions into positive and negative parts, and employs a consistency loss to ensure agreement between two augmented views of the same image. Extensive experiments on benchmark datasets demonstrate that by unveiling and harmonizing both sides of the coin, LEAF outperforms state-of-the-art semi-supervised FER methods, effectively leveraging both labeled and unlabeled data. Moreover, the proposed expression-aware aggregation strategy can be seamlessly integrated into existing semi-supervised frameworks, leading to significant performance gains. Our code is available at \url{this https URL}.
Subjects:Computer Vision and Pattern Recognition (cs.CV)
Cite as:arXiv:2404.15041 [cs.CV]
 (orarXiv:2404.15041v3 [cs.CV] for this version)
 https://doi.org/10.48550/arXiv.2404.15041
arXiv-issued DOI via DataCite

Submission history

From: Fan Zhang [view email]
[v1] Tue, 23 Apr 2024 13:43:33 UTC (11,499 KB)
[v2] Fri, 26 Apr 2024 13:38:08 UTC (11,501 KB)
[v3] Sat, 1 Feb 2025 16:59:17 UTC (3,869 KB)
Full-text links:

Access Paper:

Current browse context:
cs.CV
Change to browse by:
export BibTeX citation

Bookmark

BibSonomy logoReddit logo

Bibliographic and Citation Tools

Bibliographic Explorer(What is the Explorer?)
Connected Papers(What is Connected Papers?)
scite Smart Citations(What are Smart Citations?)

Code, Data and Media Associated with this Article

CatalyzeX Code Finder for Papers(What is CatalyzeX?)
Hugging Face(What is Huggingface?)
Papers with Code(What is Papers with Code?)

Demos

Hugging Face Spaces(What is Spaces?)

Recommenders and Search Tools

Influence Flower(What are Influence Flowers?)
CORE Recommender(What is CORE?)

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community?Learn more about arXivLabs.

Which authors of this paper are endorsers? |Disable MathJax (What is MathJax?)

[8]ページ先頭

©2009-2025 Movatter.jp