Movatterモバイル変換


[0]ホーム

URL:


Acta Press

AN ADVERSARIAL AND DEEP HASHING-BASED HIERARCHICAL SUPERVISED CROSS-MODAL IMAGE AND TEXT RETRIEVAL ALGORITHM, 77-86.

Ruidong Chen, Baohua Qiang, Mingliang Zhou, Shihao Zhang, Hong Zheng, and Chenghua Tang

Keywords

Cross-modal image and text retrieval, deep hash algorithm, hierarchical supervision, adversarial network

Abstract

With the rapid development of robotics and sensor technology,vast amounts of valuable multimodal data are collected. It isextremely critical for a variety of robots performing automated tasksto find relevant multimodal information quickly and efficiently inlarge amounts of data. In this paper, we propose an adversarialand deep hashing-based hierarchical supervised cross-modal imageand text retrieval algorithm to perform semantic analysis andassociation modelling on image and text by making full use ofthe rich semantic information of the label hierarchy. First, themodal adversarial block and the modal differentiation network bothperform adversarial learning to keep different modalities with thesame semantics closest to each other in a common subspace. Second,the intra-label layer similarity loss and inter-label layer correlationloss are used to fully exploit the intrinsic similarity existing in eachlabel layer and the correlation existing between label layers. Finally,an objective function for different semantic data is redesigned to keepdata with different semantics away from each other in a commonsubspace, thus avoiding interference of retrieval by data of differentsemantics. The experimental results on two cross-modal retrievaldatasets with hierarchically supervised information show that theproposed method substantially enhances retrieval performance andconsistently outperforms other state-of-the-art methods.

Important Links:

Go Back

LinkedInPrivacy & LegalSitemapCopyright © 2025 ACTA Press


[8]ページ先頭

©2009-2025 Movatter.jp