Movatterモバイル変換


[0]ホーム

URL:


ORGANIZATIONAL
Sign in with credentials provided by your organization.
INSTITUTIONAL
Select your institution to access the SPIE Digital Library.
SELECT YOUR INSTITUTION
PERSONAL
Sign in with your personal SPIE Account.
PERSONAL SIGN IN
No SPIE Account?Create one
;
SPIE digital library
CONFERENCE PROCEEDINGS
Advanced Search
Home> Journals> J. Electron. Imag.> Volume 32> Issue 2>Article
19 March 2023Multilingual semantic fusion network for text recognition in the wild
Celi Lou,Minglei Tong, Liang Xue,Sisil Kumarawadu
Author Affiliations +
Celi Lou,1,2 Minglei Tong,1,* Liang Xue,1 Sisil Kumarawadu3

1Shanghai Univ. of Electric Power (China)
2Univ. of Chinese Academy of Sciences (China)
3Univ. of Moratuwa (Sri Lanka)

*Address all correspondence to Minglei Tong, tongminglei@shiep.edu.cn
Funded by:National Natural Science Foundation of China (NSFC)
ORGANIZATIONAL
Sign in with credentials provided by your organization.
INSTITUTIONAL
Select your institution to access the SPIE Digital Library.
PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
No SPIE Account?Create one
;
PURCHASE THIS CONTENT
SUBSCRIBE TO DIGITAL LIBRARY
50 downloads per 1-year subscription
Members: $195
Non-members: $335ADD TO CART
25 downloads per 1-year subscription
Members: $145
Non-members: $250ADD TO CART
PURCHASE SINGLE ARTICLE
Includes PDF, HTML & Video, when available
Members:
Non-members:ADD TO CART
This will count as one of your downloads.
You will have access to both the presentation and article (if available).
This content is available for download via your institution's subscription. To access this item, please sign in to your personal account.
Forgot your username?
No SPIE account?Create an account
My Library
You currently do not have any folders to save your paper to! Create a new folder below.
Abstract

Most current approaches in the literature of scene text recognition train the language model via a text dataset far sparser than in natural language processing, resulting in inadequate training. Therefore, we propose a simple transformer encoder–decoder model called the multilingual semantic fusion network (MSFN) that can leverage prior linguistic knowledge to learn robust language features. First, we label the text dataset with forward, backward sequences, and subwords, which are extracted by tokenization with linguistic information. Then we introduce a multilingual model to the decoder corresponding to three different channels of the labeled dataset. The final output is fused by different channels to get more accurate results. In experiments, MSFN achieves cutting-edge performance across six benchmark datasets, and extensive ablative studies have proven the effectiveness of the proposed method. Code is available athttps://github.com/lclee0577/MLViT.

© 2023 SPIE and IS&T
Celi Lou,Minglei Tong,Liang Xue, andSisil Kumarawadu"Multilingual semantic fusion network for text recognition in the wild," Journal of Electronic Imaging 32(2), 023015 (19 March 2023).https://doi.org/10.1117/1.JEI.32.2.023015
Received: 24 August 2022; Accepted: 20 February 2023; Published: 19 March 2023
ACCESS THE FULL ARTICLE
ORGANIZATIONAL
Sign in with credentials provided by your organization.
INSTITUTIONAL
Select your institution to access the SPIE Digital Library.
PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
No SPIE Account?Create one
;
PURCHASE THIS CONTENT
SUBSCRIBE TO DIGITAL LIBRARY
50 downloads per 1-year subscription
Members: $195
Non-members: $335ADD TO CART
25 downloads per 1-year subscription
Members: $145
Non-members: $250ADD TO CART
PURCHASE SINGLE ARTICLE
Includes PDF, HTML & Video, when available
Members:$24.00
Non-members:$28.00ADD TO CART
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission Get copyright permission on Copyright Marketplace
KEYWORDS
Transformers

Semantics

Performance modeling

Data modeling

Education and training

Visual process modeling

Visualization

Feature extraction

Computer programming

Ablation

Erratum Email Alerts notify you when an article has been updated or the paper is withdrawn.
VisitMy Account to manage your email alerts.
The alert successfully saved.
VisitMy Account to manage your email alerts.
The alert did not successfully save. Please try again later.
Celi Lou, Minglei Tong, Liang Xue, Sisil Kumarawadu, "Multilingual semantic fusion network for text recognition in the wild," J. Electron. Imag. 32(2) 023015 (19 March 2023) https://doi.org/10.1117/1.JEI.32.2.023015
Include:
Format:
Back to Top

Keywords/Phrases

Keywords
in
Remove
in
Remove
in
Remove
+ Add another field

Search In:























Publication Years

Range
Single Year

Clear Form

[8]ページ先頭

©2009-2025 Movatter.jp