Movatterモバイル変換


[0]ホーム

URL:


CN103020044A - Machine-aided webpage translation method and system thereof - Google Patents

Machine-aided webpage translation method and system thereof
Download PDF

Info

Publication number
CN103020044A
CN103020044ACN2012105056324ACN201210505632ACN103020044ACN 103020044 ACN103020044 ACN 103020044ACN 2012105056324 ACN2012105056324 ACN 2012105056324ACN 201210505632 ACN201210505632 ACN 201210505632ACN 103020044 ACN103020044 ACN 103020044A
Authority
CN
China
Prior art keywords
translation
web page
webpage
term
page module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012105056324A
Other languages
Chinese (zh)
Inventor
宗竞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
JIANGSU LEMAIDAO NETWORK TECHNOLOGY Co Ltd
Original Assignee
JIANGSU LEMAIDAO NETWORK TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by JIANGSU LEMAIDAO NETWORK TECHNOLOGY Co LtdfiledCriticalJIANGSU LEMAIDAO NETWORK TECHNOLOGY Co Ltd
Priority to CN2012105056324ApriorityCriticalpatent/CN103020044A/en
Publication of CN103020044ApublicationCriticalpatent/CN103020044A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Landscapes

Abstract

The invention discloses a machine-aided webpage translation system. The machine-aided webpage translation system comprises a webpage receiving module, a webpage reading module and a webpage translating module, wherein the webpage receiving module parses a webpage through a parser to obtain a document object model; the webpage reading module reads the document object model; and the webpage translating module translates the webpage, builds a database, performs terminological management and performs bidirectional translation and layout. The system can effectively eliminate repeated work of a translator, thus improving the working efficiency.

Description

The auxiliary web page translation method of a kind of machine and system thereof
Technical field
The present invention relates to the auxiliary web page translation method of a kind of machine and system thereof.
Background technology
The accurate rate of translating of web page translation system is paced up and down for a long time about 70%, the readability of translation, system to the coverage rate of language phenomenon, especially opening is all unsatisfactory for the robustness of system.Society is in the urgent need to processing on a large scale real text (especially online mass text), and the expectation that the web page translation system processes extensive real text with society differs greatly.The thought of machine aided translation (Computer Aided Translation is called for short CAT) produces under such background.System compares with Fully Automatic Machine Translation, and machine-aided translation system is a kind of man-machine interactive system.In this interpretive scheme, computing machine is responsible for supplementary translation personnel's task, the knowledge of some vocabulary, term, phrase translation not only is provided to the translator, and from translating the translation of searching same or similar statement the text, make the translator avoid the unnecessary duplication of labour, carry out high efficiency translation.The Important Thought of computer-aided translation (comprising based on the translation memory technology with based on the translation technology of instance mode) is the same or analogous sentence of search or phrase in translation memory library (bilingual alignment storehouse) and instance mode storehouse, provides reference translation.
The translator takes full advantage of existing translated resources, avoids the duplication of labour as far as possible.This supplementary translation mechanism is particularly suitable for the translation of the text that this length such as scientific and technological monograph, scientific and technical literature, product description, service manual, the United Nations's file is long, the cataphasia phenomenon is more, can help the translator to eliminate the translation work of repetition, only need be absorbed in the translation of fresh content.
The machine aided translation software of mechanical translation data base technology based on such one simple true: because the related translation information enormous amount in technical translator field, and scope is relatively narrow, concentrate on certain or certain several specialties, technical translator company or the department of oneself arranged such as specialties such as politics, economy, military affairs, space flight, computing machine, communications.This just must bring the repetition in various degree of translation information.According to statistics, in different industries and department, the repetition rate of this data reaches 20% ~ 70% and does not wait.This is the meaningless duplication of labour with regard to meaning that the translator has the work more than 20% at least.The translation memory technology from setting about here, at first is devoted to eliminate translator's the duplication of labour, thereby is increased work efficiency exactly.
The web page translation function refers under the prerequisite that does not change webpage format, and the needed spoken and written languages of user translated in the spoken and written languages on the webpage of browser display.At present mostly common web page translation technology is for super word marking language (Hyper Text Markup Language, HTML) webpage of being write as is translated, its principle system obtains first the content of the source file (namely HTML shelves) of webpage, seeking afterwards the literal (being the literal between the HTML label) that needs in the webpage to translate translates, the result that then will translate substitutes original text, and generate new webpage, indicate again the newly-generated webpage of browser display.
Summary of the invention
In order to overcome the weak point in the above-mentioned background technology, the invention provides the auxiliary web page translation system of a kind of machine, comprise the reception Web page module, read Web page module and translating web page module, described translating web page module realizes by following step:
The first step, translation process, in the new sentence of translation, the search translation memory library compares and mates translation unit in this sentence and the data base, chooses the immediate translation unit of original text, provides reference translation;
Second step is built the storehouse automatically, automatic analysis and coupling original text and translation, and with original text and the translation corresponding translation memory library file that then automatically generates a standard one by one, all data of user can be recycled by this instrument take sentence as unit;
The 3rd step, the term management.All terms are carried out standard, disposablely set up the tabulation of one or more standard terminologys, when using the translation of translation memory system, open corresponding term tabulation in the term management tool, can automatically identify which word is arranged in the current sentence is defined term, and provide the term translation of standard;
In the 4th step, carry out two-way intertranslation between multilingual;
The 5th step, automatic typesetting, translation is applied mechanically the form of original text automatically, carries out automatic typesetting.
According to a kind of auxiliary web page translation system of machine that adopts said method, it comprises the reception Web page module, reads Web page module and translating web page module, described reception Web page module is resolved the acquisition document dbject model by resolver to webpage, the described Web page module that reads reads described document dbject model, and storehouse, term management and two-way intertranslation and composing are translated, built to described translating web page module to webpage.
Description of drawings
In order to be illustrated more clearly in the technical scheme in the embodiment of the invention, the accompanying drawing of required use was done to introduce simply during the below will describe embodiment, obviously, accompanying drawing in the following describes only is part embodiment of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 shows according to web page translation flow process of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the invention, the technical scheme in the embodiment of the invention is clearly and completely described, obviously, described embodiment only is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, those of ordinary skills belong to the scope of protection of the invention not making the every other embodiment that obtains under the creative work prerequisite.
According to one embodiment of present invention, as shown in Figure 1, the auxiliary web page translation system of machine comprises the reception Web page module, reads Web page module and translating web page module, described reception Web page module is resolved the acquisition document dbject model by resolver to webpage, the described Web page module that reads reads described document dbject model, and storehouse, term management and two-way intertranslation and composing are translated, built to described translating web page module to webpage.After receiving webpage, meeting is resolved this webpage by resolver and is obtained document dbject model, and this document dbject model namely is stored in the receiver module.In the present embodiment, resolver is that the resolver (such as the MSXML of Microsoft) built-in to general browser is similar.Read module is in order to the first language literal in the literal node of file reading object model, and exports it to translation module.Wherein, read module is to come information in the file reading object model with order code (script) or program, such as Java script, VB script or PHP supervisor language.Translating web page is wherein realized by following step: translation process, automatically build the management of storehouse, term, multilingual two-way intertranslation and automatic typesetting:
The first step, translation process, in the new sentence of translation, the search translation memory library compares and mates translation unit in this sentence and the data base, chooses the immediate translation unit of original text, provides reference translation;
Second step is built the storehouse automatically, automatic analysis and coupling original text and translation, and with original text and the translation corresponding translation memory library file that then automatically generates a standard one by one, all data of user can be recycled by this instrument take sentence as unit;
The 3rd step, the term management.All terms are carried out standard, disposablely set up the tabulation of one or more standard terminologys, when using the translation of translation memory system, open corresponding term tabulation in the term management tool, can automatically identify which word is arranged in the current sentence is defined term, and provide the term translation of standard;
In the 4th step, carry out two-way intertranslation between multilingual;
The 5th step, automatic typesetting, translation is applied mechanically the form of original text automatically, carries out automatic typesetting.
Specific descriptions are:
Translation memory product automatically " memory " is lived each sentence translation of user's translation, in the new sentence of translation, searches for translation memory library, and translation unit in this sentence and the data base is compared and mates, and chooses the immediate translation unit of original text, provides reference translation.The user can accept this translation, also can make some modifications, and amended new translation can deposit data base automatically in, for later on.Because professional domain vocabulary and formula are relative fixing, after the user had accumulated a plurality of data bases that certain scale arranged, the repetition sentence that runs into can get more and more, and it is more and more lighter that translation also becomes.
General translation memory product also all network enabled share the data base function.That is to say, when many people translate simultaneously, can be by translation memory library of LAN-sharing, each online translator can call other people achievement in real time.
For before using the translation memory product, accumulated the user of a large amount of translation informations, the translation memory product can provide an automatic Library Construction Kit [Microsoft FoxPro].This instrument energy automatic analysis and coupling original text and translation are corresponding one by one with original text and translation take sentence as unit.The user finishes after some adjustment and the check and correction, and this instrument can generate the translation memory library file of a standard automatically.The all data of user can be recycled by this instrument, thereby set up translation memory library efficiently, quickly.These storehouses can further be replenished again and perfect in continuous use procedure.
It is the term management that the translation memory product generally also provides a very important function.For professional skill field, almost every piece of document is all with a large amount of technical terms, and the self-consistentency of term translation is one of important content of check and correction all the time.This work is wasted time and energy, and also difficult guarantor has careless omission.The translation memory product comes all terms of standard by a term management tool (generally being e-dictionary).The user only needs the disposable one or more standard terminology tabulations (comprising term original text and translation in the table) of setting up, when using the translation of translation memory system, open corresponding term tabulation in the term management tool, can automatically identify which word is arranged in the current sentence is defined term, and provide the term translation of standard.
Because what translation memory realized is comparison and the coupling of original text and translation, has brought an innate advantage of translation memory--support the two-way intertranslation between multilingual.Take translation memory software vendor Germany TRADOS (TRADOS) company as example, the product of the said firm is supported 55 kinds of language based on Unicode, has covered the Windows95/98/NT of nearly all language version.In other words, can realize two-way intertranslation between each languages once the cover product, this is unthinkable in mechanical translation.
The thing that the people does not go for again not only be the duplication of labour.The composing work of electronic document also is the work that allows the translator have a headache.Especially Localization Industry is very strict to the call format of translation, must be consistent with the form of former document.In this respect, the translation memory product is walked out and away in the front again.Present translation memory product generally all provides various format analysis processing instruments, supports popular document format, such as DOC, RTF, HTML, SGML, PPT etc.Translation can be applied mechanically the form of original text automatically, and the translator needn't take a lot of trouble to set type, as long as it is just passable to concentrate to be engaged in translation.
Need to prove that above embodiment only is the exemplary description to technical solution of the present invention, and is not limitation of the present invention; Although with reference to top embodiment the present invention is had been described in detail; but; those of ordinary skill in the art should be understood that fully; do not breaking away from the protection domain that limited by claims of the present invention under the prerequisite of spirit; can make amendment or part technical characterictic wherein is equal to replacement the technical scheme that above-described embodiment is put down in writing, these all should belong to protection scope of the present invention.

Claims (2)

CN2012105056324A2012-12-032012-12-03Machine-aided webpage translation method and system thereofPendingCN103020044A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN2012105056324ACN103020044A (en)2012-12-032012-12-03Machine-aided webpage translation method and system thereof

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN2012105056324ACN103020044A (en)2012-12-032012-12-03Machine-aided webpage translation method and system thereof

Publications (1)

Publication NumberPublication Date
CN103020044Atrue CN103020044A (en)2013-04-03

Family

ID=47968661

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN2012105056324APendingCN103020044A (en)2012-12-032012-12-03Machine-aided webpage translation method and system thereof

Country Status (1)

CountryLink
CN (1)CN103020044A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103235775A (en)*2013-04-252013-08-07中国科学院自动化研究所Statistics machine translation method integrating translation memory and phrase translation model
CN103885942A (en)*2014-03-182014-06-25成都优译信息技术有限公司Rapid translation device and method
CN104331399A (en)*2014-07-252015-02-04一朵云(北京)科技有限公司Dictionary tree translation method
CN104881406A (en)*2015-06-152015-09-02携程计算机技术(上海)有限公司Web page translation method and system
CN106126508A (en)*2016-06-222016-11-16上海者信息科技有限公司A kind of language material management method
CN106557466A (en)*2015-09-252017-04-05四川省科技交流中心Distributed across languages searching systems and its search method based on centralized translation
CN106557478A (en)*2015-09-252017-04-05四川省科技交流中心Distributed across languages searching systems and its search method based on bridge language
CN106844354A (en)*2017-01-112017-06-13中国科学院合肥物质科学研究院A kind of webpage takes word Chinese interpretation method and its device
CN107066454A (en)*2017-03-272017-08-18成都优译信息技术股份有限公司Number and sequence number replacement method and system for machine translation
CN107329958A (en)*2017-06-082017-11-07努比亚技术有限公司Language transfer method and device based on webpage
CN109783826A (en)*2019-01-152019-05-21四川译讯信息科技有限公司A kind of document automatic translating method
CN110083845A (en)*2019-04-252019-08-02数译(成都)信息技术有限公司Web page translation method and system
CN110889296A (en)*2019-11-272020-03-17福建亿榕信息技术有限公司 A real-time translation method and device combined with crawler technology
CN111563387A (en)*2019-02-122020-08-21阿里巴巴集团控股有限公司Sentence similarity determining method and device and sentence translation method and device
CN113792558A (en)*2021-11-162021-12-14北京百度网讯科技有限公司 Self-learning translation method and device based on machine translation and post-translation editing

Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20030154071A1 (en)*2002-02-112003-08-14Shreve Gregory M.Process for the document management and computer-assisted translation of documents utilizing document corpora constructed by intelligent agents
CN1687925A (en)*2005-05-102005-10-26贺方升Method for realizing bilingual web page searching
CN101470705A (en)*2007-12-292009-07-01英业达股份有限公司Dynamic webpage translation system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20030154071A1 (en)*2002-02-112003-08-14Shreve Gregory M.Process for the document management and computer-assisted translation of documents utilizing document corpora constructed by intelligent agents
CN1687925A (en)*2005-05-102005-10-26贺方升Method for realizing bilingual web page searching
CN101470705A (en)*2007-12-292009-07-01英业达股份有限公司Dynamic webpage translation system and method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
张国霞等: "浅议计算机辅助翻译软件", 《中国现代教育装备》*
柏晓静等: "面向中文学术专著的机器辅助翻译研究", 《中国翻译》*
许钧等: "《翻译学概论》", 31 October 2009, 译林出版社*

Cited By (22)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103235775B (en)*2013-04-252016-06-29中国科学院自动化研究所A kind of statistical machine translation method merging translation memory and phrase translation model
CN103235775A (en)*2013-04-252013-08-07中国科学院自动化研究所Statistics machine translation method integrating translation memory and phrase translation model
CN103885942B (en)*2014-03-182017-09-05成都优译信息技术股份有限公司A kind of rapid translation device and method
CN103885942A (en)*2014-03-182014-06-25成都优译信息技术有限公司Rapid translation device and method
CN104331399A (en)*2014-07-252015-02-04一朵云(北京)科技有限公司Dictionary tree translation method
CN104881406A (en)*2015-06-152015-09-02携程计算机技术(上海)有限公司Web page translation method and system
CN104881406B (en)*2015-06-152018-05-04上海携程商务有限公司Web page translation method and system
CN106557466A (en)*2015-09-252017-04-05四川省科技交流中心Distributed across languages searching systems and its search method based on centralized translation
CN106557478A (en)*2015-09-252017-04-05四川省科技交流中心Distributed across languages searching systems and its search method based on bridge language
CN106126508A (en)*2016-06-222016-11-16上海者信息科技有限公司A kind of language material management method
CN106844354A (en)*2017-01-112017-06-13中国科学院合肥物质科学研究院A kind of webpage takes word Chinese interpretation method and its device
CN107066454A (en)*2017-03-272017-08-18成都优译信息技术股份有限公司Number and sequence number replacement method and system for machine translation
CN107329958A (en)*2017-06-082017-11-07努比亚技术有限公司Language transfer method and device based on webpage
CN107329958B (en)*2017-06-082021-03-26努比亚技术有限公司Language conversion method and device based on webpage
CN109783826A (en)*2019-01-152019-05-21四川译讯信息科技有限公司A kind of document automatic translating method
CN109783826B (en)*2019-01-152023-11-21四川译讯信息科技有限公司Automatic document translation method
CN111563387A (en)*2019-02-122020-08-21阿里巴巴集团控股有限公司Sentence similarity determining method and device and sentence translation method and device
CN111563387B (en)*2019-02-122023-05-02阿里巴巴集团控股有限公司Sentence similarity determining method and device, sentence translating method and device
CN110083845A (en)*2019-04-252019-08-02数译(成都)信息技术有限公司Web page translation method and system
CN110083845B (en)*2019-04-252023-06-16四川语言桥信息技术有限公司 Web page translation method and system
CN110889296A (en)*2019-11-272020-03-17福建亿榕信息技术有限公司 A real-time translation method and device combined with crawler technology
CN113792558A (en)*2021-11-162021-12-14北京百度网讯科技有限公司 Self-learning translation method and device based on machine translation and post-translation editing

Similar Documents

PublicationPublication DateTitle
CN103020044A (en)Machine-aided webpage translation method and system thereof
Thu et al.Introducing the Asian language treebank (ALT)
KR100912501B1 (en) Method and apparatus for building translation knowledge
CN108694214A (en)Generation method, generating means, readable medium and the electronic equipment of data sheet
CN102262621A (en)Device and method for checking translated text
JP2009151777A (en) Method and apparatus for alignment of spoken language parallel corpus
CN108008947B (en)Intelligent prompting method and device for programming statement, server and storage medium
WangThe development of translation technology in the era of big data
CN114936271B (en)Method, equipment and medium for converting natural language into database query statement
RU2546064C1 (en)Distributed system and method of language translation
CN119005133A (en)Text generation method and system based on large language model
CN103793368B (en)A kind of method of labelling in protection markup language automatically in automatization translation processes
Batoulis et al.Automatic business process model translation with bpmt
JP2013250605A (en)Machine translation device, machine translation method and program
AbuSa’aleekThe adequacy and acceptability of machine translation in translating the Islamic texts
GuComputer Intelligent Proofreading System of Translation Model Based on Improved GLR Algorithm [J]
Deksne et al.The modern electronic dictionary that always provides an answer
Hunziker et al.Corpus2wiki: A mediawiki-based tool for automatically generating wikiditions in digital humanities
Abdelkadir et al.ERROR ANALYSIS OF TIGRINYA–ENGLISH MACHINE TRANSLATION SYSTEMS
Le Thuyen et al.Results Comparison of machine translation by Direct translation and by Through intermediate language
FengIntroduction: The history and development of Chinese terminology
Goswami et al.An empirical study on English to Hindi E-contents Machine Translation through multi engines
US20240193161A1 (en)Reverse engineered retokenization for translation of machine interpretable languages
InnaTHE ROLE OF MACHINE TRANSLATION IN IT-SPECIALISTS’TRANSLATION ACTIVITY
IsaharaToward Practical Use of Machine Translation

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
C02Deemed withdrawal of patent application after publication (patent law 2001)
WD01Invention patent application deemed withdrawn after publication

Application publication date:20130403


[8]ページ先頭

©2009-2025 Movatter.jp