Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Automatic annotation of XML encoded files.

NotificationsYou must be signed in to change notification settings

e-ditiones/Annotator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This script process segmentation, lemmatization, normalization and NER of XML-TEI encoded files.

Getting starded

TO DO

Normalization and NER are still a work in progress.

To install Annotator, using command lines, you have to :

  • clone or download this repository
git clone https://github.com/e-ditiones/Annotator.gitcd Annotator

How to use it

  1. The XML-files to be processed need to be in thein_XML folder.

  2. Run the script

bash process.sh
  1. Results are in theout folder :
    • XML : contains XML annotated files ;
    • TSV : contains the annotation in TSV format.

How it works

The lemmazition

For lemmatisation, we usePie-extended and the "freem" model.

Credits

This repository is developed by Alexandre Bartz with the help of Simon Gabay, as part of the projecte-ditiones.

Licences

Licence Creative Commons
Our work is licenced under aCreative Commons Attribution 4.0 International Licence.

Pie-extended is under theMozilla Public License 2.0.

Cite this repository À CHANGER

Alexandre Bartz, Simon Gabay. 2020.Lemmatization and normalization of French modern manuscripts and printed documents. Retrieved fromhttps://github.com/e-ditiones/Annotator.

About

Automatic annotation of XML encoded files.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp