This repository was archived by the owner on Sep 15, 2022. It is now read-only.

AI4Bharat/IndicNLP-TransliterationPublic archive

NotificationsYou must be signed in to change notification settings
Fork13
Star60

Codebase for Indic-Transliteration using Seq2Seq RNN. For latest repo with Transformer-based models, check:https://github.com/AI4Bharat/IndicXlit

transliteration.ai4bharat.org

License

Apache-2.0 license

60 stars 13 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 276 Commits
NoteBooks		NoteBooks
algorithms		algorithms
apps		apps
data		data
docs		docs
hypotheses		hypotheses
tasks		tasks
tools		tools
utilities		utilities
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Repository files navigation

IndianNLP-Transliteration

Project Website |Demo UI |Python Library

The main goal of this project is to create open source input tools for content creation in under-represented languages in India.
It started in collaboration withStory Weaver a non-profit working towards foundational literary education for children, supported byGoogle's AI for Social Good initiative.

Most languages in India do not have digital presence due to an underdeveloped ecosystem. One of the major bottlenecks in content creation and language adoption, is difficulty to input text in several native Indian languages. Lack of stable input tools in underserved languages is huge barrier for creating digital content and NLP datasets in these languages.

Supported Languages

Bengali - বাংলা
Gujarati - ગુજરાતી
Hindi - हिंदी
Kannada - ಕನ್ನಡ
Konkani Goan - कोंकणी
Maithili - मैथिली
Malayalam - മലയാളം
Marathi - मराठी
Panjabi Eastern - ਪੰਜਾਬੀ
Sindhi - سنڌي‎
Sinhala - සිංහල
Telugu - తెలుగు
Tamil - தமிழ்
Urdu - اُردُو

Repository Usage

For Attributions and Contributions lists,check here 🖖

Training Procedures

This repository is developed to facilate easier experimentation with different network architecture models, reformulated objectives with minimal effort and highly tinkerable, rather than a offshelf library.

A Condensed standalone version of a simple model training, inferencing and accuracy computation is created as jupyter notebook.

Pythonic Library

Pythonic transliteration library is available asPython Package Index and also under github releases.
Follow usages inapps readme.

NeuralNet Models

Transliteration models for languages are made available as releases, in a easy deployable way.

All the NN models (along with metadata) of Xlit - Transliteration are licensed under aCreative Commons Attribution-ShareAlike 4.0 International License.

Datasets

Datasets created as part of the project for languages Maithili, Konkani, Hindi are made available as JSON files underdownloads.

Xlit - Transliteration Datasets byStory Weaver &AI4Bharat are licensed under aCreative Commons Attribution 4.0 International License.

Kindly attribute if you use the dataset for your research or products

Contact

If you have benefited by our datasets/models/services or got motivated by our works, we would like to hear from you.

email:opensource@ai4bharat.org

About

Codebase for Indic-Transliteration using Seq2Seq RNN. For latest repo with Transformer-based models, check:https://github.com/AI4Bharat/IndicXlit

transliteration.ai4bharat.org

Releases6

Xlit - Version 0.5.0 Latest

Nov 10, 2020

+ 5 releases

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

IndianNLP-Transliteration

Repository Usage

Training Procedures

Pythonic Library

NeuralNet Models

Datasets

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases6

Uh oh!

Contributors3

Uh oh!

Languages

Movatterモバイル変換

License

AI4Bharat/IndicNLP-Transliteration

Folders and files

Latest commit

History

Repository files navigation

IndianNLP-Transliteration

Repository Usage

Training Procedures

Pythonic Library

NeuralNet Models

Datasets

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases6

Uh oh!

Contributors3

Uh oh!

Languages