text-normalization
Here are 71 public repositories matching this topic...
Language:All
Sort:Most stars
🧹 Python package for text cleaning
- Updated
Jan 28, 2026 - Python
Chinese text normalization for speech processing
- Updated
Mar 18, 2023 - Python
Japanese text normalizer for mecab-neologd
- Updated
Dec 2, 2025 - Cython
Russian text normalization pipeline for speech-to-text and other applications based on tagging s2s networks
- Updated
Mar 15, 2021 - Python
Myanmar Language Script Library
- Updated
Mar 3, 2023 - JavaScript
Demonstration of the results in "Text Normalization using Memory Augmented Neural Networks", Authors: Subhojeet Pramanik, Aman Hussain
- Updated
Jul 2, 2019 - Jupyter Notebook
Code and model files for paper: I. Lourentzou et al., Adapting Sequence to Sequence models for Text Normalization in Social Media", ICWSM'19
- Updated
Jun 5, 2021 - Python
This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.
- Updated
May 7, 2024 - Python
Convert English text from written expressions into spoken forms
- Updated
Jun 22, 2022 - Python
Inneall aistriúcháin atá taobh thiar de Chaighdeánaitheoir na Gaeilge, agus aistritheoirí Gàidhlig/Gaelg→Gaeilge
- Updated
Sep 14, 2024 - Perl
Soe Vinorm: An Effective Text Normalization Toolkit for converting Vietnamese text to its spoken form.
- Updated
Oct 17, 2025 - Python
A Python library for text normalization, specifically designed for Vietnamese and English text processing. This library provides comprehensive text normalization capabilities including handling of special characters, numbers, dates, and various text formats.
- Updated
Mar 30, 2025 - Python
Proper categorization of e-commerce products enhances the user experience and achieves better results with external search engines. The objective of the project is to classify a product into four given categories, based on its description available on an e-commerce platform.
- Updated
Mar 13, 2024 - Jupyter Notebook
JS / Python3 / PHP Lib to work with UTF8 polytonic greek and latin
- Updated
Sep 11, 2024 - JavaScript
📢 Tha (ថា) - A Khmer Text Normalization and Verbalization Toolkit
- Updated
Jul 26, 2024 - Python
pyTorch implementation for Text Normalization Challenge
- Updated
Sep 25, 2018 - Jupyter Notebook
Useful String extensions to save you time in production.
- Updated
Dec 10, 2025 - Dart
Modern .NET 9 / C# 13 library to normalize text (emojis, currency, numbers, abbreviations, chat slang) for consistent and natural Text-to-Speech (TTS) synthesis, ideal for stream chat/donations.
- Updated
Oct 24, 2025 - C#
Repository for text normalization research.
- Updated
Aug 1, 2023
Improve this page
Add a description, image, and links to thetext-normalization topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thetext-normalization topic, visit your repo's landing page and select "manage topics."