dialect-identification
Here are 28 public repositories matching this topic...
Language:All
Sort:Most stars
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
- Updated
Oct 26, 2025 - Python
TunBERT is the first release of a pre-trained BERT model for the Tunisian dialect using a Tunisian Common-Crawl-based dataset. TunBERT was applied to three NLP downstream tasks: Sentiment Analysis (SA), Tunisian Dialect Identification (TDI) and Reading Comprehension Question-Answering (RCQA)
- Updated
Feb 13, 2023 - Python
This repository contains the Arabic sarcasm dataset (ArSarcasm)
- Updated
Feb 18, 2021
Dialect identification using Siamese network
- Updated
Dec 12, 2017 - Jupyter Notebook
The first Dialectal Arabic Code Switching - DACS corpus from broadcast speech. Annotated at the token-level, considering both the linguistic and the acoustic cues. This dataset is a potential benchmark for DCS in spontaneous speech.
- Updated
Apr 3, 2022
ArSarcasm-v2 is an extension to the original ArSarcasm dataset. It was used for the shared task on sarcasm detection and sentiment analysis, which is a part of WANLP 2021.
- Updated
Jan 26, 2022
Language and Speech Technology for Central Kurdish Varieties (LREC-COLING 2024)
- Updated
Nov 29, 2024 - Python
Classifier that identifies Greek text as Cypriot Greek or Standard Modern Greek
- Updated
Jun 12, 2024 - Jupyter Notebook
VarDial19 shared task: Discriminating between Mainland and Taiwan Variation of Mandarin Chinese (DMT)
- Updated
Apr 10, 2019 - Python
- Updated
Apr 26, 2021 - Jupyter Notebook
A tool that predicts the dialect of English of an SMS message using recurrent neural networks supplemented with data from Google Trends.
- Updated
Dec 19, 2017 - Python
Ríomhchlár a dhéanann aicmiú staitistiúil ar théacsanna Gaeilge de réir a gcanúint
- Updated
May 22, 2020 - Perl
Arabic_Dialect_Identification_NLP-AIM-Task
- Updated
Mar 16, 2022 - Jupyter Notebook
using AraBert to classify different Arabic dialects. ranked fourth in WANLP2020 workshop.
- Updated
Feb 26, 2021 - Python
Twitter Dialect Datasets and Classifiers (GULF Arabic Corpus)
- Updated
Jun 28, 2018 - Jupyter Notebook
An Arabic Tweet Dialect Classifier
- Updated
Feb 8, 2022 - Jupyter Notebook
An atlas of Central Kurdish dialects + a simple game to detect dialects
- Updated
Dec 20, 2024 - HTML
Twitter Dialect Datasets and Classifiers (EG + GULF Arabic Corpus)
- Updated
Jun 28, 2018 - Jupyter Notebook
Dialect Identification in Indic Languages
- Updated
Apr 20, 2025 - Python
- Updated
Dec 16, 2018 - Python
Improve this page
Add a description, image, and links to thedialect-identification topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thedialect-identification topic, visit your repo's landing page and select "manage topics."