multilingual-nlp
Here are 98 public repositories matching this topic...
Language:All
Sort:Most stars
MTEB: Massive Text Embedding Benchmark
- Updated
Dec 17, 2025 - Python
Crosslingual Generalization through Multitask Finetuning
- Updated
Sep 22, 2024 - Jupyter Notebook
EMNLP 2023 Papers: Explore cutting-edge research from EMNLP 2023, the premier conference for advancing empirical methods in natural language processing. Stay updated on the latest in machine learning, deep learning, and natural language processing with code included. ⭐ support NLP!
- Updated
May 18, 2024 - Python
Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023
- Updated
Apr 20, 2024 - Python
[EMNLP 2023] The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation
- Updated
Aug 21, 2024 - Jupyter Notebook
This repo supports various cross-lingual transfer learning & multilingual NLP models.
- Updated
Sep 13, 2023 - Python
Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".
- Updated
Mar 11, 2024 - Jupyter Notebook
This repository contains the code, data, and models of the paper titled "CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+ Language Pairs" published in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL’23), July 9-14, 2023.
- Updated
Mar 26, 2024 - Python
WorkRB: Work Research Benchmark
- Updated
Dec 16, 2025 - Python
An open-source Python package that uses AI to predict Nigerian languages, including English, Pidgin, Yoruba, Hausa, and Igbo.
- Updated
Nov 8, 2025 - Python
TUMLU: A Unified and Native Language Understanding Benchmark for Turkic Languages
- Updated
Feb 25, 2025 - Python
Official codebase for the ACL 2025 Findings paper: Optimized Text Embedding Models and Benchmarks for Amharic Passage Retrieval.
- Updated
Jul 26, 2025 - Jupyter Notebook
Generate synthetic labeled data for extremely low-resource languages using bilingual lexicons.
- Updated
Oct 3, 2024 - Python
Code for "Preference Tuning For Toxicity Mitigation Generalizes Across Languages." Paper accepted at Findings of EMNLP 2024
- Updated
Mar 25, 2025 - Jupyter Notebook
Parity-Aware Byte-Pair Encoding: Improving Cross-lingual Fairness in Tokenization [arXiv 2025]
- Updated
Dec 10, 2025 - Python
🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
- Updated
Apr 6, 2025 - Python
On Bilingual Lexicon Induction with Large Language Models (EMNLP 2023). Keywords: Bilingual Lexicon Induction, Word Translation, Large Language Models, LLMs.
- Updated
Jan 23, 2025 - Python
Cross Lingual Language models for making search engines for Holy Quran and Sahih Hadiths
- Updated
Apr 10, 2023 - Jupyter Notebook
This repository provides the official resources for EMNLP 2025 Paper Grounding Multilingual Multimodal LLMs With Cultural Knowledge
- Updated
Oct 7, 2025 - Python
M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis
- Updated
Nov 24, 2025 - Python
Improve this page
Add a description, image, and links to themultilingual-nlp topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with themultilingual-nlp topic, visit your repo's landing page and select "manage topics."