undertheseanlp/NLP-Vietnamese-progressPublic

NotificationsYou must be signed in to change notification settings
Fork75
Star361

Repository to track the progress in Vietnamese Natural Language Processing, including the datasets and the current state-of-the-art for the most common Vietnamese NLP tasks.

361 stars 75 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 436 Commits
resources		resources
tasks		tasks
.gitignore		.gitignore
README.md		README.md

Repository files navigation

Tracking Progress in Vietnamese NLP

This document aims to track the progress inVietnamese Natural Language Processing and give an overview of thestate-of-the-art (SOTA) across the most common NLP tasks and their corresponding datasets.

It aims to cover both traditional and core NLP tasks such as dependency parsing and part-of-speech tagging as well as more recent ones such as reading comprehension and natural language inference. The main objective is to provide the reader with a quick overview ofbenchmark datasets and thestate-of-the-art for their task of interest, which serves as a stepping stone for further research. To this end, if there is a place where results for a task are already published and regularly maintained, such as apublic leaderboard, the reader will be pointed there.

Sentence Boundary Disambiguation /Language Detection /Text Normalization /Spelling Correction
Word Segmentation /Part-of-Speech Tagging /Chunking /Parsing
Text Classification /Sentiment Analysis /Word Embeddings
Named Entity Recognition /Relationship Extraction /Event Extraction /Information Extraction /Keyword Extraction
Coreference Resolution / Slot Filling /Entity Linking
Semantics /Semantic Role Labeling /Paraphrase Identification /Natural Language Inference
Machine Translation /Automatic Summarization
Knowledge Representation and Reasoning
Dialog Systems and Chatbots / Language Generation /Question Answering
Automatic Speech Recognition /Text To Speech /Speech Classification /Speech
Optical Text Recognition /Image Captioning
Plagiarism Detection
Resources

Contributing

If you would like to add a new result, you can do so with a pull request (PR).In order to minimize noise and to make maintenance somewhat manageable, results reportedin published papers will be preferred (indicate the venue of publication in your PR);an exception may be made for influential preprints. The result should include the nameof the method, the citation, the score, and a link to the paper and should be addedso that the table is sorted (with the best result on top).

If your pull request contains a new result, please make sure that "new result" appearssomewhere in the title of the PR. This way, we can track which tasks are the mostactive and receive the most attention.

In order to make reproduction easier, we recommend to add a link to an implementationto each method if available. You can add aCode column (see below) to the table if it does not exist.In theCode column, indicate an official implementation withOfficial.If an unofficial implementation is available, useLink (see below).If no implementation is available, you can leave the cell empty.

Model	Score	Paper/Source	Code
			Official
			Link

To add a new dataset or task, follow the below steps. Any new datasetsshould have been used for evaluation in at least one published paper besidesthe one that introduced the dataset.

Fork the repository.
If your task is completely new, create a new file and link to it in the table of contents above.If not, add your task or dataset to the respective section of the corresponding file (in alphabetical order).
Briefly describe the dataset/task and include relevant references.
Describe the evaluation setting and evaluation metric.
Show how an annotated example of the dataset/task looks like.
Add a download link if available.
Copy the below table and fill in at least two results (including the state-of-the-art)for your dataset/task (change Score to the metric of your dataset).
Submit your change as a pull request.

Model	Score	Paper/Source	Code

About

Repository to track the progress in Vietnamese Natural Language Processing, including the datasets and the current state-of-the-art for the most common Vietnamese NLP tasks.

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Tracking Progress in Vietnamese NLP

Table of contents

Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Contributors5

Uh oh!

Movatterモバイル変換

undertheseanlp/NLP-Vietnamese-progress

Folders and files

Latest commit

History

Repository files navigation

Tracking Progress in Vietnamese NLP

Table of contents

Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Contributors5

Uh oh!

Packages