Movatterモバイル変換

christospi/glc-nllp-21Public

NotificationsYou must be signed in to change notification settings
Fork0
Star4

Code and data for the NLLP 2021 paper: `Multi-granular Legal topic Classification on Greek Legislation`

arxiv.org/abs/2109.15298

4 stars 0 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
RAPTARCHIS47k		RAPTARCHIS47k
gltc.data.parser		gltc.data.parser
gltc.nlp.training		gltc.nlp.training
.gitignore		.gitignore
README.md		README.md

Repository files navigation

Multi-granular Legal Topic Classification on Greek Legislation

Code and data repo for the paper:Multi-granular Legal Topic Classification on Greek Legislation
presented at NLLP 2021 workshop co-located with EMNLP 2021.

Dataset is available at HuggingFace 🤗:https://huggingface.co/datasets/greek_legal_code

Abstract

In this work, we study the task of classifying legal texts written in the Greek language. We introduce and make publicly available a novel dataset based on Greeklegislation, consisting of more than 47 thousand official, categorized Greek legislation resources. We experiment with this dataset and evaluate a battery ofadvanced methods and classifiers, ranging from traditional machine learning and RNN-based methods to state-of-the-art Transformer-based methods.

We show that recurrent architectures with domain-specific word embeddings offer improved overall performance while being competitive even to transformer-basedmodels. Finally, we show that cutting-edge multilingual and monolingual transformer-based models brawl on the top of the classifiers’ ranking, making us questionthe necessity of training monolingual transfer learning models as a rule of thumb.

To the best of our knowledge, this is the first time the task of Greek legal text classification is considered in an open research project, while also Greek is alanguage with very limited NLP resources in general.

Code

Contact me via email for code access.

Word2Vec embeddings available at:http://legislation.di.uoa.gr/publications/ner_word2vec

Note:
NLP training scripts based on "Large-Scale Multi-Label Text Classification on EU Legislation" project. For full code and project structure, follow lmtc-eurlex57k project instructions at:https://github.com/iliaschalkidis/lmtc-eurlex57k

About

Code and data for the NLLP 2021 paper: `Multi-granular Legal topic Classification on Greek Legislation`

arxiv.org/abs/2109.15298

Releases

No releases published

Packages

No packages published

Languages

Python100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Folders and files

Latest commit

History

Repository files navigation

Multi-granular Legal Topic Classification on Greek Legislation

Abstract

Code

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages

Languages

Movatterモバイル変換

christospi/glc-nllp-21

Folders and files

Latest commit

History

Repository files navigation

Multi-granular Legal Topic Classification on Greek Legislation

Abstract

Code

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages0

Languages

Packages