NotificationsYou must be signed in to change notification settings
Fork0
Star10

AI model designed to test the effectiveness in handling external ethical attacks.

You must be signed in to change notification settings

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
.gitignore		.gitignore
README.md		README.md
example.py		example.py
requirements.txt		requirements.txt

Repository files navigation

Demo

LLaMA-3 Joker

Introduction

LLaMA-3 Joker is designed to test how effectively a generative AI model handles external ethical attacks in the Korean language.

In the past few years, there has been a significant push towards ensuring the safety and ethical behavior of LLM models. Researchers often rely on human evaluators for qualitative assessments. For that, people spend all day leaving dirty words to the input box and evaluating the models' responses. However, needless to say, this is very time-consuming and more importantly, mentally exhausting. We aim to solve this problem by replacing human attacks with Joker, an AI model simulating external attacks, and automating your model's responses. Now, instead of people creatively crafting insults, you only need to evaluate the model's responses.

LLaMA-3 Joker

We fine-tuned the checkpoint of Meta's publicly released LLaMA-3 model (8B) with theAI Hub's'텍스트 윤리검증 데이터' (Text Ethics Verification Data)dataset. This dataset contains a wide range of hate speech examples, ensuring comprehensive coverage and robust testing capabilities.

Demo

You can interact with the model in ourdemo page. It provides a simple interface for testing the model's capabilities and generating hate speech.

Download

To download the model weights and tokenizer, please visit theHugging Face model hub and follow the instructions.

Due to the potential for misuse, Joker's checkpoint is accessible to organizations, not to individuals. We request you follow Hugging Face's model access request system. Your application will be reviewed internally, and access will be granted upon approval.

Warning

Using this model for purposes that violate common sense and social norms is prohibited, and all responsibility for such misuse lies with the user.

Quick Start

You can follow the steps below to get up and running with Llama-3 Joker quickly.

Clone this repository in a conda env with PyTorch / CUDA.
Run:

$ pip install -e .

Visit theHugging Face model hub and register to download the model.
Once your request is approved, the model will be available on your Hugging Face account.
Runexample.py to interact with the model:

$ python3 example.py

Citation

If you apply this library to any project and research, please cite our code:

@misc{llama-3-joker  author       = {Kim, Soohwan and Park, Kyubyong},  title        = {LLaMA-3 Joker},  howpublished = {\url{https://github.com/tunib-ai/joker}},  year         = {2024}}

Contact

For any inquiries or issues, please contact us attunibridge@tunib.ai

Acknowledgements

GPUs for training Joker were provided byCommon Computer.

About

AI model designed to test the effectiveness in handling external ethical attacks.

huggingface.co/tunib/llama-3-joker

Releases

No releases published

Packages

No packages published

Languages

Python100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Folders and files

Latest commit

History

Repository files navigation

LLaMA-3 Joker

Introduction

LLaMA-3 Joker

Demo

Download

Warning

Quick Start

Citation

Contact

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages

Languages

Movatterモバイル変換

tunib-ai/joker

Folders and files

Latest commit

History

Repository files navigation

LLaMA-3 Joker

Introduction

LLaMA-3 Joker

Demo

Download

Warning

Quick Start

Citation

Contact

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages0

Languages

Packages