Movatterモバイル変換


[0]ホーム

URL:


Hugging Face's logoHugging Face

Safetensors
Japanese
English
mixtral

Sarashina2-8x70B

This repository provides large language models trained bySB Intuitions.

Required Hardware

BF16 Inference:

  • 16x H100
  • 16x A100 80GB

Model Description

We constructed this Sarashina2-8x70B model, which consists of over 450 billion parameters, by applying thesparse upcycling technique to ourSarashina2-70B model to efficiently build the Mixture-of-Experts model.We trained the Sarashina2-8x70B model using a mix of Japanese and English corpora from web data.

Tokenization

We use asentencepiece tokenizer with a unigram language model and byte-fallback.We do not apply pre-tokenization with Japanese tokenizer.Thus, a user may directly feed raw sentences into the tokenizer.

Ethical Considerations and Limitations

Sarashina2 has not been tuned to follow an instruction yet.Therefore, sarashina2 might generate some meaningless sequences, some inaccurate instances or biased/objectionable outputs.Before using sarashina2, we would like developers to tune models based on human preferences and safety considerations.

License

Sarashina Model NonCommercial License Agreement

Downloads last month
3
Safetensors
Model size
465B params
Tensor type
BF16
·
Inference ProvidersNEW
This model isn't deployed by any Inference Provider.🙋Ask for provider support

Model tree forsbintuitions/sarashina2-8x70b

Finetunes
1 model

Collection includingsbintuitions/sarashina2-8x70b


[8]ページ先頭

©2009-2025 Movatter.jp