- Notifications
You must be signed in to change notification settings - Fork668
Modeling, training, eval, and inference code for OLMo
License
allenai/OLMo
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
OLMo is a repository for training and using AI2's state-of-the-art open language models. It is designed by scientists, for scientists.
First, installPyTorch following the instructions specific to your operating system.
For training and fine-tuning, we recommend installing from source:
git clone https://github.com/allenai/OLMo.gitcd OLMopip install -e .[all]You can also install from PyPI with:
pip install ai2-olmo
OLMo pretraining follows a two-stage training procedure.In the first stage, we train on large amounts of mostly web-based data:OLMo-mix-1124In the second stage, we train on a smaller amount of high-quality, targeted data:Dolmino-mix-1124
You can findall the checkpoints, at minimum every 1000 training steps in OLMo core and Hugging Face format:
| Variant | OLMo Format (Stage 1) | OLMo Format (Stage 2) | Hugging Face Format |
|---|---|---|---|
| OLMo-2 1B | OLMo-2 1B | OLMo-2 1B | Hugging Face for the 1B variant |
| OLMo-2 7B | OLMo-2 7B | OLMo-2 7B | Hugging Face for the 7B variant |
| OLMo-2 13B | OLMo-2 13B | OLMo-2 13B | Hugging Face for the 13B variant |
| OLMo-2 32B | OLMo-2 32B | OLMo-2 32B | Hugging Face for the 32B variant |
Note: The 32B variant was trained on our new trainer. To train or fine-tune OLMo-2 32B, visitOLMo-core.
To reproduce any of the training processes described below, run this:
torchrun --nproc_per_node=8 scripts/train.py {path_to_train_config}For the training config, use any of the configs listed below.
If you want to override any of the settings in the training config without having to write a new config every time,you can do this:
torchrun --nproc_per_node=8 scripts/train.py {path_to_train_config} \ --setting1=value \ --setting2=value \ --setting3.subsetting1=valueThe training configs below refer to training data that gets streamed in live over HTTP.To reproduce at large scale, we recommend downloading the files locally and changing the paths to point to yourlocal file system.
python scripts/train.py {path_to_train_config}Example:
python scripts/train.py configs/tiny/OLMo-20M.yaml --save_overwrite
Note: You need to upgrade PyTorch to 2.5.x to run.
Stage 1 is the biggest stage, where we train on 4T or 5T tokens on largely web-based data.
| OLMo2 1B | OLMo2 7B | OLMo2 13B | |
|---|---|---|---|
| Number of tokens | 4 Trillion | 4 Trillion | 5 Trillion |
| Checkpoint | stage1-step1907359-tokens4001B | stage1-step928646-tokens3896B | stage1-step596057-tokens5001B |
| Training config | OLMo2-1B-stage1.yaml | OLMo2-7B-stage1.yaml | OLMo2-13B-stage1.yaml |
| WandB | wandb.ai/OLMo2-1B | wandb.ai/OLMo2-7B | wandb.ai/OLMo2-13B |
You can find the .csv.gz files containing the training datahere.
For the 1B model, we have trained three times with different data order on 50B high quality tokens, used last checkpoint of seed 42 as final checkpoint.
| Checkpoint | Training config | WandB | |
|---|---|---|---|
| random seed 42069 | stage2-ingredient1-step23852-tokens51B | OLMo2-1B-stage2-seed42069.yaml | wandb.ai/OLMo2-1B |
| random seed 666 | stage2-ingredient2-step23852-tokens51B | OLMo2-1B-stage2-seed666.yaml | wandb.ai/OLMo2-1B |
| random seed 42 (main) | stage2-ingredient3-step23852-tokens51B | OLMo2-1B-stage2-seed42.yaml | wandb.ai/OLMo2-1B |
For the 7B model, we train three times with different data order on 50B high quality tokens, and then average ("soup") the models.
| Checkpoint | Training config | WandB | |
|---|---|---|---|
| random seed 42 | stage2-ingredient1-step11931-tokens50B | OLMo2-7B-stage2-seed42.yaml | wandb.ai/OLMo2-7B |
| random seed 42069 | stage2-ingredient2-step11931-tokens50B | OLMo2-7B-stage2-seed42069.yaml | wandb.ai/OLMo2-7B |
| random seed 666 | stage2-ingredient3-step11931-tokens50B | OLMo2-7B-stage2-seed666.yaml | wandb.ai/OLMo2-7B |
| final souped model | main | no config, we just averaged the weights in Python |
The training configs linked here are set up to download the latest checkpoint after stage 1, and start training from there.
For the 13B model, we train three times with different data order on 100B high quality tokens, and one more timeon 300B high quality tokens. Then we average ("soup") the models.
| Checkpoint | Training config | WandB | |
|---|---|---|---|
| random seed 1110, 100B | stage2-ingredient1-step11931-tokens100B | OLMo2-13B-stage2-seed1110-100B.yaml | wandb.ai/OLMo2-13B |
| random seed 2662, 100B | stage2-ingredient2-step11931-tokens100B | OLMo2-13B-stage2-seed2662-100B.yaml | wandb.ai/OLMo2-13B |
| random seed 6209, 100B | stage2-ingredient3-step11931-tokens100B | OLMo2-13B-stage2-seed6209-100B.yaml | wandb.ai/OLMo2-13B |
| random seed 2662, 300B | stage2-ingredient4-step11931-tokens300B | OLMo2-13B-stage2-seed2662-300B.yaml | wandb.ai/OLMo2-13B |
| final souped model | main | no config, we just averaged the weights in Python |
The training configs linked here are set up to download the latest checkpoints after stage 1, and start training from there.
Note: You can find all the information about the 32B in theOLMo-core repository.
For instruction tuned variants of these models, go to
You can use our Hugging Face integration to run inference on the OLMo Transformers checkpoints:
fromtransformersimportAutoModelForCausalLM,AutoTokenizerolmo=AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-0425-1B")tokenizer=AutoTokenizer.from_pretrained("allenai/OLMo-2-0425-1B")message= ["Language modeling is "]inputs=tokenizer(message,return_tensors='pt',return_token_type_ids=False)# optional verifying cuda# inputs = {k: v.to('cuda') for k,v in inputs.items()}# olmo = olmo.to('cuda')response=olmo.generate(**inputs,max_new_tokens=100,do_sample=True,top_k=50,top_p=0.95)print(tokenizer.batch_decode(response,skip_special_tokens=True)[0])
Alternatively, with the Hugging Face pipeline abstraction:
fromtransformersimportpipelineolmo_pipe=pipeline("text-generation",model="allenai/OLMo-2-0425-1B")print(olmo_pipe("Language modeling is"))
olmo=AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-0425-1B",torch_dtype=torch.float16,load_in_8bit=True)# requires bitsandbytes
The quantized model is sensitive to input types and CUDA handling. To avoid potential issues, we recommend explicitly converting input IDs to CUDA using:inputs.input_ids.to('cuda')
Additional tools for evaluating OLMo models are available at theOLMo Eval andolmes repositories.
An example script is provided for hosting an OLMo 2 model on Modal.com using the OpenAI API in./scripts/olmo2_modal_openai.py.To run that:
- Follow the instructions under Getting Started inthe Modal.com Guide to installthe Modal library and command line tools.
- Follow the instructions underSecrets in the Modal.com Guide to create a Modal secret named "example-secret-token"that defines a value for the variable MODAL_TOKEN for your server.
- Then run
modal deploy ./scripts/olmo2_modal_openai.py
You can check your endpoint using curl similar to the following:
curl -X POST \ -H"Authorization: Bearer [the secret token from above]" \ -H"Content-Type: application/json" \ -d @body.json \ https://[the web endpoint modal creates above]/v1/chat/completions
wherebody.json is of the form:
{ "model": "OLMo-2-1124-13B-Instruct", "messages": [ { "role": "user", "content": "Who was Alan Turing?" } ], "max_tokens": 100, "temperature": 0.9, "stream": true}@misc{olmo20242olmo2furious,title={2 OLMo 2 Furious},author={Team OLMo and Pete Walsh and Luca Soldaini and Dirk Groeneveld and Kyle Lo and Shane Arora and Akshita Bhagia and Yuling Gu and Shengyi Huang and Matt Jordan and Nathan Lambert and Dustin Schwenk and Oyvind Tafjord and Taira Anderson and David Atkinson and Faeze Brahman and Christopher Clark and Pradeep Dasigi and Nouha Dziri and Michal Guerquin and Hamish Ivison and Pang Wei Koh and Jiacheng Liu and Saumya Malik and William Merrill and Lester James V. Miranda and Jacob Morrison and Tyler Murray and Crystal Nam and Valentina Pyatkin and Aman Rangapur and Michael Schmitz and Sam Skjonsberg and David Wadden and Christopher Wilhelm and Michael Wilson and Luke Zettlemoyer and Ali Farhadi and Noah A. Smith and Hannaneh Hajishirzi},year={2024},eprint={2501.00656},archivePrefix={arXiv},primaryClass={cs.CL},url={https://arxiv.org/abs/2501.00656}, }
About
Modeling, training, eval, and inference code for OLMo
Resources
License
Contributing
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
