Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Undfined/continual pt#146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
undfined wants to merge148 commits intomain
base:main
Choose a base branch
Loading
fromundfined/continual-pt
Draft
Changes from1 commit
Commits
Show all changes
148 commits
Select commitHold shift + click to select a range
2740d49
add round1 anneal configs
aettingJun 26, 2025
39da9f2
Pin to swafix and add smoketest config
undfinedJun 26, 2025
e765967
Trainer config updates
undfinedJun 26, 2025
14233fa
More fixes
undfinedJun 26, 2025
d753873
oops
undfinedJun 26, 2025
528bb19
More config tweaks
undfinedJun 26, 2025
610a9de
Imports
undfinedJun 26, 2025
2a3e8cb
Fix for WSD class bug
undfinedJun 26, 2025
068b776
Match ac_config from swafix
undfinedJun 26, 2025
613ab1a
Match sliding window changes
undfinedJun 27, 2025
4bbb982
More shenans
undfinedJun 27, 2025
65eb9e4
Typo
undfinedJun 27, 2025
a1ac0da
comment
undfinedJun 27, 2025
1353c5a
Can't load state with new dataset
undfinedJun 27, 2025
7c523ee
OOM
undfinedJun 27, 2025
b9df024
olmo3 settings and new paths
aettingJun 30, 2025
09f3a38
resources and web name
aettingJun 30, 2025
13ef591
new web paths
aettingJun 30, 2025
7b2afe8
Use improved scheduler branch
undfinedJul 1, 2025
1b35a4d
Merge branch 'undfined/swafix-core' into olmo3-anneals
aettingJul 1, 2025
335283b
update round1 anneal paths (missing two)
aettingJul 1, 2025
8d834bb
update example configs
aettingJul 1, 2025
657084e
add web paths
aettingJul 1, 2025
c167043
Merge branch 'undfined/swafix-core' into olmo3-anneals
aettingJul 1, 2025
de07e7a
consistency updates
aettingJul 1, 2025
71cf247
Use new dolmino math and update weights
undfinedJul 1, 2025
6104825
Allow repetitions in hqweb
undfinedJul 1, 2025
953ae4a
Not enough tokens for dolmino
undfinedJul 1, 2025
e614cae
Adjust reddit target
undfinedJul 2, 2025
70684e2
Try double rbz
undfinedJul 2, 2025
95859f9
oops
undfinedJul 2, 2025
ec6a6bc
Back to 8192 rbz
undfinedJul 2, 2025
106f175
try with float8
undfinedJul 2, 2025
035d19a
Newer torch
undfinedJul 2, 2025
36cc983
dp tweaks
undfinedJul 2, 2025
2520bdb
match pretrain
undfinedJul 2, 2025
aee96c2
More tweaks for large job
undfinedJul 2, 2025
8b085bd
baseline dolmino anneal config
aettingJul 2, 2025
d897712
paths bucket and format fix
aettingJul 2, 2025
95ef206
mj anneals rd1
Jul 3, 2025
4339fdd
Tweaks for mj anneals
undfinedJul 3, 2025
a0544c2
Rejiggered ratios for OMR rewrites
Jul 3, 2025
e3b8261
merge
Jul 3, 2025
a562e4c
restore trainer state from save folder
epwalshJul 3, 2025
8520a13
Merge pull request #127 from allenai/epwalsh/olmo3-anneals
aettingJul 3, 2025
dba258a
update example priority
aettingJul 3, 2025
2979e40
restore model_and_optim
aettingJul 3, 2025
a209758
add lr-test-config
aettingJul 3, 2025
ba57e1f
Added a bunch of nanoanneals
Jul 3, 2025
e595476
Merge branch 'olmo3-anneals' of github.com:allenai/olmo-cookbook into…
Jul 3, 2025
7983868
fixed typo
Jul 3, 2025
9fe0df4
added submodular dolmino math curves
Jul 3, 2025
70a1a94
path format consistency
aettingJul 4, 2025
a11ea98
fix name
aettingJul 4, 2025
47211e3
added gs->weka tool
Jul 7, 2025
14d46c9
Merge branch 'olmo3-anneals' of github.com:allenai/olmo-cookbook into…
Jul 7, 2025
148b539
idk this is some luca thing maybe\?
Jul 7, 2025
6cbe734
Added convert from config
Jul 7, 2025
c78dc1e
diff convert
Jul 7, 2025
acbd449
tyler wanted me to do this, idk
Jul 7, 2025
fd233ad
Merge branch 'main' into olmo3-anneals
Jul 7, 2025
de49cd6
Adds v2 hq fim stackedu microanneal
undfinedJul 7, 2025
cd707d9
convert with custom branching
Jul 7, 2025
14b95d3
Merge branch 'olmo3-anneals' of github.com:allenai/olmo-cookbook into…
Jul 7, 2025
90ac849
Adds v2++ hq fim stackedu microanneal
undfinedJul 7, 2025
1670251
Merge branch 'olmo3-anneals' of github.com:allenai/olmo-cookbook into…
undfinedJul 7, 2025
ba62835
add 7T anneals
aettingJul 8, 2025
b41ca95
step update
aettingJul 8, 2025
d06c704
fix run names
aettingJul 8, 2025
12befcd
add ae microanneals
aettingJul 9, 2025
6d42d1d
bump up rank microbatch size
aettingJul 9, 2025
e7e55ac
Added eval script for midtraining
Jul 9, 2025
40408cf
uncomment
Jul 9, 2025
5ed2c7b
merge
Jul 9, 2025
d849517
Update README.md
revbucketJul 9, 2025
9dc008a
add testrun
aettingJul 9, 2025
d5f4d88
Rename olmo2 anneals and add olmo3-fim-code configs
undfinedJul 9, 2025
b1dd917
Added 'missing eval' stuff
Jul 9, 2025
f0eeaa6
Too many workers counting tokens
undfinedJul 9, 2025
8eb33fc
Merge branch 'olmo3-anneals' of github.com:allenai/olmo-cookbook into…
Jul 9, 2025
c471589
merged davidhs backfill stuff
Jul 9, 2025
8645829
increment eval version
Jul 9, 2025
0657fb0
Wrong weight for hqweb
undfinedJul 10, 2025
29a3653
Merge branch 'olmo3-anneals' of github.com:allenai/olmo-cookbook into…
undfinedJul 10, 2025
38aeeea
add highthresh diverse qa config
aettingJul 10, 2025
6fd7250
Added mjnewmath-bestof
Jul 10, 2025
aba11c2
add wip anneal round 2 config
aettingJul 11, 2025
2144071
added kodkode mjicroanneals
Jul 11, 2025
3f32ad1
Merge branch 'olmo3-anneals' of github.com:allenai/olmo-cookbook into…
Jul 11, 2025
079cdf5
update reasoning and math ratios
aettingJul 11, 2025
8ae28c7
add round2 8T
aettingJul 11, 2025
371e551
8 nodes
aettingJul 11, 2025
ed104ab
update reasoning paths and run names
aettingJul 12, 2025
687bce5
updated code path
aettingJul 12, 2025
fc81e62
updated paths
soldniJul 12, 2025
830c5c1
adjusting ratios
soldniJul 12, 2025
35d7cde
merged main
Jul 14, 2025
d2cd5a9
add follow-up reasoning microanneals
aettingJul 14, 2025
e8dc411
Adds 10b anneal with 35/30/35 web/code/etc ratios
undfinedJul 14, 2025
984e751
add reddit lowthresh663 microanneal
aettingJul 14, 2025
5773e19
added megamath-web-pro-max anneals
Jul 15, 2025
7182650
more reddit lowthresh microanneals
aettingJul 15, 2025
647294d
fix nonmc name
aettingJul 16, 2025
bee78f9
cleaned up
Jul 16, 2025
0b90f6b
Merge branch 'olmo3-anneals' of github.com:allenai/olmo-cookbook into…
Jul 16, 2025
b108729
remove model_and_optim in load_path
aettingJul 16, 2025
c18c541
add round 3 macroanneal configs
aettingJul 18, 2025
bc2c081
add 12T configs
aettingJul 21, 2025
31e480b
lowthresh mcplusfull
aettingJul 21, 2025
bcf3459
added rewrite checks
Jul 21, 2025
aa985bf
convert from config hashes updated
Jul 21, 2025
d187de5
Added swallow anneals
Jul 21, 2025
e66c626
lowthresh add context v1
aettingJul 22, 2025
0825df8
fix path
aettingJul 22, 2025
68c6fee
add 200B round3 config
aettingJul 25, 2025
0401ec8
more web paths
aettingJul 25, 2025
6ef5068
adjust math ratios and code path
aettingJul 26, 2025
a9de822
adjust reasoning ratios
aettingJul 26, 2025
795e26d
adjust reasoning ratios
aettingJul 26, 2025
2cf4c9c
update name
aettingJul 28, 2025
b12faae
16 nodes
aettingJul 28, 2025
59d42fc
add omr fullthoughts baseline
aettingJul 28, 2025
fafdcc5
psgqa microanneal
aettingJul 30, 2025
fa8d330
psgqa microanneal name
aettingJul 30, 2025
da8433d
psgqa microanneal name
aettingJul 30, 2025
30d25b3
add no reasoning no instruct
aettingJul 31, 2025
fbf6ff9
add no reasoning no instruct
aettingJul 31, 2025
b50797f
add nodes
aettingJul 31, 2025
a5d0d3b
fix dolmino ratio
aettingJul 31, 2025
65f7ebf
add sub8k llamanemotron
aettingJul 31, 2025
cc1f690
Added check of swallowmatt stuff
Jul 31, 2025
d79785a
bumped nodes on fm4p
Jul 31, 2025
c71cd1a
Added megamatt test anneals
Jul 31, 2025
05249d8
changed names
Jul 31, 2025
78bb0e4
some more swallowmath diversity experiments
Aug 1, 2025
5193269
correct token counts
Aug 1, 2025
ba1f52d
Adds changes to support continual pretrain of olmo3
undfinedAug 1, 2025
c6f53f2
Fix config
undfinedAug 1, 2025
33a52e2
Try with diff path
undfinedAug 1, 2025
e24a6d6
Can't load state with new dataset
undfinedAug 1, 2025
8703207
OOM fix maybe
undfinedAug 1, 2025
620e3ef
Div by 0 not gud
undfinedAug 1, 2025
98352fc
Must set warmup
undfinedAug 1, 2025
a814101
Tweaks
undfinedAug 1, 2025
336347e
fix model validation
undfinedAug 1, 2025
b5e84a7
Try decay for a single token
undfinedAug 1, 2025
14a8dd4
set lr floor for decay
undfinedAug 1, 2025
b048115
duh
undfinedAug 1, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
PrevPrevious commit
NextNext commit
psgqa microanneal
  • Loading branch information
@aetting
aetting committedJul 30, 2025
commitfafdcc5bdcee67c2c30d4531812f4b45d17f5664
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
name: "webv19-redditv1-10B-olmo3-microanneal"
description: "OLMo3 7b 10B web v18 + reddit v1 microanneal"
budget: "ai2/oe-base"
workspace: "ai2/olmo-3-microanneals"
nodes: 4
gpus: 8
preemptible: true
max_tokens: 10_000_000_000
global_batch_size: 2097152
sequence_length: 8192
seed: 1337
model: "olmo2_7B_swafix"
tokenizer: "dolma2"
priority: high
cluster: ai2/augusta-google-1
rank_microbatch_size: 16384
scheduler_type: linear
warmup_steps: 0
activation_checkpointing: true
annealing:
enabled: true
load_path: gs://ai2-llm/checkpoints/OLMo3-7B-swafix/step289000
load_state: false
dataset:
sources:
- name: web
target_ratio: 0.58
paths:
- s3://ai2-llm/preprocessed/cc_all_dressed/all_dressed_v3_subsamples/midtrain_pools/6B/allenai/dolma2-tokenizer/*.npy
- name: psgqa
target_ratio: 0.42
paths:
- s3://ai2-llm/preprocessed/wiki_psgqa_rewrites/psgqa_rewrites_v1/allenai/dolma2-tokenizer/*.npy

[8]ページ先頭

©2009-2025 Movatter.jp