- Notifications
You must be signed in to change notification settings - Fork453
[WIP] Remove interleaved thinking for MT thinking model#1135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Summary of ChangesHello@natolambert, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new message formatting template, Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on ourdocumentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on@gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign uphere. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with theGemini Code Assist IDE Extension. Footnotes
|
97f65c4 tofaab9deCompareThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Code Review
This pull request introduces a new chat template,olmo_thinker_no_think, designed to remove interleaved thinking and control when reasoning is emitted. My review focuses on improving the robustness and consistency of this new template. I've identified a potential data loss issue in the parsing logic for the<think> block and an inconsistency in the default system prompt compared to otherolmo templates. The suggested changes aim to make the template more reliable and align it with existing conventions.
| "{% if '</think>' in assistant_content %}" | ||
| "{% set think_split = assistant_content.split('</think>') %}" | ||
| "{% set reasoning_content = think_split[0].rstrip('\\n').split('<think>')[-1].lstrip('\\n') %}" | ||
| "{% set assistant_content = think_split[-1].lstrip('\\n') %}" | ||
| "{% endif %}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The current logic for parsing the<think> block is fragile and can lead to data loss. Specifically, any content before the<think> tag is discarded. For example, if the assistant content isPreamble <think>reasoning</think> Answer, the preamble will be lost.
I suggest a more robust parsing logic that correctly handles content before and after the<think> block, and also ensures both<think> and</think> tags are present before attempting to split.
| "{% if '</think>' in assistant_content %}" | |
| "{% setthink_split= assistant_content.split('</think>') %}" | |
| "{% setreasoning_content = think_split[0].rstrip('\\n').split('<think>')[-1].lstrip('\\n') %}" | |
| "{% setassistant_content =think_split[-1].lstrip('\\n') %}" | |
| "{% endif %}" | |
| "{% if '<think>' in assistant_content and '</think>' in assistant_content %}" | |
| "{% set_before_think, _after_think= assistant_content.split('<think>', 1) %}" | |
| "{% set_reasoning, _after_reasoning = _after_think.split('</think>', 1) %}" | |
| "{% setreasoning_content =_reasoning.strip() %}" | |
| "{%set assistant_content = (_before_think ~ _after_reasoning).lstrip('\\n') %}{%endif %}" |
| "olmo_thinker_no_think": ( | ||
| "{% set has_system = messages|selectattr('role', 'equalto', 'system')|list|length > 0 %}" | ||
| "{% if not has_system %}" | ||
| "{{ '<|im_start|>system\nYou are Olmo, a helpful AI assistant built by Ai2. Your date cutoff is December 2024, and your model weights are available at https://huggingface.co/allenai.<|im_end|>\n' }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The default system prompt forolmo_thinker_no_think is missing information about function-calling capabilities. Otherolmo templates include this, and the rest of this template handles functions. This inconsistency could be confusing. For consistency, I suggest updating the default system prompt to mention functions, similar to otherolmo templates.
| "{{ '<|im_start|>system\nYou are Olmo, a helpful AI assistant built by Ai2. Your date cutoff is December 2024, and your model weights are available at https://huggingface.co/allenai.<|im_end|>\n' }}" | |
| "{{ '<|im_start|>system\nYou are Olmo, a helpful AI assistant built by Ai2. Your date cutoff is December 2024, and your model weights are available at https://huggingface.co/allenai. You do not currently have access to any functions. <functions></functions><|im_end|>\n' }}" |
Uh oh!
There was an error while loading.Please reload this page.
Note
Updates
olmo_thinker_no_thinkto parse assistant messages, extract<think>content, and emit it as a dedicated reasoning block only after the last user turn before the assistant’s final content.open_instruct/dataset_transformation.py:olmo_thinker_no_think:last_user_indexacrossmessages.contentto extract reasoning between<think> ... </think>and the remaining assistant text.'<think>...\n</think>'block before assistant content; otherwise output assistant content only.functions,function_calls, and message boundaries/EOS.Written byCursor Bugbot for commitfaab9de. This will update automatically on new commits. Configurehere.