This repository was archived by the owner on Jul 4, 2025. It is now read-only.
- Notifications
You must be signed in to change notification settings - Fork182
Hostfix: remove not needed params from load_model#2209
Merged
Uh oh!
There was an error while loading.Please reload this page.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
The --pooling flag was removed as the mean pooling functionality not needed in chat models. This fixes the regression
Adds support for the ctx_len parameter by appending --ctx-size with its value. Removed outdated parameter mappings from the kParamsMap to reflect current implementation details and ensure consistency.
When the model path contains both "jan" and "nano" (case-insensitive), automatically addspeculative decoding parameters to adjust generation behavior. This improvesflexibility by enabling environment-specific configurations without manualparameter tuning. Also includes necessary headers for string manipulation andfixes whitespace in ctx_len handling.
The comment was redundant as the code's purpose is clear without it, improving readability.
This commit introduces new configuration parameters and their corresponding command-line flags for the local engine. The changes include:- Adding "flash_attn" to ignored parameters- Mapping UI parameters to CLI flags (e.g., cpu_threads → --threads)- Expanding support for various model configuration optionsThese additions enhance the flexibility of the local engine by enabling fine-grained control over performance and behavior through both UI and CLI interfaces.
The condition was updated to include 'qwen' in the check for triggering specific parameters('--temp', '--top-p', etc.), aligning it with the existing 'jan' and 'nano' validation logic. This allowsthe same parameter configuration to apply to 'qwen' models as well as the original keywords.Removed deprecated parameters such as "dynatemp_exponent" and "ctx_len" handling logic,which were no longer needed. Added "flash_attn" back to the ignored parameters list.Cleaned up the parameter conversion logic by removing conditional blocks forspecific model optimizations that are no longer required.
This reverts commit7ae8a15.
louis-jan approved these changesJun 12, 2025
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Describe Your Changes
Fixes Issues
Self Checklist