Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Add barebones but working implementation of model preload (#209)#235

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
mostlygeek merged 8 commits intomainfromadd-preload-hook
Aug 14, 2025

Conversation

@mostlygeek
Copy link
Owner

@mostlygeekmostlygeek commentedAug 9, 2025
edited by coderabbitaibot
Loading

  • add hooks.on_startup.preload (array)
  • implement in ProxyManager to load models
  • todo: tests, docs

Summary by CodeRabbit

  • New Features
    • Optional non-blocking startup hooks to preload models via configuration (hooks.on_startup.preload); preload runs in background.
  • Events
    • Emits model preload status events to report success/failure for monitoring and readiness.
  • Tests
    • Added tests for startup preload behavior and config parsing; updated test helpers.
  • Documentation
    • README and example config updated to document startup preload option.
  • Chores
    • Added .dev/ to .gitignore.

@coderabbitai
Copy link

coderabbitaibot commentedAug 9, 2025
edited
Loading

Walkthrough

Adds startup preload hooks to config and ProxyManager.New, introduces a DiscardWriter for dummy requests, and defines a ModelPreloadedEvent. Updates tests to cover new hooks and configuration parsing. Also ignores .dev/ in .gitignore.

Changes

Cohort / File(s)Summary of changes
Config hooks & sanitization
proxy/config.go
AddedHooksConfig andHookOnStartup types; extendedConfig withHooks; sanitizedHooks.OnStartup.Preload entries after YAML unmarshal (trim, skip empty, canonicalize viaRealModelName).
Proxy startup preload
proxy/proxymanager.go
Added asynchronous preload path inNew(): for each configured model, resolveProcessGroup, issue dummy GET viaProxyRequest withDiscardWriter, and emitModelPreloadedEvent with success/failure (non-blocking).
Events extension
proxy/events.go
IntroducedModelPreloadedEventID andModelPreloadedEvent{ModelName, Success} withType() method.
HTTP sink writer
proxy/discardWriter.go
NewDiscardWriter implementinghttp.ResponseWriter andhttp.Flusher; records headers/status, discards body, no-opFlush.
Tests for hooks/config
proxy/config_posix_test.go,proxy/helpers_test.go,proxy/proxymanager_test.go
Updated/added tests to parsehooks.on_startup.preload, use globalsimpleResponderPath, and verify preload events and model readiness.
Repo ignore
.gitignore
Added.dev/ to ignore list.
Docs / examples
README.md,config.example.yaml
Documented newhooks.on_startup.preload feature and updated Features notes and example config.

Sequence Diagram(s)

sequenceDiagram    participant App    participant ProxyManager    participant Preloader as Goroutine    participant ProcGroup as ProcessGroup    participant Proxy as ProxyRequest    participant Sink as DiscardWriter    participant Bus as EventBus    App->>ProxyManager: New(config)    ProxyManager-->>App: return (preload runs async)    ProxyManager->>Preloader: start if hooks.on_startup.preload not empty    loop for each model        Preloader->>ProxyManager: swapProcessGroup(model)        alt group found            Preloader->>Proxy: ProxyRequest GET "/"            Proxy->>Sink: write/discard response            Preloader->>Bus: emit ModelPreloadedEvent{model, true}        else error/not found            Preloader->>Bus: emit ModelPreloadedEvent{model, false}        end    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

enhancement,configuration

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from theIntegrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branchadd-preload-hook

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat withCodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag@coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag@coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on oursupport page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type@coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add@coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add@coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add@coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit ourStatus Page to check the current availability of CodeRabbit.
  • Visit ourDocumentation for detailed information on how to use CodeRabbit.
  • Join ourDiscord Community to get help, request features, and share feedback.
  • Follow us onX/Twitter for updates and announcements.

@mostlygeekmostlygeek marked this pull request as draftAugust 9, 2025 16:14
Copy link

@coderabbitaicoderabbitaibot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
proxy/config.go (1)

141-146:Consider making OnStartup a named type for future extensibility

Inline struct is fine now, but if hooks grow, a named type (e.g., type OnStartupHooks struct { Preload []string ... }) can improve clarity and reuse across config, tests, and docs.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and betweena186318 anded2293a.

📒 Files selected for processing (4)
  • proxy/config.go (3 hunks)
  • proxy/config_test.go (1 hunks)
  • proxy/discardWriter.go (1 hunks)
  • proxy/proxymanager.go (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
proxy/proxymanager.go (2)
proxy/discardWriter.go (1)
  • DiscardWriter (6-6)
proxy/proxymanager_test.go (6)
  • TestProxyManager_SwapMultiProcess (45-86)
  • TestProxyManager_Unload (381-406)
  • TestProxyManager_SwapMultiProcessParallelRequests (130-186)
  • TestProxyManager_SwapProcessCorrectly (21-43)
  • TestProxyManager_PersistentGroupsAreNotSwapped (90-126)
  • TestProxyManager_RunningEndpoint (409-473)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run-tests
🔇 Additional comments (1)
proxy/config.go (1)

165-167:Config.Hooks field addition LGTM

Field wiring and YAML tags are consistent with the new HooksConfig type.

@mostlygeekmostlygeek marked this pull request as ready for reviewAugust 11, 2025 23:39
@mostlygeekmostlygeek marked this pull request as draftAugust 11, 2025 23:40
@mostlygeek
Copy link
OwnerAuthor

@coderabbitai review

coderabbitai[bot] reacted with eyes emoji

@coderabbitai
Copy link

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@mostlygeekmostlygeek marked this pull request as ready for reviewAugust 11, 2025 23:53
@mostlygeek
Copy link
OwnerAuthor

@coderrabbitai review please

Copy link

@coderabbitaicoderabbitaibot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
proxy/discardWriter.go (1)

5-28:ResponseWriter contract fixed; Flush support added

Header is now stable, Write returns len(p), status is tracked, and Flush is provided. This resolves prior issues flagged in earlier reviews about breaking http semantics and panic risks in flusher paths.

🧹 Nitpick comments (2)
proxy/helpers_test.go (1)

72-78:Use the already-initialized simpleResponderPath in YAML

Switching to the package-level simpleResponderPath makes the YAML generation stable across platforms. One nit: TestMain (Line 24) still recomputes the path via getSimpleResponderPath(). Consider reusing simpleResponderPath there to avoid drift between the path you validate and the one you embed in configs.

proxy/events.go (1)

53-60:Consider documenting Success semantics

The ModelPreloadedEvent type is clear. Add a brief comment clarifying what “Success” means (e.g., process swap ok vs. dummy request ok) to prevent ambiguous interpretations by subscribers.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and betweened2293a andaeaa4ec.

📒 Files selected for processing (8)
  • .gitignore (1 hunks)
  • proxy/config.go (3 hunks)
  • proxy/config_posix_test.go (2 hunks)
  • proxy/discardWriter.go (1 hunks)
  • proxy/events.go (2 hunks)
  • proxy/helpers_test.go (2 hunks)
  • proxy/proxymanager.go (2 hunks)
  • proxy/proxymanager_test.go (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (2)
  • proxy/proxymanager.go
  • proxy/config.go
🧰 Additional context used
🧬 Code Graph Analysis (2)
proxy/proxymanager_test.go (5)
proxy/config.go (1)
  • LoadConfigFromReader (198-361)
proxy/proxymanager.go (1)
  • New (47-130)
proxy/process.go (2)
  • StopWaitForInflightRequest (36-36)
  • StateReady (25-25)
event/default.go (1)
  • On (16-18)
proxy/events.go (1)
  • ModelPreloadedEvent (53-56)
proxy/helpers_test.go (1)
proxy/logMonitor.go (1)
  • NewLogMonitorWriter (41-49)
🔇 Additional comments (3)
proxy/helpers_test.go (1)

16-19:LGTM on centralizing test helpers

Moving nextTestPort, testLogger, and simpleResponderPath to package-level improves reuse across tests. No functional issues spotted.

proxy/events.go (1)

10-10:Event ID allocation looks consistent

ModelPreloadedEventID = 0x06 continues the existing sequence without collisions. Good addition.

proxy/config_posix_test.go (1)

103-106:Hooks parsing and expected struct assertion are solid

The YAML hooks.on_startup.preload is exercised and the expected Config.Hooks.OnStartup.Preload assertion reflects canonicalized values. This provides good coverage for the new config surface.

Also applies to: 169-173

@mostlygeekmostlygeek merged commit5dc6b3e intomainAug 14, 2025
1 check was pending
@mostlygeekmostlygeek deleted the add-preload-hook branchAugust 14, 2025 17:27
Copy link

@coderabbitaicoderabbitaibot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
proxy/proxymanager.go (1)

101-127:Preload event may report success on proxy failure; add cancellation, endpoint targeting, and error handling

Current flow emits Success: true even if ProxyRequest fails, doesn’t cancel on shutdown, and always probes “/”. Tighten this:

  • Emit Success: false if ProxyRequest fails.
  • Use pm.shutdownCtx so preload cancels cleanly.
  • Prefer each model’s CheckEndpoint (fallback “/”).
  • Handle http.NewRequest error and log proxy errors.
  • Copy the preload slice defensively in the goroutine.
 // run any startup hooks if len(config.Hooks.OnStartup.Preload) > 0 {-// do it in the background, don't block startup -- not sure if good idea yet+// do it in the background, don't block startup go func() {-discardWriter := &DiscardWriter{}-for _, realModelName := range config.Hooks.OnStartup.Preload {+// defensive copy in case config is ever mutated later+preload := append([]string(nil), config.Hooks.OnStartup.Preload...)+for _, realModelName := range preload {+select {+case <-pm.shutdownCtx.Done():+pm.proxyLogger.Infof("Preload cancelled due to shutdown")+return+default:+} proxyLogger.Infof("Preloading model: %s", realModelName) processGroup, _, err := pm.swapProcessGroup(realModelName)  if err != nil {-event.Emit(ModelPreloadedEvent{-ModelName: realModelName,-Success:   false,-})+event.Emit(ModelPreloadedEvent{ModelName: realModelName, Success: false}) proxyLogger.Errorf("Failed to preload model %s: %v", realModelName, err) continue-} else {-req, _ := http.NewRequest("GET", "/", nil)-processGroup.ProxyRequest(realModelName, discardWriter, req)-event.Emit(ModelPreloadedEvent{-ModelName: realModelName,-Success:   true,-}) }+// Prefer model-specific check endpoint if configured+endpoint := strings.TrimSpace(pm.config.Models[realModelName].CheckEndpoint)+if endpoint == "" {+endpoint = "/"+}+req, reqErr := http.NewRequestWithContext(pm.shutdownCtx, "GET", endpoint, nil)+if reqErr != nil {+event.Emit(ModelPreloadedEvent{ModelName: realModelName, Success: false})+proxyLogger.Errorf("Failed to build preload request for %s: %v", realModelName, reqErr)+continue+}+// Use a fresh DiscardWriter per request+if err := processGroup.ProxyRequest(realModelName, &DiscardWriter{}, req); err != nil {+event.Emit(ModelPreloadedEvent{ModelName: realModelName, Success: false})+proxyLogger.Errorf("Preload request failed for %s: %v", realModelName, err)+continue+}+event.Emit(ModelPreloadedEvent{ModelName: realModelName, Success: true}) } }() }
🧹 Nitpick comments (2)
config.example.yaml (1)

223-231:Fix typo and clarify alias support in startup hooks

  • Typo: “dictionar” → “dictionary”.
  • Minor grammar: “models sections” → “models section”.
  • Add a note that aliases are supported and resolved to real model IDs (this matches the config cleanup logic via RealModelName).

Apply this diff:

-  # - optional, default: empty dictionar+  # - optional, default: empty dictionary@@-        # - model names must match keys in the models sections+        # - model names must match keys in the models section+        # - aliases are supported; entries are resolved to real model IDs
proxy/proxymanager_test.go (1)

868-885:Count only successful preloads and assert both model IDs observed

The test currently waits for any two events, including failures. Tighten to count successes and track model names to avoid false positives.

-preloadChan := make(chan ModelPreloadedEvent, 2) // buffer for 2 expected events--unsub := event.On(func(e ModelPreloadedEvent) {-preloadChan <- e-})+successCh := make(chan string, 2) // buffer for 2 expected successes+unsub := event.On(func(e ModelPreloadedEvent) {+if e.Success {+successCh <- e.ModelName+}+})@@-for i := 0; i < 2; i++ {-select {-case <-preloadChan:-case <-time.After(5 * time.Second):-t.Fatal("timed out waiting for models to preload")-}-}+want := map[string]bool{"model1": false, "model2": false}+timeout := time.After(10 * time.Second)+for seen := 0; seen < 2; {+select {+case name := <-successCh:+if _, ok := want[name]; ok && !want[name] {+want[name] = true+seen+++}+case <-timeout:+t.Fatalf("timed out waiting for successful preloads; seen=%v", want)+}+}
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and betweene552095 anddbfe92a.

📒 Files selected for processing (10)
  • .gitignore (1 hunks)
  • README.md (5 hunks)
  • config.example.yaml (2 hunks)
  • proxy/config.go (3 hunks)
  • proxy/config_posix_test.go (2 hunks)
  • proxy/discardWriter.go (1 hunks)
  • proxy/events.go (2 hunks)
  • proxy/helpers_test.go (2 hunks)
  • proxy/proxymanager.go (2 hunks)
  • proxy/proxymanager_test.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • .gitignore
  • proxy/helpers_test.go
  • proxy/events.go
  • proxy/config.go
  • proxy/config_posix_test.go
  • proxy/discardWriter.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-05-05T23:54:14.205Z
Learnt from: mostlygeekPR: mostlygeek/llama-swap#117File: proxy/config.go:0-0Timestamp: 2025-05-05T23:54:14.205ZLearning: In the llama-swap project, automatic port assignment does not include port availability checks. It's the user's responsibility to provide an appropriate startPort that doesn't conflict with other services.

Applied to files:

  • README.md
🧬 Code Graph Analysis (2)
proxy/proxymanager_test.go (5)
proxy/config.go (1)
  • LoadConfigFromReader (198-361)
proxy/events.go (1)
  • ModelPreloadedEvent (53-56)
event/default.go (1)
  • On (16-18)
proxy/proxymanager.go (1)
  • New (47-130)
proxy/process.go (1)
  • StateReady (25-25)
proxy/proxymanager.go (4)
proxy/discardWriter.go (1)
  • DiscardWriter (6-9)
event/default.go (1)
  • Emit (28-30)
proxy/events.go (1)
  • ModelPreloadedEvent (53-56)
llama-swap.go (1)
  • main (27-178)
🔇 Additional comments (4)
README.md (3)

34-36:Docs: Nice addition highlighting container lifecycle and startup preload

The revised bullets add clarity around container lifecycle (cmdStart/cmdStop) and the new startup preload feature. Good visibility for users.


46-49:Docs: Clearer setup intro

Explicitly calling out the minimal config path improves onboarding. LGTM.


141-149:Homebrew section reads well

The flow (tap → install → run) is clear. No issues spotted.

proxy/proxymanager.go (1)

18-21:Event emission import is correct and scoped appropriately

Using the event bus here aligns with the existing event system and keeps the preload workflow observable in tests and ops tooling.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@coderabbitaicoderabbitai[bot]coderabbitai[bot] left review comments

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

1 participant

@mostlygeek

[8]ページ先頭

©2009-2025 Movatter.jp