Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Feature Request: Improve Multi-Replica Behavior for OSS (Cold Standby Support) #18345

@bjornrobertsson

Description

@bjornrobertsson

Problem Statement

Currently, when deploying Coder OSS withreplicas=2 (or more) without an Enterprise license, the deployment creates multiplecoderd instances that compete for database access, leading to:

  • Inconsistent behavior: ~50% success rate for connections (Terminal works sometimes, VS Code fails completely)
  • Poor user experience: No clear indication that multi-replica setup requires Enterprise license
  • Silent failures: No warning or error messages about unsupported configuration
  • Wasted resources: Multiple instances running when only one can be active

Current Behavior

  1. Multiplecoderd instances start successfully
  2. All instances attempt to connect to PostgreSQL
  3. Traffic gets distributed across instances without proper coordination
  4. Results in race conditions and connection failures
  5. VS Code extensions fail completely, some terminal connections work intermittently

Proposed Solution

Implement adatabase-level locking mechanism for OSS deployments that would:

1. Primary Instance Locking

  • Firstcoderd instance to connect successfully becomes the "primary"
  • Creates a lock record in PostgreSQL (e.g.,instance_locks table with instance ID, timestamp, heartbeat)
  • Continuously updates heartbeat to maintain lock ownership

2. Standby Instance Behavior

  • Additional instances detect existing lock and enter "cold standby" mode
  • Standby instances:
    • Monitor primary instance heartbeat
    • Return HTTP 503 (Service Unavailable) for all requests with clear error message
    • Automatically promote to primary if original instance fails/heartbeat expires
    • Log clear status messages about standby state

3. Clear User Feedback

  • Startup logs: Clear indication of primary vs standby status
  • Health endpoints: Different responses for primary (200 OK) vs standby (503 Service Unavailable)
  • Admin UI warning: Banner indicating "Multiple replicas detected - Enterprise license required for load balancing"

Implementation Details

-- Example lock table structureCREATETABLEIF NOT EXISTS instance_locks (    lock_nameVARCHAR(255)PRIMARY KEY,    instance_id UUIDNOT NULL,    acquired_atTIMESTAMPTZNOT NULL,    heartbeat_atTIMESTAMPTZNOT NULL,    expires_atTIMESTAMPTZNOT NULL);
// Pseudo-code for lock acquisitionfunc (s*Server)acquirePrimaryLock(ctx context.Context) (bool,error) {// Try to acquire or refresh lock// Return true if this instance is primary, false if standby}

4. Configuration Options

Add environment variables:

  • CODER_OSS_STANDBY_MODE=auto (default: auto-detect and enter standby)
  • CODER_LOCK_TIMEOUT=30s (how long before lock expires)
  • CODER_HEARTBEAT_INTERVAL=10s (how often to update heartbeat)

Benefits

  1. Graceful degradation: Multi-replica deployments work predictably without license
  2. High availability: Automatic failover when primary instance fails
  3. Clear feedback: Users understand what's happening and why
  4. Resource efficiency: Only one active instance processing requests
  5. Enterprise upsell: Clear path to licensed version for true load balancing

Alternative Considerations

  • License check with graceful shutdown: Detect multi-replica + no license and shut down extra instances
  • Load balancer integration: Provide health check endpoints that only return healthy for primary
  • Admin warnings: Dashboard notifications about suboptimal configuration

Related Issues/Context

This addresses the common Kubernetes deployment pattern where users naturally setreplicas=2 for high availability, not realizing it requires Enterprise licensing. The current behavior creates a frustrating debugging experience.


Priority: Medium-High (affects common deployment scenarios)
Labels:enhancement,oss,database,high-availability


Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp