Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Readiness probes for Kubernetes application

License

NotificationsYou must be signed in to change notification settings

kazhuravlev/healthcheck

Repository files navigation

Go ReferenceLicenseBuild StatusGo Report CardCodeCovMentioned in Awesome Go

A production-ready health check library for Go applications that enables proper monitoring and graceful degradation inmodern cloud environments,especiallyKubernetes.

Why Health Checks Matter

Health checks are critical for building resilient, self-healing applications in distributed systems. They provide:

  1. Automatic Recovery: In Kubernetes, failed health checks trigger automatic pod restarts, ensuring your applicationrecovers from transient failures without manual intervention.
  2. Load Balancer Integration: Health checks prevent traffic from being routed to unhealthy instances, maintainingservice quality even during partial outages.
  3. Graceful Degradation: By monitoring dependencies (databases, caches, external APIs), your application can degradegracefully when non-critical services fail.
  4. Operational Visibility: Health endpoints provide instant insight into system state, making debugging and incidentresponse faster.
  5. Zero-Downtime Deployments: Readiness checks ensure new deployments only receive traffic when fully initialized.

Features

  • Multiple Check Types: Basic (sync), Manual, and Background (async) checks for different use cases
  • Kubernetes Native: Built-in/live and/ready endpoints following k8s conventions
  • JSON Status Reports: Detailed health status with history for debugging
  • Metrics Integration: Callbacks for Prometheus or other monitoring systems
  • Thread-Safe: Concurrent-safe operations with proper synchronization
  • Graceful Shutdown: Proper cleanup of background checks and shutdown signaling
  • Check History: Last 5 states stored for each check for debugging

Installation

go get -u github.com/kazhuravlev/healthcheck

Quick Start

package mainimport ("context""errors""math/rand""time""github.com/kazhuravlev/healthcheck")funcmain() {ctx:=context.TODO()// 1. Create healthcheck instancehc,_:=healthcheck.New()// 2. Register a simple checkhc.Register(ctx,healthcheck.NewBasic("redis",time.Second,func(ctx context.Context)error {ifrand.Float64()>0.5 {returnerrors.New("service is not available")}returnnil}))// 3. Start HTTP serverserver,_:=healthcheck.NewServer(hc,healthcheck.WithPort(8080))_=server.Run(ctx)// 4. Check health at http://localhost:8080/readyselect {}}

Types of Health Checks

1. Basic Checks (Synchronous)

Basic checks run on-demand when the/ready endpoint is called. Use these for:

  • Fast operations (< 1 second)
  • Checks that need fresh data
  • Low-cost operations
// Database connectivity checkdbCheck:=healthcheck.NewBasic("postgres",time.Second,func (ctx context.Context)error {returndb.PingContext(ctx)})

2. Background Checks (Asynchronous)

Background checks run periodically in a separate goroutine (in background mode). Use these for:

  • Expensive operations (API calls, complex queries)
  • Checks with rate limits (when checks running rarely than k8s requests to/ready)
  • Operations that can use slightly stale data
// External API health check - runs every 30 secondsapiCheck:=healthcheck.NewBackground("payment-api",nil,// initial error state5*time.Second,// initial delay30*time.Second,// check interval5*time.Second,// timeout per checkfunc (ctx context.Context)error {resp,err:=client.Get("https://api.payment.com/health")iferr!=nil {returnerr    }deferresp.Body.Close()ifresp.StatusCode!=200 {returnerrors.New("unhealthy")    }returnnil  },)

3. Manual Checks

Manual checks are controlled by your application logic. Use these for:

  • Initialization states (cache warming, data loading)
  • Circuit breaker patterns
  • Feature flags
// Cache warming checkcacheCheck:=healthcheck.NewManual("cache-warmed")hc.Register(ctx,cacheCheck)// Set unhealthy during startupcacheCheck.SetErr(errors.New("cache warming in progress"))// After cache is warmedcacheCheck.SetErr(nil)

Best Practices

1. Choose the Right Check Type

ScenarioCheck TypeWhy
Database pingBasicFast, needs fresh data
File system checkBasicFast, local operation
External API healthBackgroundExpensive, rate-limited
Message queue depthBackgroundMetrics query, can be stale
Cache warmup statusManualApplication-controlled state

2. Set Appropriate Timeouts

// ❌ Bad: Too long timeout blocks readiness. Timeout should less than timeout in k8shealthcheck.NewBasic("db",30*time.Second,checkFunc)// ✅ Good: Short timeouthealthcheck.NewBasic("db",1*time.Second,checkFunc)

3. Use Status Codes Correctly

  • Liveness (/live): Should almost always return 200 OK

    • Only fail if the application is in an unrecoverable state
    • Kubernetes will restart the pod on failure
  • Readiness (/ready): Should fail when:

    • Critical dependencies are unavailable
    • Application is still initializing
    • Application is shutting down

4. Add Context to Errors

funccheckDatabase(ctx context.Context)error {iferr:=db.PingContext(ctx);err!=nil {// Use fmt.Errorf to add context. It will be available in /ready reportreturnfmt.Errorf("postgres connection failed: %w",err)  }returnnil}

5. Graceful Shutdown

For applications that need to signal they are shutting down (preventing new traffic while completing existing requests),use theShutdown() method:

// Create healthcheck instancehc,_:=healthcheck.New()// Register your normal checkshc.Register(ctx,healthcheck.NewBasic("database",time.Second,checkDB))// Start HTTP serverserver,_:=healthcheck.NewServer(hc,healthcheck.WithPort(8080))_=server.Run(ctx)// In your graceful shutdown handlerfuncgracefulShutdown(hc*healthcheck.Healthcheck) {// Mark application as shutting down - /ready will return 500hc.Shutdown()// Continue with your normal shutdown process// - Stop accepting new requests// - Complete existing requests// - Close database connections, etc.}

What happens afterShutdown():

  • /ready endpoint immediately returns HTTP 500 with status "down"
  • A special__shutting_down__ check is added to the response
  • Kubernetes will stop routing new traffic to this pod
  • /live endpoint continues to return 200 OK (pod should not be restarted)

Use this pattern for:

  • Zero-downtime deployments
  • Graceful pod termination in Kubernetes
  • Maintenance mode activation
  • When you need to drain traffic before shutdown

6. Monitor Checks

hc,_:=healthcheck.New(healthcheck.WithCheckStatusHook(func (namestring,status healthcheck.Status) {// hcMetric can be a prometheus metric - it is up to your infrastructurehcMetric.WithLabelValues(name,string(status)).Set(1)  }),)

Complete Example

package mainimport ("context""database/sql""fmt""log""time""github.com/kazhuravlev/healthcheck"_"github.com/lib/pq")funcmain() {ctx:=context.Background()// Initialize dependenciesdb,err:=sql.Open("postgres","postgres://localhost/myapp")iferr!=nil {log.Fatal(err)}// Create healthcheckhc,_:=healthcheck.New()// 1. Database check - synchronous, criticalhc.Register(ctx,healthcheck.NewBasic("postgres",time.Second,func(ctx context.Context)error {returndb.PingContext(ctx)}))// 2. Cache warmup - manual controlcacheReady:=healthcheck.NewManual("cache")hc.Register(ctx,cacheReady)cacheReady.SetErr(fmt.Errorf("warming up"))// 3. External API - background checkhc.Register(ctx,healthcheck.NewBackground("payment-provider",nil,10*time.Second,// initial delay30*time.Second,// check interval5*time.Second,// timeoutcheckPaymentProvider,))// Start health check serverserver,_:=healthcheck.NewServer(hc,healthcheck.WithPort(8080))iferr:=server.Run(ctx);err!=nil {log.Fatal(err)}// Simulate cache warmup completiongofunc() {time.Sleep(5*time.Second)cacheReady.SetErr(nil)log.Println("Cache warmed up")}()// Graceful shutdown examplegofunc() {time.Sleep(30*time.Second)log.Println("Initiating graceful shutdown...")hc.Shutdown()// /ready will now return 500, stopping new trafficlog.Println("Application marked as shutting down")}()log.Println("Health checks available at:")log.Println("  - http://localhost:8080/live")log.Println("  - http://localhost:8080/ready")select {}}funccheckPaymentProvider(ctx context.Context)error {// Implementation of payment provider checkreturnnil}

Integration with Kubernetes

apiVersion:v1kind:Podspec:containers:    -name:applivenessProbe:httpGet:path:/liveport:8080initialDelaySeconds:10periodSeconds:10timeoutSeconds:5failureThreshold:3readinessProbe:httpGet:path:/readyport:8080initialDelaySeconds:5periodSeconds:5timeoutSeconds:3failureThreshold:2

Response Format

The/ready endpoint returns detailed JSON with check history:

Healthy application:

{"status":"up","checks": [{"name":"postgres","state": {"status":"up","error":"","timestamp":"2024-01-15T10:30:00Z"},"history": [{"status":"up","error":"","timestamp":"2024-01-15T10:29:55Z"}]}]}

Application shutting down:

{"status":"down","checks": [{"name":"postgres","state": {"status":"up","error":"","timestamp":"2024-01-15T10:30:00Z"}},{"name":"__shutting_down__","state": {"status":"down","error":"The application in shutting down process","timestamp":"2024-01-15T10:30:05Z"},"history":null}]}

About

Readiness probes for Kubernetes application

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp