Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

A monitoring tool to gather infrastructure network information

License

NotificationsYou must be signed in to change notification settings

telekom/sparrow

Repository files navigation

Last CommitOpen IssuesLicense

Thesparrow is an infrastructure monitoring tool. The binary includes several checks (e.g. health check) that will beexecuted periodically.

About this component

Thesparrow performs several checks to monitor the health of the infrastructure and network from its point of view.The following checks are available:

  1. Health check -health: Thesparrow is able to perform an HTTP-based (HTTP/1.1) health check tothe provided endpoints. Thesparrow will expose its own health check endpoint as well.

  2. Latency check -latency: Thesparrow is able to communicate with othersparrow instances tocalculate the time a request takes to the target and back. The check is http (HTTP/1.1) based as well.

  3. DNS check -dns: Thesparrow is able to perform DNS resolution checks to monitor domain name systemperformance and reliability. The check has the ability to target specific domains or IPs for monitoring.

  4. Traceroute Check -traceroute: Thesparrow is able to perform traceroute checks to monitorthe network path to a target. The check has the ability to target specific domains or IPs for monitoring.

Each check is designed to provide comprehensive insights into the various aspects of network and service health,ensuring robust monitoring and quick detection of potential issues.

Installation

Thesparrow is provided as a small binary & a container image.

Please refer to therelease notes to get the latest version.

Binary

The binary is available for several distributions. To install the binary, use a provided bundle or source.Replace${RELEASE_VERSION} with the desired release version:

export RELEASE_VERSION=0.5.0

Download the binary:

curl https://github.com/telekom/sparrow/releases/download/v${RELEASE_VERSION}/sparrow_${RELEASE_VERSION}_linux_amd64.tar.gz -Lo sparrow.tar.gzcurl https://github.com/telekom/sparrow/releases/download/v${RELEASE_VERSION}/sparrow_${RELEASE_VERSION}_checksums.txt -Lo checksums.txt

Extract the binary:

tar -xf sparrow.tar.gz

Container Image

Thesparrow container images fordedicatedrelease can be found in the GitHub registry.

Helm

Sparrow can be installed via Helm Chart. The chart is available in the GitHub registry:

helm -n sparrow upgrade -i sparrow oci://ghcr.io/telekom/charts/sparrow --create-namespace

The default settings are suitable for a local configuration. With the default Helm values, the sparrow loader uses achecks' configuration provided in a ConfigMap (thefile loader is used). Define thechecksConfig section to set theConfigMap.

Use the following configuration values to use a runtime configuration by thehttp loader:

startupConfig:...loader:type:httpinterval:30shttp:url:https://url-to-checks-config.de/api/config%2EyamlchecksConfig:{ }

To provide the sparrow container with the token, manually create a secret containing theSPARROW_LOADER_HTTP_TOKENenvironment variable. Utilize theenvFromSecrets in thevalues.yaml to enable access to this secret by the sparrowcontainer. Avoid adding sensitive data like the token used by thehttp loader (loader.http.token) directly in thevalues section.

The same applies to the target manager token. Use theSPARROW_TARGETMANAGER_GITLAB_TOKEN in a secret and bind it withtheenvFromSecrets in thevalues.yaml.

For all available value options seeChart README.

Additionally check out the sparrowconfiguration variants.

Usage

Usesparrow run to execute the instance using the binary. AsparrowName (a valid DNS name) is required to be passed,else the sparrow will not start:

sparrow run --sparrowName sparrow.telekom.de

Image

Run asparrow container by using e.g.docker run ghcr.io/telekom/sparrow.

Pass the available configuration arguments to the container e.g.docker run ghcr.io/telekom/sparrow --help.

Start the instance using a mounted startup configuration filee.g.docker run -v /config:/config ghcr.io/telekom/sparrow --config /config/config.yaml.

Configuration

The configuration is divided into two parts. The startup configuration and the checks' configuration. The startupconfiguration is a technical configuration to configure thesparrow instance itself.

Startup

The available configuration options can be found in theCLI flag documentation.

Thesparrow is able to get the startup configuration from different sources as follows.

Priority of configuration (high to low):

  1. CLI flags
  2. Environment variables
  3. Defined configuration file
  4. Default configuration file

Every value in the config file can be set through environment variables.

You can set a token for the http loader:

export SPARROW_LOADER_HTTP_TOKEN="xxxxxx"

Or for any other config attribute:

export SPARROW_ANY_OTHER_OPTION="Some value"

Just write out the path to the attribute, delimited by_.

Example Startup Configuration

# DNS sparrow is exposed onname:sparrow.example.com# Selects and configures a loader to continuously fetch the checks' configuration at runtimeloader:# Defines which loader to use. Options: "file | http"type:http# The interval in which sparrow tries to fetch a new configuration# If this isn't set or set to 0, the loader will only retrieve the configuration onceinterval:30s# Config specific to the http loaderhttp:# The URL where the config is locatedurl:https://myconfig.example.com/config.yaml# This token is passed in the Authorization header when refreshing the configtoken:xxxxxxx# A timeout for the config refreshtimeout:30sretry:# How long to wait in between retriesdelay:10s# How many times to retrycount:3# Config specific to the file loader# The file loader is not intended for production usefile:# Location of the file in the local filesystempath:./config.yaml# Configures the APIapi:# Which address to expose Sparrow's REST API onaddress::8080# Configures tls for the http server# including prometheus metrics etctls:# whether to enable tls, default is falseenabled:true# path to your x509 certificatecertPath:mycert.pem# path to your certificate keykeyPath:mykey.key# Configures the target manager.targetManager:# whether to enable the target manager. (default: false)enabled:true# Defines which target manager to use.type:gitlab# The interval for the target reconciliation processcheckInterval:1m# How often the instance should register itself as a global target# A duration of 0 means no registrationregistrationInterval:1m# How often the instance should update its registration as a global target# A duration of 0 means no updateupdateInterval:120m# The amount of time a target can be unhealthy# before it is removed from the global target list# A duration of 0 means no removalunhealthyThreshold:360m# Scheme defines with which scheme sparrow should register itselfscheme:http# Configuration options for the GitLab target managergitlab:# The URL of your GitLab hostbaseUrl:https://gitlab.com# Your GitLab API token# You can also set this value through the SPARROW_TARGETMANAGER_GITLAB_TOKEN environment variabletoken:glpat-xxxxxxxx# The ID of your GitLab project. This is where Sparrow will register itself# and grab the list of other Sparrows fromprojectId:18923# The branch to use for the state file# If not set, it tries to resolve the default branch otherwise it uses the 'main' branchbranch:main# Configures the telemetry exporter.telemetry:# Whether to enable telemetry. (default: false)enabled:true# The telemetry exporter to use.# Options:# grpc: Exports telemetry using OTLP via gRPC.# http: Exports telemetry using OTLP via HTTP.# stdout: Prints telemetry to stdout.# noop | "": Disables telemetry.exporter:grpc# The address to export telemetry to.url:localhost:4317# The token to use for authentication.# If the exporter does not require a token, this can be left empty.token:""# Configures tls for the telemetry exportertls:# Enable or disable TLSenabled:true# The path to the tls certificate to use.# Only required if your otel endpoint uses custom TLS certificatescertPath:""

Loader

The loader component of thesparrow dynamically loads thechecks' configuration during runtime.

You select which loader is used by setting theloaderType parameter.

Available loaders:

  • http (default): Retrieves the checks' configuration from a remote endpoint during runtime. Additional configurationparameters are set in theloader.http section.

  • file: Loads the checks' configuration from a local file during runtime. Additional configurationparameters are set in theloader.file section.

If you want to retrieve the checks' configuration only once, you can setloader.interval to 0.The target manager is currently not functional in combination with this configuration.

Logging Configuration

You can configure the logging behavior of the sparrow instance by setting the following environment variables:

  • LOG_LEVEL: Adjusts the minimum log level.Available options:DEBUG,INFO,WARNING,ERROR.
  • LOG_FORMAT: Sets the log format. This allows you to customize the format of the log messages.Available options:JSON,TEXT.

Checks

In addition to the technical startup configuration, thesparrow checks' configuration can be dynamically loaded during runtime.Theloader is capable of dynamically loading and configuring checks.

For detailed information on available loader configuration options, please refertothis documentation.

Example format of a configuration file for the checks:

health:targets:[ ]

Target Manager

Thesparrow can optionally manage targets for checks and register itself as a target on a (remote) backend throughtheTargetManager interface. This feature is optional; if the startup configuration does not includethetargetManager, it will not be used. When configured, it offers various settings, detailed below, which can be setin the startup YAML configuration file as shown in theexample configuration.

TypeDescription
targetManager.enabledWhether to enable the target manager. Defaults to false
targetManager.typeType of the target manager. Options:gitlab
targetManager.schemeShould the target register itself as http or https. Can behttp orhttps. This needs to be set tohttps, whenapi.tls.enabled ==true
targetManager.checkIntervalInterval for checking new targets.
targetManager.unhealthyThresholdThreshold for marking a target as unhealthy. 0 means no cleanup.
targetManager.registrationIntervalInterval for registering the current sparrow at the target backend. 0 means no registration.
targetManager.updateIntervalInterval for updating the registration of the current sparrow. 0 means no update.
targetManager.gitlab.baseUrlBase URL of the GitLab instance.
targetManager.gitlab.tokenToken for authenticating with the GitLab instance.
targetManager.gitlab.projectIdProject ID for the GitLab project used as a remote state backend.
targetManager.gitlab.branchBranch to use for the state file. If not set, it tries to resolve the default branch otherwise it uses themain branch.

Currently, only one target manager exists: the Gitlab target manager. It uses a gitlab project as the remote statebackend. The varioussparrow instances can register themselves as targets in the project.Thesparrow instances will also check the project for new targets and add them to the local state.The registration is done by committing a "state" file in the main branch of the repository,which is named after the DNS name of thesparrow. The state file contains the following information:

{"url":"<SCHEME>://<SPARROW_DNS_NAME>","lastSeen":"2021-09-30T12:00:00Z"}

Check: Health

Available configuration options:

FieldTypeDescription
intervaldurationInterval to perform the health check.
timeoutdurationTimeout for the health check.
retry.countintegerNumber of retries for the health check.
retry.delaydurationInitial delay between retries for the health check.
targetslist of stringsList of targets to send health probe. Needs to be a valid URL. Can be anothersparrow instance. Automatically updated when a targetManager is configured.

Example configuration

health:interval:10stimeout:30sretry:count:3delay:1stargets:    -https://example.com/    -https://google.com/

Health Metrics

  • sparrow_health_up
    • Type: Gauge
    • Description: Health of targets
    • Labelled withtarget

Check: Latency

Available configuration options:

FieldTypeDescription
intervaldurationInterval to perform the latency check.
timeoutdurationTimeout for the latency check.
retry.countintegerNumber of retries for the latency check.
retry.delaydurationInitial delay between retries for the latency check.
targetslist of stringsList of targets to send latency probe. Needs to be a valid URL. Can be anothersparrow instance. Automatically updated when a targetManager is configured.

Example configuration

latency:interval:10stimeout:30sretry:count:3delay:1stargets:    -https://example.com/    -https://google.com/

Latency Metrics

  • sparrow_latency_duration_seconds

    • Type: Gauge
    • Description: Latency with status information of targets. This metric is DEPRECATED. Usesparrow_latency_seconds.
    • Labelled withtarget andstatus
  • sparrow_latency_seconds

    • Type: Gauge
    • Description: Latency information of targets
    • Labelled withtarget
  • sparrow_latency_count

    • Type: Counter
    • Description: Count of latency checks done
    • Labelled withtarget
  • sparrow_latency_duration

    • Type: Histogram
    • Description: Latency of targets in seconds
    • Labelled withtarget

Check: DNS

Caution

Breaking Change: Starting from versionv0.6.0, the API returns lowercase keys instead of capitalized keys. Ensure that your code handles this change to avoid issues.

Available configuration options:

FieldTypeDescription
intervaldurationInterval to perform the DNS check.
timeoutdurationTimeout for the DNS check.
retry.countintegerNumber of retries for the DNS check.
retry.delaydurationInitial delay between retries for the DNS check.
targetslist of stringsList of targets to lookup. Needs to be a valid domain or IP. Can be anothersparrow instance. Automatically updated when a targetManager is configured.

Example configuration

dns:interval:10stimeout:30sretry:count:3delay:1stargets:    -www.example.com    -www.google.com

DNS Metrics

  • sparrow_dns_status

    • Type: Gauge
    • Description: Lookup status of targets
    • Labelled withtarget
  • sparrow_dns_check_count

    • Type: Counter
    • Description: Count of DNS checks done
    • Labelled withtarget
  • sparrow_dns_duration_seconds

    • Type: Gauge
    • Description: Duration of DNS resolution attempts
    • Labelled withtarget
  • sparrow_dns_duration

    • Type: Histogram
    • Description: Histogram of response times for DNS checks
    • Labelled withtarget

Check: Traceroute

FieldTypeDescription
intervaldurationInterval to perform the Traceroute check.
timeoutdurationTimeout for every hop.
retry.countintegerNumber of retries for the latency check.
retry.delaydurationInitial delay between retries for the latency check.
maxHopsintegerMaximum number of hops to try before giving up.
targetslist of objectsList of targets to traceroute to.
targets[].addrstringThe address of the target to traceroute to. Can be an IP address or DNS name
targets[].portuint16The port of the target to traceroute to. Default is 80

Example configuration

traceroute:interval:5stimeout:3sretry:count:3delay:1smaxHops:30targets:    -addr:8.8.8.8port:53    -addr:www.google.comport:80

Optional Capabilities

Sparrow does not need any extra permissions to run this check. However, some data, like the ip addressof the hop that dropped a packet, will not be available. To enable this functionality, there are two options:

  • Run sparrow as root:

    sudo sparrow run --config config.yaml
  • Allow sparrow to create raw sockets, by assigning theCAP_NET_RAW capability to the sparrow binary:

    sudo setcap'cap_net_raw=ep' sparrow

Traceroute Prometheus Metrics

  • sparrow_traceroute_check_duration_ms{target="google.com"} 43150
    • Type: Gauge
    • Description: How long the last traceroute took for this target in total
  • sparrow_traceroute_minimum_hops{target="google.com"} 14
    • Type: Gauge
    • Description: The minimum number of hops required to reach a target

Traceroute API Metrics

The traceroute check exposes additional data through its rest API that isn't available in prometheus.This data give a more detailed breakdown of the trace and can be found at/v1/metrics/traceroute and ismeant to be a json representation of traditional traceroute output:

$ traceroute -T -q 1 100.1.2.2 1  200.2.0.1 (200.2.0.1)  2 ms 2  11.0.0.34 (11.0.0.34)  5 ms ...

Is roughly equal to this:

{"data": {"100.1.2.2": {"MinHops":1,"Hops": {"1": [          {"Latency":2,"Addr": {"IP":"200.2.0.1","Port":80,"Zone":""            },"Name":"","Ttl":1,"Reached":false          }        ],"2": [          {"Latency":5,"Addr": {"IP":"11.0.0.34","Port":80,"Zone":""            },"Name":"","Ttl":2,"Reached":false          }        ]...      }    },  },"timestamp":"2024-07-26T15:49:39.60760766+02:00"}

API

Caution

Breaking Change: Starting from versionv0.6.0, the API returns lowercase keys instead of capitalized keys. Ensure that your code handles this change to avoid issues.

Thesparrow exposes an API for accessing the results of various checks. Each check registers its own endpointat/v1/metrics/{check-name}. The API's definition is available at/openapi.

Metrics, Telemetry & Dashboards

Thesparrow provides a/metrics endpoint to expose application metrics. In addition to runtime information, the sparrow provides specific metrics for each check. Refer to theChecks section for more detailed information.

Prometheus Integration

Thesparrow metrics API is designed to be compatible with Prometheus. To integratesparrow with Prometheus, add the following scrape configuration to your Prometheus configuration file:

scrape_configs:  -job_name:'sparrow'static_configs:      -targets:['<sparrow_instance_address>:8080']

Replace<sparrow_instance_address> with the actual address of yoursparrow instance.

Traces

Thesparrow supports exporting telemetry data using the OpenTelemetry Protocol (OTLP). This allows users to choose their preferred telemetry provider and collector. The following configuration options are available for setting up telemetry:

FieldTypeDescription
enabledboolWhether to enable telemetry. Default:false
exporterstringThe telemetry exporter to use. Options:grpc,http,stdout,noop
urlstringThe address to export telemetry to.
tokenstringThe token to use for authentication.
tls.enabledboolEnable or disable TLS.
tls.certPathstringThe path to the TLS certificate to use. Only required if custom TLS is used

For example, to export telemetry data using OTLP via gRPC, you can add the following configuration to yourstartup configuration:

telemetry:# Whether to enable telemetry. (default: false)enabled:true# The telemetry exporter to use.# Options:# grpc: Exports telemetry using OTLP via gRPC.# http: Exports telemetry using OTLP via HTTP.# stdout: Prints telemetry to stdout.# noop | "": Disables telemetry.exporter:grpc# The address to export telemetry to.url:collector.example.com:4317# The token to use for authentication.# If the exporter does not require a token, this can be left empty.token:""tls:# Enable or disable TLSenabled:true# The path to the tls certificate to use.# Only required if your otel endpoint uses custom TLS certificatescertPath:""

SinceOTLP is a standard protocol, you can choose any collector that supports it. Thestdout exporter can be used for debugging purposes to print telemetry data to the console, while thenoop exporter disables telemetry. If an external collector is used, a bearer token for authentication and a TLS certificate path for secure communication can be provided.

Grafana Dashboards

A sample Grafana dashboard to visualize the metrics collected by the checks is available in theexamples directory of the repository. How to import dashboards into Grafana is documentedhere.

Code of Conduct

This project has adopted theContributor Covenant in version 2.1 as our code ofconduct. Please see the details in ourCODE_OF_CONDUCT.md. All contributors must abide by the codeof conduct.

By participating in this project, you agree to abide by itsCode of Conduct at all times.

Working Language

We decided to applyEnglish as the primary project language.

Consequently, all content will be made available primarily in English.We also ask all interested people to use English as the preferred language to create issues,in their code (comments, documentation, etc.) and when you send requests to us.The application itself and all end-user facing content will be made available in other languages as needed.

Support and Feedback

The following channels are available for discussions, feedback, and support requests:

TypeChannel
IssuesIssues

How to Contribute

Contribution and feedback is encouraged and always welcome. For more information about how to contribute, the projectstructure, as well as additional contribution information, see ourContribution Guidelines. Byparticipating in this project, you agree to abide by itsCode of Conduct at all times.

Licensing

This project follows theREUSE standard for software licensing.Each file contains copyright and license information, and license texts can be found in the./LICENSES folder. For more information visithttps://reuse.software/.You can find a guide for developers atReuse Template Docs.


[8]ページ先頭

©2009-2025 Movatter.jp