Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

⚡ Fast, async, stream-based link checker written in Rust. Finds broken URLs and mail addresses inside Markdown, HTML, reStructuredText, websites and more!

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
NotificationsYou must be signed in to change notification settings

lycheeverse/lychee

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

lychee

HomepageGitHub MarketplaceRustdocs.rsCheck LinksDocker Pulls

⚡ A fast, async, stream-based link checker written in Rust.
Finds broken hyperlinks and mail addresses inside Markdown, HTML,reStructuredText, or any other text file or website!

Available as a command-line utility, a library and aGitHub Action.

Lychee demo

Table of Contents

Development

Afterinstalling Rust useCargo for building and testing.On Linux the OpenSSL packageis required to compilereqwest, a dependency of lychee.For Nix we provide a flake so you can usenix develop andnix build.

Installation

Arch Linux

pacman -S lychee

OpenSUSE Tumbleweed

zypperin lychee

macOS

ViaHomebrew:

brew install lychee

ViaMacPorts:

sudo port install lychee

Docker

docker pull lycheeverse/lychee

NixOS

nix-env -iA nixos.lychee

Nixpkgs

FreeBSD

pkg install lychee

Scoop

scoop install lychee

Termux

pkg install lychee

Alpine Linux

# available for Alpine Edge in testing repositoriesapk add lychee

Chocolatey (Windows)

choco install lychee

Pre-built binaries

We provide binaries for Linux, macOS, and Windows for every release.
You can download them from thereleases page.

Cargo

Build dependencies

On APT/dpkg-based Linux distros (e.g. Debian, Ubuntu, Linux Mint and Kali Linux)the following commands will install all required build dependencies, includingthe Rust toolchain andcargo:

curl -sSf'https://sh.rustup.rs'| shapt install gcc pkg-config libc6-dev libssl-dev

Compile and install lychee

cargo install lychee

Feature flags

Lychee supports several feature flags:

  • native-tls enables the platform-native TLS cratenative-tls.
  • vendored-openssl compiles and statically links a copy of OpenSSL. See the corresponding feature of theopenssl crate.
  • rustls-tls enables the alternative TLS craterustls.
  • email-check enables checking email addresses using thecheck-if-email-exists crate. This feature requires thenative-tls feature.
  • check_example_domains allows checking example domains such asexample.com. This feature is useful for testing.

By default,native-tls andemail-check are enabled.

Features

This comparison is made on a best-effort basis. Please create a PR to fixoutdated information.

lycheeawesome_botmuffetbroken-link-checkerlinkinatorlinkcheckermarkdown-link-checkfink
LanguageRustRubyGoJSTypeScriptPythonJSPHP
Async/Parallelyesyesyesyesyesyesyesyes
JSON outputyesnoyesyesyesmaybe1yesyes
Static binaryyesnoyesnonononono
Markdown filesyesyesnononoyesyesno
HTML filesyesnonoyesyesnoyesno
Text filesyesnonononononono
Website supportyesnoyesyesyesyesnoyes
Chunked encodingsyesmaybemaybemaybemaybenoyesyes
GZIP compressionyesmaybemaybeyesmaybeyesmaybeno
Basic Authyesnonoyesnoyesnono
Custom user agentyesnonoyesnoyesnono
Relative URLsyesyesnoyesyesyesyesyes
Anchors/Fragmentsyesnonononoyesyesno
Skip relative URLsyesnonomaybenononono
Include patternsyesyesnoyesnononono
Exclude patternsyesnoyesyesyesyesyesyes
Handle redirectsyesyesyesyesyesyesyesyes
Ignore insecure SSLyesyesyesnonoyesnoyes
File globbingyesyesnonoyesnoyesno
Limit schemeyesnonoyesnoyesnono
Custom headersyesnoyesnononoyesyes
Summaryyesyesyesmaybeyesyesnoyes
HEAD requestsyesyesnoyesyesyesnono
Colored outputyesmaybeyesmaybeyesyesnoyes
Filter status codeyesyesnonononoyesno
Custom timeoutyesyesyesnoyesyesnoyes
E-mail linksyesnonononoyesnono
Progress baryesyesnononoyesyesyes
Retry and backoffyesnononoyesnoyesno
Skip private domainsyesnonononononono
Use as libraryyesyesnoyesyesnoyesno
Quiet modeyesnononoyesyesyesyes
Config fileyesnononoyesyesyesno
Cookiesyesnoyesnonoyesnoyes
Recursionnonoyesyesyesyesyesno
Amazing lychee logoyesnonononononono

1 Other machine-readable formats like CSV are supported.

Commandline usage

Recursively check all links in supported files inside the current directory

lychee.

You can also specify various types of inputs:

# check links in specific local file(s):lychee README.mdlychee test.html info.txt# check links on a website:lychee https://endler.dev# check links in directory but block network requestslychee --offline path/to/directory# check links in a remote file:lychee https://raw.githubusercontent.com/lycheeverse/lychee/master/README.md# check links in local files via shell glob:lychee~/projects/*/README.md# check links in local files (lychee supports advanced globbing and ~ expansion):lychee"~/projects/big_project/**/README.*"# ignore case when globbing and check result for each link:lychee --glob-ignore-case --verbose"~/projects/**/[r]eadme.*"# check links from epub file (requires atool: https://www.nongnu.org/atool)acat -F zip {file.epub}"*.xhtml""*.html"| lychee -

lychee parses other file formats as plaintext and extracts links usinglinkify.This generally works well if there are no format or encoding specifics,but in case you need dedicated support for a new file format, please consider creating an issue.

Docker Usage

Here's how to mount a local directory into the container and check some inputwith lychee.

  • The--init parameter is passed so that lychee can be stopped from the terminal.
  • We also pass-it to start an interactive terminal, which is required to show the progress bar.
  • The--rm removes not used anymore container from the host after the run (self-cleanup).
  • The-w /input points to/input as the default workspace
  • The-v $(pwd):/input does local volume mounting to the container for lychee access.

By default a Debian-based Docker image is used. If you want to run an Alpine-based image, use thelatest-alpine tag.For example,lycheeverse/lychee:latest-alpine

Linux/macOS shell command

docker run --init -it --rm -w /input -v$(pwd):/input lycheeverse/lychee README.md

Windows PowerShell command

docker run--init-it--rm-w/input-v${PWD}:/input lycheeverse/lychee README.md

GitHub Token

To avoid getting rate-limited while checking GitHub links, you can optionallyset an environment variable with your GitHub token like soGITHUB_TOKEN=xxxx,or use the--github-token CLI option. It can also be set in the config file.Here is an example config file.

The token can be generated on yourGitHub account settings page.A personal access token with no extra permissions is enough to be able to check public repo links.

For more scalable organization-wide scenarios you can consider aGitHub App.It has a higher rate limit than personal access tokens but requires additional configuration steps on your GitHub workflow.Please follow theGitHub App Setup example.

Commandline Parameters

There is an extensive list of command line parameters to customize the behavior.See below for a full list.

A fast, async link checkerFinds broken URLs and mail addresses inside Markdown, HTML, `reStructuredText`, websites and more!Usage: lychee [OPTIONS] <inputs>...Arguments:  <inputs>...          The inputs (where to get links to check from). These can be: files (e.g. `README.md`), file globs (e.g. `"~/git/*/README.md"`), remote URLs (e.g. `https://example.com/README.md`) or standard input (`-`). NOTE: Use `--` to separate inputs from options that allow multiple argumentsOptions:  -c, --config <CONFIG_FILE>          Configuration file to use          [default: lychee.toml]  -v, --verbose...          Set verbosity level; more output per occurrence (e.g. `-v` or `-vv`)  -q, --quiet...          Less output per occurrence (e.g. `-q` or `-qq`)  -n, --no-progress          Do not show progress bar.          This is recommended for non-interactive shells (e.g. for continuous integration)      --extensions <EXTENSIONS>          Test the specified file extensions for URIs when checking files locally.          Multiple extensions can be separated by commas. Note that if you want to check filetypes,          which have multiple extensions, e.g. HTML files with both .html and .htm extensions, you need to          specify both extensions explicitly.          [default: md,mkd,mdx,mdown,mdwn,mkdn,mkdown,markdown,html,htm,txt]      --cache          Use request cache stored on disk at `.lycheecache`      --max-cache-age <MAX_CACHE_AGE>          Discard all cached requests older than this duration          [default: 1d]      --cache-exclude-status <CACHE_EXCLUDE_STATUS>          A list of status codes that will be ignored from the cache          The following accept range syntax is supported: [start]..[=]end|code. Some valid          examples are:          - 429          - 500..=599          - 500..          Use "lychee --cache-exclude-status '429, 500..502' <inputs>..." to provide a comma- separated          list of excluded status codes. This example will not cache results with a status code of 429, 500,          501 and 502.          [default: ]      --dump          Don't perform any link checking. Instead, dump all the links extracted from inputs that would be checked      --dump-inputs          Don't perform any link extraction and checking. Instead, dump all input sources from which links would be collected      --archive <ARCHIVE>          Specify the use of a specific web archive. Can be used in combination with `--suggest`          [possible values: wayback]      --suggest          Suggest link replacements for broken links, using a web archive. The web archive can be specified with `--archive`  -m, --max-redirects <MAX_REDIRECTS>          Maximum number of allowed redirects          [default: 5]      --max-retries <MAX_RETRIES>          Maximum number of retries per request          [default: 3]      --max-concurrency <MAX_CONCURRENCY>          Maximum number of concurrent network requests          [default: 128]  -T, --threads <THREADS>          Number of threads to utilize. Defaults to number of cores available to the system  -u, --user-agent <USER_AGENT>          User agent          [default: lychee/x.y.z]  -i, --insecure          Proceed for server connections considered insecure (invalid TLS)  -s, --scheme <SCHEME>          Only test links with the given schemes (e.g. https). Omit to check links with any other scheme. At the moment, we support http, https, file, and mailto      --offline          Only check local files and block network requests      --include <INCLUDE>          URLs to check (supports regex). Has preference over all excludes      --exclude <EXCLUDE>          Exclude URLs and mail addresses from checking (supports regex)      --exclude-file <EXCLUDE_FILE>          Deprecated; use `--exclude-path` instead      --exclude-path <EXCLUDE_PATH>          Exclude file path from getting checked  -E, --exclude-all-private          Exclude all private IPs from checking.          Equivalent to `--exclude-private --exclude-link-local --exclude-loopback`      --exclude-private          Exclude private IP address ranges from checking      --exclude-link-local          Exclude link-local IP address range from checking      --exclude-loopback          Exclude loopback IP address range and localhost from checking      --exclude-mail          Exclude all mail addresses from checking (deprecated; excluded by default)      --include-mail          Also check email addresses      --remap <REMAP>          Remap URI matching pattern to different URI      --fallback-extensions <FALLBACK_EXTENSIONS>          Test the specified file extensions for URIs when checking files locally.          Multiple extensions can be separated by commas. Extensions will be checked in          order of appearance.          Example: --fallback-extensions html,htm,php,asp,aspx,jsp,cgi      --header <HEADER>          Custom request header  -a, --accept <ACCEPT>          A List of accepted status codes for valid links          The following accept range syntax is supported: [start]..[=]end|code. Some valid          examples are:          - 200..=204          - 200..204          - ..=204          - ..204          - 200          Use "lychee --accept '200..=204, 429, 500' <inputs>..." to provide a comma-          separated list of accepted status codes. This example will accept 200, 201,          202, 203, 204, 429, and 500 as valid status codes.          [default: 100..=103,200..=299]      --include-fragments          Enable the checking of fragments in links  -t, --timeout <TIMEOUT>          Website timeout in seconds from connect to response finished          [default: 20]  -r, --retry-wait-time <RETRY_WAIT_TIME>          Minimum wait time in seconds between retries of failed requests          [default: 1]  -X, --method <METHOD>          Request method          [default: get]      --base <BASE>          Deprecated; use `--base-url` instead  -b, --base-url <BASE_URL>          Base URL used to resolve relative URLs during link checking Example: <https://example.com>      --root-dir <ROOT_DIR>          Root path to use when checking absolute local links, must be an absolute path      --basic-auth <BASIC_AUTH>          Basic authentication support. E.g. `http://example.com username:password`      --github-token <GITHUB_TOKEN>          GitHub API token to use when checking github.com links, to avoid rate limiting          [env: GITHUB_TOKEN]      --skip-missing          Skip missing input files (default is to error if they don't exist)      --no-ignore          Do not skip files that would otherwise be ignored by '.gitignore', '.ignore', or the global ignore file      --hidden          Do not skip hidden directories and files      --include-verbatim          Find links in verbatim sections like `pre`- and `code` blocks      --glob-ignore-case          Ignore case when expanding filesystem path glob inputs  -o, --output <OUTPUT>          Output file of status report      --mode <MODE>          Set the output display mode. Determines how results are presented in the terminal          [default: color]          [possible values: plain, color, emoji, task]  -f, --format <FORMAT>          Output format of final status report          [default: compact]          [possible values: compact, detailed, json, markdown, raw]      --require-https          When HTTPS is available, treat HTTP links as errors      --cookie-jar <COOKIE_JAR>          Tell lychee to read cookies from the given file. Cookies will be stored in the cookie jar and sent with requests. New cookies will be stored in the cookie jar and existing cookies will be updated  -h, --help          Print help (see a summary with '-h')  -V, --version          Print version

Exit codes

  • 0 for success (all links checked successfully or excluded/skipped as configured)
  • 1 for missing inputs and any unexpected runtime failures or config errors
  • 2 for link check failures (if any non-excluded link failed the check)
  • 3 for errors in the config file

Ignoring links

You can exclude links from getting checked by specifying regex patternswith--exclude (e.g.--exclude example\.(com|org)).

Here are some examples:

# Exclude LinkedIn URLs (note that we match on the full URL, including the schema to avoid false-positives)lychee --exclude'^https://www\.linkedin\.com'# Exclude LinkedIn and Archive.org URLslychee --exclude'^https://www\.linkedin\.com' --exclude'^https://web\.archive\.org/web/'# Exclude all links to PDF fileslychee --exclude'\.pdf$'.# Exclude links to specific domainslychee --exclude'(facebook|twitter|linkedin)\.com'.# Exclude links with certain URL parameterslychee --exclude'\?utm_source='.# Exclude all mailto linkslychee --exclude'^mailto:'.

For excluding files/directories from being scanned uselychee.tomlandexclude_path.

exclude_path = ["some/path","*/dev/*"]

If a file named.lycheeignore exists in the current working directory, itscontents are excluded as well. The file allows you to list multiple regularexpressions for exclusion (one pattern per line).

For more advanced usage and detailed explanations, check out our comprehensiveguide on excluding links.

Caching

If the--cache flag is set, lychee will cache responses in a file called.lycheecache in the current directory. If the file exists and the flag is set,then the cache will be loaded on startup. This can greatly speed up future runs.Note that by default lychee will not store any data on disk.

Library usage

You can use lychee as a library for your own projects!Here is a "hello world" example:

use lychee_lib::Result;#[tokio::main]asyncfnmain() ->Result<()>{let response = lychee_lib::check("https://github.com/lycheeverse/lychee").await?;println!("{response}");Ok(())}

This is equivalent to the following snippet, in which we build our own client:

use lychee_lib::{ClientBuilder,Result,Status};#[tokio::main]asyncfnmain() ->Result<()>{let client =ClientBuilder::default().client()?;let response = client.check("https://github.com/lycheeverse/lychee").await?;assert!(response.status().is_success());Ok(())}

The client builder is very customizable:

let client = lychee_lib::ClientBuilder::builder().includes(includes).excludes(excludes).max_redirects(cfg.max_redirects).user_agent(cfg.user_agent).allow_insecure(cfg.insecure).custom_headers(headers).method(method).timeout(timeout).github_token(cfg.github_token).scheme(cfg.scheme).accepted(accepted).build().client()?;

All options that you set will be used for all link checks.See thebuilderdocumentationfor all options. For more information, check out theexamplesfolder.

GitHub Action Usage

A GitHub Action that uses lychee is available as a separate repository:lycheeverse/lychee-actionwhich includes usage instructions.

Pre-commit Usage

Lychee can also be used as apre-commit hook.

# .pre-commit-config.yamlrepos:  -repo:https://github.com/lycheeverse/lychee.gitrev:v0.15.1hooks:      -id:lychee# Optionally include additional CLI argumentsargs:["--no-progress", "--exclude", "file://"]

Rather than running on staged-files only, Lychee can be run against an entire repository.

-id:lycheeargs:["--no-progress", "."]pass_filenames:false

Contributing to lychee

We'd be thankful for any contribution.
We try to keep the issue tracker up-to-date so you can quickly find a task to work on.

Try one of these links to get started:

For more detailed instructions, head over toCONTRIBUTING.md.

Troubleshooting and Workarounds

We collect a list of common workarounds for various websites in ourtroubleshooting guide.

Users

If you are using lychee for your project,please add it here.

Credits

The first prototype of lychee was built inepisode 10 of HelloRust. Thanks to all GitHub and Patreon sponsorsfor supporting the development since the beginning. Also, thanks to all thegreat contributors who have since made this project more mature.

License

lychee is licensed under either of

at your option.



🔼 Back to top

About

⚡ Fast, async, stream-based link checker written in Rust. Finds broken URLs and mail addresses inside Markdown, HTML, reStructuredText, websites and more!

Topics

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Languages


[8]ページ先頭

©2009-2025 Movatter.jp