Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Domain names collector - Crawl websites and collect domain names along with their availability status.

License

NotificationsYou must be signed in to change notification settings

twiny/spidy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A tool that crawl websites to find domain names and checks thier availiabity.

Install

git clone https://github.com/twiny/spidy.gitcd ./spidy# buildgo build -o bin/spidy -v cmd/spidy/main.go# run./bin/spidy -c config/config.yaml -u https://github.com

Usage

NAME:   Spidy - Domain name scraperUSAGE:   spidy [global options]command [command options] [arguments...]VERSION:   2.0.0COMMANDS:   help, h  Shows a list of commands orhelpfor onecommandGLOBAL OPTIONS:   --config path, -c path  path to config file   --help, -h              showhelp (default: false)   --urls urls, -u urls    urls of page to scrape  (accepts multiple inputs)   --version, -v           print the version (default: false)

Configuration

# main crawler configcrawler:max_depth:10# max depth of pages to visit per website.# filter: [] # regexp filterrate_limit:"1/5s"# 1 request per 5 secmax_body_size:"20MB"# max page body sizeuser_agents:# array of user-agents      -"Spidy/2.1; +https://github.com/ twiny/spidy"# proxies: [] # array of proxy. http(s), SOCKS5# Logslog:rotate:7# log rotationpath:"./log"# log directory# Storestore:ttl:"24h"# keep cache for 24hpath:"./store"# store directory# Resultsresult:path:./result# result directoryparralle:3# number of concurrent workerstimeout:"5m"# request timeouttlds:["biz", "cc", "com", "edu", "info", "net", "org", "tv"]# array of domain extension to check.

TODO

  • Add support to morewriters.
  • Add terminal logging.
  • Add test cases.

Issues

NOTE: This package is provided "as is" with no guarantee. Use it at your own risk and always test it yourself before using it in a production environment. If you find any issues, please create a new issue.

About

Domain names collector - Crawl websites and collect domain names along with their availability status.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp