Movatterモバイル変換

twiny/spidyPublic

NotificationsYou must be signed in to change notification settings
Fork28
Star159

Domain names collector - Crawl websites and collect domain names along with their availability status.

github.com/twiny/spidy/wiki

License

MIT license

159 stars 28 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
cmd/spidy		cmd/spidy
config		config
internal		internal
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Repository files navigation

Spidy

A tool that crawl websites to find domain names and checks thier availiabity.

Install

git clone https://github.com/twiny/spidy.gitcd ./spidy# buildgo build -o bin/spidy -v cmd/spidy/main.go# run./bin/spidy -c config/config.yaml -u https://github.com

Usage

NAME:   Spidy - Domain name scraperUSAGE:   spidy [global options]command [command options] [arguments...]VERSION:   2.0.0COMMANDS:   help, h  Shows a list of commands orhelpfor onecommandGLOBAL OPTIONS:   --config path, -c path  path to config file   --help, -h              showhelp (default: false)   --urls urls, -u urls    urls of page to scrape  (accepts multiple inputs)   --version, -v           print the version (default: false)

Configuration

# main crawler configcrawler:max_depth:10# max depth of pages to visit per website.# filter: [] # regexp filterrate_limit:"1/5s"# 1 request per 5 secmax_body_size:"20MB"# max page body sizeuser_agents:# array of user-agents      -"Spidy/2.1; +https://github.com/ twiny/spidy"# proxies: [] # array of proxy. http(s), SOCKS5# Logslog:rotate:7# log rotationpath:"./log"# log directory# Storestore:ttl:"24h"# keep cache for 24hpath:"./store"# store directory# Resultsresult:path:./result# result directoryparralle:3# number of concurrent workerstimeout:"5m"# request timeouttlds:["biz", "cc", "com", "edu", "info", "net", "org", "tv"]# array of domain extension to check.

TODO

Add support to morewriters.
Add terminal logging.
Add test cases.

Issues

NOTE: This package is provided "as is" with no guarantee. Use it at your own risk and always test it yourself before using it in a production environment. If you find any issues, please create a new issue.

About

Domain names collector - Crawl websites and collect domain names along with their availability status.

github.com/twiny/spidy/wiki

Releases3

v2.0.1 Latest

Aug 15, 2022

+ 2 releases

Packages

No packages published

Languages

Go100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Spidy

Install

Usage

Configuration

TODO

Issues

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases3

Packages

Languages

Movatterモバイル変換

License

twiny/spidy

Folders and files

Latest commit

History

Repository files navigation

Spidy

Install

Usage

Configuration

TODO

Issues

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases3

Packages0

Languages

Packages