- Notifications
You must be signed in to change notification settings - Fork26
CLI utility to scrape emails from websites
License
NotificationsYou must be signed in to change notification settings
lawzava/scrape
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
CLI utility to scrape emails from websites
- Asynchronous scraping
- Recursive link follow
- External link follow
- Cloudflare email obfuscation decoding
- Client side rendered pages support through headless
chromium
load awaits - Simple, grepable output
MacOS:
brew tap lawzava/scrape https://github.com/lawzava/scrapebrew install scrape
Linux:
sudo snap install scrape
Sample call:
scrape -w https://lawzava.com
Depends onchromium
orgoogle-chrome
being available in path if--js
is used
--async Scrape website pages asynchronously (default true) --debug Print debug logs -d, --depth int Max depth to follow when scraping recursively (default 3) --follow-external Follow external 3rd party links within website -h, --help help for scrape --js Enables EnableJavascript execution await --output string Output type to use (default 'plain', supported: 'csv', 'json') (default "plain") --output-with-url Adds URL to output with each email --recursively Scrape website recursively (default true) --timeout int If > 0, specify a timeout (seconds) for js execution await -w, --website string Website to scrape (default "https://lawzava.com")
For those that are looking forscraper
package - this repository was intended as a cli-use only thus the scraper package was moved tolawzava/emailscraper.Thescrape
utility will be maintained as a CLI implementation ofemailscraper
package.
About
CLI utility to scrape emails from websites