website-crawler

Star

Here are 31 public repositories matching this topic...

Language:All

Filter by language

All31 Python16 Java3 Go2 PHP2 Shell2 C#1 JavaScript1 Jupyter Notebook1 Kotlin1 TypeScript1

Sort:Most stars

Sort options

Most stars Fewest stars Most forks Fewest forks Recently updated Least recently updated

X-SLAYER /Website-Cloner

Star316

It allows you to download a website from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer.

css html front-end clone js images website-crawler website-clone website-cloner front-end-clone

UpdatedJun 1, 2023
Visual Basic .NET

MLArtist /WebScraper

Star81

Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.

crawler scraper user-agent scraping beautiful-soup robots-txt beautifulsoup scrapper website-scraper scrapping-python website-crawler beautifulsoup4 crawling-python iprotation

UpdatedJun 10, 2025
Python

Sneakpeek is a framework that helps to quickly and conviniently develop scrapers. It’s the best choice for scrapers that have some specific complex scraping logic that needs to be run on a constant basis

python crawler scraper vue scraping crawling python3 scrapers scraper-engine crawlers crawling-framework website-crawler scraping-framework crawler-python scraper-api crawling-engine

UpdatedAug 19, 2023
Python

vlmaier /marvel-snap-scrapr

Star24

Scraper forhttps://marvelsnapzone.com to retrieve metadata of Marvel SNAP cards.

game crawler scraper marvel website-scraper website-crawler marvel-characters crawler-python marvel-snap

UpdatedJul 1, 2024
Python

sammwyy /SpearCopy

Star20

A universal and local phishing toolkit for audit purposes

python web-crawler phishing audit pentesting pentest webscraping pentest-tool website-crawler website-clone phishing-kit phishing-page phishing-script phishing-tool web-clone

UpdatedNov 21, 2024
Python

chandrasekharan98 /Multisite-Python-Crawler

Star18

An almost generic web crawler built using Scrapy and Python 3.7 to recursively crawl entire websites.

python scrapy-spider python3 scrapy scrapy-crawler scrapy-demo website-crawler crawling-sites recursive-crawling

UpdatedMar 1, 2022
Python

martech-engineer /WebKnoGraph

Star11

WebKnoGraph is an open research project that uses data processing, vector embeddings, and graph algorithms to optimize internal linking at scale. Built for both academic and industry use, it offers THE FIRST FULLY transparent, AI-driven framework for improving SEO and site navigation through reproducible methods.

marketing-automation world-wide-web network-analysis link-prediction python-development search-engine-optimization website-crawler sentence-embeddings hits-algorithm graph-neural-networks page-rank web-networks real-datasets internal-linking martech-backend synthethic-data-generation

UpdatedJul 10, 2025
Jupyter Notebook

oxylabs /web-scraping-php

Star9

A tutorial and code samples of web scraping with PHP

php web-scraping url-scraper screen-scraping website-crawler email-scraper wikipedia-scraper email-scraper-with-proxy

UpdatedJun 26, 2025
PHP

JohnScooby /DuckDuckGo-Scraper

Star8

A Simple Script To Scrape DuckDuckGo Search Results Using Python And Selenium WebDriver.

python scraper scraping selenium duckduckgo url-scraper google-dorks dork duckduckgo-search website-crawler bing-search dork-scanner dorking dorkscanner bing-dorking dorking-tool

UpdatedNov 1, 2022
Python

zebbern /ReconX

Sponsor

Star4

🕷️ | ReconX is a Live-Website Crawler made to gather critical information with an option to take a picture of each site crawled!

python search-engine security website crawler information-retrieval osint hacking pentest information-security opsec information-gathering python-crawler website-scraper security-tools website-crawler livedata website-security osint-tool

UpdatedFeb 20, 2025
Python

Mediashare /crawler

Star3

💫 Crawl urls from a webpage and provide a DomCrawler with Scraper Library

crawler scraper crawl website-crawler

UpdatedNov 12, 2024
PHP

pratik-paranjape /tarantula-python-crawler

Star3

This a project to demonstrate the use of standard python libraries like os, urllib, HTMLParser to create a minimalist webpage crawler that crawls webpages on a website to gather hyperlinks (URLs)

python python3 website-crawler