web-crawling
Here are 319 public repositories matching this topic...
Language:All
Sort:Most stars
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
- Updated
Jul 18, 2025 - TypeScript
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
- Updated
Jul 18, 2025 - Python
The All in One Framework to Build Undefeatable Scrapers
- Updated
Jun 11, 2025 - Python
A powerful Model Context Protocol (MCP) server that provides an all-in-one solution for public web access.
- Updated
Jul 10, 2025 - JavaScript
PulsarRPA: An AI-Enabled, Super-Fast, Thread-Safe Browser Automation Solution! 💖
- Updated
Jul 13, 2025 - Kotlin
Official repository for "Craw4LLM: Efficient Web Crawling for LLM Pretraining"
- Updated
Feb 24, 2025 - Python
A simple web scraper to extract Product Data and Pricing from Amazon
- Updated
Jun 13, 2023 - Python
Library for Rapid (Web) Crawler and Scraper Development
- Updated
Jun 10, 2025 - PHP
This is a Twitter Scraper which uses Selenium for scraping tweets. It is capable of scraping tweets from home, user profile, hashtag, query or search, and advanced searches.
- Updated
Apr 12, 2025 - Jupyter Notebook
Omnisci3nt – See What They’ve Tried to Hide Extract deep intelligence from any domain. From subdomains to SSL certs, archived secrets to exposed ports — Omnisci3nt gives you the full picture in seconds.
- Updated
Apr 15, 2025 - Python
Machine Learning Model for Sport Predictions (Football, Basketball, Baseball, Hockey, Soccer & Tennis)
- Updated
Feb 12, 2017 - Jupyter Notebook
A simple but powerful web crawler library for .NET
- Updated
Dec 15, 2023 - C#
⚡ Ayakashi.io - The next generation web scraping framework
- Updated
Jun 29, 2023 - TypeScript
A tool for scraping emails, social media accounts, and much more information from websites using Google Search Results.
- Updated
Mar 19, 2024 - Ruby
Scrapy Training companion code
- Updated
Jan 30, 2019 - Python
A web crawling framework written in Kotlin
- Updated
Jun 29, 2021 - Kotlin
💵 💰 🇧🇷 Informações sobre taxas oficiais diárias de Inflação, Selic, Poupança, Dólar, Dólar PTAX, Euro e Euro PTAX pelo site do Banco Central do Brasil
- Updated
Nov 30, 2021 - Python
Set up free and scalable Scrapyd cluster for distributed web-crawling with just a few clicks. DEMO 👉
- Updated
Apr 4, 2020 - Python
Parser and database to index the terpene profile of different strains of Cannabis from online databases
- Updated
Apr 28, 2023 - Python
A web crawling programming language
- Updated
Aug 21, 2024 - Rust
Improve this page
Add a description, image, and links to theweb-crawling topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with theweb-crawling topic, visit your repo's landing page and select "manage topics."