scraping

AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.

python resume bot agent chrome scraper automation job scraping selenium jobs artificial-intelligence automate jobseeker gpt jobsearch human-resources chatgpt opeai application-resume

UpdatedNov 16, 2025
Python

gocolly /colly

Star25.1k

Elegant Scraper and Crawler Framework for Golang

go golang crawler scraper framework spider scraping crawling

UpdatedFeb 17, 2026
Go

ScrapeGraphAI /Scrapegraph-ai

Sponsor

Star22.7k

Python scraper based on AI

markdown crawler web-crawler scraping web-scraper web-scraping data-extraction webscraping web-data-extraction web-search ai-search rag web-data scraping-python web-crawlers llm ai-crawler large-language-model ai-scraping firecrawl-alternative

UpdatedFeb 16, 2026
Python

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

nodejs javascript npm crawler scraper automation typescript web-crawler headless scraping crawling web-scraping web-crawling headless-chrome apify puppeteer playwright

UpdatedFeb 20, 2026
TypeScript

soxoj /maigret

Sponsor

Star19k

🕵️‍♂️ Collect a dossier on a person by username from thousands of sites

python cli open-source osint social-network scraping sherlock python3 cybersecurity identification infosec pentesting blueteam investigation reconnaissance redteam osint-framework socmint osint-python namechecker

UpdatedFeb 19, 2026
Python

psf /requests-html

Sponsor

Star13.9k

Pythonic HTML Parsing for Humans™

python html http scraping requests kennethreitz beautifulsoup lxml css-selectors pyquery

UpdatedApr 16, 2024
Python

ultrafunkamsterdam /undetected-chromedriver

Star12.4k

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)

testing chrome automation webdriver browser captcha scraping selenium navigator python3 cloudflare chromedriver anti-bot bot-detection cloudflare-bypass distil anti-detection

UpdatedJul 5, 2025
Python

code4craft /webmagic

Star11.7k

A scalable web crawler framework for Java.

java crawler framework scraping

UpdatedDec 20, 2025
Java

D4Vinci /Scrapling

Sponsor

Star9.1k

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

python crawler data automation ai mcp scraping crawling web-scraper web-scraping selectors xpath data-extraction stealth webscraping crawling-python playwright web-scraping-python ai-scraping mcp-server

UpdatedFeb 18, 2026
Python

apify /crawlee-python

Star8.1k

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

python crawler scraper automation web-crawler headless scraping crawling pip web-scraping beautifulsoup web-crawling hacktoberfest headless-chrome apify playwright