scraper

AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.

python resume bot agent chrome scraper automation job scraping selenium jobs artificial-intelligence automate jobseeker gpt jobsearch human-resources chatgpt opeai application-resume

UpdatedNov 16, 2025
Python

gocolly /colly

Star25.1k

Elegant Scraper and Crawler Framework for Golang

go golang crawler scraper framework spider scraping crawling

UpdatedFeb 17, 2026
Go

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

nodejs javascript npm crawler scraper automation typescript web-crawler headless scraping crawling web-scraping web-crawling headless-chrome apify puppeteer playwright

UpdatedFeb 20, 2026
TypeScript

Evil0ctal /Douyin_TikTok_Download_API

Sponsor

Star16.3k

🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具，支持API调用，在线批量解析及下载。

python api crawler scraper spider async web-scraping douyin tiktok fastapi tiktok-scraper tiktok-api douyin-api pywebio tiktok-signature no-watermark online-parsing douyin-tiktok-api douyin-tiktok-download douyin-scraper

UpdatedOct 12, 2025
Python

getmaxun /maxun

Star15.1k

✨ The open-source no-code platform for web scraping, crawling, search and AI data extraction • Turn websites into structured APIs in minutes ✨

api crawler scraper automation crawling web-scraper self-hosted web-scraping data-extraction webscraping agents browser-automation no-code web-search rpa robotic-process-automation nocode playwright

UpdatedFeb 20, 2026
TypeScript

codelucas /newspaper

Sponsor

Star15k

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:

python crawler scraper news crawling news-aggregator

UpdatedDec 6, 2025
HTML

pwxcoo /chinese-xinhua

Star11.5k

📙 中华新华字典数据库。包括歇后语，成语，词语，汉字。

json data scraper json-data python3 chinese chinese-nlp chinese-characters chinese-simplified chinese-traditional json-dataset chinese-language

UpdatedDec 26, 2023
Python

guyueyingmu /avbook

Star9.9k

AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

crawler scraper laravel database spider magnet-link guzzlehttp magnet adult javbus javlibrary avmoo adult-video

UpdatedJun 1, 2024
PHP

apify /crawlee-python

Star8.1k

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

python crawler scraper automation web-crawler headless scraping crawling pip web-scraping beautifulsoup web-crawling hacktoberfest headless-chrome apify playwright