sarperavci/CloudflareBypassForScrapingPublic

NotificationsYou must be signed in to change notification settings
Fork296
Star1.8k

A cloudflare verification bypass script for webscraping

License

MIT license

1.8k stars 296 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
.github/workflows		.github/workflows
cf_bypasser		cf_bypasser
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
CloudflareBypasser.py		CloudflareBypasser.py
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
docker_startup.sh		docker_startup.sh
old_server.py		old_server.py
old_server_requirements.txt		old_server_requirements.txt
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
server.py		server.py
server_requirements.txt		server_requirements.txt

Repository files navigation

Cloudflare Bypass for Scraping

⭐Thank you for 1,800+ stars! IntroducingVersion 2.0 with enhanced request mirroring, improved caching and better reliability for bypassing Cloudflare protection.

Bypass Cloudflare protection with ease. Supports cookie generation and request mirroring for any HTTP method.

Sponsors

Scrapeless

If you are looking for a solution focused onbrowser automation and anti-detection mechanisms, I recommendScrapeless Browser.
It is a cloud-based, Chromium-powered headless browser cluster that enables developers to runlarge-scale concurrent browser instances and handle complex interactions on protected pages. Perfect forAI infrastructure, web automation, data scraping, page rendering, and automated testing.

TheScrapeless Browser provides a secure, isolated browser environment that allows you to interact with web applications while minimizing potential risks to your system.

If you're looking for powerful and scalable browser automation and real-time data acquisition capabilities, Scrapeless offersa high-performance, scalable, and cost-efficient cloud browser infrastructure as well asa global enterprise-grade proxy network, addressing the core needs of automated execution and stable IP access.

Scrapeless Browser– Enterprise Cloud Browser Infrastructure

Out-of-the-Box Ready: Natively compatible withPuppeteer andPlaywright, supporting CDP connections. Migrate your projects with just one line of code.
Bulk Isolated Environment Creation: Each profile corresponds to an exclusive browser environment, enabling persistent login and identity isolation.
Unlimited Concurrent Scaling: A single task supports second-level launch of 50 to 1000+ browser instances. Auto-scaling is available with no server resource limits.
Real-time Signaling (MFA)：Supports event-driven handling of asynchronous workflows, including SMS/Email/TOTP verification, ensuring stable sessions, and uninterrupted automation.
Edge Node Service (ENS) – Multiple nodes worldwide, offering 2–3× faster launch speed and higher stability than other cloud browsers.
Flexible Fingerprint Customization: Generate random fingerprints or customize fingerprint parameters as needed.
Visual Debugging: Perform interactive debugging and real-time monitoring of proxy traffic throughLive View, and quickly pinpoint issues and optimize actions by replaying sessions page by page withSession Recordings.
Enterprise Customization: Undertake customization of enterprise-level automation projects and AI Agent customization.

👉 Learn more:Scrapeless Scraping Browser Playground Scrapeless Browser| Documentation

Scrapeless Proxy Network– Unblockable, Large-Scale Data Extraction

90+ million residential IPs worldwide, covering195+ countries, starting at$1.80/GB, pay-per-GB with no traffic expiration.
Flexible Proxy Types: Choose residential, IPv6, static ISP, or datacenter proxies based on workload requirements.
Enterprise-Grade Reliability: 99.9% uptime with ultra-low latency (<0.5s).
Advanced targeting: City-level geolocation targeting with automatic IP rotation
High-Performance Scraping: Ideal for AI training data collection, web automation, and large-scale real-time extraction tasks.
👉 Learn more:Scrapeless Proxies| Documentation
👉Get it Now!

ThorData

ThorData Web Scraper provides unblockable proxy infrastructure and scraping solutions for reliable, real-time web data extraction at scale. Perfect for AI training data collection, web automation, and large-scale scraping operations that require high performance and stability.
Key Advantages of ThorData:

Massive proxy network: Access to 60M+ ethically sourced residential, mobile, ISP, and datacenter IPs across 190+ countries.
Enterprise-grade reliability: 99.9% uptime with ultra-low latency (<0.5s response time) for uninterrupted data collection.
Flexible proxy types: Choose from residential, mobile (4G/5G), static ISP, or datacenter proxies based on your needs.
Cost-effective pricing: Starting from $1.80/GB for residential proxies with no traffic expiration and pay-as-you-go model.
Advanced targeting: City-level geolocation targeting with automatic IP rotation and unlimited bandwidth options.
Ready-to-use APIs: 120+ scraper APIs and comprehensive datasets purpose-built for AI and data science workflows.

ThorData is SOC2, GDPR, and CCPA compliant, trusted by 4,000+ enterprises for secure web data extraction.
👉 Learn more:ThorData Web Scraper |Get Started

IPOasis

IPOasis is a trusted provider of high-quality proxy services, with over 90 million nodes distributed globally across more than 200 countries.

Our proxies are fresh, clean, fast, and have a high success rate.

Supporting both HTTP and SOCKS5 protocols, with session control and unlimited concurrency.

Our products are ideal for a variety of use cases including data monitoring, survey research, web scraping, SEO/ASO optimization, app simulation, gaming, business measurement, marketing, and more.

May IPOasis, this unique online 'oasis,' empower every user seeking high-quality residential proxies. 🩵

🚀 Quick Start

Docker (Recommended)

Using Docker Compose

git clone https://github.com/sarperavci/CloudflareBypassForScraping.gitcd CloudflareBypassForScrapingdocker compose pull&& docker compose up -d

Using Docker directly

# Pull and run the latest imagedocker run -p 8000:8000 ghcr.io/sarperavci/cloudflarebypassforscraping:latest

Manual Installation

pip install -r requirements.txtpython server.py

Usage

Request Mirroring (Any HTTP Method)

Request mirroring is a new technique that allows you to forward any HTTP request through the Cloudflare bypass server. That lets you to handle seamlessly both clearance cookie generation and SSL/TLS fingerprinting challenges.

Simply, change your API base URL to point to the local server and add thex-hostname header with the target hostname. You can add other headers or body as needed.

# GET requestcurl"http://localhost:8000/api/data" -H"x-hostname: example-site-protected-with-cf.com"# POST requestcurl -X POST"http://localhost:8000/api/submit" \  -H"x-hostname: cf-protected-website.com" \  -H"Content-Type: application/json" \  -d'{"key": "value"}'

Initial request will generate and cache Cloudflare cookies, subsequent requests will use cached cookies automatically.

Miscellaneous Headers

x-hostname: Target hostname (required)
x-proxy: Proxy URL (optional)
x-bypass-cache: Force fresh cookies (optional)

These three headers let you control the bypassing behavior per request. You can set them as needed.

curl"http://localhost:8000/api/data" \  -H"x-hostname: protected-site.com" \  -H"x-proxy: http://user:pass@proxyserver:port" \  -H"x-bypass-cache: true"

Basic Cookie Extraction

The/cookies endpoint allows you to get Cloudflare cookies for a specific URL without mirroring a request. A random Firefox version on a random OS is used as the user agent.

$ curl"http://localhost:8000/cookies?url=https://nopecha.com/demo/cloudflare"

{"cookies": {"cf_clearance":"SJHuYhHrTZpXDUe8iMuzEUpJxocmOW8ougQVS0.aK5g-1723665177-1.0.1.1-5_NOoP19LQZw4TQ4BLwJmtrXBoX8JbKF5ZqsAOxRNOnW2rmDUwv4hQ7BztnsOfB9DQ06xR5hR_hsg3n8xteUCw"  },"user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:145.0) Gecko/20100101 Firefox/145.0"}

HTML Content Extraction

The/html endpoint returns the full HTML content of a page after bypassing Cloudflare protection. The HTML is returned directly (not as JSON).

$ curl"http://localhost:8000/html?url=https://nopecha.com/demo/cloudflare"

This returns the raw HTML content with additional headers containing bypass information:

x-cf-bypasser-cookies: Number of cookies generated
x-cf-bypasser-user-agent: User agent used for bypass
x-cf-bypasser-final-url: Final URL after redirects
x-processing-time-ms: Time taken to process the request

Build from Source

# Build the imagedocker build -t cloudflare-bypass.# Run the containerdocker run -p 8000:8000 cloudflare-bypass

Backward Compatibility

Existing integrations continue to work unchanged:

# Legacy endpoint still workscurl"http://localhost:8000/cookies?url=https://example.com"# Old bypass server - I'm keeping it as alternative methodpip install -r old_server_requirements.txtpython old_server.py

Example Projects

Contributing

Contributions welcome! Submit PRs against the main codebase.

About

A cloudflare verification bypass script for webscraping

Movatterモバイル変換

License

sarperavci/CloudflareBypassForScraping

Folders and files

Latest commit

History

Repository files navigation

Cloudflare Bypass for Scraping

Sponsors

Scrapeless

Scrapeless Browser– Enterprise Cloud Browser Infrastructure

Scrapeless Proxy Network– Unblockable, Large-Scale Data Extraction

ThorData

IPOasis

🚀 Quick Start

Docker (Recommended)

Using Docker Compose

Using Docker directly

Manual Installation

Usage

Request Mirroring (Any HTTP Method)

Miscellaneous Headers

Basic Cookie Extraction

HTML Content Extraction

Build from Source

Backward Compatibility

Example Projects

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Uh oh!

Contributors11

Uh oh!

Languages

Packages