- Notifications
You must be signed in to change notification settings - Fork38
Learn how to scrape websites with Python, Selenium, Requests HTML, Celery, FastAPI, & NoSQL with Cassandra via AstraDB.
License
codingforentrepreneurs/Scrape-Websites-with-Python-FastAPI-Celery-NoSQL
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Learn how to scrape websites with Python, Selenium, Requests HTML, Celery, FastAPI, & NoSQL.
Here's what each tool is used for:
- Python 3.9download - programming the logic.
- AstraDBsign up - highly perfomant and scalable database service by DataStax. AstraDB is a Cassandra NoSQL Database.Cassandra is used by Netflix, Discord, Apple, and many others to handle astonding amounts of data.
- Seleniumdocs - an automated web browsing experience that allows:
- Run all web-browser actions through code
- Loads JavaScript heavy websites
- Can perform standard user interaction like clicks, form submits, logins, etc.
- Requests HTMLdocs - we're going to use this to parse an HTML document extracted from Selenium
- Celerydocs - Celery providers worker processes that will allow us to schedule when we need to scrape websites. We'll be usingredis as our task queue.
- FastAPIdocs - as a web application framework to Display and monitor web scraping results from anywhere
This series is broken up into 4 parts:
- Scraping How to scrape and parse data from nearly any website with Selenium & Requests HTML.
- Data models how to store and validate data with
cassandra-driver,pydantic, andAstraDB. - Worker & Scheduling how to schedule periodic tasks (ie scraping) integrated with Redis & AstraDB
- Presentation How to combine the above steps in as robust web application service
Below is a preflight checklist to ensure you system is fully setup to work with this course. All guides and setup can be found in thesetup directory of this repo.
- [] Install Selenium & Chromedriver -setup guide
- [] Install Redis -setup guide
- [] Create a virtual environment & install dependencies
- [] Setup an account with DataStax
- [] Create your first AstraDB and get API credentials
- [] Use
cassandra-driverto verify your connection to AstraDB
About
Learn how to scrape websites with Python, Selenium, Requests HTML, Celery, FastAPI, & NoSQL with Cassandra via AstraDB.
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Uh oh!
There was an error while loading.Please reload this page.
Contributors2
Uh oh!
There was an error while loading.Please reload this page.