- Notifications
You must be signed in to change notification settings - Fork0
Dynamic website scraper and email notifier.
License
AleksaMCode/university-notices-email-notifier
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Scraper for notices onFaculty of Electrical Engineering Banja Luka website. This project scrapes notices from a website and after ETL processing data is sent to the appointed email address through Yahoo SMTP, usingsmtplib library, in a form of a JSON file.
I've always wanted to build a web scraper, and recently I found some free time recently to complete this project. Because the website is dynamic, scraping was done withSelenium API in addition toBeautiful Soup library. The project is written in such way that it can be run both on Windows and Linux.
Note:
- In order for any of this to work one prerequisite is that you have installed Python 3 on your machine.
- Be cautious when changingconfig.ini because it's tightly coupled with python code.
- The code is tested both on Windows 10 and latest Linux Mint distribution.
In this section, I will go over details how to set up this project on Linux. However, the majority of the steps are also applicable on Windows. Firstly, you will open the Command line and position yourself to the desired directory, after which you will need to clone this repository usinggit clone
command.
$ git clone https://github.com/AleksaMCode/university-notices-email-notifier.git
Next, position yourself inside the project directory, create avirtualenv
and then install all the needed packages from therequirements.txt file.
$cd university-notices-email-notifier$ virtuelenv -p python3 venv$source venv/bin/activate(venv) pip install -r requirements.txt
Note:
All of these commands you can find ininit.sh file that is located inside of theresources/scripts directory.
Before using this project, you need to adjust a couple of parameters stored in a configini
file. Firstly, you'll need to add an email address (user_email field) you wish to use to receive the email notification. If you wish to use Yahoo SMTP, you only need to update theemail andpassword fields with your own credentials. Below you can find detail instruction how to set up Yahoo SMTP with your account. If for some reason you want to use another email provider, then you will need, in addition to the previously mentioned fields, to update fields that are provider specific, such asport andSMTP server. All of this information is stored in a config file in the SMTP section.
university-notices-email-notifier/config.ini
Lines 1 to 6 inacc714b
[SMTP] | |
smtp = smtp.mail.yahoo.com | |
email = | |
port = 587 | |
password = | |
user_email = |
Below you have a table of all the essential details you need:
SMTP server | Port | Requires SSL | Requires TLS | Authentication | Username | Password |
---|---|---|---|---|---|---|
smtp.mail.yahoo.com | 587 | ✅ | ✅ | ✅ | Your Yahoo email address | Your Yahoo Mail App Password, which isn't the same as your account password |
Restrictions:
- You can send maximum of 500 emails per day.
- Some sources claim you can send maximum of 100 emails per hour.
In order to use Yahoo SMTP server, you need to create a dedicated App Password. Firstly you need to go to your account settings area and then click onAccount Security after which you will click onGenerate app password link under theOther ways to sign in section. After the popup is shown, you will need to enter your app name, which can be anything. Next, click theGenerate password button. You should then see the 16-char long app password, which you will need to remember for later usage, as Yahoo will not be showing it to you again.
First thing you need to create is abat file which will connect thepython.exe andnotifier.py script. Open a directory in which you wish to create abat file and open a PowerShell and type the following commands:
New-Itemscraper.bat"@echo of`r`n""C:\Users\Username\AppData\Local\Programs\Python\Python310\python.exe""""C:\Users\Username\university-notices-email-notifier\notifier.py"""
Note:
You will need to adjust the syntax above:
- Set first path where yourpython.exe is stored.
- Set second path wherenotifier.py script is stored.
In order to schedule the scraper using Window Scheduler, you will need to:
- Open the Windows Control Panel, then click on theAdministrative Tools and double-click on theTask Scheduler.
- Choose the option `Create Task...`.
- Type a name for this task (description is optional) inGeneral tab and then click onTriggers tab.
- Press on theNew... and then in the newly openedNew Trigger window choose to start the task 'One time' starting from 12:00:00 am.
- InAdvanced settings tick 'Repeat task every' and enter your desired frequency.
- From the drop menufor a duration of choose 'Indefinitely' and press onOK.
- Press on the tab and click on theNew... button. There you will need to browse and findscraper.bat which is located inside of the resources/scripts directory.
- PressOK twice.
Firstly, you need to open crontab with the following command crontab -e
. Once you enter the cron editor, you will need to add the cronjob command. For example, if you want to run this scraper every 30 minutes, you will enter:
0,30**** /usr/bin/python /home/script/university-notices-email-notifier/notifier.py
Save your changes and exit the editor. For more details on how to specify frequency, visit thislink.
Note:
Don't forget to exit Vim using:wq
. :)
- Replace json file attachment with html formatted email response.
- Implement year specific command for notifications.
- Implement year range command for notifications.
- Move sensitive information, like password, from config file to environment variables.
- Implement toast notifications.
About
Dynamic website scraper and email notifier.
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Uh oh!
There was an error while loading.Please reload this page.