Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Dynamic website scraper and email notifier.

License

NotificationsYou must be signed in to change notification settings

AleksaMCode/university-notices-email-notifier

Repository files navigation

University notices email notifier

Scraper for notices onFaculty of Electrical Engineering Banja Luka website. This project scrapes notices from a website and after ETL processing data is sent to the appointed email address through Yahoo SMTP, usingsmtplib library, in a form of a JSON file.

Table of contents

Introduction

I've always wanted to build a web scraper, and recently I found some free time recently to complete this project. Because the website is dynamic, scraping was done withSelenium API in addition toBeautiful Soup library. The project is written in such way that it can be run both on Windows and Linux.

Note:

  • In order for any of this to work one prerequisite is that you have installed Python 3 on your machine.
  • Be cautious when changingconfig.ini because it's tightly coupled with python code.
  • The code is tested both on Windows 10 and latest Linux Mint distribution.

Initial Setup

In this section, I will go over details how to set up this project on Linux. However, the majority of the steps are also applicable on Windows. Firstly, you will open the Command line and position yourself to the desired directory, after which you will need to clone this repository usinggit clone command.

$ git clone https://github.com/AleksaMCode/university-notices-email-notifier.git

Next, position yourself inside the project directory, create avirtualenv and then install all the needed packages from therequirements.txt file.

$cd university-notices-email-notifier$ virtuelenv -p python3 venv$source venv/bin/activate(venv) pip install -r requirements.txt

Note:

All of these commands you can find ininit.sh file that is located inside of theresources/scripts directory.

Config file setup

Before using this project, you need to adjust a couple of parameters stored in a configini file. Firstly, you'll need to add an email address (user_email field) you wish to use to receive the email notification. If you wish to use Yahoo SMTP, you only need to update theemail andpassword fields with your own credentials. Below you can find detail instruction how to set up Yahoo SMTP with your account. If for some reason you want to use another email provider, then you will need, in addition to the previously mentioned fields, to update fields that are provider specific, such asport andSMTP server. All of this information is stored in a config file in the SMTP section.

[SMTP]
smtp = smtp.mail.yahoo.com
email =
port = 587
password =
user_email =

Yahoo SMTP

Below you have a table of all the essential details you need:

SMTP serverPortRequires SSLRequires TLSAuthenticationUsernamePassword
smtp.mail.yahoo.com587Your Yahoo email addressYour Yahoo Mail App Password, which isn't the same as your account password

Restrictions:

  • You can send maximum of 500 emails per day.
  • Some sources claim you can send maximum of 100 emails per hour.

In order to use Yahoo SMTP server, you need to create a dedicated App Password. Firstly you need to go to your account settings area and then click onAccount Security after which you will click onGenerate app password link under theOther ways to sign in section. After the popup is shown, you will need to enter your app name, which can be anything. Next, click theGenerate password button. You should then see the 16-char long app password, which you will need to remember for later usage, as Yahoo will not be showing it to you again.

Scheduling scraping

Windows - Task Scheduler

First thing you need to create is abat file which will connect thepython.exe andnotifier.py script. Open a directory in which you wish to create abat file and open a PowerShell and type the following commands:

New-Itemscraper.bat"@echo of`r`n""C:\Users\Username\AppData\Local\Programs\Python\Python310\python.exe""""C:\Users\Username\university-notices-email-notifier\notifier.py"""

Note:
You will need to adjust the syntax above:

  • Set first path where yourpython.exe is stored.
  • Set second path wherenotifier.py script is stored.

In order to schedule the scraper using Window Scheduler, you will need to:

  • Open the Windows Control Panel, then click on theAdministrative Tools and double-click on theTask Scheduler.
  • Choose the option `Create Task...`.
  • Type a name for this task (description is optional) inGeneral tab and then click onTriggers tab.
  • Press on theNew... and then in the newly openedNew Trigger window choose to start the task 'One time' starting from 12:00:00 am.
  • InAdvanced settings tick 'Repeat task every' and enter your desired frequency.
  • From the drop menufor a duration of choose 'Indefinitely' and press onOK.
  • Press on the tab and click on theNew... button. There you will need to browse and findscraper.bat which is located inside of the resources/scripts directory.
  • PressOK twice.

Linux - Cron job

Firstly, you need to open crontab with the following command crontab -e. Once you enter the cron editor, you will need to add the cronjob command. For example, if you want to run this scraper every 30 minutes, you will enter:

0,30**** /usr/bin/python /home/script/university-notices-email-notifier/notifier.py

Save your changes and exit the editor. For more details on how to specify frequency, visit thislink.

Note:
Don't forget to exit Vim using:wq. :)

To-Do List

  • Replace json file attachment with html formatted email response.
  • Implement year specific command for notifications.
  • Implement year range command for notifications.
  • Move sensitive information, like password, from config file to environment variables.
  • Implement toast notifications.

About

Dynamic website scraper and email notifier.

Topics

Resources

License

Stars

Watchers

Forks


[8]ページ先頭

©2009-2025 Movatter.jp