- Notifications
You must be signed in to change notification settings - Fork0
khuyentran1401/Extract-text-from-article
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
This project extracts the text from an article using Python Article Library and uses NLTK (Natural Language Processing Toolkit) to preprocess the text and extract the most common words in the article
- Newspaper3k: tool to scrape article
- NLTK: tool to process text
- Scrape articles with newspaper3k
fromnewspaperimportArticleurl='https://mystudentvoices.com/it-took-me-2-years-to-get-1000-followers-life-lessons-ive-learned-throughout-the-journey-9bc44f2959f0'article=Article(url)article.download()
- Find the publish date
article.publish_date
- Extract image
- Find the author
- Find the keywords
- Find the summary
- Preprocessing with NLTK
- Tokenize text
- Lowercase and remove stopwords
- Visualization the frequency of words with Matplotlib

Find the Medium article for this repositoryhere
About
No description or website provided.
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
No releases published
Packages0
No packages published