Movatterモバイル変換


[0]ホーム

URL:


Open In App
Next Article:
GET and POST Requests Using Python
Next article icon

If you like to play with API's or like to scrape data from various websites, you must've come around random annoying text, numbers, keywords that come around with data. Sometimes it can be really complicating and frustrating to clean scraped data to obtain the actual data that we want. 

In this article, we are going to explore a python library called clean-text which will help you to clean your scraped data in a matter of seconds without writing any fancy, long code. Let's begin

Installation

Use the following command

pip install clean-text

Note:CleanText package requires Python 3.7 or greater.

Syntax

cleantext.clean_words( text , {operations})

Different cleantext operations:

The clean-text function provides a range of arguments that specifies how to clean the given raw text input and return the cleaned text in the form of a string. Here is the list of arguments that you can use to clean your required data.

Code Implementation:

Python3
# import libraryfromcleantextimportclean# input stringtext="""    A bunch of\\u2018new\\u2019 references,    including [Moana]. »Yóù àré rïght <3!«    """print(clean(text=text,fix_unicode=True,to_ascii=True,lower=True,no_line_breaks=False,no_urls=False,no_emails=False,no_phone_numbers=False,no_numbers=False,no_digits=False,no_currency_symbols=False,no_punct=False,replace_with_punct="",replace_with_url="This is a URL",replace_with_email="Email",replace_with_phone_number="",replace_with_number="123",replace_with_digit="0",replace_with_currency_symbol="$",lang="en"))

Output:

 

Similar Reads

We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood ourCookie Policy &Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences

[8]ページ先頭

©2009-2025 Movatter.jp