Prerequisite:Beautifulsoup Installation
Attributes are provided by Beautiful Soup which is a web scraping framework for Python. Web scraping is the process of extracting data from the website using automated tools to make the process faster. A tag may have any number of attributes. For example, the tag <b> has an attribute “class” whose value is “active”. We can access a tag’s attributes by treating it like a dictionary.
Syntax:
tag.attrs
Implementation:
Example 1: Program to extract the attributes using attrs approach.
Python3# Import Beautiful Soupfrombs4importBeautifulSoup# Initialize the object with a HTML pagesoup=BeautifulSoup(''' <html> <h2 class="hello"> Heading 1 </h2> <h1> Heading 2 </h1> </html> ''',"lxml")# Get the whole h2 tagtag=soup.h2# Get the attributeattribute=tag.attrs# Print the outputprint(attribute)
Output:
{'class': ['hello']}
Example 2: Program to extract the attributes using dictionary approach.
Python3# Import Beautiful Soupfrombs4importBeautifulSoup# Initialize the object with a HTML pagesoup=BeautifulSoup(''' <html> <h2 class="hello"> Heading 1 </h2> <h1> Heading 2 </h1> </html> ''',"lxml")# Get the whole h2 tagtag=soup.h2# Get the attributeattribute=tag['class']# Print the outputprint(attribute)
Output:
['hello']
Example 3: Program to extract the multiple attribute values using dictionary approach.
Python3# Import Beautiful Soupfrombs4importBeautifulSoup# Initialize the object with a HTML pagesoup=BeautifulSoup(''' <html> <h2 class="first second third"> Heading 1 </h2> <h1> Heading 2 </h1> </html> ''',"lxml")# Get the whole h2 tagtag=soup.h2# Get the attributeattribute=tag['class']# Print the outputprint(attribute)
Output:
['first', 'second', 'third']