Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit803a609

Browse files
committed
Updated get_video_info() to work with new YouTube HTML formatting
1 parent542cc36 commit803a609

File tree

2 files changed

+10
-15
lines changed

2 files changed

+10
-15
lines changed

‎web-scraping/youtube-extractor/README.md

Lines changed: 6 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -8,22 +8,17 @@ To run this:
88
**Output:**
99
```
1010
Title: Me at the zoo
11-
Views:106602383
12-
Published at:23/04/2005
11+
Views:172639597
12+
Published at: 2005-04-23
1313
Video Duration: 0:18
1414
Video tags: me at the zoo, jawed karim, first youtube video
15-
Likes:3825489
16-
Dislikes:111818
15+
Likes:8188077
16+
Dislikes:191986
1717

18-
Description: The first video on YouTube. Maybe it's time to go back to the zoo?
19-
20-
NEW VIDEO LIVE! https://www.youtube.com/watch?v=dQw4w...
21-
22-
23-
== Ok, new video as soon as 10M subscriberz! ==
18+
Description: The first video on YouTube. While you wait for Part 2, listen to this great song: https://www.youtube.com/watch?v=zj82_v2R6ts
2419

2520

2621
Channel Name: jawed
2722
Channel URL: https://www.youtube.com/channel/UC4QobU6STFB0P71PMvOGN5A
28-
Channel Subscribers: 1.03M
23+
Channel Subscribers: 1.98M subscribers
2924
```

‎web-scraping/youtube-extractor/extract_video_info.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,13 +16,13 @@ def get_video_info(url):
1616
# initialize the result
1717
result= {}
1818
# video title
19-
result["title"]=soup.find("h1").text.strip()
19+
result["title"]=soup.find("meta",itemprop="name")['content']
2020
# video views (converted to integer)
21-
result["views"]=int(''.join([cforcinsoup.find("span",attrs={"class":"view-count"}).textifc.isdigit() ]))
21+
result["views"]=soup.find("meta",itemprop="interactionCount")['content']
2222
# video description
23-
result["description"]=soup.find("yt-formatted-string",{"class":"content"}).text
23+
result["description"]=soup.find("meta",itemprop="description")['content']
2424
# date published
25-
result["date_published"]=soup.find("div",{"id":"date"}).text[1:]
25+
result["date_published"]=soup.find("meta",itemprop="datePublished")['content']
2626
# get the duration of the video
2727
result["duration"]=soup.find("span", {"class":"ytp-time-duration"}).text
2828
# get the video tags

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp