Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork115
Open
Description
If the URL contains UNICODE encoding, python will report an error.
debug info:
INFO:root:Crawling#1:https://gvo.wiki/html/NPC掉落書籍.html
DEBUG:root:https://gvo.wiki/html/NPC掉落書籍.html ==> 'ascii' codec can't encode characters in position 13-16: ordinal no
t in range(128)
Solution:
- edit crawler.py
Add the following code at the top
import stringfrom urllib.parse import unquotethen search
current_url = self.urls_to_crawl.pop()add a line below
current_url = self.urls_to_crawl.pop()current_url = quote(current_url, safe=string.printable)Metadata
Metadata
Assignees
Labels
No labels