Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork34k
Open
Description
Documentation
In the Python documentation forurllib.robotparser, the example currently references a page that is no longer available (musi-cal.com). The example code now points to an inactive website:
>>>importurllib.robotparser>>>rp=urllib.robotparser.RobotFileParser()>>>rp.set_url("http://www.musi-cal.com/robots.txt")>>>rp.read()>>>rrate=rp.request_rate("*")>>>rrate.requests3>>>rrate.seconds20>>>rp.crawl_delay("*")6>>>rp.can_fetch("*","http://www.musi-cal.com/cgi-bin/search?city=San+Francisco")False>>>rp.can_fetch("*","http://www.musi-cal.com/")True
Additionally, the currentrobots.txt file athttp://www.musi-cal.com/robots.txt contains:
User-agent:*Disallow: /wp-admin/Allow: /wp-admin/admin-ajax.phpBecause of this, bothcan_fetch() calls now return True, which doesn't align with the expected output from the example.
Proposed fix:
Update the example inurlib.robotparser.rst to replace the outdated musi-cal.com URL with a valid URL (e.g.https://www.python.org).
I would be happy to work on this issue and put together a PR for the update.
Linked PRs
Metadata
Metadata
Assignees
Projects
Status
Todo