Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Cover image for Get HTTP response codes with Python
petercour
petercour

Posted on

     

Get HTTP response codes with Python

You can get HTTP response codes with Python. This is great to find broken urls or redirects.

The web is dynamic. It's ever changing. Today the popular browsers are Chrome and Firefox. It used to be Netscape and Internet Explorer.

Because it's ever changing, you want to test a website for broken links (404), redirects and other errors. You can do this in Python.

To speed up testing, you want to be using threading. A thread allows you "parallel" execution. Parallel is between quotes because it's not really parallel, but multi-threaded.

So what does that look like in code?

#!/usr/bin/python3import timeimport urllib.requestfrom threading import Threadclass GetUrlThread(Thread):    def __init__(self, url):        self.url = url        super(GetUrlThread, self).__init__()        def run(self):        resp = urllib.request.urlopen(self.url)        print(self.url, resp.getcode())def get_responses():    urls = ['https://dev.to', 'https://www.ebay.com', 'https://www.github.com']    start = time.time()    threads = []    for url in urls:        t = GetUrlThread(url)        threads.append(t)        t.start()    for t in threads:        t.join()    print("Elapsed time: %s" % (time.time()-start))get_responses()
Enter fullscreen modeExit fullscreen mode

Run to test every url in urls:

https://www.github.com 200https://dev.to 200https://www.ebay.com 200Elapsed time: 0.496950626373291
Enter fullscreen modeExit fullscreen mode

A response of 200 means everything is okey. If a page is not found it returns 404. Basically every code other than 200 means there's a problem.

Related links

Top comments(2)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss
CollapseExpand
 
rhymes profile image
rhymes
Such software as dreams are made on.I mostly rant about performance, unnecessary complexity, privacy and data collection.
  • Joined

Since you only need to check response codes you might want to only issue aHEAD request, this way you avoid downloading the page and discard the content.

If you want to use the standard library you can do something like:

request=urllib.request.Request(url,method="HEAD")response=urllib.request.urlopen(request)

Unfortunately this will follow redirects, so you end up issuing multiple requests.

I'm sure there's a way to ignore redirects but you'd have to check the documentation.

CollapseExpand
 
petercour profile image
petercour
  • Joined

Thanks!

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

  • Joined

More frompetercour

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp