- Notifications
You must be signed in to change notification settings - Fork0
Ruby Gem: Format, Verify & Follow URL redirects with detailed reports.
License
4rlm/url_verifier
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Following url redirects can sometimes take a few minutes and often creates various exceptions. UrlVerifier is built with exceptional error handling, reformatting, and optional time limits you can set; default is set to 60 sec limit, but typically only takes 5-10 seconds per url. UrlVerifier has been developed and improved upon for several years in an enterprise level app and is now available as an open source gem. It is perfect for high-volume, yet smooth, uninterrupted url formatting and verification.
Example:
{ url: 'blackwellford.com/staff', verified_url: 'https://www.blackwellford.com', url_path: '/staff', response_code:'200', url_redirected: true, url_sts: 'Valid'}
Add this line to your application's Gemfile:
gem'url_verifier'
And then execute:
$ bundle
Or install it yourself as:
$ gem install url_verifier
verifier = UrlVerifier::RunCurler.newverified_url_hash = verifier.verify_url('example.com')
verifier = UrlVerifier::RunCurler.newverified_url_hashes = verifier.verify_urls(array_of_urls)
array_of_urls = %w[blackwellford.com/staff www.mccrea.subaru.com/inventory www.sofake.sofake https://www.century1chevy.com https://www.mccreasubaru.com]args = { timeout_limit: 30 }verifier = UrlVerifier::RunCurler.new(args)verified_url_hashes = verifier.verify_urls(array_of_urls)
Note::timeout_limit
default is 60 seconds per url, but most urls take under 10 sec each. You can override the default by passing in your desired limit. If 60 seconds is fine, no need to pass any args. You could simply instantiate like this:
verifier = UrlVerifier::RunCurler.newverified_url_hashes = verifier.verify_urls(array_of_urls)
Returns Data in Hash Format with detailed report:
Notice the URLs in the input array above were NOT uniformly formatted. UrlVerifier leverages theutf8_sanitizer gem
andcrm_formatter gem
to pre-format the URLs before verifying, so the results will be uniformly formatted, like in the hashes below:url_f
.
:verified_url
is the final verified URL.:url_redirected
indicates that the verified URL is different than:url_f
.
Ifurl_sts: 'Invalid'
,:wx_date
will be timestamped, which helps keep track of when it became invalid, incase you are running periodic database URL verifications and want to include these details in your reports.
:response_code
in the 200's is ideal. If it has recently been forwarded it will be in the 300's, and 400's indicates an issue with the URL domain or server. Some 400's could be run later when they resolve their issues, so don't always give up on them.
Here is a reference guide:List of HTTP status codes
verified_url_hashes = [ { :web_status=>"formatted", :url=>"blackwellford.com/staff", :url_f=>"http://www.blackwellford.com", :url_path=>"/staff", :web_neg=>nil, :verified_url=>"https://www.blackwellford.com", :url_redirected=>true, :response_code=>"200", :url_sts=>"Valid", :url_date=>2018-07-02 09:16:19 -0500, :wx_date=>nil, :timeout=>0 }, { :web_status=>"formatted", :url=>"www.mccrea.subaru.com/inventory", :url_f=>"http://www.mccrea.subaru.com", :url_path=>nil, :web_neg=>nil, :verified_url=>"https://www.mccreasubaru.com", :url_redirected=>true, :response_code=>"200", :url_sts=>"Valid", :url_date=>2018-07-02 09:16:38 -0500, :wx_date=>nil, :timeout=>0 }, { :web_status=>"invalid", :url=>"www.sofake.sofake", :url_f=>nil, :url_path=>nil, :web_neg=>"error: ext.invalid [sofake]", :verified_url=>nil, :url_redirected=>false, :response_code=>nil, :url_sts=>"Invalid", :url_date=>2018-07-02 09:16:58 -0500, :wx_date=>2018-07-02 09:16:58 -0500, :timeout=>nil }, { :web_status=>"unchanged", :url=>"https://www.century1chevy.com", :url_f=>"https://www.century1chevy.com", :url_path=>nil, :web_neg=>nil, :verified_url=>"http://www.centurychevy.com", :url_redirected=>true, :response_code=>"405", :url_sts=>"Valid", :url_date=>2018-07-02 09:16:58 -0500, :wx_date=>nil, :timeout=>0 }, { :web_status=>"unchanged", :url=>"https://www.mccreasubaru.com", :url_f=>"https://www.mccreasubaru.com", :url_path=>nil, :web_neg=>nil, :verified_url=>"https://www.mccreasubaru.com", :url_redirected=>false, :response_code=>"200", :url_sts=>"Valid", :url_date=>2018-07-02 09:16:59 -0500, :wx_date=>nil, :timeout=>0 }]
Adam J Booth -4rlm
After checking out the repo, verifybin/setup
to install dependencies. Then, verifyrake spec
to verify the tests. You can also verifybin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, verifybundle exec rake install
. To release a new version, update the version number inversion.rb
, and then verifybundle exec rake release
, which will create a git tag for the version, push git commits and tags, and push the.gem
file torubygems.org.
Bug reports and pull requests are welcome on GitHub athttps://github.com/4rlm/url_verifier. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to theContributor Covenant code of conduct.
The gem is available as open source under the terms of theMIT License.
Everyone interacting in the UrlVerifier project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow thecode of conduct.