Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Web Scraping API code examples for Python, PHP and Node.js

License

NotificationsYou must be signed in to change notification settings

Smartproxy/Web-Scraping-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

List of contents

Introduction

Withour Web Scraping API, you can scrape various websites en masse.

Authentication

Once you have an active Web Scraping API subscription, you can try sending a request right from the dashboard Web Scraping API > Authentication method tab simply by entering your username, password, and clicking on Generate. You will also see an example of curl request generated right below your entereduser:pass.

Note that this is only an example with preset values to get you on the right track for forming your own request, meaning you will not be able to change the request values in the dashboard itself – that will have to be done in your code.

Scraping

You can use universal parameter as your target and supply any URL you want, which will return the HTML of the targeted URL.

API Link:https://scraper-api.smartproxy.com/v2/scrape

POST /scrape

Target:universal (not parseable)

Required parameters:url (ip.smartproxy.com in this example)

ParameterTypeDescription
urlurlTarget URL
targetstringScraping target -universal

Examples

Programming LanguageExample locationDownload
Pythonpython/ipsmartproxy.pycurl https://raw.githubusercontent.com/Smartproxy/Web-Scraping-API/main/python/ipsmartproxy.py > ipsmartproxy.py
PHPphp/ipsmartproxy.phpcurl https://raw.githubusercontent.com/Smartproxy/Web-Scraping-API/main/php/ipsmartproxy.php > ipsmartproxy.php
Node.jsnodejs/ipsmartproxy.jscurl https://raw.githubusercontent.com/Smartproxy/Web-Scraping-API/main/nodejs/ipsmartproxy.js > ipsmartproxy.js

Response

{"results": [    {"content":"Your Ip is: 213.87.163.6","status_code":200,"url":"https://ip.smartproxy.com/","task_id":"6971034977135771649","created_at":"2022-09-01 09:24:14","updated_at":"2022-09-01 09:24:17"    }  ]}

Headless

Not seeing the results you wanted?

Try enabling JavaScript rendering using theheadless parameter. -Parameters

This parameter renders JavaScript on the target website making more data available for scraping.

Facebook

Facebook Page

Target:universal (not parseable)

Required parameters:url

ParameterTypeDescription
urlurlTarget URL
targetstringScraping target -universal

Response

{"results": [    {"content":"<html> Facebook page content</html>""status_code":200,"url":"https://www.facebook.com/ladygaga","task_id":"6972452679540839425","created_at":"2022-09-05 07:17:40","updated_at":"2022-09-05 07:17:45"    }  ]}

Examples

Programming LanguageExample locationDownload
Pythonpython/facebookpage.pycurl https://raw.githubusercontent.com/Smartproxy/Web-Scraping-API/main/python/facebookpage.py > facebookpage.py
PHPphp/facebookpage.phpcurl https://raw.githubusercontent.com/Smartproxy/Web-Scraping-API/main/php/facebookpage.php > facebookpage.php
Node.jsnodejs/facebookpage.jscurl https://raw.githubusercontent.com/Smartproxy/Web-Scraping-API/main/nodejs/facebookpage.js > facebookpage.js

Facebook Post

Target:universal (not parseable)

Required parameters:url

ParameterTypeDescription
urlurlTarget URL
targetstringScraping target -universal
headlessstringJavascript rendering -html

Response

{"results": [    {"content":"<html> Facebook page content</html>""status_code":200,"url":"https://www.facebook.com/zuck/posts/pfbid0HeY54v4LMcv2EMxDz5RvnWaR6swsGFWikzUbrsEFtvxu9n4GCx7zA2YTM69XdiYnl","task_id":"6972484278999372801","created_at":"2022-09-05 09:23:14","updated_at":"2022-09-05 09:23:32"    }  ]}

Examples

Programming LanguageExample locationDownload
Pythonpython/facebookpost.pycurl https://raw.githubusercontent.com/Smartproxy/Web-Scraping-API/main/python/facebookpost.py > facebookpost.py
PHPphp/facebookpost.phpcurl https://raw.githubusercontent.com/Smartproxy/Web-Scraping-API/main/php/facebookpost.php > facebookpost.php
Node.jsnodejs/facebookpost.jscurl https://raw.githubusercontent.com/Smartproxy/Web-Scraping-API/main/nodejs/facebookpost.js > facebookpost.js

Facebook Group

Target:universal (not parseable)

Required parameters:url

ParameterTypeDescription
urlurlTarget URL
targetstringScraping target -universal

Response

{"results": [    {"content":"<html> Facebook page content</html>""status_code":200,"url":"https://www.facebook.com/groups/1394454774138066","task_id":"6972486765374350337","created_at":"2022-09-05 09:33:07","updated_at":"2022-09-05 09:33:33"    }  ]}

Examples

Programming LanguageExample locationDownload
Pythonpython/facebookgroup.pycurl https://raw.githubusercontent.com/Smartproxy/Web-Scraping-API/main/python/facebookgroup.py > facebookgroup.py
PHPphp/facebookgroup.phpcurl https://raw.githubusercontent.com/Smartproxy/Web-Scraping-API/main/php/facebookgroup.php > facebookgroup.php
Node.jsnodejs/facebookgroup.jscurl https://raw.githubusercontent.com/Smartproxy/Web-Scraping-API/main/nodejs/facebookgroup.js > facebookgroup.js

Parameters

ParameterTypeDescription
targetstringData source. (universal)
urlurlDirect URL (link)
localestringThis will change the web interface language. Example: – en-US – en-GB
geostringThe geographical location that the result depends on. Full Country names required
device_typestringDevice type and browser. Supported:desktop,desktop_chrome,desktop_firefox,mobile,mobile_android,mobile_ios.
headlessstringEnable JavaScript rendering. Supported:html,png

Response Codes

HTTP Response Codes

ResponseDescriptionSolution
200 - SuccessServer has replied and given requested response.Celebrate!
204 - No contentJob not completed yet.Wait a few seconds before trying again.
400 - Multiple error messagesBad structure of the request.Re-check your request to make sure it is in the correct format.
401 - Invalid / not provided authorization header (client not found)Incorrect login credentials or missing authorization.Re-check your provided credentials for authorization.
403 - ForbiddenYour account does not have access to this resource.Make sure the target is supported by us
404 - Not foundYour target was not found.Re-check your targeted URL.
429 - Too many requestsExceeded rate limit for your subscription.Make sure you still have at least one request left. Wait a couple minutes and try again. If you are encountering the error often – chat with us to see if your rate limit can be increased.
500 - Internal errorService unavailable, possibly due to some issues we are encountering.Wait a couple minutes and send another request. Contact us for more information.
524 - TimeoutService unavailable, possibly due to some issues we are encountering.Wait a couple minutes and send another request. Contact us for more information.

License

All code is released underMIT License


[8]ページ先頭

©2009-2025 Movatter.jp