Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

ScrapingAnt API client for Python.

NotificationsYou must be signed in to change notification settings

ScrapingAnt/scrapingant-client-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyPI version

scrapingant-client is the official library to accessScrapingAnt API from your Pythonapplications. It provides useful features like parameters encoding to improve the ScrapingAnt usage experience. Requirespython 3.6+.

Quick Start

fromscrapingant_clientimportScrapingAntClientclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')# Scrape the example.com siteresult=client.general_request('https://example.com')print(result.content)

Install

pip install scrapingant-client

If you need async support:

pip install scrapingant-client[async]

API token

In order to get API token you'll need to register atScrapingAnt Service

API Reference

All public classes, methods and their parameters can be inspected in this API reference.

ScrapingAntClient(token)

Main class of this library.

ParamType
tokenstring

Common arguments

  • ScrapingAntClient.general_request
  • ScrapingAntClient.general_request_async
  • ScrapingAntClient.markdown_request
  • ScrapingAntClient.markdown_request_async

https://docs.scrapingant.com/request-response-format#available-parameters

ParamTypeDefault
urlstring
methodstringGET
cookiesList[Cookie]None
headersList[Dict[str, str]]None
js_snippetstringNone
proxy_typeProxyTypedatacenter
proxy_countrystrNone
wait_for_selectorstrNone
browserbooleanTrue
return_page_sourcebooleanFalse
datasame asrequests param 'data'None
jsonsame asrequests param 'json'None

IMPORTANT NOTE:js_snippet will be encoded to Base64 automatically by the ScrapingAnt client library.


Cookie

Class defining cookie. Currently it supports only name and value

ParamType
namestring
valuestring

Response

Class defining response from API.

ParamType
contentstring
cookiesList[Cookie]
status_codeint
textstring

Exceptions

ScrapingantClientException is base Exception class, used for all errors.

ExceptionReason
ScrapingantInvalidTokenExceptionThe API token is wrong or you have exceeded the API calls request limit
ScrapingantInvalidInputExceptionInvalid value provided. Please, look into error message for more info
ScrapingantInternalExceptionSomething went wrong with the server side code. Try again later or contact ScrapingAnt support
ScrapingantSiteNotReachableExceptionThe requested URL is not reachable. Please, check it locally
ScrapingantDetectedExceptionThe anti-bot detection system has detected the request. Please, retry or change the request settings.
ScrapingantTimeoutExceptionGot timeout while communicating with Scrapingant servers. Check your network connection. Please try later or contact support

Examples

Sending custom cookies

fromscrapingant_clientimportScrapingAntClientfromscrapingant_clientimportCookieclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')result=client.general_request('https://httpbin.org/cookies',cookies=[Cookie(name='cookieName1',value='cookieVal1'),Cookie(name='cookieName2',value='cookieVal2'),    ])print(result.content)# Response cookies is a list of Cookie objects# They can be used in next requestsresponse_cookies=result.cookies

Executing custom JS snippet

fromscrapingant_clientimportScrapingAntClientclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')customJsSnippet="""var str = 'Hello, world!';var htmlElement = document.getElementsByTagName('html')[0];htmlElement.innerHTML = str;"""result=client.general_request('https://example.com',js_snippet=customJsSnippet,)print(result.content)

Exception handling and retries

fromscrapingant_clientimportScrapingAntClient,ScrapingantClientException,ScrapingantInvalidInputExceptionclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')RETRIES_COUNT=3defparse_html(html:str):    ...# Implement your data extraction hereparsed_data=Noneforretry_numberinrange(RETRIES_COUNT):try:scrapingant_response=client.general_request('https://example.com',        )exceptScrapingantInvalidInputExceptionase:print(f'Got invalid input exception: {{repr(e)}}')break# We are not retrying if request params are not validexceptScrapingantClientExceptionase:print(f'Got ScrapingAnt exception{repr(e)}')exceptExceptionase:print(f'Got unexpected exception{repr(e)}')# please report this kind of exceptions by creating a new issueelse:try:parsed_data=parse_html(scrapingant_response.content)break# Data is parsed successfully, so we dont need to retryexceptExceptionase:print(f'Got exception while parsing data{repr(e)}')ifparsed_dataisNone:print(f'Failed to retrieve and parse data after{RETRIES_COUNT} tries')# Can sleep and retry later, or stop the script execution, and research the reasonelse:print(f'Successfully parsed data:{parsed_data}')

Sending custom headers

fromscrapingant_clientimportScrapingAntClientclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')result=client.general_request('https://httpbin.org/headers',headers={'test-header':'test-value'    })print(result.content)# Http basic auth exampleresult=client.general_request('https://jigsaw.w3.org/HTTP/Basic/',headers={'Authorization':'Basic Z3Vlc3Q6Z3Vlc3Q='})print(result.content)

Simple async example

importasynciofromscrapingant_clientimportScrapingAntClientclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')asyncdefmain():# Scrape the example.com siteresult=awaitclient.general_request_async('https://example.com')print(result.content)asyncio.run(main())

Sending POST request

fromscrapingant_clientimportScrapingAntClientclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')# Sending POST request with json dataresult=client.general_request(url="https://httpbin.org/post",method="POST",json={"test":"test"},)print(result.content)# Sending POST request with bytes dataresult=client.general_request(url="https://httpbin.org/post",method="POST",data=b'test_bytes',)print(result.content)

Receiving markdown

fromscrapingant_clientimportScrapingAntClientclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')# Sending POST request with json dataresult=client.markdown_request(url="https://example.com",)print(result.markdown)

Useful links


[8]ページ先頭

©2009-2025 Movatter.jp