- Notifications
You must be signed in to change notification settings - Fork5
ScrapingAnt API client for Python.
ScrapingAnt/scrapingant-client-python
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
scrapingant-client
is the official library to accessScrapingAnt API from your Pythonapplications. It provides useful features like parameters encoding to improve the ScrapingAnt usage experience. Requirespython 3.6+.
fromscrapingant_clientimportScrapingAntClientclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')# Scrape the example.com siteresult=client.general_request('https://example.com')print(result.content)
pip install scrapingant-client
If you need async support:
pip install scrapingant-client[async]
In order to get API token you'll need to register atScrapingAnt Service
All public classes, methods and their parameters can be inspected in this API reference.
Main class of this library.
Param | Type |
---|---|
token | string |
- ScrapingAntClient.general_request
- ScrapingAntClient.general_request_async
- ScrapingAntClient.markdown_request
- ScrapingAntClient.markdown_request_async
https://docs.scrapingant.com/request-response-format#available-parameters
Param | Type | Default |
---|---|---|
url | string | |
method | string | GET |
cookies | List[Cookie] | None |
headers | List[Dict[str, str]] | None |
js_snippet | string | None |
proxy_type | ProxyType | datacenter |
proxy_country | str | None |
wait_for_selector | str | None |
browser | boolean | True |
return_page_source | boolean | False |
data | same asrequests param 'data' | None |
json | same asrequests param 'json' | None |
IMPORTANT NOTE:js_snippet
will be encoded to Base64 automatically by the ScrapingAnt client library.
Class defining cookie. Currently it supports only name and value
Param | Type |
---|---|
name | string |
value | string |
Class defining response from API.
Param | Type |
---|---|
content | string |
cookies | List[Cookie] |
status_code | int |
text | string |
ScrapingantClientException
is base Exception class, used for all errors.
Exception | Reason |
---|---|
ScrapingantInvalidTokenException | The API token is wrong or you have exceeded the API calls request limit |
ScrapingantInvalidInputException | Invalid value provided. Please, look into error message for more info |
ScrapingantInternalException | Something went wrong with the server side code. Try again later or contact ScrapingAnt support |
ScrapingantSiteNotReachableException | The requested URL is not reachable. Please, check it locally |
ScrapingantDetectedException | The anti-bot detection system has detected the request. Please, retry or change the request settings. |
ScrapingantTimeoutException | Got timeout while communicating with Scrapingant servers. Check your network connection. Please try later or contact support |
fromscrapingant_clientimportScrapingAntClientfromscrapingant_clientimportCookieclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')result=client.general_request('https://httpbin.org/cookies',cookies=[Cookie(name='cookieName1',value='cookieVal1'),Cookie(name='cookieName2',value='cookieVal2'), ])print(result.content)# Response cookies is a list of Cookie objects# They can be used in next requestsresponse_cookies=result.cookies
fromscrapingant_clientimportScrapingAntClientclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')customJsSnippet="""var str = 'Hello, world!';var htmlElement = document.getElementsByTagName('html')[0];htmlElement.innerHTML = str;"""result=client.general_request('https://example.com',js_snippet=customJsSnippet,)print(result.content)
fromscrapingant_clientimportScrapingAntClient,ScrapingantClientException,ScrapingantInvalidInputExceptionclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')RETRIES_COUNT=3defparse_html(html:str): ...# Implement your data extraction hereparsed_data=Noneforretry_numberinrange(RETRIES_COUNT):try:scrapingant_response=client.general_request('https://example.com', )exceptScrapingantInvalidInputExceptionase:print(f'Got invalid input exception: {{repr(e)}}')break# We are not retrying if request params are not validexceptScrapingantClientExceptionase:print(f'Got ScrapingAnt exception{repr(e)}')exceptExceptionase:print(f'Got unexpected exception{repr(e)}')# please report this kind of exceptions by creating a new issueelse:try:parsed_data=parse_html(scrapingant_response.content)break# Data is parsed successfully, so we dont need to retryexceptExceptionase:print(f'Got exception while parsing data{repr(e)}')ifparsed_dataisNone:print(f'Failed to retrieve and parse data after{RETRIES_COUNT} tries')# Can sleep and retry later, or stop the script execution, and research the reasonelse:print(f'Successfully parsed data:{parsed_data}')
fromscrapingant_clientimportScrapingAntClientclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')result=client.general_request('https://httpbin.org/headers',headers={'test-header':'test-value' })print(result.content)# Http basic auth exampleresult=client.general_request('https://jigsaw.w3.org/HTTP/Basic/',headers={'Authorization':'Basic Z3Vlc3Q6Z3Vlc3Q='})print(result.content)
importasynciofromscrapingant_clientimportScrapingAntClientclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')asyncdefmain():# Scrape the example.com siteresult=awaitclient.general_request_async('https://example.com')print(result.content)asyncio.run(main())
fromscrapingant_clientimportScrapingAntClientclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')# Sending POST request with json dataresult=client.general_request(url="https://httpbin.org/post",method="POST",json={"test":"test"},)print(result.content)# Sending POST request with bytes dataresult=client.general_request(url="https://httpbin.org/post",method="POST",data=b'test_bytes',)print(result.content)
fromscrapingant_clientimportScrapingAntClientclient=ScrapingAntClient(token='<YOUR-SCRAPINGANT-API-TOKEN>')# Sending POST request with json dataresult=client.markdown_request(url="https://example.com",)print(result.markdown)
About
ScrapingAnt API client for Python.
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Uh oh!
There was an error while loading.Please reload this page.