- Notifications
You must be signed in to change notification settings - Fork23
🥄 A package for building specific Proxy Pool for different Sites.
License
Jiramew/spoon
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Spoon is a library for building Distributed Proxy Pool for each different sites as you assign.
Only running on python 3.
Simply run:pip install spoonproxy
or clone the repo and set it into your PYTHONPATH.
Please make sure the Redis is running. Default configuration is "host:localhost, port:6379". You can also modify the Redis connection.
Likeexample.py
inspoon_server/example
,
You can assign many different proxy providers.
fromspoon_server.proxy.fetcherimportFetcherfromspoon_server.main.proxy_pipeimportProxyPipefromspoon_server.proxy.kuai_providerimportKuaiProviderfromspoon_server.proxy.xici_providerimportXiciProviderfromspoon_server.database.redis_configimportRedisConfigfromspoon_server.main.checkerimportCheckerBaidudefmain_run():redis=RedisConfig("127.0.0.1",21009)p1=ProxyPipe(url_prefix="https://www.baidu.com",fetcher=Fetcher(use_default=False),database=redis,checker=CheckerBaidu()).set_fetcher([KuaiProvider()]).add_fetcher([XiciProvider()])p1.start()if__name__=='__main__':main_run()
Also, with different checker, you can validate the result precisely.
classCheckerBaidu(Checker):defchecker_func(self,html=None):ifisinstance(html,bytes):html=html.decode('utf-8')ifre.search(r".*百度一下,你就知道.*",html):returnTrueelse:returnFalse
Also, as the code shows inspoon_server/example/example_multi.py
, by using multiprocess, you can get many queues to fetching & validating the proxies.
You can also assign different Providers for different url.
The default proxy providers are shown below, you can write your own providers.
A Simple django web api demo. You could use any web server and write your own api.
Gently runpython manager.py runserver **.**.**.**:*****
The simple apis include:
name | description |
---|---|
http://127.0.0.1:21010/api/v1/get_keys | Get all keys from redis |
http://127.0.0.1:21010/api/v1/fetchone_from?target=www.google.com&filter=65 | Get one useful proxy. target: the specific url filter: successful-revalidate times |
http://127.0.0.1:21010/api/v1/fetchall_from?target=www.google.com&filter=65 | Get all useful proxies. |
http://127.0.0.1:21010/api/v1/fetch_hundred_recent?target=www.baidu.com&filter=5 | Get recently joined full-scored proxies. target: the specific url filter: time in seconds |
http://127.0.0.1:21010/api/v1/fetch_stale?num=100 | Get recently proxies without check. num: the specific number of proxies you want |
http://127.0.0.1:21010/api/v1/fetch_recent?target=www.baidu.com | Get recently proxies that successfully validated. target: the specific url |