scrapyd-go
commandmoduleThis package is not in the latest version of its module.
Details
Validgo.mod file
The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license
Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version
Modules with tagged versions give importers more predictable builds.
Stable version
When a project reaches major version v1 it is considered stable.
- Learn more about best practices
Repository
README¶
scrapyd-go
an drop-in replacement forscrapyd that is more easy to be scalable and distributed on any number of commodity machines with no hassle, each
scrapyd-go
instance is a stateless microservice, all instances must be connected to the sameredis
server,redis
is used as a ceneralized registry system for all instances, so each instance se what others see.
Why
scrapyd isn't bad, but it is very stateful, it isn't that easy to deploy it in a distributed environment likek8s
, as well as I wanted to add more features, so I started this project as a drop-in replacement forscrapyd
but writing in modern & scalable environment likego
for restful server andredis
as centeralized registry.
TODOs
schedule.json
cancel.json
addversion.json
listprojects.json
listversions.json
listspiders.json
delproject.json
delversion.json
listjobs.json
daemonstatus.json
logs/{jobid}
,new: realtime output of the job log
Configurations
scrapyd-go
configs are just simple command lineflags
-dir string the directory to use for local caching (default ".scrapyd-go") -listen string the address to bind to (default ":6800") -max2keep int the maximum jobs/logs to keep in memory (default 1000000) -poll int time in millisecond between each poll operation from queue(s) (default 10) -python string the python binary to use (default "python3") -redis string the redis server address (default "redis://:somepass@localhost:6379/1") -sync int time in seconds between each sync operation (default 15) -workers int the maximum workers count (default cpu-cores-count)
Installation
- binary : go toreleases page and download your os based release
- docker:
$ docker pull alash3al/scrapyd-go
- source:
$ go get github.com/alash3al/scrapyd-go
Running
- binary:
$ ./scrapyd_bin_file -redis redis://localhost:6379/1
- docker:
$ docker run --link SomeRedisServerContainer -p 6800:6800 alash3al/scrapyd-go -redis redis://SomeRedisServerContainer:6379/1
- source:
$ scrapyd-go -redis redis://localhost:6379/1
Contributing
- Fork the repo
- Create a feature branch
- Push your changes
- Create a pull request
License
Apache License v2.0
Author
- Mohamed Al Ashaal
- Software Engineer
- m7medalash3al@gmail.com
Documentation¶
There is no documentation for this package.