Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

command-line web crawler implementation in golang

License

NotificationsYou must be signed in to change notification settings

johnstcn/gocrawl

Repository files navigation

Gocrawl is a small Go library for scraping data from websites. Some command-line utilities are provided as well.

gocrawl

Gocrawl is a simple command-line utility to crawl websites.It usesgopkg.in/xmlpath.v2 to perform HTML parsing and XPath evaluation.

See theexamples directory for an example job specification.Example usage and output:

$ gocrawl -input example/job.json{    "day": {        "error": "",        "values": [            "Saturday"        ]    },    "invalid_regexp": {        "error": "error parsing regexp: missing closing ]: `[a-z+`",        "values": [            "Saturday, 2 June 2018"        ]    },    "invalid_xpath": {        "error": "compiling xml path \"//*[@id=\\\\\\\"ctdat\\\"]\":8: expected a literal string",        "values": []    },    "month": {        "error": "",        "values": [            "June"        ]    },    "no_matching_xpath": {        "error": "no match for xpath //*[@id=\"cttdat\"]",        "values": []    },    "time": {        "error": "",        "values": [            "13:18:59"        ]    }}

gocrawld

Gocrawld is a daemon version of gocrawl. It accepts a POST request containing a job specification identical to that ofgocrawl and returns the result of executing the crawl job, encoded as JSON.

Example usage:

$ gocrawld -host localhost -port 12345 &<pid>$ curl -XPOST localhost:12345 --data @example/job.json<job output will be the same as above except less pretty>

Example docker usage:

docker run --rm --net=host --detach johnstcn/gocrawld

About

command-line web crawler implementation in golang

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors2

  •  
  •  

[8]ページ先頭

©2009-2025 Movatter.jp