- Notifications
You must be signed in to change notification settings - Fork262
Web scraper for NodeJS
rchipka/node-osmosis
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
HTML/XML parser and web scraper for NodeJS.
Uses native libxml C bindings
Clean promise-like interface
Supports CSS 3.0 and XPath 1.0 selector hybrids
No large dependencies like jQuery, cheerio, or jsdom
Compose deep and complex data structures
HTML parser features
- Fast parsing
- Very fast searching
- Small memory footprint
HTML DOM features
- Load and search ajax content
- DOM interaction and events
- Execute embedded and remote scripts
- Execute code in the DOM
HTTP request features
- Logs urls, redirects, and errors
- Cookie jar and custom cookies/headers/user agent
- Login/form submission, session cookies, and basic auth
- Single proxy or multiple proxies and handles proxy failure
- Retries and redirect limits
varosmosis=require('osmosis');osmosis.get('www.craigslist.org/about/sites').find('h1 + div a').set('location').follow('@href').find('header + div + div li > a').set('category').follow('@href').paginate('.totallink + a.button.next:first').find('p > a').follow('@href').set({'title':'section > h2','description':'#postingbody','subcategory':'div.breadbox > span[4]','date':'time@datetime','latitude':'#map@data-latitude','longitude':'#map@data-longitude','images':['img@src']}).data(function(listing){// do something with listing data}).log(console.log).error(console.log).debug(console.log)
For documentation and examples check outhttps://rchipka.github.io/node-osmosis/global.html
- libxmljs-dom - DOM wrapper forlibxmljs C bindings
- needle - Lightweight HTTP wrapper
Please consider a donation if you depend on web scraping and Osmosis makes your job a bit easier.Your contribution allows me to spend more time making this the best web scraper for Node.
About
Web scraper for NodeJS
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors9
Uh oh!
There was an error while loading.Please reload this page.

