- Notifications
You must be signed in to change notification settings - Fork88
🔍 Tiny, full-text search engine for static websites built with Rust and Wasm
License
Apache-2.0, MIT licenses found
Licenses found
tinysearch/tinysearch
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
tinysearch is a lightweight, fast, full-text search engine. It is designed forstatic websites.
tinysearch is written in Rust, and then compiled to WebAssembly to run in abrowser.
It can be used together with static site generators such asJekyll,Hugo,Zola,Cobalt, orPelican.
The test index file of my blog with around 40 posts creates a WASM payload of99kB (49kB gzipped, 40kB brotli).
That is smaller than the demo image above; so yes.
tinysearch is a Rust/WASM port of the Python code from the article"Writing a full-textsearch engine using Bloom filters".It can be seen as an alternative tolunr.js andelasticlunr, which are too heavy for smaller websitesand load a lot of JavaScript.
Under the hood it uses aXor Filter —a datastructure for fast approximation of set membership that is smaller thanbloom and cuckoo filters. Each blog post gets converted into a filter that willthen be serialized to a binary blob usingbincode. Please note that theunderlying technologies are subject to change.
- Only finds entire words. As a consequence there are no search suggestions(yet). This is a necessary tradeoff for reducing memory usage. A triedatastructure was about 10x bigger than the xor filters. New research oncompact datastructures for prefix searches might lift this limitation in thefuture.
- Since we bundle all search indices for all articles into one static binary, werecommend to only use it for small- to medium-size websites. Expect around 2kB uncompressed per article (~1 kb compressed).
wasm-pack is required to build the WASMmodule. Install it with
cargo install wasm-pack
To optimize the JavaScript output, you'll also needterser:
npm install terser -g
If you want to make the WebAssembly as small as possible, we recommend toinstallbinaryen as well. On macOSyou can install it withhomebrew:
brew install binaryen
Alternatively, you can download the binary from therelease page or use your OSpackage manager.
After that, you can install tinysearch itself:
cargo install tinysearch
A JSON file, which contains the content to index, is required as an input.Please take a look at theexample file.
ℹ️ Thebody
field in the JSON document is optional and can be skipped to justindex post titles.
Once you created the index, you can run
tinysearch fixtures/index.json
This will create a WASM module and the JavaScript glue code to integrate it intoyour website. You can open thedemo.html
from any webserver to see the result.
For example, Python has a built-in webserver that can be used for a quick test:
python3 -m http.server
then browse tohttp://0.0.0.0:8000/demo.html to run the demo.
You can also take a look at the code examples for different static sitegeneratorshere.
For advanced usage options, run
tinysearch --help
Please check what's required tohost WebAssembly in production-- you will need to explicitly set gzip mime types.
If you don't have a full Rust setup available, you can also use ournightly-built Docker images.
Here is how to quickly try tinysearch with Docker:
# Download a sample blog index from endler.devcurl -O https://raw.githubusercontent.com/tinysearch/tinysearch/master/fixtures/index.json# Create the WASM outputdocker run -v$PWD:/app tinysearch/cli --engine-version path=\"/engine\" --path /app/wasm_output /app/index.json
By default, the most recent stable Alpine Rust image is used. To get nightly,run
docker build --build-arg RUST_IMAGE=rustlang/rust:nightly-alpine -t tinysearch/cli:nightly.
WASM_REPO
: Overwrite the wasm-pack repositoryWASM_BRANCH
: Overwrite the repository branch to useTINY_REPO
: Overwrite repository of tinysearchTINY_BRANCH
: Overwrite tinysearch branch
To integrate tinysearch in continuous deployment pipelines, agithub action isavailable.
-name:Build tinysearchuses:leonhfr/tinysearch-action@v1with:index:public/index.jsonoutput_dir:public/wasmoutput_types:| wasm
The following websites use tinysearch:
Are you using tinysearch, too? Add your site here!
- Matthias Endler (@mre)
- Jorge-Luis Betancourt (@jorgelbg)
- Mad Mike (@fluential)
tinysearch is licensed under either of
- Apache License, Version 2.0, (LICENSE-APACHE orhttp://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT orhttp://opensource.org/licenses/MIT)
at your option.
About
🔍 Tiny, full-text search engine for static websites built with Rust and Wasm