- Notifications
You must be signed in to change notification settings - Fork0
lfreist/x-search
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
External string searching library (x-search) written in C++ (C++20)
- libboost-program-options1.74-dev (only for examples)
- liblz4-dev
- libzstd-dev
- cmake
- g++ or clang
We refer to the corresponding Wiki entry:Installation
As a brief example on how to use x-search, we will create a small (very basic) grep-like executable:
// my_grep.cpp#include<xsearch/xsearch.h>#include<iostream>intmain(int argc,char** argv) {auto searcher = xs::extern_search<xs::lines>(argv[1], argv[2],false,1);for (autoconst& line : *searcher->getResult()) { std::cout << line <<'\n'; }}
Now, just build it and link against xsearch as described here
Done! We have created a grep-like command line search tool. Let's check if it can be as fast as GNU grep...
# GNU grep:$time grep Sherlock opensubtitles.en.txt> /tmp/grep.resultreal 0m3.379suser 0m2.525ssys 0m0.843s# Our implementation using x-search$time my_grep Sherlock opensubtitles.en.txt> /tmp/my_grep.resultreal 0m1.154suser 0m0.716ssys 0m0.469s
x-search
provides a simple one-function API call to search on external files.
#include<xsearch/xsearch.h>// count number of matches:auto res = xs::extern_search<xs::count>(pattern, file_path, meta_file_path, num_threads, max_num_readers);
Besides
xs::count
,xs::extern_search
is specialized for the following template arguments:
xs::count_lines
: count lines containing a matchxs::match_byte_offsets
: a vector of the byte offsets of all matchesxs::line_byte_offsets
: a vector of the byte offsets of matching linesxs::line_indices
: a vector of the line indices of matching linesxs::lines
: a vector of lines (as std::string) containing the match
After callingxs::extern_search
, the returned shared_ptr of the Searcher instance can be...
- ... joined (
res->join()
): the main thread sleeps until the search process finishes - ... used to access already created results using the iterator:
for (auto r : *res->getResult()) { ...}
- ... ignored: the threads started for the search are joined on destruction.