Getting started with C++
Theopen-source SVS library supports all documented features except our proprietary vectorcompression (LVQ andLeanvec),which are exclusive to Intel hardware and available via ourshared library andPyPI package.This tutorial shows how to use the shared library to enable our proprietary vector compression and unlock significant performance and memory gains!
We also provide anexample using theopen-source SVS library only.
Building
Building the SVS example should be relatively straight-forward. We test on Ubuntu 22.04 LTS, but any Linux distribution should work.
Prerequisites
A C++20 capable compiler:
GCC >= 11.0
Clang >= 18.0 (note that the shared library requires a higher Clang version than the open-source library)
CMake build
To build and run the SVS example using ourshared library run the following commands:
gitclonehttps://github.com/intel/ScalableVectorSearchcdScalableVectorSearch/examples/cpp/sharedmkdirbuild&&cdbuildcmake..make-j./example_vamana_with_compression
SeeCMakeLists.txt for details on how the shared library is used and remember to update the linkin CMakeLists.txt todownload the latest shared library release.
Step by step example using vector compression
Here is a step by step explanation of theexample that showcases the most important features of SVS.We will use the random dataset included in SVS for testing indata/test_dataset.
Compress the data
To boost performance and reduce memory usage, we first compress the data using our vector compression technique LVQ.SeeVector compression andChoosing the Right Compression for details.
constsize_tnum_threads=4;size_tpadding=32;size_tleanvec_dim=64;autothreadpool=svs::threads::as_threadpool(num_threads);autoloaded=svs::VectorDataLoader<float>(std::filesystem::path(SVS_DATA_DIR)/"data_f32.svs").load();autodata=svs::leanvec::LeanDataset<svs::leanvec::UsingLVQ<4>,svs::leanvec::UsingLVQ<8>,svs::Dynamic,svs::Dynamic>::reduce(loaded,std::nullopt,threadpool,padding,svs::lib::MaybeStatic<svs::Dynamic>(leanvec_dim));
Building the index
To search effectively, first build a graph-based index linking related data vectors. We’ll keep defaults for hyperparameters,exact values can betuned later based on the dataset. For dynamic indexing (adding and removing points over time), see thisexample.
autoparameters=svs::index::vamana::VamanaBuildParameters{};svs::Vamanaindex=svs::Vamana::build<float>(parameters,data,svs::distance::DistanceL2(),num_threads);
Searching the index
The graph is built; we can now query it. Load queries from disk and setsearch_window_size –larger values boost accuracy but reduce speed (seeHow to Set the Search Window Size).
constsize_tsearch_window_size=50;constsize_tn_neighbors=10;index.set_search_window_size(search_window_size);autoqueries=svs::load_data<float>(std::filesystem::path(SVS_DATA_DIR)/"queries_f32.fvecs");autoresults=index.search(queries,n_neighbors);
After searching, we compare the results with the ground-truth and print the obtained recall.
autogroundtruth=svs::load_data<int>(std::filesystem::path(SVS_DATA_DIR)/"groundtruth_euclidean.ivecs");doublerecall=svs::k_recall_at_n(groundtruth,results,n_neighbors,n_neighbors);fmt::print("Recall@{} = {:.4f}\n",n_neighbors,recall);
Saving and loading the index
If you are satisfied with the performance of the generated index, you can save it to disk to avoid rebuilding it in the future.
index.save("config","graph","data");index=svs::Vamana::assemble<float>("config",svs::GraphLoader("graph"),svs::lib::load_from_disk<svs::leanvec::LeanDataset<svs::leanvec::UsingLVQ<4>,svs::leanvec::UsingLVQ<8>,svs::Dynamic,svs::Dynamic>>("data",padding),svs::distance::DistanceL2(),num_threads);
Note
The save index function currently uses three folders for saving.All three are needed to be able to reload the index.
One folder for the graph.
One folder for the data.
One folder for metadata.
This is subject to change in the future.
Entire example
/* * Copyright 2025 Intel Corporation * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */// SVS#include"svs/core/recall.h"#include"svs/extensions/flat/leanvec.h"#include"svs/extensions/flat/lvq.h"#include"svs/extensions/vamana/leanvec.h"#include"svs/extensions/vamana/lvq.h"#include"svs/orchestrators/dynamic_vamana.h"#include"svs/orchestrators/exhaustive.h"#include"svs/orchestrators/vamana.h"intmain(){// STEP 1: Compress Data with LeanVec, reducing dimensionality to leanvec_dim dimensions// and using 4 and 8 bits for primary and secondary levels respectively.//! [Compress data]constsize_tnum_threads=4;size_tpadding=32;size_tleanvec_dim=64;autothreadpool=svs::threads::as_threadpool(num_threads);autoloaded=svs::VectorDataLoader<float>(std::filesystem::path(SVS_DATA_DIR)/"data_f32.svs").load();autodata=svs::leanvec::LeanDataset<svs::leanvec::UsingLVQ<4>,svs::leanvec::UsingLVQ<8>,svs::Dynamic,svs::Dynamic>::reduce(loaded,std::nullopt,threadpool,padding,svs::lib::MaybeStatic<svs::Dynamic>(leanvec_dim));//! [Compress data]// STEP 2: Build Vamana Index//! [Index Build]autoparameters=svs::index::vamana::VamanaBuildParameters{};svs::Vamanaindex=svs::Vamana::build<float>(parameters,data,svs::distance::DistanceL2(),num_threads);//! [Index Build]// STEP 3: Search the Index//! [Perform Queries]constsize_tsearch_window_size=50;constsize_tn_neighbors=10;index.set_search_window_size(search_window_size);autoqueries=svs::load_data<float>(std::filesystem::path(SVS_DATA_DIR)/"queries_f32.fvecs");autoresults=index.search(queries,n_neighbors);//! [Perform Queries]//! [Recall]autogroundtruth=svs::load_data<int>(std::filesystem::path(SVS_DATA_DIR)/"groundtruth_euclidean.ivecs");doublerecall=svs::k_recall_at_n(groundtruth,results,n_neighbors,n_neighbors);fmt::print("Recall@{} = {:.4f}\n",n_neighbors,recall);//! [Recall]// STEP 4: Saving and reloading the index//! [Saving Loading]index.save("config","graph","data");index=svs::Vamana::assemble<float>("config",svs::GraphLoader("graph"),svs::lib::load_from_disk<svs::leanvec::LeanDataset<svs::leanvec::UsingLVQ<4>,svs::leanvec::UsingLVQ<8>,svs::Dynamic,svs::Dynamic>>("data",padding),svs::distance::DistanceL2(),num_threads);//! [Saving Loading]index.set_search_window_size(search_window_size);recall=svs::k_recall_at_n(groundtruth,results,n_neighbors,n_neighbors);fmt::print("Recall@{} after saving and reloading = {:.4f}\n",n_neighbors,recall);return0;}
Using open-source SVS only
Building and installing
Prerequisites
A C++20 capable compiler:
GCC >= 11.0
Clang >= 13.0
To build SVS and the included examples, use the following:
gitclonehttps://github.com/intel/ScalableVectorSearchcdScalableVectorSearchmkdirbuild&&cdbuildcmake..-DSVS_BUILD_EXAMPLES=YEScmake--build.-j$(nproc)
Run this command to confirm SVS is installed correctly, it should print some types, likefloat32.
examples/cpp/types
Run this command to execute the example
examples/cpp/vamana../data/test_dataset/data_f32.fvecs../data/test_dataset/queries_f32.fvecs../data/test_dataset/groundtruth_euclidean.ivecs
Entire example
/* * Copyright 2023 Intel Corporation * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. *///! [Example All]//! [Includes]// SVS Dependencies#include"svs/orchestrators/vamana.h" // bulk of the dependencies required.#include"svs/core/recall.h" // Convenient k-recall@n computation.#include"svs/extensions/vamana/scalar.h" // SQ vamana extensions.#include"svs/quantization/scalar/scalar.h" // SQ implementation.// Alternative main definition#include"svsmain.h"// stl#include<map>#include<string>#include<string_view>#include<vector>//! [Includes]//! [Helper Utilities]doublerun_recall(svs::Vamana&index,constsvs::data::SimpleData<float>&queries,constsvs::data::SimpleData<uint32_t>&groundtruth,size_tsearch_window_size,size_tnum_neighbors,std::string_viewmessage=""){index.set_search_window_size(search_window_size);autoresults=index.search(queries,num_neighbors);doublerecall=svs::k_recall_at_n(groundtruth,results,num_neighbors,num_neighbors);if(!message.empty()){fmt::print("[{}] ",message);}fmt::print("Windowsize = {}, Recall = {}\n",search_window_size,recall);returnrecall;}constboolDEBUG=false;voidcheck(doubleexpected,doublegot,doubleeps=0.005){doublediff=std::abs(expected-got);ifconstexpr(DEBUG){fmt::print("Expected {}. Got {}\n",expected,got);}else{if(diff>eps){throwANNEXCEPTION("Expected ",expected,". Got ",got,'!');}}}//! [Helper Utilities]// Alternative main definitionintsvs_main(std::vector<std::string>args){//! [Argument Extraction]constsize_tnargs=args.size();if(nargs!=4){throwANNEXCEPTION("Expected 3 arguments. Instead, got ",nargs,'!');}conststd::string&data_vecs=args.at(1);conststd::string&query_vecs=args.at(2);conststd::string&groundtruth_vecs=args.at(3);//! [Argument Extraction]// Building the index//! [Build Parameters]autoparameters=svs::index::vamana::VamanaBuildParameters{1.2,// alpha64,// graph max degree128,// search window size1024,// max candidate pool size60,// prune to degreetrue,// full search history};//! [Build Parameters]//! [Index Build]size_tnum_threads=4;svs::Vamanaindex=svs::Vamana::build<float>(parameters,svs::VectorDataLoader<float>(data_vecs),svs::DistanceL2(),num_threads);//! [Index Build]// Searching the index//! [Load Aux]// Load the queries and ground truth.autoqueries=svs::load_data<float>(query_vecs);autogroundtruth=svs::load_data<uint32_t>(groundtruth_vecs);//! [Load Aux]//! [Perform Queries]index.set_search_window_size(30);svs::QueryResult<size_t>results=index.search(queries,10);doublerecall=svs::k_recall_at_n(groundtruth,results);check(0.8215,recall);//! [Perform Queries]//! [Search Window Size]autoexpected_recall=std::map<size_t,double>({{10,0.5509},{20,0.7281},{30,0.8215},{40,0.8788}});for(autowindowsize:{10,20,30,40}){recall=run_recall(index,queries,groundtruth,windowsize,10,"Sweep");check(expected_recall.at(windowsize),recall);}//! [Search Window Size]// Saving the index//! [Saving]index.save("example_config","example_graph","example_data");//! [Saving]// Reloading a saved index//! [Loading]// We can reload an index from a previously saved set of files.index=svs::Vamana::assemble<float>("example_config",svs::GraphLoader("example_graph"),svs::VectorDataLoader<float>("example_data"),svs::DistanceType::L2,4// num_threads);recall=run_recall(index,queries,groundtruth,30,10,"Reload");check(0.8215,recall);//! [Loading]//! [Only Loading]// We can reload an index from a previously saved set of files.index=svs::Vamana::assemble<float>("example_config",svs::GraphLoader("example_graph"),svs::VectorDataLoader<float>("example_data"),svs::DistanceType::L2,4// num_threads);//! [Only Loading]//! [Set a new thread pool with n-threads]index.set_threadpool(svs::threads::DefaultThreadPool(4));//! [Set a new thread pool with n-threads]//! [Compressed Loader]// Quantizationnamespacescalar=svs::quantization::scalar;// Wrap the compressor object in a lazy functor.// This will defer loading and compression of the SQ dataset until the threadpool// used in the index has been created.autocompressor=svs::lib::Lazy([=](svs::threads::ThreadPoolauto&threadpool){autodata=svs::VectorDataLoader<float,128>("example_data").load();returnscalar::SQDataset<std::int8_t,128>::compress(data,threadpool);});index=svs::Vamana::assemble<float>("example_config",svs::GraphLoader("example_graph"),compressor,svs::DistanceType::L2,4);recall=run_recall(index,queries,groundtruth,30,10,"Compressed load");check(0.8190,recall);//! [Compressed Loader]//! [Build Index Compressed]// Compressed buildingindex=svs::Vamana::build<float>(parameters,compressor,svs::DistanceL2(),num_threads);recall=run_recall(index,queries,groundtruth,30,10,"Compressed Build");check(0.8212,recall);//! [Build Index Compressed]return0;}// Special main providing some helpful utilities.SVS_DEFINE_MAIN();//! [Example All]