Common Python API
Memory Allocators
- classsvs.DRAM
Small class for an allocator capable of using huge pages. Prioritizes page use in the order:1~GiB, 2~MiB, 4~KiB. SeeHuge Pages for more information on what huge pages areand how to allocate them on your system.
Enums
- classsvs.DistanceType
Select which distance function to use
Members:
L2 : Euclidean Distance (minimize)
MIP : Maximum Inner Product (maximize)
Cosine : Cosine similarity (maximize)
- classsvs.DataType
Datatype Selector
Members:
uint8 : 8-bit unsigned integer.
uint16 : 16-bit unsigned integer.
uint32 : 32-bit unsigned integer.
uint64 : 64-bit unsigned integer.
int8 : 8-bit signed integer.
int16 : 16-bit signed integer.
int32 : 32-bit signed integer.
int64 : 64-bit signed integer.
float16 : 16-bit IEEE floating point.
float32 : 32-bit IEEE floating point.
float64 : 64-bit IEEE floating point.
Helper Functions
- svs.read_vecs(filename)
Read a file in thebvecs/fvecs/ivecs format and return a NumPy array with the results.
The data type of the returned array is determined by the file extension with thefollowing mapping:
bvecs: 8-bit unsigned integers.
fvecs: 32-bit floating point numbers.
ivecs: 32-bit signed integers.
- Parameters:
filename (str) – The file to read.
- Returns:
Numpy array with the results.
- svs.write_vecs(array,filename,skip_check=False)
- Parameters:
array (array) – The raw array to save.
filename (str) – The file where the results will be saved.
skip_check (bool) –
Be default, this function will check if the file extension for the vecsfile is appropriate for the given array (see list below).
Passingskip_check = True overrides this logic and forces creation of thefile.
- Result:
The array is saved to the requested file.
File extension to array element type:
fvecs: np.float32
hvecs: np.float16
ivecs: np.uint32
bvecs: np.uint8
Warning
The user must specify the file extension corresponding to the desired file format in thefilename argument ofsvs.write_vecs().
- svs.read_svs(filename,dtype=<class'numpy.float32'>)
Read the svs native data file as a numpy array.Note: As of no, now type checking is performed. Make sure the requested type actuallymatches the contents of the file.
- Parameters:
filename (str) – The file to read.
dtype – The data type of the encoded vectors in the file.
- Result:
A numpy matrix with the results.
- svs.convert_fvecs_to_float16(source_file:str,destination_file:str)→None
Convert thefvecs file on disk with 32-bit floating point entries to afvecs file with16-bit floating point entries.
- Parameters:
source_file – The source file path to convert.
destination_file – The destination file to generate.
- svs.generate_test_dataset(nvectors,nqueries,ndims,directory,data_seed=None,query_seed=None,num_threads=1,num_neighbors=100,distance=<DistanceType.L2:0>)
Generate a sample dataset consisting of the base data, queries, and groundtruth all inthe standard
*vecsform.- Parameters:
nvectors (int) – The number of base vectors in the generated dataset.
nqueries (int) – The number of query vectors in the generated dataset.
ndims (int) – The number of dimensions per vector in the dataset.
directory (str) – The directory in which to generate the dataset.
data_seed (optional) – The seed to use for random number generation in the dataset.
query_seed (optional) – The seed to use for random number generation for the queries.
num_threads (optional) – Number of threads to use to generate the groundtruth.
num_neighbors (int) – The number of neighbors to compute for the groundtruth.
distance (optional) – The distance metric to use for groundtruth generation.
Creates
directoryif it didn’t already exist. The following files are generated:$(directory)/data.fvecs: The dataset encoded using float32 in as fvecs.$(directory)/queries.fvecs: The queries encoded using float32 as fvecs.$(directory)/groundtruth.ivecs: The computednum_neighborsnearestneighbors of the queries in the dataset with respect to the provided distance.
- svs.convert_vecs_to_svs(vecs_file:str,svs_file:str,dtype:svs::python.DataType=<DataType.float32:10>)→None
Convert the vecs file (containing the specified element types) to the svs native format.
- Parameters:
vecs_file – The source [f/h/i/b]vecs file.
svs_file – The destination native file.
dtype – The svs.DataType of the vecs file. Supported types: (float32, float16, uint32, and uint8).
File extension type map:
fvecs = svs.DataType.float32
hvecs = svs.DataType.float16
ivecs = svs.DataType.uint32
bvecs = svs.DataType.uint8