Loaders

Uncompressed File Loaders

classsvs.VectorDataLoader

Handle representing an uncompressed vector data file.

__init__(self:svs::python.VectorDataLoader,path:str,data_type:svs::python.DataType|None=None,dims:typing.SupportsInt|None=None)None

Construct a newsvs.VectorDataLoader.

Parameters:
  • path (str) –

    The path to the file to load. This can either be:

    • The path to the directory where a previous vector dataset was saved (preferred).

    • The direct path to the vector data file itself. In this case, the type of the filewill try to be inferred automatically. Recognized extensions: “.[b/i/f]vecs”,“.bin”, and “.svs”.

  • data_type (svs.DataType) – The native type of the elements in the dataset.

  • dims (int) – The expected dimsionality of the dataset. While this argument is generallyoptional, providing it may yield runtime speedups.

propertydata_type

Access the assigned data type.

Type:

Read/Write (svs.DataType)

propertydims

Access the expected dimensionality.

Type:

Read/Write (int)

propertyfilepath

Access the underlying file path.

Type:

Read/Write (str)

LVQ Loader

The LVQ loader provides lazy compression of uncompressed data and reloading of previouslysaved LVQ data.

classsvs.LVQLoader

Generic LVQ Loader

__init__(*args,**kwargs)

Overloaded function.

  1. __init__(self: svs::python.LVQLoader, datafile: svs::python.VectorDataLoader, primary: typing.SupportsInt, residual: typing.SupportsInt = 0, padding: typing.SupportsInt = 0, strategy: svs::python.LVQStrategy = <LVQStrategy.Auto: 0>) -> None

Construct a loader that will lazily compress the results of the data loader.Requires an appropriate back-end to be compiled for all combinations of primary and residualbits.

Parameters:
  • loader (svs.VectorDataLoader) – The uncompressed dataset to compressin-memory.

  • primary (int) – The number of bits to use for compression in the primary dataset.

  • residual (int) – The number of bits to use for compression in the residual dataset.Default: 0.

  • padding (int) – The value (in bytes) to align the beginning of each compressed vectors.Values of 32 or 64 may offer the best performance at the cost of a lower compressionratio. A value of 0 implies no special alignment.

  • strategy (svs.LVQStrategy) – The packing strategy to use for the compressedcodes. See the associated documenation for that enum.

  1. __init__(self: svs::python.LVQLoader, directory: str, padding: typing.SupportsInt = 0, strategy: svs::python.LVQStrategy = <LVQStrategy.Auto: 0>) -> None

Reload a compressed dataset from a previously saved dataset.Requires an appropriate back-end to be compiled for all combinations of primary and residualbits.

Parameters:
  • directory (str) – The directory where the dataset was previously saved.

  • primary (int) – The number of bits to use for compression in the primary dataset.

  • residual (int) – The number of bits to use for compression in the residual dataset.Default: 0>

  • dims (int) – The number of dimensions in the dataset. May provide a performance boostif given if a specialization has been compiled. Default: Dynamic (any dimension).

  • padding (int) – The value (in bytes) to align the beginning of each compressed vectors.Values of 32 or 64 may offer the best performance at the cost of a lower compressionratio. A value of 0 implies no special alignment. Default: 0.

  • strategy (svs.LVQStrategy) – The packing strategy to use for the compressedcodes. See the associated documenation for that enum.

  1. __init__(self: svs::python.LVQLoader, legacy: svs::python.LVQ4) -> None

  2. __init__(self: svs::python.LVQLoader, legacy: svs::python.LVQ8) -> None

  3. __init__(self: svs::python.LVQLoader, legacy: svs::python.LVQ4x4) -> None

  4. __init__(self: svs::python.LVQLoader, legacy: svs::python.LVQ4x8) -> None

  5. __init__(self: svs::python.LVQLoader, legacy: svs::python.LVQ8x8) -> None

propertydims

The number of dimensions.

propertyprimary_bits

The number of bits used for the primary encoding.

reload_from(self:svs::python.LVQLoader,directory:str)svs::python.LVQLoader

Create a copy of the argument loader configured to reload a previously saved LVQ datasetfrom the given directory.

propertyresidual_bits

The number of bits used for the residual encoding.

propertystrategy

The packing strategy to use.

Strategy Selection

The strategy argument of the LVQ loader provides a way of overriding the default selectionof the packing strategy used by a LVQ backend.

Note that overriding the default strategy requires the corresponding backend tobe compiled in thesvs shared library component.

classsvs.LVQStrategy

Select the packing mode for LVQ

Members:

Auto : Let SVS decide the best strategy.

Sequential : Use the Sequential packing strategy.

Turbo : Use the best Turbo packing strategy for this architecture.

LeanVecLoader

The LeanVec loader provides a way to use dimensionality reduction to improveperformance on high dimensional datasets.

Internally, a LeanVec dataset consists of the dimensionality reduced primary dataset(over which the bulk of the index search is conducted) and a full dimensional secondarydataset used to rerank and refine candidates returned from the initial search.

svs allows selection of the storage format using thesvs.LeanVecKind enum,enablingfloat16 andlvq compression for either of the primary and secondary datasets.

classsvs.LeanVecLoader

Generic LeanVec Loader

__init__(*args,**kwargs)

Overloaded function.

  1. __init__(self: svs::python.LeanVecLoader, datafile: svs::python.VectorDataLoader, leanvec_dims: typing.SupportsInt, primary_kind: svs::python.LeanVecKind = <LeanVecKind.lvq8: 2>, secondary_kind: svs::python.LeanVecKind = <LeanVecKind.lvq8: 2>, data_matrix: typing.Annotated[numpy.typing.ArrayLike, numpy.float32] | None = None, query_matrix: typing.Annotated[numpy.typing.ArrayLike, numpy.float32] | None = None, alignment: typing.SupportsInt = 32) -> None

Construct a loader that will lazily reduce the dimensionality of the data loader.Requires an appropriate back-end to be compiled for all combinations of primary andsecondary types.

Parameters:
  • loader (svs.VectorDataLoader) – The uncompressed original dataset.

  • leanvec_dims (int) – resulting value of reduced dimensionality

  • primary (LeanVecKind) – Type of dataset used for Primary (Default: LVQ8)

  • secondary (LeanVecKind) – Type of dataset used for Secondary (Default: LVQ8)

  • data_matrix (Optional[numpy.ndarray[numpy.float32]]) – Matrix for data transformation[see note 1] (Default: None).

  • query_matrix (Optional[numpy.ndarray[numpy.float32]]) – Matrix for query transformation[see note 1] (Default: None).

  • alignment (int) – alignement/padding used in LVQ data types (Default: 32)

Note 1: The argumentsdata_matrix anddata_matrix are optional and have thefollowing requirements for valid combinations:

  1. Neither matrix provided: Transform dataset and queries using a default PCA-basedtransformation.

  2. Onlydata_matrix provided: The provided matrix is used to transform both thequeries and the original dataset.

  3. Both arguments are provided: Use the respective matrices for transformation.

  1. __init__(self: svs::python.LeanVecLoader, directory: str, alignment: typing.SupportsInt = 32) -> None

Reload a LeanVec dataset from a previously saved dataset.Requires an appropriate back-end to be compiled for all combinations of primary andsecondary types.

Parameters:
  • directory (str) – The directory where the dataset was previously saved.

  • leanvec_dims (int) – resulting value of reduced dimensionality.Default: Dynamic (any dimension).

  • dims (int) – The number of dimensions in the original dataset.Default: Dynamic (any dimension).

  • primary (LeanVecKind) – Type of dataset used for PrimaryDefault:svs.LeanVecKind.lvq8.

  • secondary (LeanVecKind) – Type of dataset used for SecondaryDefault:svs.LeanVecKind.LVQ8.

  • alignment (int) – alignement/padding used in LVQ data types. Default: 32.

propertyalignment

The alignment to use for LVQ encoded data.

propertydims

The full-dimensionality.

propertyleanvec_dims

The reduced dimensionality.

propertyprimary_kind

The encoding of the reduced dimensional dataset.

reload_from(self:svs::python.LeanVecLoader,directory:str)svs::python.LeanVecLoader

Create a copy of the argument loader configured to reload a previously saved LeanVec datasetfrom the given directory.

propertysecondary_kind

The encoding of the full-dimensional dataset.

classsvs.LeanVecKind

LeanVec primary and secondary types

Members:

float32 : Uncompressed float32

float16 : Uncompressed float16

lvq8 : Compressed with LVQ 8bits

lvq4 : Compressed with LVQ 4bits