Variable and Secrets
Most embedding configuration options are saved in the table's metadata. However,this isn't always appropriate. For example, API keys should never be stored in themetadata. Additionally, other configuration options might be best set at runtime,such as thedevice
configuration that controls whether to use GPU or CPU forinference. If you hardcoded this to GPU, you wouldn't be able to run the code ona server without one.
To handle these cases, you can set variables on the embedding registry andreference them in the embedding configuration. These variables will be availableduring the runtime of your program, but not saved in the table's metadata. Whenthe table is loaded from a different process, the variables must be set again.
To set a variable, use theset_var()
/setVar()
method on the embedding registry.To reference a variable, use the syntax$env:VARIABLE_NAME
. If there is a defaultvalue, you can use the syntax$env:VARIABLE_NAME:DEFAULT_VALUE
.
Using variables to set secrets
Sensitive configuration, such as API keys, must either be set as environmentvariables or using variables on the embedding registry. If you pass in a hardcodedvalue, LanceDB will raise an error. Instead, if you want to set an API key viaconfiguration, use a variable:
Using variables to set the device parameter
Many embedding functions that run locally have adevice
parameter that controlswhether to use GPU or CPU for inference. Because not all computers have a GPU,it's helpful to be able to set thedevice
parameter at runtime, rather thanhave it hard coded in the embedding configuration. To make it work even if thevariable isn't set, you could provide a default value ofcpu
in the embeddingconfiguration.
Some embedding libraries even have a method to detect which devices are available,which could be used to dynamically set the device at runtime. For example, in Pythonyou can check if a CUDA GPU is available usingtorch.cuda.is_available()
.