Learn how to use inference-time strategies like dynamic in-context learning (DICL) and best-of-N sampling to optimize LLM performance.

experimental_best_of_n type.Here’s a simple example configuration:[functions.draft_email.variants.promptA]type ="chat_completion"model ="gpt-4o-mini"user_template ="functions/draft_email/promptA/user.minijinja"[functions.draft_email.variants.promptB]type ="chat_completion"model ="gpt-4o-mini"user_template ="functions/draft_email/promptB/user.minijinja"[functions.draft_email.variants.best_of_n]type ="experimental_best_of_n"candidates = ["promptA","promptA","promptB"][functions.draft_email.variants.best_of_n.evaluator]model ="gpt-4o-mini"user_template ="functions/draft_email/best_of_n/user.minijinja"[functions.draft_email.experimentation]type ="uniform"candidate_variants = ["best_of_n"]# so we don't sample `promptA` or `promptB` directlybest_of_n variant that uses two different variants (promptA andpromptB) to generate candidates.It generates two candidates usingpromptA and one candidate usingpromptB.evaluator block specifies the model and instructions for selecting the best response.experimental_best_of_n variant type inConfiguration Reference.experimental_chain_of_thought variant type is only available for non-streaming requests to JSON functions.For chat functions, we recommend using reasoning models instead (e.g. OpenAI o3, DeepSeek R1).To use CoT in TensorZero, you need to configure a variant with theexperimental_chain_of_thought type.It uses the same configuration as achat_completion variant.Under the hood, TensorZero will prepend an additional field to the desired output schema to include the chain-of-thought reasoning and remove it from the final output.The reasoning is stored in the database for downstream observability and optimization.
experimental_dynamic_in_context_learning type.Here’s a simple example configuration:[functions.draft_email.variants.dicl]type ="experimental_dynamic_in_context_learning"model ="gpt-4o-mini"embedding_model ="text-embedding-3-small"system_instructions ="functions/draft_email/dicl/system.txt"k =5max_distance =0.5 # Optional: filter examples by cosine distance[embedding_models.text-embedding-3-small]routing = ["openai"][embedding_models.text-embedding-3-small.providers.openai]type ="openai"model_name ="text-embedding-3-small"dicl variant that uses theexperimental_dynamic_in_context_learning type.embedding_model field specifies the model used to embed inputs for similarity search.We also need to define this model in theembedding_models section.k parameter determines the number of similar examples to retrieve and incorporate into the prompt.max_distance parameter filters examples based on their cosine distance from the input, ensuring only highly relevant examples are included.DynamicInContextLearningExample table in your ClickHouse database.These examples will be used by the DICL variant to enhance the context of your prompts at inference time.The process of adding these examples to the database is crucial for DICL to function properly.We provide a sample recipe that simplifies this process:Dynamic In-Context Learning with OpenAI.This recipe supports selecting examples based on boolean metrics, float metrics, and demonstrations.It helps you populate theDynamicInContextLearningExample table with high-quality, relevant examples from your historical data.For more information on theDynamicInContextLearningExample table and its role in the TensorZero data model, seeData Model.For a comprehensive list of configuration options for theexperimental_dynamic_in_context_learning variant type, seeConfiguration Reference.
experimental_mixture_of_n type.Here’s a simple example configuration:[functions.draft_email.variants.promptA]type ="chat_completion"model ="gpt-4o-mini"user_template ="functions/draft_email/promptA/user.minijinja"[functions.draft_email.variants.promptB]type ="chat_completion"model ="gpt-4o-mini"user_template ="functions/draft_email/promptB/user.minijinja"[functions.draft_email.variants.mixture_of_n]type ="experimental_mixture_of_n"candidates = ["promptA","promptA","promptB"][functions.draft_email.variants.mixture_of_n.fuser]model ="gpt-4o-mini"user_template ="functions/draft_email/mixture_of_n/user.minijinja"[functions.draft_email.experimentation]type ="uniform"candidate_variants = ["mixture_of_n"]# so we don't sample `promptA` or `promptB` directlymixture_of_n variant that uses two different variants (promptA andpromptB) to generate candidates.It generates two candidates usingpromptA and one candidate usingpromptB.fuser block specifies the model and instructions for combining the candidates into a single response.experimental_mixture_of_n variant type inConfiguration Reference.