Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Cover image for Integrate Orquesta with Cohere using Python SDK
Orquesta profile imageOlumide Shittu
Olumide Shittu forOrquesta

Posted on

     

Integrate Orquesta with Cohere using Python SDK

Orquesta provides your product teams with no-code collaboration tooling to experiment, operate, and monitor LLMs and remote configurations within your SaaS. Using Orquesta, you can easily perform prompt engineering, prompt management, experiment in production, push new versions directly to production, and roll back instantly.

Cohere, on the other hand, is an API that offers language processing to any system. It trains massive language models and puts them behind a very simple API.

Cohere
Source:Cohere.

This article guides you through integrating your SaaS with Orquesta and Cohere using our Python SDK. By the end of the article, you'll know how to set up a prompt in Orquesta, perform prompt engineering, request a prompt variant using our SDK code generator, map the Orquesta response with Cohere, send a payload to Cohere, and report the response back to Orquesta for observability and monitoring.

Prerequisites

For you to be able to follow along in this tutorial, you will need the following:

  • Jupyter Notebook (or any IDE of your choice).

  • Orquesta Python SDK.

Integration

Follow these steps to integrate the Python SDK with Cohere.

Step 1 - Install SDK and create a client instance

pipinstallorquesta-sdkpipinstallcohere
Enter fullscreen modeExit fullscreen mode

To create a client instance, you need to have access to the Orquesta API key, which can be found in your workspacehttps://my.orquesta.dev/<workspace-name>/settings/developers

Copy it and add the following code to your notebook to initialize the Orquesta client.

importtimeimportcoherefromorquesta_sdkimportOrquestaClient,OrquestaClientOptionsfromorquesta_sdk.helpersimportorquesta_cohere_parameters_mapperfromorquesta_sdk.promptsimportOrquestaPromptMetrics# Initialize Orquesta clientfromorquesta_sdkimportOrquestaClient,OrquestaClientOptionsapi_key="ORQUESTA-API-KEY"options=OrquestaClientOptions(api_key=api_key,ttl=3600)client=OrquestaClient(options)
Enter fullscreen modeExit fullscreen mode

Explanation:

  • Import thetime module to calculate the total time for the program to run

  • We also importcohere, to be able to use the API

  • TheOrquestaClient and theOrquestaClientOptions classes that are already defined in theorquesta_sdk module, are imported.

  • The Orquesta SDK has helper functions that map and interface between Orquesta and specific LLM providers. For this integration, we will make use of theorquesta_cohere_parameters_mapper helper

  • To log all the interactions with Cohere, we use theOrquestaPromptMetrics class

  • We create the instance of theOrquestaClientOptions and configure it with theapi_key and thettl (Time to Live) in seconds for the local cache; by default, it is3600 seconds (1 hour)

Finally, an instance of theOrquestaClient class is created and initialized with the previously configured options object. Thisclient instance can now interact with the Orquesta service using the provided API key for authentication.

Step 2 - Enable Cohere models in Model Garden

Head over to Orquesta's Model Garden and enable the Cohere models you want to use.

Enable Cohere models in Model Garden

Step 3 - Set up a completion prompt and variants

The next step is to set up your completion prompt; ensure it is completion and not chat to use Cohere.

To create a prompt, click onAdd Prompt, provide aprompt key, aDomain (optional) and selectCompletion.

Set up a completion prompt and variants

Once that is set up, create your first completion, give it a name prompt, add all the necessary information, and clickSave.

Set up a completion prompt and variants

Step 4 - Request a variant from Orquesta using the SDK

Our flexible configuration matrix allows you to define multiple prompt variants based on custom context. This allows you to work with different prompts and hyperparameters with, for example, environment, country, locale or user segment. TheCode Snippet Generator makes it easy to request a prompt variant.

Code Snippet Generator

Once you open the Code Snippet Generator, copy the code snippet and paste it into your editor.

Code Snippet Generator

# Query the prompt from Orquestaprompt=client.prompts.query(key="data_completion",context={"environments":["test"]},variables={})
Enter fullscreen modeExit fullscreen mode

Step 5 - Map the Orquesta response to Cohere using a Helper

We have already established at the beginning of this tutorial that for us to be able to integrate these two technologies, we will make use of a Helper provided by Orquesta, which isorquesta_cohere_parameters_mapper.

# Start time of the completion requeststart_time=time.time()print(f'Start time:{start_time}')co=cohere.Client('COHERE-API-KEY')# Insert your Cohere API keycompletion=co.generate(**orquesta_cohere_parameters_mapper(prompt.value),model=prompt.value.get("model"),prompt=prompt.value.get('prompt'),)# End time of the completion requestend_time=time.time()print(f'End time:{end_time}')# Calculate the difference (latency) in millisecondslatency=(end_time-start_time)*1000print(f'Latency is:{latency}')
Enter fullscreen modeExit fullscreen mode

Latency

Explanation:

  • We start thetime using the time module.

  • An instance of the Cohereclient is created.

  • Using thegenerate() endpoint, we can generate realistic text conditioned on a given input

  • Thegenerate() endpoint also receives other body parameters, such as theprompt as a required string, themodel, thenum_generations,max_tokens,temperature, etc. For simplicity, we are only working with model and prompt

  • We end thetime and calculatelatency.

Step 6 - Report analytics back to Orquesta

After each query, Orquesta generates a log with a Trace ID. Using theadd_metrics() method, you can add additional information, such as thellm_response,metadata,latency, andeconomics.

# Tokenize responsesprompt_tokenization=co.tokenize(prompt.value.get('prompt'))completion_tokenization=co.tokenize(completion.generations[0].text)prompt_tokens=len(prompt_tokenization.tokens)completion_tokens=len(completion_tokenization.tokens)total_tokens=prompt_tokens+completion_tokens# Report the metrics back to Orquestametrics=OrquestaPromptMetrics(economics={"total_tokens":total_tokens,"completion_tokens":completion_tokens,"prompt_tokens":prompt_tokens,},llm_response=completion.generations[0].text,latency=latency,metadata={"finish_reason":completion.generations[0].finish_reason,},)prompt.add_metrics(metrics=metrics)
Enter fullscreen modeExit fullscreen mode

Conclusion

With these easy steps, you have successfully integrated Orquesta with Cohere, and this is just the tip of the iceberg because, as of the time of writing this article, Orquesta only supports thegenerate() endpoint, but in the future, you can use the other endpoints, such asembed,classify,summarize,detect-language, etc.

Orquesta supports otherSDKs such as Angular, Node.js, React, and TypeScript. Refer to ourdocumentation for more information.

Full Code Example

importosimporttimeimportcoherefromorquesta_sdkimportOrquestaClient,OrquestaClientOptionsfromorquesta_sdk.helpersimportorquesta_cohere_parameters_mapperfromorquesta_sdk.promptsimportOrquestaPromptMetrics# Initialize Orquesta clientfromorquesta_sdkimportOrquestaClient,OrquestaClientOptionsapi_key="ORQUESTA-API-KEY"options=OrquestaClientOptions(api_key=api_key,ttl=3600)client=OrquestaClient(options)co=cohere.Client('COEHERE-API-KEY')prompt=client.prompts.query(key="data_completion",context={"environments":["test"]},variables={},metadata={"user_id":45515})# Start time of the completion requeststart_time=time.time()print(f'Start time:{start_time}')completion=co.generate(**orquesta_cohere_parameters_mapper(prompt.value),model=prompt.value.get("model"),prompt=prompt.value.get('prompt'),)# End time of the completion requestend_time=time.time()print(f'End time:{end_time}')# Calculate the difference (latency) in millisecondslatency=(end_time-start_time)*1000print(f'Latency is:{latency}')# Tokenize responsesprompt_tokenization=co.tokenize(prompt.value.get('prompt'))completion_tokenization=co.tokenize(completion.generations[0].text)prompt_tokens=len(prompt_tokenization.tokens)completion_tokens=len(completion_tokenization.tokens)total_tokens=prompt_tokens+completion_tokens# Tokenize responsesprompt_tokenization=co.tokenize(prompt.value.get('prompt'))completion_tokenization=co.tokenize(completion.generations[0].text)prompt_tokens=len(prompt_tokenization.tokens)completion_tokens=len(completion_tokenization.tokens)total_tokens=prompt_tokens+completion_tokens# Report the metrics back to Orquestametrics=OrquestaPromptMetrics(economics={"total_tokens":total_tokens,"completion_tokens":completion_tokens,"prompt_tokens":prompt_tokens,},llm_response=completion.generations[0].text,latency=latency,metadata={"finish_reason":completion.generations[0].finish_reason,},)prompt.add_metrics(metrics=metrics)
Enter fullscreen modeExit fullscreen mode

Top comments(0)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

LLM Operations and Integration Platform.

More fromOrquesta

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp