GreenNodeEmbeddings
GreenNode is a global AI solutions provider and aNVIDIA Preferred Partner, delivering full-stack AI capabilities—from infrastructure to application—for enterprises across the US, MENA, and APAC regions. Operating onworld-class infrastructure (LEED Gold, TIA‑942, Uptime Tier III), GreenNode empowers enterprises, startups, and researchers with a comprehensive suite of AI services
This notebook provides a guide to getting started withGreenNodeEmbeddings
. It enables you to perform semantic document search using various built-in connectors or your own custom data sources by generating high-quality vector representations of text.
Overview
Integration details
Provider | Package |
---|---|
GreenNode | langchain-greennode |
Setup
To access GreenNode embedding models you'll need to create a GreenNode account, get an API key, and install thelangchain-greennode
integration package.
Credentials
GreenNode requires an API key for authentication, which can be provided either as theapi_key
parameter during initialization or set as the environment variableGREENNODE_API_KEY
. You can obtain an API key by registering for an account onGreenNode Serverless AI.
import getpass
import os
ifnot os.getenv("GREENNODE_API_KEY"):
os.environ["GREENNODE_API_KEY"]= getpass.getpass("Enter your GreenNode API key: ")
If you want to get automated tracing of your model calls you can also set yourLangSmith API key by uncommenting below:
# os.environ["LANGSMITH_TRACING"] = "true"
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
Installation
The LangChain GreenNode integration lives in thelangchain-greennode
package:
%pip install-qU langchain-greennode
Note: you may need to restart the kernel to use updated packages.
Instantiation
TheGreenNodeEmbeddings
class can be instantiated with optional parameters for the API key and model name:
from langchain_greennodeimport GreenNodeEmbeddings
# Initialize the embeddings model
embeddings= GreenNodeEmbeddings(
# api_key="YOUR_API_KEY", # You can pass the API key directly
model="BAAI/bge-m3"# The default embedding model
)
Indexing and Retrieval
Embedding models play a key role in retrieval-augmented generation (RAG) workflows by enabling both the indexing of content and its efficient retrieval.Below, see how to index and retrieve data using theembeddings
object we initialized above. In this example, we will index and retrieve a sample document in theInMemoryVectorStore
.
# Create a vector store with a sample text
from langchain_core.vectorstoresimport InMemoryVectorStore
text="LangChain is the framework for building context-aware reasoning applications"
vectorstore= InMemoryVectorStore.from_texts(
[text],
embedding=embeddings,
)
# Use the vectorstore as a retriever
retriever= vectorstore.as_retriever()
# Retrieve the most similar text
retrieved_documents= retriever.invoke("What is LangChain?")
# show the retrieved document's content
retrieved_documents[0].page_content
'LangChain is the framework for building context-aware reasoning applications'
Direct Usage
TheGreenNodeEmbeddings
class can be used independently to generate text embeddings without the need for a vector store. This is useful for tasks such as similarity scoring, clustering, or custom processing pipelines.
Embed single texts
You can embed single texts or documents withembed_query
:
single_vector= embeddings.embed_query(text)
print(str(single_vector)[:100])# Show the first 100 characters of the vector
[-0.01104736328125, -0.0281982421875, 0.0035858154296875, -0.0311279296875, -0.0106201171875, -0.039
Embed multiple texts
You can embed multiple texts withembed_documents
:
text2=(
"LangGraph is a library for building stateful, multi-actor applications with LLMs"
)
two_vectors= embeddings.embed_documents([text, text2])
for vectorin two_vectors:
print(str(vector)[:100])# Show the first 100 characters of the vector
[-0.01104736328125, -0.0281982421875, 0.0035858154296875, -0.0311279296875, -0.0106201171875, -0.039
[-0.07177734375, -0.00017452239990234375, -0.002044677734375, -0.0299072265625, -0.0184326171875, -0
Async Support
GreenNodeEmbeddings supports async operations:
import asyncio
asyncdefgenerate_embeddings_async():
# Embed a single query
query_result=await embeddings.aembed_query("What is the capital of France?")
print(f"Async query embedding dimension:{len(query_result)}")
# Embed multiple documents
docs=[
"Paris is the capital of France",
"Berlin is the capital of Germany",
"Rome is the capital of Italy",
]
docs_result=await embeddings.aembed_documents(docs)
print(f"Async document embeddings count:{len(docs_result)}")
await generate_embeddings_async()
Async query embedding dimension: 1024
Async document embeddings count: 3
Document Similarity Example
import numpyas np
from scipy.spatial.distanceimport cosine
# Create some documents
documents=[
"Machine learning algorithms build mathematical models based on sample data",
"Deep learning uses neural networks with many layers",
"Climate change is a major global environmental challenge",
"Neural networks are inspired by the human brain's structure",
]
# Embed the documents
embeddings_list= embeddings.embed_documents(documents)
# Function to calculate similarity
defcalculate_similarity(embedding1, embedding2):
return1- cosine(embedding1, embedding2)
# Print similarity matrix
print("Document Similarity Matrix:")
for i, emb_iinenumerate(embeddings_list):
similarities=[]
for j, emb_jinenumerate(embeddings_list):
similarity= calculate_similarity(emb_i, emb_j)
similarities.append(f"{similarity:.4f}")
print(f"Document{i+1}:{similarities}")
Document Similarity Matrix:
Document 1: ['1.0000', '0.6005', '0.3542', '0.5788']
Document 2: ['0.6005', '1.0000', '0.4154', '0.6170']
Document 3: ['0.3542', '0.4154', '1.0000', '0.3528']
Document 4: ['0.5788', '0.6170', '0.3528', '1.0000']
API Reference
For more details about the GreenNode Serverless AI API, visit theGreenNode Serverless AI Documentation.
Related
- Embedding modelconceptual guide
- Embedding modelhow-to guides