Jun 6, 2023 · Jun 6, 2023
diff --git a/pgml-dashboard/Cargo.lock b/pgml-dashboard/Cargo.lock
diff --git a/pgml-dashboard/src/api/docs.rs b/pgml-dashboard/src/api/docs.rs
        cluster,
 &path,
 vec![
 NavLink::new("Introducing PostgresML Python SDK: Build End-to-End Vector Search Applications without OpenAI and Pinecone")
 .href("/blog/introducing-postgresml-python-sdk-build-end-to-end-vector-search-applications-without-openai-and-pinecone"),
 NavLink::new("PostgresML raises $4.7M to launch serverless AI application databases based on Postgres")
 .href("/blog/postgresml-raises-4.7M-to-launch-serverless-ai-application-databases-based-on-postgres"),
 NavLink::new("PG Stat Sysinfo, a Postgres Extension for Querying System Statistics")
diff --git a/...-sdk-build-end-to-end-vector-search-applications-without-openai-and-pinecone.md b/...-sdk-build-end-to-end-vector-search-applications-without-openai-and-pinecone.md
 ---
 author:Santi Adavani
 description:The PostgresML Python SDK is designed to facilitate the development of end-to-end vector search applications without OpenAI and Pinecone. With this SDK, you can seamlessly manage various database tables related to documents, text chunks, text splitters, LLM (Large Language Model) models, and embeddings. By leveraging the SDK's capabilities, you can efficiently index LLM embeddings using PgVector for fast and accurate queries.
 image:https://postgresml.org/dashboard/static/images/blog/sdk_code.png
 image_alt:"Introducing PostgresML Python SDK: Build End-to-End Vector Search Applications without OpenAI and Pinecone"
 ---
 #Introducing PostgresML Python SDK: Build End-to-End Vector Search Applications without OpenAI and Pinecone
 <divclass="d-flex align-items-center mb-4">
  <imgwidth="54px"height="54px"src="/dashboard/static/images/team/santi.jpg"style="border-radius:50%;"alt="Author" />
  <divclass="ps-3 d-flex justify-content-center flex-column">
 <p class="m-0">Santi Adavani</p>
 <p class="m-0">June 01, 2023</p>
  </div>
 </div>

 We are excited to introduce a Python SDK for PostgresML that streamlines the development of scalable vector search applications on PostgreSQL databases. Traditionally, building a vector search application requires spinning up an application database, connecting to external OpenAI or HuggingFace REST API services for generating embeddings, and integrating with vector databases like Pinecone for indexing and search. This approach increases infrastructure footprint, maintenance efforts, and query latency.

 With the PostgresML Python SDK, developers now have a unified solution. They can effortlessly manage a single application database where they can handle: document management, embedding generation, indexing, and searching. This eliminates the need for multiple infrastructure components, simplifies maintenance, and reduces query latencies. The SDK offers a comprehensive set of tools for managing database tables related to documents, text chunks, text splitters, LLM models, and embeddings, enabling seamless integration of advanced search functionalities.

 <imgsrc="/dashboard/static/images/blog/sdk_code.png"alt="Sample code to build a vector search application using Python SDK">

 ##Key Features

 ###Automated Database Management
 The Python SDK automates the management of various database tables, eliminating the complexity of setting up and maintaining the data structure required for vector search applications. With this automated system, you can focus on building robust search functionalities while the SDK handles the underlying database management.

 ###Embedding Generation from Open Source Models
 Leveraging the Python SDK, you gain access to a vast collection of open source models. These models have been trained on extensive datasets and capture the semantic meaning of text. With just a few lines of code, you can generate embeddings using these models, enabling powerful analysis and search capabilities in your application.

 ###Flexible and Scalable Vector Search
 The Python SDK seamlessly integrates with PgVector, a PostgreSQL extension designed for efficient vector-based indexing and querying. By leveraging the power of PgVector, you can perform advanced searches, rank results by relevance, and retrieve accurate and meaningful information from your database. The SDK ensures that your vector search application scales effortlessly to handle increasing amounts of data.

 ##How the Python SDK Works

 The Python SDK simplifies the development of vector search applications by abstracting away the complexities of database management and indexing. Here's an overview of how it works:

 ###Document and Text Chunk Management
 The SDK simplifies the process of upserting documents and generating text chunks by offering a user-friendly interface. It allows you to effortlessly add and configure various text splitters to generate text chunks of different sizes, overlaps, and file formats, such as Python and Markdown.

 ###Open Source Model Integration
 With the SDK, you can seamlessly incorporate a wide range of open source models from HuggingFace into your application. These models capture the semantic meaning of text and enable powerful analysis and search capabilities. Generating high-quality embeddings from these models is a breeze with the Python SDK.

 ###Embedding Indexing
 The Python SDK utilizes the PgVector extension to efficiently index the embeddings generated by the open source models. This indexing process optimizes search performance and allows for fast and accurate retrieval of relevant results, even with large volumes of data.

 ###Querying and Search
 Once the embeddings are indexed, the SDK provides intuitive methods for executing vector-based searches on the documents and text chunks stored in the PostgreSQL database. You can easily execute queries and retrieve search results with precise and relevant information.

 ##Use Cases

 The Python SDK's embedding capabilities find applications in various scenarios, including:

 ###Search
 By comparing embeddings of query strings and documents, you can retrieve search results ranked by their relevance or similarity to the query. This allows users to find the most relevant information quickly and effectively.

 ###Clustering
 Utilizing embeddings, you can group text strings based on their similarity. By measuring the similarity between embeddings, you can identify clusters or groups of text strings that share common characteristics, providing valuable insights for data analysis.

 ###Recommendations
 Embeddings play a crucial role in recommendation systems. By identifying items with related text strings based on their embeddings, you can deliver personalized recommendations to users, enhancing user experience and engagement.

 ###Anomaly Detection
 Anomaly detection involves identifying outliers or anomalies in data. By quantifying the similarity between text strings using embeddings, you can identify anomalies that have little relatedness to the rest of the data, aiding in anomaly detection tasks.

 ###Classification
 Embeddings are valuable in classification tasks, where text strings are classified based on their most similar label. By comparing the embeddings of text strings and labels, you can accurately classify new text strings into predefined categories.

 ##Get Started with the Python SDK

 To get started with the Python SDK for scalable vector search on PostgreSQL, visit our[GitHub repository](https://github.com/postgresml/postgresml/tree/master/pgml-sdks/python/pgml). You'll find comprehensive documentation, code examples, and installation instructions to help you integrate the SDK into your projects seamlessly.

 We're excited to see how the Python SDK transforms your vector search applications, enabling fast, accurate, and scalable search functionalities. Should you have any questions or need assistance please do not hesitate to reach out to us on[Discord](https://discord.gg/DmyJP3qJ7U) or send an[email](mailto:team@postgresml.org).

 Happy coding and happy searching!

diff --git a/pgml-dashboard/static/images/blog/sdk_code.png b/pgml-dashboard/static/images/blog/sdk_code.png
Original file line number	Diff line number	Diff line change
Expand Up		@@ -80,6 +80,8 @@ async fn blog_handler<'a>(path: PathBuf, cluster: Cluster) -> Result<ResponseOk,
		cluster,
		&path,
		vec![
		NavLink::new("Introducing PostgresML Python SDK: Build End-to-End Vector Search Applications without OpenAI and Pinecone")
		.href("/blog/introducing-postgresml-python-sdk-build-end-to-end-vector-search-applications-without-openai-and-pinecone"),
		NavLink::new("PostgresML raises $4.7M to launch serverless AI application databases based on Postgres")
		.href("/blog/postgresml-raises-4.7M-to-launch-serverless-ai-application-databases-based-on-postgres"),
		NavLink::new("PG Stat Sysinfo, a Postgres Extension for Querying System Statistics")
Expand Down
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,75 @@
		---
		author:Santi Adavani
		description:The PostgresML Python SDK is designed to facilitate the development of end-to-end vector search applications without OpenAI and Pinecone. With this SDK, you can seamlessly manage various database tables related to documents, text chunks, text splitters, LLM (Large Language Model) models, and embeddings. By leveraging the SDK's capabilities, you can efficiently index LLM embeddings using PgVector for fast and accurate queries.
		image:https://postgresml.org/dashboard/static/images/blog/sdk_code.png
		image_alt:"Introducing PostgresML Python SDK: Build End-to-End Vector Search Applications without OpenAI and Pinecone"
		---
		#Introducing PostgresML Python SDK: Build End-to-End Vector Search Applications without OpenAI and Pinecone
		<divclass="d-flex align-items-center mb-4">
		<imgwidth="54px"height="54px"src="/dashboard/static/images/team/santi.jpg"style="border-radius:50%;"alt="Author" />
		<divclass="ps-3 d-flex justify-content-center flex-column">
		<p class="m-0">Santi Adavani</p>
		<p class="m-0">June 01, 2023</p>
		</div>
		</div>

		We are excited to introduce a Python SDK for PostgresML that streamlines the development of scalable vector search applications on PostgreSQL databases. Traditionally, building a vector search application requires spinning up an application database, connecting to external OpenAI or HuggingFace REST API services for generating embeddings, and integrating with vector databases like Pinecone for indexing and search. This approach increases infrastructure footprint, maintenance efforts, and query latency.

		With the PostgresML Python SDK, developers now have a unified solution. They can effortlessly manage a single application database where they can handle: document management, embedding generation, indexing, and searching. This eliminates the need for multiple infrastructure components, simplifies maintenance, and reduces query latencies. The SDK offers a comprehensive set of tools for managing database tables related to documents, text chunks, text splitters, LLM models, and embeddings, enabling seamless integration of advanced search functionalities.

		<imgsrc="/dashboard/static/images/blog/sdk_code.png"alt="Sample code to build a vector search application using Python SDK">

		##Key Features

		###Automated Database Management
		The Python SDK automates the management of various database tables, eliminating the complexity of setting up and maintaining the data structure required for vector search applications. With this automated system, you can focus on building robust search functionalities while the SDK handles the underlying database management.

		###Embedding Generation from Open Source Models
		Leveraging the Python SDK, you gain access to a vast collection of open source models. These models have been trained on extensive datasets and capture the semantic meaning of text. With just a few lines of code, you can generate embeddings using these models, enabling powerful analysis and search capabilities in your application.

		###Flexible and Scalable Vector Search
		The Python SDK seamlessly integrates with PgVector, a PostgreSQL extension designed for efficient vector-based indexing and querying. By leveraging the power of PgVector, you can perform advanced searches, rank results by relevance, and retrieve accurate and meaningful information from your database. The SDK ensures that your vector search application scales effortlessly to handle increasing amounts of data.

		##How the Python SDK Works

		The Python SDK simplifies the development of vector search applications by abstracting away the complexities of database management and indexing. Here's an overview of how it works:

		###Document and Text Chunk Management
		The SDK simplifies the process of upserting documents and generating text chunks by offering a user-friendly interface. It allows you to effortlessly add and configure various text splitters to generate text chunks of different sizes, overlaps, and file formats, such as Python and Markdown.

		###Open Source Model Integration
		With the SDK, you can seamlessly incorporate a wide range of open source models from HuggingFace into your application. These models capture the semantic meaning of text and enable powerful analysis and search capabilities. Generating high-quality embeddings from these models is a breeze with the Python SDK.

		###Embedding Indexing
		The Python SDK utilizes the PgVector extension to efficiently index the embeddings generated by the open source models. This indexing process optimizes search performance and allows for fast and accurate retrieval of relevant results, even with large volumes of data.

		###Querying and Search
		Once the embeddings are indexed, the SDK provides intuitive methods for executing vector-based searches on the documents and text chunks stored in the PostgreSQL database. You can easily execute queries and retrieve search results with precise and relevant information.

		##Use Cases

		The Python SDK's embedding capabilities find applications in various scenarios, including:

		###Search
		By comparing embeddings of query strings and documents, you can retrieve search results ranked by their relevance or similarity to the query. This allows users to find the most relevant information quickly and effectively.

		###Clustering
		Utilizing embeddings, you can group text strings based on their similarity. By measuring the similarity between embeddings, you can identify clusters or groups of text strings that share common characteristics, providing valuable insights for data analysis.

		###Recommendations
		Embeddings play a crucial role in recommendation systems. By identifying items with related text strings based on their embeddings, you can deliver personalized recommendations to users, enhancing user experience and engagement.

		###Anomaly Detection
		Anomaly detection involves identifying outliers or anomalies in data. By quantifying the similarity between text strings using embeddings, you can identify anomalies that have little relatedness to the rest of the data, aiding in anomaly detection tasks.

		###Classification
		Embeddings are valuable in classification tasks, where text strings are classified based on their most similar label. By comparing the embeddings of text strings and labels, you can accurately classify new text strings into predefined categories.

		##Get Started with the Python SDK

		To get started with the Python SDK for scalable vector search on PostgreSQL, visit our[GitHub repository](https://github.com/postgresml/postgresml/tree/master/pgml-sdks/python/pgml). You'll find comprehensive documentation, code examples, and installation instructions to help you integrate the SDK into your projects seamlessly.

		We're excited to see how the Python SDK transforms your vector search applications, enabling fast, accurate, and scalable search functionalities. Should you have any questions or need assistance please do not hesitate to reach out to us on[Discord](https://discord.gg/DmyJP3qJ7U) or send an[email](mailto:team@postgresml.org).

		Happy coding and happy searching!