Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Revert changes made to SDK docs in Revert "GITBOOK-120: Update to SDK…#1344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
SilasMarvin merged 3 commits intomasterfromsilas-revert-client-sdk-doc-changes
Mar 1, 2024
Merged
Show file tree
Hide file tree
Changes fromall commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletionspgml-cms/docs/SUMMARY.md
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -38,12 +38,11 @@
* [Overview](introduction/apis/client-sdks/getting-started.md)
* [Collections](introduction/apis/client-sdks/collections.md)
* [Pipelines](introduction/apis/client-sdks/pipelines.md)
* [Search](introduction/apis/client-sdks/search.md)
* [Vector Search](introduction/apis/client-sdks/search.md)
* [Document Search](introduction/apis/client-sdks/document-search.md)
* [Tutorials](introduction/apis/client-sdks/tutorials/README.md)
* [Semantic Search](introduction/apis/client-sdks/tutorials/semantic-search.md)
* [Semantic Search using Instructor model](introduction/apis/client-sdks/tutorials/semantic-search-using-instructor-model.md)
* [Extractive Question Answering](introduction/apis/client-sdks/tutorials/extractive-question-answering.md)
* [Summarizing Question Answering](introduction/apis/client-sdks/tutorials/summarizing-question-answering.md)
* [Semantic Search Using Instructor Model](introduction/apis/client-sdks/tutorials/semantic-search-1.md)

## Product

Expand Down
172 changes: 70 additions & 102 deletionspgml-cms/docs/introduction/apis/client-sdks/collections.md
View file
Open in desktop
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
---
description: >-
Organizational building blocks of the SDK. Manage all documents and related chunks, embeddings, tsvectors, and pipelines.
description: Organizational building blocks of the SDK. Manage all documents and related chunks, embeddings, tsvectors, and pipelines.
---

# Collections

Collections are the organizational building blocks of the SDK. They manage all documents and related chunks, embeddings, tsvectors, and pipelines.

## Creating Collections

By default, collections will read and write to the database specified by `DATABASE_URL` environment variable.
By default, collections will read and write to the database specified by `PGML_DATABASE_URL` environment variable.

### **Default `DATABASE_URL`**
### **Default `PGML_DATABASE_URL`**

{% tabs %}
{% tab title="JavaScript" %}
Expand All@@ -26,9 +26,9 @@ collection = Collection("test_collection")
{% endtab %}
{% endtabs %}

###**CustomDATABASE\_URL**
### Custom`PGML_DATABASE_URL`

Create a Collection that reads from a different database than that set by the environment variable `DATABASE_URL`.
Create a Collection that reads from a different database than that set by the environment variable `PGML_DATABASE_URL`.

{% tabs %}
{% tab title="Javascript" %}
Expand All@@ -46,21 +46,23 @@ collection = Collection("test_collection", CUSTOM_DATABASE_URL)

## Upserting Documents

Documents are dictionaries withtwo requiredkeys: `id` and `text`. All other keys/value pairs are stored asmetadata for the document.
Documents are dictionaries withone requiredkey: `id`. All other keys/value pairs are storedand can be chunked, embedded, broken into tsvectors, and searched overasspecified by a `Pipeline`.

{% tabs %}
{% tab title="JavaScript" %}
```javascript
const documents = [
{
id: "Document One",
id: "document_one",
title: "Document One",
text: "document one contents...",
random_key: "this will be metadata for the document",
random_key: "here is some random data",
},
{
id: "Document Two",
id: "document_two",
title: "Document Two",
text: "document two contents...",
random_key: "this will be metadata for the document",
random_key: "here is some random data",
},
];
await collection.upsert_documents(documents);
Expand All@@ -71,35 +73,40 @@ await collection.upsert_documents(documents);
```python
documents = [
{
"id": "Document 1",
"id": "document_one",
"title": "Document One",
"text": "Here are the contents of Document 1",
"random_key": "this will be metadata for the document"
"random_key": "here is some random data",
},
{
"id": "Document 2",
"id": "document_two",
"title": "Document Two",
"text": "Here are the contents of Document 2",
"random_key": "this will be metadata for the document"
}
"random_key": "here is some random data",
},
]
collection = Collection("test_collection")
await collection.upsert_documents(documents)
```
{% endtab %}
{% endtabs %}

Document metadatacan be replaced by upsertingthe document withoutthe`text` key.
Documentscan be replaced by upsertingdocuments withthesame `id`.

{% tabs %}
{% tab title="JavaScript" %}
```javascript
const documents = [
{
id: "Document One",
random_key: "this will be NEW metadata for the document",
id: "document_one",
title: "Document One New Title",
text: "Here is some new text for document one",
random_key: "here is some new random data",
},
{
id: "Document Two",
random_key: "this will be NEW metadata for the document",
id: "document_two",
title: "Document Two New Title",
text: "Here is some new text for document two",
random_key: "here is some new random data",
},
];
await collection.upsert_documents(documents);
Expand All@@ -110,39 +117,42 @@ await collection.upsert_documents(documents);
```python
documents = [
{
"id": "Document 1",
"random_key": "this will be NEW metadata for the document"
"id": "document_one",
"title": "Document One",
"text": "Here is some new text for document one",
"random_key": "here is some random data",
},
{
"id": "Document 2",
"random_key": "this will be NEW metadata for the document"
}
"id": "document_two",
"title": "Document Two",
"text": "Here is some new text for document two",
"random_key": "here is some random data",
},
]
collection = Collection("test_collection")
await collection.upsert_documents(documents)
```
{% endtab %}
{% endtabs %}

Document metadata can be mergedwith new metadatabyupserting thedocument without the `text` key and specifying the merge option.
Documents can be merged bysetting the`merge` option. On conflict, new document keys will override old document keys.

{% tabs %}
{% tab title="JavaScript" %}
```javascript
const documents = [
{
id: "Document One",
text: "document one contents...",
id: "document_one",
new_key: "this will be a new key in document one",
random_key: "this will replace old random_key"
},
{
id: "Document Two",
text: "document two contents...",
id: "document_two",
new_key: "this will bew a new key in document two",
random_key: "this will replace old random_key"
},
];
await collection.upsert_documents(documents, {
metdata: {
merge: true
}
merge: true
});
```
{% endtab %}
Expand All@@ -151,20 +161,17 @@ await collection.upsert_documents(documents, {
```python
documents = [
{
"id": "Document 1",
"random_key": "this will be NEW merged metadata for the document"
"id": "document_one",
"new_key": "this will be a new key in document one",
"random_key": "this will replace old random_key",
},
{
"id": "Document 2",
"random_key": "this will be NEW merged metadata for the document"
}
"id": "document_two",
"new_key": "this will be a new key in document two",
"random_key": "this will replace old random_key",
},
]
collection = Collection("test_collection")
await collection.upsert_documents(documents, {
"metadata": {
"merge": True
}
})
await collection.upsert_documents(documents, {"merge": True})
```
{% endtab %}
{% endtabs %}
Expand All@@ -176,14 +183,12 @@ Documents can be retrieved using the `get_documents` method on the collection ob
{% tabs %}
{% tab title="JavaScript" %}
```javascript
const collection = Collection("test_collection")
const documents = await collection.get_documents({limit: 100 })
```
{% endtab %}

{% tab title="Python" %}
```python
collection = Collection("test_collection")
documents = await collection.get_documents({ "limit": 100 })
```
{% endtab %}
Expand All@@ -198,14 +203,12 @@ The SDK supports limit-offset pagination and keyset pagination.
{% tabs %}
{% tab title="JavaScript" %}
```javascript
const collection = pgml.newCollection("test_collection")
const documents = await collection.get_documents({ limit: 100, offset: 10 })
```
{% endtab %}

{% tab title="Python" %}
```python
collection = Collection("test_collection")
documents = await collection.get_documents({ "limit": 100, "offset": 10 })
```
{% endtab %}
Expand All@@ -216,41 +219,31 @@ documents = await collection.get_documents({ "limit": 100, "offset": 10 })
{% tabs %}
{% tab title="JavaScript" %}
```javascript
const collection = Collection("test_collection")
const documents = await collection.get_documents({ limit: 100, last_row_id: 10 })
```
{% endtab %}

{% tab title="Python" %}
```python
collection = Collection("test_collection")
documents = await collection.get_documents({ "limit": 100, "last_row_id": 10 })
```
{% endtab %}
{% endtabs %}

The `last_row_id` can be taken from the `row_id` field in the returned document's dictionary.
The `last_row_id` can be taken from the `row_id` field in the returned document's dictionary. Keyset pagination does not currently work when specifying the `order_by` key.

### Filtering Documents

Metadata and full text filtering are supported just like they areinvector recall.
Documents can be filtered by passinginthe `filter` key.

{% tabs %}
{% tab title="JavaScript" %}
```javascript
const collection = pgml.newCollection("test_collection")
const documents = await collection.get_documents({
limit: 100,
offset: 10,
limit: 10,
filter: {
metadata: {
id: {
$eq: 1
}
},
full_text_search: {
configuration: "english",
text: "Some full text query"
id: {
$eq: "document_one"
}
}
})
Expand All@@ -259,34 +252,25 @@ const documents = await collection.get_documents({

{% tab title="Python" %}
```python
collection = Collection("test_collection")
documents = await collection.get_documents({
"limit": 100,
"offset": 10,
"filter": {
"metadata": {
"id": {
"$eq": 1
}
documents = await collection.get_documents(
{
"limit": 100,
"filter": {
"id": {"$eq": "document_one"},
},
"full_text_search": {
"configuration": "english",
"text": "Some full text query"
}
}
})
)
```
{% endtab %}
{% endtabs %}

### Sorting Documents

Documents can be sorted on anymetadatakey. Note that this does not currently work well with Keyset based pagination. If paginating and sorting, use Limit-Offset based pagination.
Documents can be sorted on any key. Note that this does not currently work well with Keyset based pagination. If paginating and sorting, use Limit-Offset based pagination.

{% tabs %}
{% tab title="JavaScript" %}
```javascript
const collection = pgml.newCollection("test_collection")
const documents = await collection.get_documents({
limit: 100,
offset: 10,
Expand All@@ -299,7 +283,6 @@ const documents = await collection.get_documents({

{% tab title="Python" %}
```python
collection = Collection("test_collection")
documents = await collection.get_documents({
"limit": 100,
"offset": 10,
Expand All@@ -315,39 +298,24 @@ documents = await collection.get_documents({

Documents can be deleted with the `delete_documents` method on the collection object.

Metadata and full text filtering are supported just like they are in vector recall.

{% tabs %}
{% tab title="JavaScript" %}
```javascript
const collection = pgml.newCollection("test_collection")
const documents = await collection.delete_documents({
metadata: {
id: {
$eq: 1
}
},
full_text_search: {
configuration: "english",
text: "Some full text query"
}
})
```
{% endtab %}

{% tab title="Python" %}
```python
documents = await collection.delete_documents({
"metadata": {
"id": {
"$eq": 1
}
},
"full_text_search": {
"configuration": "english",
"text": "Some full text query"
documents = await collection.delete_documents(
{
"id": {"$eq": 1},
}
})
)
```
{% endtab %}
{% endtabs %}
Loading

[8]ページ先頭

©2009-2025 Movatter.jp