Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commite22134f

Browse files
Moloejoegitbook-bot
authored andcommitted
GITBOOK-72: Fix broken Pipelines and Search and add a bit more info
1 parent14a5976 commite22134f

File tree

7 files changed

+153
-122
lines changed

7 files changed

+153
-122
lines changed

‎pgml-docs/docs/guides/deploying-postgresml/self-hosting/README.md‎

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ Finally, you can install PostgresML:
3131
sudo apt install -y postgresml-14
3232
```
3333

34-
Ubuntu 22.04 ships with PostgreSQL 14, but if you have a different version installed on your system, just change`14` in the package name to your Postgres version. We currently support all versions supported by the community: Postgres 12 through16.
34+
Ubuntu 22.04 ships with PostgreSQL 14, but if you have a different version installed on your system, just change`14` in the package name to your Postgres version. We currently support all versions supported by the community: Postgres 12 through15.
3535

3636
###Validate your installation
3737

‎pgml-docs/docs/guides/deploying-postgresml/self-hosting/building-from-source.md‎

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,12 +40,12 @@ For a typical deployment in production, you would need to compile and install th
4040

4141
####Install pgrx
4242

43-
`pgrx` is open source and available from crates.io. We are currently using the`0.11.0` version. It's important that your`pgrx` version matches what we're using, since there are some hard dependencies between our code and`pgrx`.
43+
`pgrx` is open source and available from crates.io. We are currently using the`0.10.0` version. It's important that your`pgrx` version matches what we're using, since there are some hard dependencies between our code and`pgrx`.
4444

4545
To install`pgrx`, simply run:
4646

4747
```
48-
cargo install cargo-pgrx --version "0.11.0"
48+
cargo install cargo-pgrx --version "0.10.0"
4949
```
5050

5151
Before using`pgrx`, it needs to be initialized against the installed version of PostgreSQL. In this example, we'll be using the Ubuntu 22.04 default PostgreSQL 14 installation:

‎pgml-docs/docs/guides/developer-docs/installation.md‎

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,6 @@ To install the necessary Python packages into a virtual environment, use the `vi
6464
virtualenv pgml-venv&& \
6565
source pgml-venv/bin/activate&& \
6666
pip install -r requirements.txt&& \
67-
pip install -r requirements-autogptq.txt&& \
6867
pip install -r requirements-xformers.txt --no-dependencies
6968
```
7069

‎pgml-docs/docs/guides/getting-started/sign-up.md‎

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
1. Go to[https://postgresml.org/signup](https://postgresml.org/signup)
66
2. Sign up using your email or using Google or Github authentication
77
3. Login using your account
8+
4.[data-pre-processing.md](../machine-learning/supervised-learning/data-pre-processing.md"mention")
89

910

1011

‎pgml-docs/docs/guides/machine-learning/supervised-learning/data-pre-processing.md‎

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,9 @@ In this example:
2525

2626
There are 3 steps to preprocessing data:
2727

28-
*[Encoding](#categorical-encodings) categorical values into quantitative values
29-
*[Imputing](#imputing-missing-values) NULL values to some quantitative value
30-
*[Scaling](#scaling-values) quantitative values across all variables to similar ranges
28+
*[Encoding](../../../../../pgml-dashboard/content/docs/guides/training/preprocessing.md#categorical-encodings) categorical values into quantitative values
29+
*[Imputing](../../../../../pgml-dashboard/content/docs/guides/training/preprocessing.md#imputing-missing-values) NULL values to some quantitative value
30+
*[Scaling](../../../../../pgml-dashboard/content/docs/guides/training/preprocessing.md#scaling-values) quantitative values across all variables to similar ranges
3131

3232
These preprocessing steps may be specified on a per-column basis to the[train()](../../../../../docs/guides/training/overview) function. By default, PostgresML does minimal preprocessing on training data, and will raise an error during analysis if NULL values are encountered without a preprocessor. All types other than`TEXT` are treated as quantitative variables and cast to floating point representations before passing them to the underlying algorithm implementations.
3333

@@ -71,7 +71,7 @@ Encoding categorical variables is an O(N log(M)) where N is the number of rows,
7171
|**name**|**description**|
7272
| ---------| -----------------------------------------------------------------------------------------------------------------------------------------------|
7373
|`none`|**Default** - Casts the variable to a 32-bit floating point representation compatible with numerics. This is the default for non-`TEXT` values.|
74-
|`target`| Encodes the variable as theaverage value of the target label for all members of the category. This is the default for`TEXT` variables.|
74+
|`target`| Encodes the variable as themean value of the target label for all members of the category. This is the default for`TEXT` variables.|
7575
|`one_hot`| Encodes the variable as multiple independent boolean columns.|
7676
|`ordinal`| Encodes the variable as integer values provided by their position in the input array. NULLS are always 0.|
7777

‎pgml-docs/docs/guides/sdks/pipelines.md‎

Lines changed: 66 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ model = Model()
1717

1818
{% tab title="JavaScript" %}
1919
```javascript
20-
model=pgml.newModel()
20+
constmodel=pgml.newModel()
2121
```
2222
{% endtab %}
2323
{% endtabs %}
@@ -36,9 +36,10 @@ model = Model(
3636

3737
{% tab title="JavaScript" %}
3838
```javascript
39-
model=pgml.newModel(
40-
name="hkunlp/instructor-base",
41-
parameters={instruction:"Represent the Wikipedia document for retrieval:"}
39+
constmodel=pgml.newModel(
40+
"hkunlp/instructor-base",
41+
"pgml",
42+
{ instruction:"Represent the Wikipedia document for retrieval:" }
4243
)
4344
```
4445
{% endtab %}
@@ -55,7 +56,7 @@ model = Model(name="text-embedding-ada-002", source="openai")
5556

5657
{% tab title="JavaScript" %}
5758
```javascript
58-
model=pgml.newModel(name="text-embedding-ada-002",source="openai")
59+
constmodel=pgml.newModel("text-embedding-ada-002","openai")
5960
```
6061
{% endtab %}
6162
{% endtabs %}
@@ -75,7 +76,7 @@ splitter = Splitter()
7576

7677
{% tab title="JavaScript" %}
7778
```javascript
78-
splitter=pgml.newSplitter()
79+
constsplitter=pgml.newSplitter()
7980
```
8081
{% endtab %}
8182
{% endtabs %}
@@ -95,8 +96,8 @@ splitter = Splitter(
9596
{% tab title="JavaScript" %}
9697
```javascript
9798
splitter=pgml.newSplitter(
98-
name="recursive_character",
99-
parameters={chunk_size:1500, chunk_overlap:40}
99+
"recursive_character",
100+
{chunk_size:1500, chunk_overlap:40}
100101
)
101102
```
102103
{% endtab %}
@@ -120,9 +121,9 @@ await collection.add_pipeline(pipeline)
120121

121122
{% tab title="JavaScript" %}
122123
```javascript
123-
model=pgml.newModel()
124-
splitter=pgml.newSplitter()
125-
pipeline=pgml.newPipeline("test_pipeline", model, splitter)
124+
constmodel=pgml.newModel()
125+
constsplitter=pgml.newSplitter()
126+
constpipeline=pgml.newPipeline("test_pipeline", model, splitter)
126127
awaitcollection.add_pipeline(pipeline)
127128
```
128129
{% endtab %}
@@ -151,17 +152,51 @@ await collection.add_pipeline(pipeline)
151152

152153
{% tab title="JavaScript" %}
153154
```javascript
154-
model=pgml.newModel()
155-
splitter=pgml.newSplitter()
156-
pipeline=pgml.newPipeline("test_pipeline", model, splitter, {
157-
"full_text_search": {
158-
active: True,
159-
configuration:"english"
155+
constmodel=pgml.newModel()
156+
constsplitter=pgml.newSplitter()
157+
constpipeline=pgml.newPipeline("test_pipeline", model, splitter, {
158+
full_text_search: {
159+
active:true,
160+
configuration:"english"
161+
}
162+
})
163+
awaitcollection.add_pipeline(pipeline)
164+
```
165+
{% endtab %}
166+
{% endtabs %}
167+
168+
###Customizing the HNSW Index
169+
170+
By default the SDK uses HNSW indexes to efficiently perform vector recall. The default HNSW index sets`m` to 16 and`ef_construction` to 64. These defaults can be customized when the Pipeline is created.
171+
172+
{% tabs %}
173+
{% tab title="Python" %}
174+
```python
175+
model= Model()
176+
splitter= Splitter()
177+
pipeline= Pipeline("test_pipeline", model, splitter, {
178+
"hnsw": {
179+
"m":16,
180+
"ef_construction":64
160181
}
161182
})
162183
await collection.add_pipeline(pipeline)
163184
```
164185
{% endtab %}
186+
187+
{% tab title="JavaScript" %}
188+
```javascript
189+
constmodel=pgml.newModel()
190+
constsplitter=pgml.newSplitter()
191+
constpipeline=pgml.newPipeline("test_pipeline", model, splitter, {
192+
hnsw: {
193+
m:16,
194+
ef_construction:64
195+
}
196+
})
197+
awaitcollection.add_pipeline(pipeline)
198+
```
199+
{% endtab %}
165200
{% endtabs %}
166201

167202
##Searching with Pipelines
@@ -179,19 +214,17 @@ results = await collection.query().vector_recall("Why is PostgresML the best?",
179214

180215
{% tab title="JavaScript" %}
181216
```javascript
182-
pipeline=pgml.newPipeline("test_pipeline")
183-
collection=pgml.newCollection("test_collection")
184-
results=awaitcollection.query().vector_recall("Why is PostgresML the best?", pipeline).fetch_all()
217+
constpipeline=pgml.newPipeline("test_pipeline")
218+
constcollection=pgml.newCollection("test_collection")
219+
constresults=awaitcollection.query().vector_recall("Why is PostgresML the best?", pipeline).fetch_all()
185220
```
186221
{% endtab %}
187222
{% endtabs %}
188223

189-
224+
##**Disable a Pipeline**
190225

191226
Pipelines can be disabled or removed to prevent them from running automatically when documents are upserted.
192227

193-
##**Disable a Pipeline**
194-
195228
{% tabs %}
196229
{% tab title="Python" %}
197230
```python
@@ -203,8 +236,8 @@ await collection.disable_pipeline(pipeline)
203236

204237
{% tab title="JavaScript" %}
205238
```javascript
206-
pipeline=pgml.newPipeline("test_pipeline")
207-
collection=pgml.newCollection("test_collection")
239+
constpipeline=pgml.newPipeline("test_pipeline")
240+
constcollection=pgml.newCollection("test_collection")
208241
awaitcollection.disable_pipeline(pipeline)
209242
```
210243
{% endtab %}
@@ -214,6 +247,8 @@ Disabling a Pipeline prevents it from running automatically, but leaves all chun
214247

215248
##**Enable a Pipeline**
216249

250+
Disabled pipelines can be re-enabled.
251+
217252
{% tabs %}
218253
{% tab title="Python" %}
219254
```python
@@ -225,8 +260,8 @@ await collection.enable_pipeline(pipeline)
225260

226261
{% tab title="JavaScript" %}
227262
```javascript
228-
pipeline=pgml.newPipeline("test_pipeline")
229-
collection=pgml.newCollection("test_collection")
263+
constpipeline=pgml.newPipeline("test_pipeline")
264+
constcollection=pgml.newCollection("test_collection")
230265
awaitcollection.enable_pipeline(pipeline)
231266
```
232267
{% endtab %}
@@ -246,11 +281,10 @@ await collection.remove_pipeline(pipeline)
246281
{% endtab %}
247282

248283
{% tab title="JavaScript" %}
249-
```javascript
250-
pipeline=pgml.newPipeline("test_pipeline")
251-
collection=pgml.newCollection("test_collection")
252-
awaitcollection.remove_pipeline(pipeline)
253-
```
284+
<preclass="language-javascript"><codeclass="lang-javascript">const pipeline = pgml.newPipeline("test_pipeline")
285+
<strong>const collection = pgml.newCollection("test_collection")
286+
</strong>await collection.remove_pipeline(pipeline)
287+
</code></pre>
254288
{% endtab %}
255289
{% endtabs %}
256290

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp