- Notifications
You must be signed in to change notification settings - Fork328
Add pgml-rds-proxy#1412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Merged
Uh oh!
There was an error while loading.Please reload this page.
Merged
Add pgml-rds-proxy#1412
Changes fromall commits
Commits
Show all changes
12 commits Select commitHold shift + click to select a range
c06cb76
Remote CLI command
levkkdfd5d29
Generate dblink and pgcat config
levkked7e8e8
blog
levkk47b18e7
RDS proxy
levkkeb212ba
no config generation
levkkcbb9f08
Build image
levkkc0a7dd3
remove post
levkk2096722
spell
levkk8f0036a
version
levkkf154127
bad comment
levkk5ef7b5d
save
levkk3bfe951
space
levkkFile filter
Filter by extension
Conversations
Failed to load comments.
Loading
Uh oh!
There was an error while loading.Please reload this page.
Jump to
Jump to file
Failed to load files.
Loading
Uh oh!
There was an error while loading.Please reload this page.
Diff view
Diff view
There are no files selected for viewing
24 changes: 24 additions & 0 deletions.github/workflows/pgml-rds-proxy.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
name: Build and release pgml-rds-proxy Docker image | ||
on: | ||
workflow_dispatch: | ||
jobs: | ||
publish-proxy-docker-image: | ||
strategy: | ||
matrix: | ||
os: ["buildjet-4vcpu-ubuntu-2204"] | ||
runs-on: ${{ matrix.os }} | ||
defaults: | ||
run: | ||
working-directory: packages/pgml-rds-proxy | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Login to GitHub Container Registry | ||
uses: docker/login-action@v1 | ||
with: | ||
registry: ghcr.io | ||
username: ${{ github.actor }} | ||
password: ${{ secrets.GITHUB_TOKEN }} | ||
- name: Build and push Docker image | ||
run: | | ||
bash build-docker-image.sh |
10 changes: 10 additions & 0 deletionspackages/pgml-rds-proxy/Dockerfile
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
FROM ubuntu:22.04 | ||
ENV PGCAT_VERSION=2.0.0-alpha18 | ||
RUN apt update && \ | ||
apt install -y curl postgresql-client-common postgresql-client-14 && \ | ||
apt clean | ||
WORKDIR /pgml-rds-proxy | ||
COPY --chown=root:root download-pgcat.sh download-pgcat.sh | ||
COPY --chown=root:root run.sh run.sh | ||
RUN bash download-pgcat.sh | ||
ENTRYPOINT ["bash", "run.sh"] |
79 changes: 79 additions & 0 deletionspackages/pgml-rds-proxy/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
# pgml-rds-proxy | ||
A pgcat-based PostgreSQL proxy that allows to use PostgresML functions on managed PostgreSQL databases that may not have Internet access, like AWS RDS. | ||
## Getting started | ||
A Docker image is provided and is the easiest way to get started. To run the image, you can simply: | ||
```bash | ||
docker run \ | ||
-e DATABASE_URL=postgres://pg:ml@sql.cloud.postgresml.org:38042/pgml \ | ||
-p 6432:6432 \ | ||
ghcr.io/postgresml/pgml-rds-proxy:latest | ||
``` | ||
**Note:** Replace the `DATABASE_URL` above with the `DATABASE_URL` of your own PostgresML database. | ||
If you're running this on EC2, make sure the instance is placed inside the same VPC as your RDS database and that the RDS database is allowed to make outbound connections to the EC2 instance. | ||
The example above starts the proxy process on port 6432, so for your security group configuration, make sure the database can make outbound connections to the EC2 instance using TCP on port 6432. | ||
### Configure FDW | ||
We'll be using the Foreign Data Wrapper extension to connect from your RDS database to PostgresML, forwarding the connection through the proxy. If you're running the proxy on EC2, take note of the private IP | ||
or DNS entry of the instance. | ||
Before proceeding, make sure you have the following extensions installed into your RDS database: | ||
```postgresql | ||
CREATE EXTENSION IF NOT EXISTS dblink; | ||
CREATE EXTENSION IF NOT EXISTS postgres_fdw; | ||
``` | ||
Both of these require superuser, so make sure you're running these commands with a user that has the `rds_superuser` role. | ||
To create a foreign data wrapper connection, take your PostgresML host and port and replace the host with the private IP or DNS entry of the instance. | ||
```postgresql | ||
CREATE SERVER postgresml | ||
FOREIGN DATA WRAPPER postgres_fdw | ||
OPTIONS ( | ||
host '127.0.0.1', | ||
port '6432', | ||
dbname 'pgml' | ||
); | ||
``` | ||
Replace the value for `host` with the private IP or DNS entry of the EC2 instance running the proxy. Replace the `dbname` with the name of the database from your PostgresML database `DATABASE_URL`. | ||
#### User mapping | ||
PostgresML and the proxy requires authentication. For each user that will use the connection, create a user mapping, like so: | ||
```postgresql | ||
CREATE USER MAPPING | ||
FOR CURRENT_USER | ||
SERVER postgresml | ||
OPTIONS ( | ||
user 'pg', | ||
password 'ml' | ||
); | ||
``` | ||
Replace the values for `user` and `password` with the values from your PostgresML database `DATABASE_URL`. This example contains values that will only work with our demo server and aren't suitable for production. `CURRENT_USER` is a special PostgreSQL variable that's replaced by the name of the user running the command. If you want to create this mapping for other users, replace it with the name of the user/role. | ||
### Test the connection | ||
To test the connection, you can use `dblink`: | ||
``` | ||
SELECT | ||
* | ||
FROM | ||
dblink( | ||
'postgresml', | ||
'SELECT * FROM pgml.embed(''intfloat/e5-small'', ''embed this text'') AS embedding' | ||
) AS t1(embedding real[386]); | ||
``` | ||
If everything is configured correctly, you should see an array of 386 floating points, your first embedding generated using PostgresML on AWS RDS. Both dblink and the proxy makes efficient use of connections, so queries will be executed as fast as the network connection allows. |
15 changes: 15 additions & 0 deletionspackages/pgml-rds-proxy/build-docker-image.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
#!/bin/bash | ||
# | ||
# | ||
# | ||
set -ex | ||
docker run --privileged --rm tonistiigi/binfmt --install all | ||
docker buildx create --use --name mybuilder || true | ||
docker buildx build \ | ||
--platform linux/amd64,linux/arm64 \ | ||
--tag ghcr.io/postgresml/pgml-rds-proxy:latest \ | ||
--progress plain \ | ||
--no-cache \ | ||
--push \ | ||
. |
30 changes: 30 additions & 0 deletionspackages/pgml-rds-proxy/download-pgcat.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
#!/bin/bash | ||
# | ||
# Download the right version of pgcat for the architecture. | ||
# | ||
# Author: PostgresML <team@postgresml.org> | ||
# License: MIT | ||
# | ||
architecture=$(arch) | ||
name=$(uname) | ||
url="https://static.postgresml.org/packages/pgcat" | ||
version="$PGCAT_VERSION" | ||
bin_name="pgcat2-$version.bin" | ||
if [[ -z "$version" ]]; then | ||
echo "PGCAT_VERSION environment variable is not set" | ||
exit 1 | ||
fi | ||
if [[ "$architecture" == "aarch64" && "$name" == "Linux" ]]; then | ||
url="${url}/arm64/$bin_name" | ||
elif [[ "$architecture" == "x86_64" && "$name" == "Linux" ]]; then | ||
url="${url}/amd64/$bin_name" | ||
else | ||
echo "Unsupported platform: ${name} ${architecture}" | ||
exit 1 | ||
fi | ||
echo "Downloading pgcat from $url" | ||
curl -L -o /usr/local/bin/pgcat ${url} | ||
chmod +x /usr/local/bin/pgcat |
11 changes: 11 additions & 0 deletionspackages/pgml-rds-proxy/run.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
#!/bin/bash | ||
# | ||
# Configure pgcat from a DATABASE_URL environment variable and run it as PID 1. | ||
# This will regenerate the configuration file every time so modifications to it won't be saved. | ||
# | ||
# If you want to modify the configuration file, generate it first and then run pgcat with `--config <path to file>` instead. | ||
# | ||
# Author: PostgresML <team@postgresml.org> | ||
# License: MIT | ||
# | ||
exec /usr/local/bin/pgcat --database-url ${DATABASE_URL} |
3 changes: 2 additions & 1 deletionpgml-sdks/pgml/Cargo.lock
Some generated files are not rendered by default. Learn more abouthow customized files appear on GitHub.
Oops, something went wrong.
Uh oh!
There was an error while loading.Please reload this page.
1 change: 1 addition & 0 deletionspgml-sdks/pgml/Cargo.toml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -45,6 +45,7 @@ ctrlc = "3" | ||
inquire = "0.6" | ||
parking_lot = "0.12.1" | ||
once_cell = "1.19.0" | ||
url = "2.5.0" | ||
[features] | ||
default = [] | ||
56 changes: 55 additions & 1 deletionpgml-sdks/pgml/src/cli.rs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
31 changes: 31 additions & 0 deletionspgml-sdks/pgml/src/sql/remote.sql
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
CREATE EXTENSION IF NOT EXISTS postgres_fdw; | ||
CREATE EXTENSION IF NOT EXISTS dblink; | ||
CREATE SERVER "{db_name}" | ||
FOREIGN DATA WRAPPER postgres_fdw | ||
OPTIONS ( | ||
host '{host}', | ||
port '{port}', | ||
dbname '{database_name}' | ||
); | ||
CREATE USER MAPPING | ||
FOR CURRENT_USER | ||
SERVER "{db_name}" | ||
OPTIONS ( | ||
user '{user}', | ||
password '{password}' | ||
); | ||
SELECT * FROM dblink( | ||
'{db_name}', | ||
'SELECT pgml.embed(''intfloat/e5-small'', ''test postgresml embedding'') AS embedding' | ||
) AS t(embedding real[386]); | ||
CREATE FUNCTION pgml_embed_e5_small(text) RETURNS real[386] AS $$ | ||
SELECT * FROM dblink( | ||
'{db_name}', | ||
'SELECT pgml.embed(''intfloat/e5-small'', ''' || $1 || ''') AS embedding' | ||
) AS t(embedding real[386]); | ||
$$ LANGUAGE SQL; |
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.