- Notifications
You must be signed in to change notification settings - Fork328
Readme update in progress#586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Changes from1 commit
0a33546
c006523
ba8d050
ae520e6
d822e43
a7e9ce4
7c8b982
3ae4024
0daba37
0e51c29
3e06339
755580a
345eb79
b6cfcdd
d025f12
91557e3
4ffae4e
4f21192
db9523c
e02eaff
8c3ee5e
b6476eb
5a03402
970b7be
3938ba5
7edfbf4
cb9b2d4
2f33c43
47e0cea
ad16887
8721ce8
daf045c
5749330
a2bcd1d
6c3a98c
760b520
fca5ef2
c347f9b
a1ef779
f94cc3c
f8891c2
8381fe8
592fc59
42a6541
c728d7e
3ee5b8c
c9596a7
bd197a6
629ffe0
a3f45c9
d2bd901
0016d07
27e1029
File filter
Filter by extension
Conversations
Uh oh!
There was an error while loading.Please reload this page.
Jump to
Uh oh!
There was an error while loading.Please reload this page.
Diff view
Diff view
- Loading branch information
Uh oh!
There was an error while loading.Please reload this page.
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -30,21 +30,228 @@ | ||
</a> | ||
</p> | ||
## Table of contents | ||
- [Introduction](#introduction) | ||
- [Installation](#installation) | ||
- [Getting started](#getting-started) | ||
- [Natural Language Processing](#nlp-tasks) | ||
- [Regression](#regression) | ||
- [Classification](#classification) | ||
## Introduction | ||
PostgresML is a PostgreSQL extension that enables you to perform ML training and inference on text and tabular data using SQL queries. With PostgresML, you can seamlessly integrate machine learning models into your PostgreSQL database and harness the power of cutting-edge algorithms to process text and tabular data efficiently. | ||
santiatpml marked this conversation as resolved. Show resolvedHide resolvedUh oh!There was an error while loading.Please reload this page. | ||
### Text Data | ||
- Perform natural language processing (NLP) tasks like sentiment analysis, question and answering, translation, summarization and text generation | ||
- Access 1000s of state-of-the-art language models like GPT-2, GPT-J, GPT-Neo from :hugging_face: HuggingFace model hub | ||
- Fine tune large language models (LLMs) on your own text data for different tasks | ||
**Translation** | ||
<table> | ||
santiatpml marked this conversation as resolved. Show resolvedHide resolvedUh oh!There was an error while loading.Please reload this page. | ||
<tr> | ||
<td>SQL Query</td> | ||
<td>Result </td> | ||
</tr> | ||
<tr> | ||
<td> | ||
```sql | ||
SELECT pgml.transform( | ||
'translation_en_to_fr', | ||
inputs => ARRAY[ | ||
'Welcome to the future!', | ||
'Where have you been all this time?' | ||
] | ||
) AS french; | ||
``` | ||
</td> | ||
<td> | ||
```sql | ||
french | ||
------------------------------------------------------------ | ||
[ | ||
{"translation_text": "Bienvenue à l'avenir!"}, | ||
{"translation_text": "Où êtes-vous allé tout ce temps?"} | ||
] | ||
``` | ||
</td> | ||
</tr> | ||
</table> | ||
**Sentiment Analysis** | ||
<table> | ||
<tr> | ||
<td>SQL Query</td> | ||
<td>Result </td> | ||
</tr> | ||
<tr> | ||
<td> | ||
```sql | ||
SELECT pgml.transform( | ||
'{"model": "roberta-large-mnli"}'::JSONB, | ||
inputs => ARRAY | ||
[ | ||
'I love how amazingly simple ML has become!', | ||
'I hate doing mundane and thankless tasks. ☹️' | ||
] | ||
) AS positivity; | ||
``` | ||
</td> | ||
<td> | ||
```sql | ||
positivity | ||
------------------------------------------------------ | ||
[ | ||
{"label": "NEUTRAL", "score": 0.8143417835235596}, | ||
{"label": "NEUTRAL", "score": 0.7637073993682861} | ||
] | ||
``` | ||
</td> | ||
</tr> | ||
</table> | ||
### Tabular data | ||
- [47+ classification and regression algorithms](https://postgresml.org/docs/guides/training/algorithm_selection) | ||
- [8 - 40X faster inference than HTTP based model serving](https://postgresml.org/blog/postgresml-is-8x-faster-than-python-http-microservices) | ||
- [Millions of transactions per second](https://postgresml.org/blog/scaling-postgresml-to-one-million-requests-per-second) | ||
- [Horizontal scalability](https://github.com/postgresml/pgcat) | ||
**Training a classification model** | ||
<table> | ||
<tr> | ||
<td> Training </td> | ||
<td> Inference </td> | ||
</tr> | ||
<tr> | ||
<td> | ||
```sql | ||
SELECT * FROM pgml.train( | ||
'Handwritten Digit Image Classifier', | ||
algorithm => 'xgboost', | ||
'classification', | ||
'pgml.digits', | ||
'target' | ||
); | ||
``` | ||
</td> | ||
<td> | ||
```sql | ||
SELECT pgml.predict( | ||
'My Classification Project', | ||
ARRAY[0.1, 2.0, 5.0] | ||
) AS prediction; | ||
``` | ||
</td> | ||
</tr> | ||
</table> | ||
## Installation | ||
santiatpml marked this conversation as resolved. Show resolvedHide resolvedUh oh!There was an error while loading.Please reload this page. | ||
PostgresML installation consists of three parts: PostgreSQL database, Postgres extension for machine learning and a dashboard app. The extension provides all the machine learning functionality and can be used independently using any SQL IDE. The dashboard app provides a eays to use interface for writing SQL notebooks, performing and tracking ML experiments and ML models. | ||
santiatpml marked this conversation as resolved. Show resolvedHide resolvedUh oh!There was an error while loading.Please reload this page. | ||
### Docker | ||
Step 1: Clone this repository | ||
```bash | ||
git clone git@github.com:postgresml/postgresml.git | ||
``` | ||
Step 2: Start dockerized services. PostgresML will run on port 5433, just in case you already have Postgres running. You can find Docker installation instructions [here](https://docs.docker.com/desktop/) | ||
```bash | ||
cd postgresml | ||
docker-compose up | ||
``` | ||
Step 3: Connect to PostgresDB with PostgresML enabled using a SQL IDE or [`psql`](https://www.postgresql.org/docs/current/app-psql.html) | ||
```bash | ||
postgres://postgres@localhost:5433/pgml_development | ||
``` | ||
### Free trial | ||
If you want to check out the functionality without the hassle of Docker please go ahead and start PostgresML by signing up for a free account [here](https://postgresml.org/signup). We will provide 5GiB disk space on a shared tenant. | ||
santiatpml marked this conversation as resolved. Show resolvedHide resolvedUh oh!There was an error while loading.Please reload this page. | ||
## Getting Started | ||
### IDE support | ||
- DBeaver | ||
- Data Grip | ||
- Tableau | ||
- Power BI | ||
- Jupyter | ||
- VSCode | ||
## NLP Tasks | ||
- Text Classification | ||
- Token Classification | ||
- Table Question Answering | ||
- Question Answering | ||
- Zero-Shot Classification | ||
- Translation | ||
- Summarization | ||
- nConversational | ||
santiatpml marked this conversation as resolved. Show resolvedHide resolvedUh oh!There was an error while loading.Please reload this page. | ||
- Text Generation | ||
- Text2Text Generation | ||
- Fill-Mask | ||
- Sentence Similarity | ||
## Regression | ||
## Classification | ||
## Applications | ||
### Text | ||
santiatpml marked this conversation as resolved. Show resolvedHide resolvedUh oh!There was an error while loading.Please reload this page. | ||
- AI writing partner | ||
- Chatbot for customer support | ||
- Social media post analysis | ||
- Fintech | ||
- Healthcare | ||
- Insurance | ||
### Tabular data | ||
- Fraud detection | ||
- Recommendation | ||
## Benefits | ||
- Access to hugging face models - a little more about open source language models | ||
- Ease of fine tuning and why | ||
- Rust based extension and its benefits | ||
- Problems with HTTP serving and how PML enables microsecond latency | ||
- Pgcat for horizontal scaling | ||
## Concepts | ||
- Database | ||
- Extension | ||
- ML on text data | ||
- Transform operation | ||
- Fine tune operation | ||
- ML on tabular data | ||
- Train operation | ||
- Deploy operation | ||
- Predict operation | ||
## Deployment | ||
- Docker images | ||
- CPU | ||
- GPU | ||
- Data persistence on local/EC2/EKS | ||
- Deployment on AWS using docker images | ||
## What's in the box | ||
See the documentation for a complete **[list of functionality](https://postgresml.org/)**. | ||
@@ -73,35 +280,6 @@ Since your data never leaves the database, you retain the speed, reliability and | ||
### Open source | ||
We're building on the shoulders of giants. These machine learning libraries and Postgres have received extensive academic and industry use, and we'll continue their tradition to build with the community. Licensed under MIT. | ||
## Frequently Asked Questions (FAQs) | ||