Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit2f33c43

Browse files
committed
Added protobuf for finbert support and text-classification readme in progress
1 parentcb9b2d4 commit2f33c43

File tree

4 files changed

+109
-24
lines changed

4 files changed

+109
-24
lines changed

‎README.md

Lines changed: 107 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ PostgresML is a PostgreSQL extension that enables you to perform ML training and
4949

5050
**Translation**
5151

52-
*SQLQuery*
52+
*SQLquery*
5353

5454
```sql
5555
SELECTpgml.transform(
@@ -62,7 +62,7 @@ SELECT pgml.transform(
6262
```
6363
*Result*
6464

65-
```bash
65+
```json
6666
french
6767
------------------------------------------------------------
6868

@@ -75,27 +75,24 @@ SELECT pgml.transform(
7575

7676

7777
**Sentiment Analysis**
78-
*SQLQuery*
78+
*SQLquery*
7979

8080
```sql
8181
SELECTpgml.transform(
82-
83-
'{"model": "roberta-large-mnli"}'::JSONB,
84-
inputs=> ARRAY
85-
[
82+
task=>'text-classification',
83+
inputs=> ARRAY[
8684
'I love how amazingly simple ML has become!',
8785
'I hate doing mundane and thankless tasks. ☹️'
8886
]
89-
9087
)AS positivity;
9188
```
9289
*Result*
93-
```bash
90+
```json
9491
positivity
9592
------------------------------------------------------
9693
[
97-
{"label":"NEUTRAL","score": 0.8143417835235596},
98-
{"label":"NEUTRAL","score": 0.7637073993682861}
94+
{"label":"POSITIVE","score":0.9995759129524232},
95+
{"label":"NEGATIVE","score":0.9903519749641418}
9996
]
10097
```
10198

@@ -144,7 +141,7 @@ cd postgresml
144141
docker-compose up
145142
```
146143

147-
Step 3: Connect to PostgresDB with PostgresML enabled using a SQL IDE or[`psql`](https://www.postgresql.org/docs/current/app-psql.html)
144+
Step 3: Connect to PostgresDB with PostgresML enabled using a SQL IDE or<ahref="https://www.postgresql.org/docs/current/app-psql.html"target="_blank">psql</a>
148145
```bash
149146
postgres://postgres@localhost:5433/pgml_development
150147
```
@@ -165,18 +162,106 @@ If you want to check out the functionality without the hassle of Docker please g
165162

166163
###Option 2
167164
- Use any of these popular tools to connect to PostgresML and write SQL queries
168-
-[Apache Superset](https://superset.apache.org/)
169-
-[DBeaver](https://dbeaver.io/)
170-
-[Data Grip](https://www.jetbrains.com/datagrip/)
171-
-[Postico 2](https://eggerapps.at/postico2/)
172-
-[Popsql](https://popsql.com/)
173-
-[Tableau](https://www.tableau.com/)
174-
-[Power BI](https://powerbi.microsoft.com/en-us/)
175-
-[Jupyter](https://jupyter.org/)
176-
-[VSCode](https://code.visualstudio.com/)
165+
-<ahref="https://superset.apache.org/"target="_blank">Apache Superset</a>
166+
-<ahref="https://dbeaver.io/"target="_blank">DBeaver</a>
167+
-<ahref="https://www.jetbrains.com/datagrip/"target="_blank">Data Grip</a>
168+
-<ahref="https://eggerapps.at/postico2/"target="_blank">Postico 2</a>
169+
-<ahref="https://popsql.com/"target="_blank">Popsql</a>
170+
-<ahref="https://www.tableau.com/"target="_blank">Tableau</a>
171+
-<ahref="https://powerbi.microsoft.com/en-us/"target="_blank">PowerBI</a>
172+
-<ahref="https://jupyter.org/"target="_blank">Jupyter</a>
173+
-<ahref="https://code.visualstudio.com/"target="_blank">VSCode</a>
177174

178175
##NLP Tasks
179-
- Text Classification
176+
PostgresML integrates 🤗 Hugging Face Transformers to bring state-of-the-art NLP models into the data layer. There are tens of thousands of pre-trained models with pipelines to turn raw text in your database into useful results. Many state of the art deep learning architectures have been published and made available from Hugging Face <ahref="https://huggingface.co/models"target="_blank">model hub</a>.
177+
178+
You can call different NLP tasks and customize using them using the following SQL query.
179+
180+
```sql
181+
SELECTpgml.transform(
182+
task=>TEXTOR JSONB,-- Pipeline initializer arguments
183+
inputs=>TEXT[]ORBYTEA[],-- inputs for inference
184+
args=> JSONB-- (optional) arguments to the pipeline.
185+
)
186+
```
187+
###Text Classification
188+
189+
Text classification involves assigning a label or category to a given text. Common use cases include sentiment analysis, natural language inference, and the assessment of grammatical correctness.
190+
![text classification](pgml-docs/docs/images/text-classification.png)
191+
192+
*Basic SQL query*
193+
```sql
194+
SELECTpgml.transform(
195+
task=>'text-classification',
196+
inputs=> ARRAY[
197+
'I love how amazingly simple ML has become!',
198+
'I hate doing mundane and thankless tasks. ☹️'
199+
]
200+
)AS positivity;
201+
```
202+
*Result*
203+
```json
204+
positivity
205+
------------------------------------------------------
206+
[
207+
{"label":"POSITIVE","score":0.9995759129524232},
208+
{"label":"NEGATIVE","score":0.9903519749641418}
209+
]
210+
```
211+
212+
A fine-tune checkpoint of DistilBERT-base-uncased that is tuned on Stanford Sentiment Treebank(sst2) is used as a default <ahref="https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english"target="_blank">model</a> for text classification.
213+
214+
*SQL query using specific model*
215+
216+
To use one of the over 19,000 models available on Hugging Face, include the name of the desired model and its associated task as a JSONB object in the SQL query. For example, if you want to use a RoBERTa <ahref="https://huggingface.co/models?pipeline_tag=text-classification"target="_blank">model</a> trained on around 40,000 English tweets and that has POS (positive), NEG (negative), and NEU (neutral) labels for its classes, include this information in the JSONB object when making your query.
217+
218+
```sql
219+
SELECTpgml.transform(
220+
inputs=> ARRAY[
221+
'I love how amazingly simple ML has become!',
222+
'I hate doing mundane and thankless tasks. ☹️'
223+
],
224+
task=>'{"task": "text-classification",
225+
"model": "finiteautomata/bertweet-base-sentiment-analysis"
226+
}'::JSONB
227+
)AS positivity;
228+
```
229+
*Result*
230+
```json
231+
positivity
232+
-----------------------------------------------
233+
[
234+
{"label":"POS","score":0.992932200431826},
235+
{"label":"NEG","score":0.975599765777588}
236+
]
237+
```
238+
239+
*SQL query using models from specific industry*
240+
241+
By selecting a model that has been specifically designed for a particular industry, you can achieve more accurate and relevant text classification. An example of such a model is <ahref="https://huggingface.co/ProsusAI/finbert"target="_blank">FinBERT</a>, a pre-trained NLP model that has been optimized for analyzing sentiment in financial text. FinBERT was created by training the BERT language model on a large financial corpus, and fine-tuning it to specifically classify financial sentiment. When using FinBERT, the model will provide softmax outputs for three different labels: positive, negative, or neutral.
242+
243+
```sql
244+
SELECTpgml.transform(
245+
inputs=> ARRAY[
246+
'Stocks rallied and the British pound gained.',
247+
'Stocks making the biggest moves midday: Nvidia, Palantir and more'
248+
],
249+
task=>'{"task": "text-classification",
250+
"model": "ProsusAI/finbert"
251+
}'::JSONB
252+
)AS market_sentiment;
253+
```
254+
255+
*Result*
256+
```json
257+
258+
market_sentiment
259+
------------------------------------------------------
260+
[
261+
{"label":"positive","score":0.8983612656593323},
262+
{"label":"neutral","score":0.8062630891799927}
263+
]
264+
```
180265
- Token Classification
181266
- Table Question Answering
182267
- Question Answering

‎docker-compose.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ services:
1010
context:./pgml-extension/
1111
dockerfile:Dockerfile.local
1212
ports:
13-
-"5433:5432"
13+
-"6453:5432"
1414
command:
1515
-sleep
1616
-infinity
494 KB
Loading

‎pgml-extension/Dockerfile.local

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ RUN cat /etc/apt/sources.list
1111
RUN apt-get update && apt-get install -y postgresql-pgml-14
1212

1313
# Cache this, quicker
14-
RUN pip3 install xgboost scikit-learn diptest torch lightgbm transformers datasets sentencepiece sentence_transformers sacremoses sacrebleu rouge
14+
RUN pip3 install xgboost scikit-learn diptest torch lightgbm transformers datasets sentencepiece sentence_transformers sacremoses sacrebleu rouge protobuf
1515

1616
COPY --chown=postgres:postgres . /app
1717
WORKDIR /app

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp