postgresml/postgresmlPublic

NotificationsYou must be signed in to change notification settings
Fork352
Star6.6k

Commit4ac13a0

Lev Kokotov

authored and

gitbook-bot

committed

GITBOOK-82: Update screenshots for new signup

1 parent8e60653 commit4ac13a0Copy full SHA for 4ac13a0

File tree

15 files changed

+18

-32

lines changed

pgml-cms/docs
- .gitbook/assets
- benchmarks
  - million-requests-per-second.md
  - mindsdb-vs-postgresml.md
- getting-started
- machine-learning
  - README.md

15 files changed

+18

-32

lines changed

`‎pgml-cms/docs/.gitbook/assets/1M-RPS-batching-latency (1) (1).png‎`

41.8 KB

`‎pgml-cms/docs/.gitbook/assets/1M-RPS-batching-throughput (1) (1).png‎`

55.9 KB

`‎pgml-cms/docs/.gitbook/assets/1M-RPS-latency (1) (1).png‎`

51.6 KB

`‎pgml-cms/docs/.gitbook/assets/1M-RPS-throughput (1) (1).png‎`

54 KB

`‎pgml-cms/docs/.gitbook/assets/Screenshot from 2023-11-27 23-21-36.png‎`

93.3 KB

`‎pgml-cms/docs/.gitbook/assets/image (6).png‎`

102 KB

`‎pgml-cms/docs/.gitbook/assets/image (7).png‎`

226 KB

`‎pgml-cms/docs/.gitbook/assets/mindsdb-architecture (1).png‎`

145 KB

`‎pgml-cms/docs/benchmarks/million-requests-per-second.md‎`

Lines changed: 3 additions & 13 deletions

Original file line number	Diff line number	Diff line change
`@@ -1,7 +1,5 @@`
`1`	`1`	`#Million Requests per Second`
`2`	`2`
`3`		`-`
`4`		`-`
`5`	`3`	`The question "Does it Scale?" has become somewhat of a meme in software engineering. There is a good reason for it though, because most businesses plan for success. If your app, online store, or SaaS becomes popular, you want to be sure that the system powering it can serve all your new customers.`
`6`	`4`
`7`	`5`	`At PostgresML, we are very concerned with scale. Our engineering background took us through scaling PostgreSQL to 100 TB+, so we're certain that it scales, but could we scale machine learning alongside it?`
`@@ -12,18 +10,14 @@ If you missed our previous post and are wondering why someone would combine mach`
`12`	`10`
`13`	`11`	`##Architecture Overview`
`14`	`12`
`15`		`-If you're familiar with how one runs PostgreSQL at scale, you can skip straight to the[results](broken-reference).`
	`13`	`+If you're familiar with how one runs PostgreSQL at scale, you can skip straight to the[results](broken-reference/).`
`16`	`14`
`17`	`15`	`Part of our thesis, and the reason why we chose Postgres as our host for machine learning, is that scaling machine learning inference is very similar to scaling read queries in a typical database cluster.`
`18`	`16`
`19`	`17`	Inference speed varies based on the model complexity (e.g.`n_estimators` for XGBoost) and the size of the dataset (how many features the model uses), which is analogous to query complexity and table size in the database world and, as we'll demonstrate further on, scaling the latter is mostly a solved problem.
`20`	`18`
`21`		`-`
`22`		`-`
`23`	`19`	`<figure><imgsrc="../.gitbook/assets/scaling-postgresml-3.svg"alt=""><figcaption><p><em>System Architecture</em></p></figcaption></figure>`
`24`	`20`
`25`		`-`
`26`		`-`
`27`	`21`	`\| Component\| Description\|`
`28`	`22`	`\| ---------\| ---------------------------------------------------------------------------------------------------------\|`
`29`	`23`	`\| Clients\| Regular Postgres clients\|`
`@@ -73,8 +67,6 @@ Scaling XGBoost predictions is a little bit more interesting. XGBoost cannot ser`
`73`	`67`
`74`	`68`	`PostgresML bypasses that limitation because of how Postgres itself handles concurrency:`
`75`	`69`
`76`		`-`
`77`		`-`
`78`	`70`	`<figure><imgsrc="../.gitbook/assets/postgres-multiprocess-2.png"alt=""><figcaption></figcaption></figure>`
`79`	`71`
`80`	`72`	`_PostgresML concurrency_`
`@@ -89,8 +81,6 @@ One of the tests we ran used 1,000 clients, which were connected to 1, 2, and 5`
`89`	`81`
`90`	`82`	`###Linear Scaling`
`91`	`83`
`92`		`-`
`93`		`-`
`94`	`84`	`<div>`
`95`	`85`
`96`	`86`	`<figure><imgsrc="../.gitbook/assets/1M-RPS-latency.png"alt=""><figcaption><p>Latency</p></figcaption></figure>`
`@@ -131,11 +121,11 @@ If batching did not work at all, we would see a linear increase in latency and a`
`131`	`121`
`132`	`122`	`<div>`
`133`	`123`
`134`		`-<figure><imgsrc="../.gitbook/assets/1M-RPS-batching-latency%20(1).png"alt=""><figcaption></figcaption></figure>`
	`124`	`+<figure><imgsrc="../.gitbook/assets/1M-RPS-batching-latency (1)(1).png"alt=""><figcaption></figcaption></figure>`
`135`	`125`
`136`	`126`
`137`	`127`
`138`		`-<figure><imgsrc="../.gitbook/assets/1M-RPS-batching-throughput%20(1).png"alt=""><figcaption></figcaption></figure>`
	`128`	`+<figure><imgsrc="../.gitbook/assets/1M-RPS-batching-throughput (1)(1).png"alt=""><figcaption></figcaption></figure>`
`139`	`129`
`140`	`130`	`</div>`
`141`	`131`

`‎pgml-cms/docs/benchmarks/mindsdb-vs-postgresml.md‎`

Lines changed: 4 additions & 7 deletions

Original file line number	Diff line number	Diff line change
`@@ -40,10 +40,9 @@ Another difference is that PostgresML also supports embedding models, and closel`
`40`	`40`
`41`	`41`	`The architectural implementations for these projects is significantly different. PostgresML takes a data centric approach with Postgres as the provider for both storage_and_ compute. To provide horizontal scalability for inference, the PostgresML team has also created[PgCat](https://github.com/postgresml/pgcat) to distribute workloads across many Postgres databases. On the other hand, MindsDB takes a service oriented approach that connects to various databases over the network.`
`42`	`42`
`43`		`-\`
`44`		`-`
	`43`	`+\\`
`45`	`44`
`46`		`-<figure><imgsrc="../.gitbook/assets/mindsdb-architecture.png"alt=""><figcaption></figcaption></figure>`
	`45`	`+<figure><imgsrc="../.gitbook/assets/mindsdb-architecture (1).png"alt=""><figcaption></figcaption></figure>`
`47`	`46`
`48`	`47`	`\|\| MindsDB\| PostgresML\|`
`49`	`48`	`\| -------------\| -------------\| ----------\|`
`@@ -56,8 +55,7 @@ The architectural implementations for these projects is significantly different.`
`56`	`55`	`\| On Premise\| ✅\| ✅\|`
`57`	`56`	`\| Web UI\| ✅\| ✅\|`
`58`	`57`
`59`		`-\`
`60`		`-`
	`58`	`+\\`
`61`	`59`
`62`	`60`	The difference in architecture leads to different tradeoffs and challenges. There are already hundreds of ways to get data into and out of a Postgres database, from just about every other service, language and platform that makes PostgresML highly compatible with other application workflows. On the other hand, the MindsDB Python service accepts connections from specifically supported clients like`psql` and provides a pseudo-SQL interface to the functionality. The service will parse incoming MindsDB commands that look similar to SQL (but are not), for tasks like configuring database connections, or doing actual machine learning. These commands typically have what looks like a sub-select, that will actually fetch data over the wire from configured databases for Machine Learning training and inference.
`63`	`61`
`@@ -285,8 +283,7 @@ PostgresML is the clear winner in terms of performance. It seems to me that it c`
`285`	`283`	`\| translation\_en\_to\_es\| t5-base\| 1573\| 1148\| 294\|`
`286`	`284`	`\| summarization\| sshleifer/distilbart-cnn-12-6\| 4289\| 3450\| 479\|`
`287`	`285`
`288`		`-\`
`289`		`-`
	`286`	`+\\`
`290`	`287`
`291`	`288`	There is a general trend, the larger and slower the model is, the more work is spent inside libtorch, the less the performance of the rest matters, but for interactive models and use cases there is a significant difference. We've tried to cover the most generous use case we could between these two. If we were to compare XGBoost or other classical algorithms, that can have sub millisecond prediction times in PostgresML, the 20ms Python service overhead of MindsDB just to parse the incoming query would be hundreds of times slower.
`292`	`289`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit4ac13a0

File tree

15 files changed

15 files changed

`‎pgml-cms/docs/.gitbook/assets/1M-RPS-batching-latency (1) (1).png‎`

`‎pgml-cms/docs/.gitbook/assets/1M-RPS-batching-throughput (1) (1).png‎`

`‎pgml-cms/docs/.gitbook/assets/1M-RPS-latency (1) (1).png‎`

`‎pgml-cms/docs/.gitbook/assets/1M-RPS-throughput (1) (1).png‎`

`‎pgml-cms/docs/.gitbook/assets/Screenshot from 2023-11-27 23-21-36.png‎`

`‎pgml-cms/docs/.gitbook/assets/image (6).png‎`

`‎pgml-cms/docs/.gitbook/assets/image (7).png‎`

`‎pgml-cms/docs/.gitbook/assets/mindsdb-architecture (1).png‎`

`‎pgml-cms/docs/benchmarks/million-requests-per-second.md‎`

`‎pgml-cms/docs/benchmarks/mindsdb-vs-postgresml.md‎`

0 commit comments