- Notifications
You must be signed in to change notification settings - Fork328
Is the pgml.digits dataset storing images in psql?#488
-
I see one of the examples you present in the docs is a digits dataset. SELECTpgml.train('My First PostgresML Project', task=>'regression', relation_name=>'pgml.digits', y_column_name=>'target', algorithm=>'xgboost' ); Now I am guessing this is the classical MNIST dataset. Btw, super interesting project you have going here! Keep it going 🚀 |
BetaWas this translation helpful?Give feedback.
All reactions
In terms of image format, ML algos require images to be either 2D arrays for black and white or 3D arrays for color. The MNIST data is stored as an 8x8 pixel black and white image with 16 shades of gray, i.e. a PostgresSMALLINT[][]
.
https://github.com/postgresml/postgresml/blob/master/pgml-extension/src/orm/dataset.rs#L371
You definitelycan store images and other binary data in a database, but I think the question isshould you? A CDN fronting something like an S3 bucket is a better way to store and serve image content for a web application, rather than directly out of a database. Here are a few reasons you should consider vertically sharding your binary data (image, audio, large text.…
Replies: 1 comment
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
In terms of image format, ML algos require images to be either 2D arrays for black and white or 3D arrays for color. The MNIST data is stored as an 8x8 pixel black and white image with 16 shades of gray, i.e. a Postgres https://github.com/postgresml/postgresml/blob/master/pgml-extension/src/orm/dataset.rs#L371 You definitelycan store images and other binary data in a database, but I think the question isshould you? A CDN fronting something like an S3 bucket is a better way to store and serve image content for a web application, rather than directly out of a database. Here are a few reasons you should consider vertically sharding your binary data (image, audio, large text...) into a different storage and distribution mechanism other than your primary database.
These same reasons may not hold up for an ML application with a dedicated ML database.
|
BetaWas this translation helpful?Give feedback.
All reactions
👍 1🚀 1