postgresml/postgresmlPublic

NotificationsYou must be signed in to change notification settings
Fork328
Star6.4k

Timeseries data#260

giscus[bot]bot announced inAnnouncements

Timeseries data#260

giscus[bot]bot

Aug 24, 2022

· 4 comments

Return to top

Discussion options

Timeseries data

You must be logged in to vote

Replies: 4 comments

Comment options

Prem-AU
Aug 24, 2022 — withgiscus

Hello developers, does this PostgresML handles the time series data?

You must be logged in to vote

0 replies

Comment options

montanalow
Aug 24, 2022
Maintainer

In a word, yes. If your dataset scale is:

Sub millions: everything will be sub second, including training and especially inference (can be sub millisecond), even without using best practices like indexes.
Millions: Vanilla Postgres will be able to train models in seconds at this scale with $100 laptop hardware. You probably don't need to do anything special other than have indexes for inference. As you approach billions you may want to look intoquery parrallelism. You'll also want to get familiar withEXPLAIN to analyze why queries take longer than you expect.
Billions: considerpartitioning your tables to enable more parallelism, or using theTimescale extension that optimizes for timeseries explicitly to keep blazing fast performance.
Trillions: you may want to generate summary or rollup tables for individual statistics that will reduce the overall training time
Quadrillions: there are probably going to be special considerations at this scale, but generally they can be solved by sharding across multiple physical machines. Consider using theCitus extension to make this easier to manage.

You must be logged in to vote

0 replies

Comment options

Prem-AU
Aug 25, 2022

I would like to forecast the time series data, How PostgresML handle thetime series?is it possible for multivariate time series prediction with PostgresML?

…

On Wed, Aug 24, 2022 at 9:18 PM Montana Low ***@***.***> wrote: In a word, yes. If your dataset scale is: - *Sub millions*: everything will be sub second, including training and especially inference (can be sub millisecond), even without using best practices like indexes. - *Millions*: Vanilla Postgres will be able to train models in seconds at this scale with $100 laptop hardware. You probably don't need to do anything special other than have indexes for inference. As you approach billions you may want to look into query parrallelism <https://www.postgresql.org/docs/current/parallel-query.html>. You'll also want to get familiar with EXPLAIN to analyze why queries take longer than you expect. - *Billions*: consider partitioning <https://www.postgresql.org/docs/current/ddl-partitioning.html> your tables to enable more parallelism, or using the Timescale extension <https://github.com/timescale/timescaledb> that optimizes for timeseries explicitly to keep blazing fast performance. - *Trillions*: you may want to generate summary or rollup tables for individual statistics that will reduce the overall training time - *Quadrillions*: there are probably going to be special considerations at this scale, but generally they can be solved by sharding across multiple physical machines. Consider using the Citus extension <https://github.com/citusdata/citus> to make this easier to manage. — Reply to this email directly, view it on GitHub <#260 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJWFP4YVTFE2E5VESFY4PJTV2Y73RANCNFSM57OEUBQQ> . You are receiving this because you commented.Message ID: ***@***.***>

-- Regards,Premkumar Thirumalaisamy

You must be logged in to vote

0 replies

Comment options

montanalow
Aug 25, 2022
Maintainer

There are many ways to perform multivariate time series prediction with PostgresML. The following algorithms are supported:

https://postgresml.org/user_guides/training/algorithm_selection/

For example, here is a blog post detailing how you might use XGBoost to formulate the problem, although this usage of XGBoost is from Python, the steps can be adapted to PostgresML:

https://cprosenjit.medium.com/multivariate-time-series-forecasting-using-xgboost-1728762a9eeb

There is also support for fine tuning deep learning models that have been published to HuggingFace. For example:

https://huggingface.co/spaces/keras-io/timeseries-classification-from-scratch

It might help give more specific answers if you described your objective/dataset/domain in depth.

You must be logged in to vote

0 replies

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Timeseries data#260

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

giscus[bot]
botAug 24, 2022

Timeseries data

Replies: 4 comments

Uh oh!

{{title}}

Uh oh!

Prem-AU
Aug 24, 2022 — withgiscus

Uh oh!

{{title}}

Uh oh!

montanalow
Aug 24, 2022
Maintainer

Uh oh!

{{title}}

Uh oh!

Prem-AU
Aug 25, 2022

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

montanalow
Aug 25, 2022
Maintainer

Select a reply

Uh oh!

Movatterモバイル変換

Timeseries data#260

Uh oh!

Uh oh!

giscus[bot]botAug 24, 2022

Timeseries data

Replies: 4 comments

Uh oh!

Prem-AUAug 24, 2022 — withgiscus

Uh oh!

montanalowAug 24, 2022 Maintainer

Uh oh!

Prem-AUAug 25, 2022

Uh oh!

Uh oh!

montanalowAug 25, 2022 Maintainer

Uh oh!

giscus[bot]
botAug 24, 2022

Prem-AU
Aug 24, 2022 — withgiscus

montanalow
Aug 24, 2022
Maintainer

Prem-AU
Aug 25, 2022

montanalow
Aug 25, 2022
Maintainer