- Notifications
You must be signed in to change notification settings - Fork0
A Tensorflow Dataset Factory for time-series data.
License
MArpogaus/tensorflow-time-series-dataset
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This python package should help you to create TensorFlow datasets for time-series data.
This package is available onPyPI.You install it and all of its dependencies using pip:
pip install tensorflow_time_series_dataset
Suppose you have a dataset in the following form:
import numpy as npimport pandas as pd# make things determeinisteicnp.random.seed(1)columns=['x1', 'x2', 'x3']periods=48 * 14test_df=pd.DataFrame( index=pd.date_range( start='1/1/1992', periods=periods, freq='30min' ), data=np.stack( [ np.random.normal(0,0.5,periods), np.random.normal(1,0.5,periods), np.random.normal(2,0.5,periods) ], axis=1 ), columns=columns)test_df.head() x1 x2 x31992-01-01 00:00:00 0.812173 1.205133 1.5780441992-01-01 00:30:00 -0.305878 1.429935 1.4132951992-01-01 01:00:00 -0.264086 0.550658 1.6021871992-01-01 01:30:00 -0.536484 1.159828 1.6449741992-01-01 02:00:00 0.432704 1.159077 2.005718
The factory classWindowedTimeSeriesDatasetFactory
is used to create a TensorFlow dataset from pandas dataframes, or other data sources as we will see later.We will use it now to create a dataset with48
historic time-steps as the input to predict a single time-step in the future.
from tensorflow_time_series_dataset.factory import WindowedTimeSeriesDatasetFactory as Factoryfactory_kwargs=dict( history_size=48, prediction_size=1, history_columns=['x1', 'x2', 'x3'], prediction_columns=['x3'], batch_size=4, drop_remainder=True,)factory=Factory(**factory_kwargs)ds1=factory(test_df)ds1
This returns the following TensorFlow Dataset:
<_PrefetchDataset element_spec=(TensorSpec(shape=(4, 48, 3), dtype=tf.float32, name=None), TensorSpec(shape=(4, 1, 1), dtype=tf.float32, name=None))>
We can plot the result with the utility functionplot_path
:
from tensorflow_time_series_dataset.utils.visualisation import plot_patchgithubusercontent="https://raw.githubusercontent.com/MArpogaus/tensorflow_time_series_dataset/master/"fig=plot_patch( ds1, figsize=(8,4), **factory_kwargs)fname='.images/example1.svg'fig.savefig(fname)f"[[{githubusercontent}{fname}]]"
Lets now increase the prediction size to6
half-hour time-steps.
factory_kwargs.update(dict( prediction_size=6))factory=Factory(**factory_kwargs)ds2=factory(test_df)ds2
This returns the following TensorFlow Dataset:
<_PrefetchDataset element_spec=(TensorSpec(shape=(4, 48, 3), dtype=tf.float32, name=None), TensorSpec(shape=(4, 6, 1), dtype=tf.float32, name=None))>
Again, lets plot the results to see what changed:
fig=plot_patch( ds2, figsize=(8,4), **factory_kwargs)fname='.images/example2.svg'fig.savefig(fname)f"[[{githubusercontent}{fname}]]"
Preprocessors can be used to transform the data before it is fed into the model.A Preprocessor can be any python callable.In this case we will be using the a class calledCyclicalFeatureEncoder
to encode our one-dimensional cyclical features like thetime orweekday to two-dimensional coordinates using a sine and cosine transformation as suggested inthis blogpost.
import itertoolsfrom tensorflow_time_series_dataset.preprocessors import CyclicalFeatureEncoderencs = { "weekday": dict(cycl_max=6), "dayofyear": dict(cycl_max=366, cycl_min=1), "month": dict(cycl_max=12, cycl_min=1), "time": dict( cycl_max=24 * 60 - 1, cycl_getter=lambda df, k: df.index.hour * 60 + df.index.minute, ),}factory_kwargs.update(dict( meta_columns=list(itertools.chain(*[[c+'_sin', c+'_cos'] for c in encs.keys()]))))factory=Factory(**factory_kwargs)for name, kwargs in encs.items(): factory.add_preprocessor(CyclicalFeatureEncoder(name, **kwargs))ds3=factory(test_df)ds3
This returns the following TensorFlow Dataset:
<_PrefetchDataset element_spec=((TensorSpec(shape=(4, 48, 3), dtype=tf.float32, name=None), TensorSpec(shape=(4, 1, 8), dtype=tf.float32, name=None)), TensorSpec(shape=(4, 6, 1), dtype=tf.float32, name=None))>
Again, lets plot the results to see what changed:
fig=plot_patch( ds3, figsize=(8,4), **factory_kwargs)fname='.images/example3.svg'fig.savefig(fname)f"[[{githubusercontent}{fname}]]"
Any Contributions are greatly appreciated! If you have a question, an issue or would like to contribute, please read ourcontributing guidelines.
Distributed under theApache License 2.0
Marcel Arpogaus -marcel.arpogaus@gmail.com
Project Link:https://github.com/MArpogaus/tensorflow_time_series_dataset
Parts of this work have been funded by the Federal Ministry for the Environment, Nature Conservation and Nuclear Safety due to a decision of the German Federal Parliament (AI4Grids: 67KI2012A).
About
A Tensorflow Dataset Factory for time-series data.
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.