- Notifications
You must be signed in to change notification settings - Fork259
Scalable, event-driven, deep-learning-friendly backtesting library
License
Kismuz/btgym
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
...Minimizing the mean square error on future experience. - Richard S. Sutton
Scalable event-driven RL-friendly backtesting library. Build on top of Backtrader with OpenAI Gym environment API.
Backtrader is open-source algorithmic trading library:
GitHub:http://github.com/mementum/backtrader
Documentation and community:
http://www.backtrader.com/
OpenAI Gym is...,well, everyone knows Gym:
GitHub:http://github.com/openai/gym
Documentation and community:
https://gym.openai.com/
General purpose of this project is to provide gym-integrated framework forrunning reinforcement learning experimentsin [close to] real world algorithmic trading environments.
DISCLAIMER:Code presented here is research/development grade.Can be unstable, buggy, poor performing and is subject to change.Note that this package is neither out-of-the-box-moneymaker, nor it provides ready-to-converge RL solutions.Think of it as framework for setting experiments with complex non-stationary stochastic environments.As a research project BTGym in its current stage can hardly deliver easy end-user experience in as sense thatsetting meaninfull experiments will require some practical programming experience as well as general knowledgeof reinforcement learning theory.
- Installation
- Quickstart
- Description
- Documentation and community
- Known bugs and limitations
- Roadmap
- Update news
It is highly recommended to run BTGym in designated virtual environment.
Clone or copy btgym repository to local disk, cd to it and run:pip install -e .
to install package and all dependencies:
git clone https://github.com/Kismuz/btgym.gitcd btgympip install -e .
To update to latest version::
cd btgymgit pullpip install --upgrade -e .
BTGym requresMatplotlib version 2.0.2, downgrade your installation if you have version 2.1:
pip install matplotlib==2.0.2
LSOF utility should be installed to your OS, which can not be the default case for some Linux distributives,see:https://en.wikipedia.org/wiki/Lsof
Making gym environment with all parmeters set to defaults is as simple as:
frombtgymimportBTgymEnvMyEnvironment=BTgymEnv(filename='../examples/data/DAT_ASCII_EURUSD_M1_2016.csv',)
Adding more controls may look like:
fromgymimportspacesfrombtgymimportBTgymEnvMyEnvironment=BTgymEnv(filename='../examples/data/DAT_ASCII_EURUSD_M1_2016.csv',episode_duration={'days':2,'hours':23,'minutes':55},drawdown_call=50,state_shape=dict(raw=spaces.Box(low=0,high=1,shape=(30,4))),port=5555,verbose=1, )
See more options atDocumentation: Quickstart >>
and how-to's inExamples directory >>.
Discrete actions setup: consider setup with one riskless asset acting as broker account cash and K (by default - one) risky assets.For every risky asset there exists track of historic price records referred as
data-line
.Apart from assets data lines there [optionally] exists number of exogenous data lines holding someinformation and statistics, e.g. economic indexes, encoded news, macroeconomic indicators, weather forecastsetc. which are considered relevant to decision-making.It is supposed for this setup that:- there is no interest rates for any asset;
- broker actions are fixed-size market orders (
buy
,sell
,close
); short selling is permitted; - transaction costs are modelled via broker commission;
- 'market liquidity' and 'capital impact' assumptions are met;
- time indexes match for all data lines provided;
The problem is modelled as discrete-time finite-horizon partially observable Markov decision process for equity/currency trading:
- for every asset traded agent action space is discrete
(0:
hold[do nothing], 1:
buy, 2:
sell, 3:
close[position])
; - environment is episodic: maximum episode duration and episode termination conditionsare set;
- for every timestep of the episode agent is given environment state observation as tensor of last
m
time-embedded preprocessed values for every data-line included and emits actions according some stochastic policy. - agent's goal is to maximize expected cumulative capital by learning optimal policy;
- for every asset traded agent action space is discrete
Continuous actions setup[BETA]: this setup closely relates to continuous portfolio optimisation problem definition;it differs from setup above in:
- base broker actions are real numbers:
a[i] in [0,1], 0<=i<=K, SUM{a[i]} = 1
forK
risky assets added;each action is a market target order to adjust portfolio to get sharea[i]*100%
fori
-th asset; - entire single-step broker action is dictionary of form:
{cash_name: a[0], asset_name_1: a[1], ..., asset_name_K: a[K]}
; - short selling is not permitted;
- base broker actions are real numbers:
For RL it implies having continuous action space as
K+1
dim vector.
Notice: data shaping approach is under development, expect some changes. [7.01.18]
- random sampling:historic price change dataset is divided to training, cross-validation and testing subsets.Since agent actions do not influence market, it is possible to randomly sample continuous subsetof training data for every episode. [Seems to be] most data-efficient method.Cross-validation and testing performed later as usual on most "recent" data;
- sequential sampling:full dataset is feeded sequentially as if agent is performing real-time trading,episode by episode. Most reality-like, least data-efficient, natural non-stationarity remedy.
- sliding time-window sampling:mixture of above, episde is sampled randomly from comparatively short time period, sliding fromfurthest to most recent training data. Should be less prone to overfitting than random sampling.
- ReadDocs and API Reference.
- BrowseDevelopment Wiki.
- Review opened and closedIssues.
- Go toBTGym Slack channel. If you are new -use this invite linkto join.
- requres Matplotlib version 2.0.2;
- matplotlib backend warning: appears when importing pyplot and using
%matplotlib inline
magicbefore btgym import. It's recommended to import btacktrader and btgym first to ensure proper backendchoice; - not tested with Python < 3.5;
- doesn't seem to work correctly under Windows; partially done
- by default, is configured to accept Forex 1 min. data fromwww.HistData.com;
only random data sampling is implemented;no built-in dataset splitting to training/cv/testing subsets;doneonly one equity/currency pair can be tradeddoneno 'skip-frames' implementation within environment;doneno plotting features, except if using pycharm integration observer.Not sure if it is suited for intraday strategies.[partially] donemaking new environment kills all processes using specified network port. Watch out your jupyter kernels.fixed
- refine logic for parameters applying priority (engine vs strategy vs kwargs vs defaults);
- API reference;
- examples;
- frame-skipping feature;
- dataset tr/cv/t approach;
- state rendering;
- proper rendering for entire episode;
- tensorboard integration;
- multiply agents asynchronous operation feature (e.g for A3C):
- dedicated data server;
- multi-modal observation space shape;
- A3C implementation for BTgym;
- UNREAL implementation for BTgym;
- PPO implementation for BTgym;
- RL^2 / MAML / DARLA adaptations - IN PROGRESS;
- learning from demonstrations; - partially done
- risk-sensitive agents implementation;
- sequential and sliding time-window sampling;
- multiply instruments trading;
- docker image; - CPU version,
Signalprime
contribution, - TF serving model serialisation functionality;
10.01.2019:
- docker CPU version is now available, contributed by
Signalprime
,(https://github.com/signalprime), seebtgym/docker/README.md
for details;
- docker CPU version is now available, contributed by
9.02.2019:
- Introduction to analytic data model notebook added tomodel_based_stat_arb examples folder.
25.01.2019: updates:
- lstm_policy class now requires both
internal
andexternal
observation sub-spaces to be present and allows both be one-level nestedsub-spaces itself (was only true forexternal
); all declared sub-spaces got encoded by separate convolution encoders; - policy deterministic action option is implemented for discrete action spaces and can be utilised by
syncro_runner
;by default it is enabled for test episodes; - data_feed classes now accept
pd.dataframes
as historic data dource viadataframe
kwarg (was:.csv
files only);
- lstm_policy class now requires both
18.01.2019: updates:
- data model classes are under active development to power model-based framework:
- common statistics incremental estimator classes has been added (mean, variance, covariance, linear regression etc.);
- incremental Singular Spectrum Analysis class implemented;
- for a pair of asset prices, two-factor state-space model is proposed
- newdata_feed iterator classes has been added to provide training framework with synthetic data generated by model mentioned above;
- strategy_gen_6 data handling and pre-processing has been redesigned:
- market data SSA decomposition;
- data model state as additional input to policy
- variance-based normalisation for broker statistics
- data model classes are under active development to power model-based framework:
11.12.2018: updates and fixes:
- training Launcher class got convenience features to save and reload model parameters,seehttps://github.com/Kismuz/btgym/blob/master/examples/unreal_stacked_lstm_strat_4_11.ipynb for details
- combined model-based/model-free aproach package in early development stage is added to
btgym.reserach
17.11.2018: updates and fixes:
- minor fixes to base data provider class episode sampling
- update to btgym.datafeed.synthetic subpackage: new stochastic processes generators added etc.
- new btgym.research.startegy_gen_5 subpackage:efficient parameter-free signal preprocessing implemented, other minor improvements
30.10.2018: updates and fixes:
- fixed numpy random state issue causing replicating of seeds among workers on POSIX os
- new synthetic datafeed generators - added simple Ornshtein-Uhlenbeck process data generating classes;see
btgym/datafeed/synthetic/ou.py
andbtgym/research/ou_params_space_eval
for details;
14.10.2018: update:
- base reward function redesign -> noticeable algorithms performance gain;
20.07.2018: major update to package:
enchancements to agent architecture:
- casual convolution state encoder with attention for LSTM agent;
- dropout regularization added for conv. and LSTM layers;
base strategy update: new convention for naming
get_state
methods, seeBaseStrategy
class for details;multiply datafeeds and assets trading implemented in two flavors:
- discrete actions space via MultiDiscreteEnv class;
- continious actions space via PortfolioEnv which is closely related tocontionious portfolio optimisation problem setup;
description and docs:
- MultiDataFeed:https://kismuz.github.io/btgym/btgym.datafeed.html#btgym.datafeed.multi.BTgymMultiData
- ActionSpace:https://kismuz.github.io/btgym/btgym.html#btgym.spaces.ActionDictSpace
- MultiDiscreteEnv:https://kismuz.github.io/btgym/btgym.envs.html#btgym.envs.multidiscrete.MultiDiscreteEnv
- PortfolioEnv:https://kismuz.github.io/btgym/btgym.envs.html#btgym.envs.portfolio.PortfolioEnv
examples:
- Notes on multi-asset setup:
- adding these features forced substantial package redesign;expect bugs, some backward incompatibility, broken examples etc - please report;
- current algorithms and agents architectures are ok with multiply data lines but seem not to cope well with multi-asset setup.It is especially evident in case of continuous actions, where agents completely fail to converge on train data;
- current reward function design seems inappropriate; need to reshape;
- continuous space in
beta
and still needs some improvement, esp. for broker order execution logic as well asaction sampling routine for continuous A3C (which is Dirichlet process by now); - multi-discrete space is more consistent but severely limited in number of portfolio assets (but not data-lines)due to exponential rise of action space cardinality;the option is to as use many datalines as desired while limiting portfolio to 1 - 4 assets;
- no Guided Policy available for multi-asset setup yet - in progress;
- all but
episode
rendering modes are temporally disabled; - whole thing is shamelessly resource-hungry;
17.02.18: First results on applying guided policy search ideas (GPS) to btgym setup can be seenhere.
- tensorboard summaries are updated with additional renderings:actions distribution, value function and LSTM_state; presented in the same notebook.
6.02.18: Common update to all a3c agents architectures:
all dense layers are now Noisy-Net ones,see:Noisy Networks for Exploration paper by Fortunato at al.;
note that entropy regularization is still here, kept in ~0.01 to ensure proper exploration;
policy output distribution is 'centered' using layer normalisation technique;
- all of the above results in about 2x training speedup in terms of train iterations;
20.01.18: ProjectWiki pages added;
12.01.18: Minor fixes to logging, enabled BTgymDataset train/test data split. AAC framework train/test cycle enabledvia
episode_train_test_cycle
kwarg.7.01.18: Update:
- Major data pipe redesign.
Domain -> Trial -> Episode
sampling routine implemented. For motivation andformal definitions refer toSection 1.Data of this DRAFT,APIDocumentationandIntro example. Changes should be backward compatible.In brief, it is necessry framework for upcoming meta-learning algorithms. - logging changes: now relying in python
logbook
module. Should eliminate errors under Windows. - Stacked_LSTM_Policy agent implemented. Based on NAV_A3C fromDeepMind paper with some minor mods. Basic usageExample is here.Still in research code area and need further tuning; yet faster than simple LSTM agent,able to converge on 6-month 1m dataset.
- Major data pipe redesign.
5.12.17: Inner btgym comm. fixes >> speedup ~5%.
02.12.17: Basic
sliding time-window train/test
framework implemented viaBTgymSequentialTrial()class. UPD: replaced byBTgymSequentialDataDomain
class.29.11.17: Basic meta-learning RL^2 functionality implemented.
- SeeTrial_Iterator Class andRL^2 policyfor description.
- Effectiveness is not tested yet, examples are to follow.
24.11.17: A3C/UNREAL finally adapted to work with BTGym environments.
- Examples with synthetic simple data(sine wawe) and historic financial data added,seeexamples directory;
- Results on potential-based functions reward shaping in
/research/DevStartegy_4_6
; - Work on Sequential/random Trials Data iterators (kind of sliding time-window) in progress,start approaching the toughest part: non-stationarity battle is ahead.
14.11.17: BaseAAC framework refraction; added per worker batch-training option and LSTM time_flatten option; Atariexamples updated; seeDocumentation for details.
30.10.17: Major update, some backward incompatibility:
- BTGym now can be thougt as two-part package: one is environment itself and the other one isRL algoritms tuned for solving algo-trading tasks. Some basic work on shaping of later is done. Three advantageactor-critic style algorithms are implemented: A3C itself, it's UNREAL extension and PPO. Core logic of these seemsto be implemented correctly but further extensive BTGym-tuning is ahead.For now one can checkatari tests.
- Finally, basicdocumentation and API reference is now available.
27.09.17: A3Ctest_4.2 added:
- some progress on estimator architecture search, state and reward shaping;
22.09.17: A3Ctest_4 added:
- passing train convergence test on small (1 month) dataset of EURUSD 1-minute bar data;
20.09.17: A3C optimised sine-wave test addedhere.
- This notebook presents some basic ideas on state presentation, reward shaping,model architecture and hyperparameters choice.With those tweaks sine-wave sanity test is converging faster and with greater stability.
31.08.17: Basic implementation of A3C algorithm is done and moved inside BTgym package.
- algorithm logic consistency tests are passed;
- still work in early stage, experiments with obs. state features and policy estimator architecture ahead;
- check out
examples/a3c
directory.
23.08.17:
filename
arg in environment/dataset specification now can be list of csv files.- handy for bigger dataset creation;
- data from all files are concatenated and sampled uniformly;
- no record duplication and format consistency checks preformed.
21.08.17: UPDATE: BTgym is now using multi-modal observation space.
- space used is simple extension of gym:
DictSpace(gym.Space)
- dictionary (not nested yet) of core gym spaces. - defined in
btgym/spaces.py
. raw_state
is default Box space of OHLC prices. Subclass BTgymStrategy and overrideget_state()
method tocompute alll parts of env. observation.- rendering can now be performed for avery entry in observation dictionary as long as it is Box ranked <=3and same key is passed in reneder_modes kwarg of environment.'Agent' mode renamed to 'state'. See updated examples.
- space used is simple extension of gym:
07.08.17: BTgym is now optimized for asynchronous operation with multiply environment instances.
- dedicated data_server is used for dataset management;
- improved overall internal network connection stability and error handling;
- see example
async_btgym_workers.ipynb
inexamples
directory.
15.07.17: UPDATE, BACKWARD INCOMPATIBILITY: now state observation can be tensor of any rank.
- Consequently, dim. ordering convention has changed to ensure compatibility withexisting tf models: time embedding is first dimension from now on, e.g. statewith shape (30, 20, 4) is 30x steps time embedded with 20 features and 4 'channels'.For the sake of 2d visualisation only one 'cannel' can be rendered, can bechosen by setting env. kwarg
render_agent_channel=0
; - examples are updated;
- better now than later.
- Consequently, dim. ordering convention has changed to ensure compatibility withexisting tf models: time embedding is first dimension from now on, e.g. statewith shape (30, 20, 4) is 30x steps time embedded with 20 features and 4 'channels'.For the sake of 2d visualisation only one 'cannel' can be rendered, can bechosen by setting env. kwarg
11.07.17: Rendering battle continues: improved stability while low in memory,added environment kwarg
render_enabled=True
; when set toFalse
- all renderings are disabled. Can help with performance.5.07.17: Tensorboard monitoring wrapper added; pyplot memory leak fixed.
30.06.17: EXAMPLES updated with 'Setting up: full throttle' how-to.
29.06.17: UPGRADE: be sure to run
pip install --upgrade -e .
- major rendering rebuild: updated with modes:
human
,agent
,episode
;render process now performed by server and returned to environment asrgb numpy array
.Pictures can be shown either via matplolib or as pillow.Image(preferred). - 'Rendering HowTo' added, 'Basic Settings' example updated.
- internal changes: env. state divided on
raw_state
- price data,andstate
- featurized representation.get_raw_state()
method added to strategy. - new packages requirements:
matplotlib
andpillow
.
- major rendering rebuild: updated with modes:
25.06.17:Basic rendering implemented.
23.06.17:alpha 0.0.4:added skip-frame feature,redefined parameters inheritance logic,refined overall stability;
17.06.17:first working alpha v0.0.2.
About
Scalable, event-driven, deep-learning-friendly backtesting library
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors8
Uh oh!
There was an error while loading.Please reload this page.