Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Stock Forecasting Application

License

NotificationsYou must be signed in to change notification settings

naserih/stockastic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Stock Data Science With Python

introduction

This is a startup project to test some of the data science tools on stock data to find meaningful trends.

Requirements: Starting with python

  • You will need a git installed on your machine.
  • You will require Python 3 on your machine with a working version of pip3 to start this project
  • you can check versions of python and pip using any of the following comments:python --version,python3 --version,pip --version,pip3 --version
  • Clone the current repositorygit clone https://github.com/hn617/stockastic.git
  • cd stockastic
  • make a new git branchgit checkout -b lesson1
  • pip3 install -r requirements
  • make sure you have all the requirements installed into the python3
  • Learn about TSX stock tickers and daily stock prices (open, close), and volume.

Lesson 1: Read data from a file into the pandas' data frame

In this lesson, we are going to build a python script to read TSX historical stock prices (2019-2020) and sort the stock tickers according to their average volume.

Read data from CSV file ...
  1. data directory contains daily stock values for TSX stocks for the year 2019-2020. Files' names are stock tickers. Open a couple of the CSV files and check the data structure. We are going to create a ticker dictionary containing file path and stock details.
  ticker_dic = {'<TIKER_0>' : {                              'FILE_PATH': '<full_path_to_ticker_0_file>'},                              'mean_volume' : xx,                              'order_volume' : xx,                              },                 '<TIKER_1>' : {                              'FILE_PATH': '<full_path_to_ticker_1_file>'},                              'mean_volume' : xx,                              'order_volume' : xx,                              }

later we will add fore data into the ticker dictionary.

  1. Use python to list all the CSV files (stock tickers) from./data/TSX/20190222
import osmypath = ""onlyfiles = [f for f in os.listdir(mypath) if ".csv" in f]

Then create a dictionary with ticker name as key and full file path to the csv file as value. You can do something like.

ticker_dic = {}for filename in onlyfiles:  ticker_dic[filename[:-4]] = {'filepath':os.path.join(mypath, filename)}
  1. Write a function to read a CSV file for a given ticker as a panda dataframe.HELP
import pandas as pd  import jsondf = pd.read_csv("full_path_to_csv_file", header=0,sep=",", thousands=',', index_col=None, parse_dates=['Date'])if len(df['Volume']) == 0:  del ticker_dic[ticker]
  1. Write a function to return themean of the stockVolumes for a input ticker.df.mean(axis=0)
  def get_mean_volume(ticker):    mean_volume = ... //finds mean volume    return mean_volume
  1. Modify the function to add the mean_volume into the ticker_dic.
    ticker_dic[ticker]['mean_volume'] = mean_volume
  1. sort tickers by their mean_volume and add the ticker order to the ticker_dic
  sorted_by_volume = sorted(ticker_dic, key=lambda k: ticker_dic[k]['mean_volume'], reverse=True)  # check to make sure it is working   print (sorted_by_volume)  for i in range(len(sorted_by_volume)):      ticker = sorted_by_volume[i]      order_volume = i      ticker_dic[ticker]['order_volume'] = order_volume

Lesson 2: Visualize Data in matplotlib

In this lesson we will add stockopen andclose arrays into theticker_dic and plot stock values for some of high volume tickers.

Visualize Data in matplotlib
  1. Similar to the previous lesson, addmedian_volume andorder_median_volume into the ticker dictionary.

  2. Create panda array with ticker'sorder_median_volume,order_mean_volume,median_volume, andmean_volume.

df = pd.DataFrame(tickers_dic.values())
  1. plot stockmean_volume andmedian_volume vsorder_volume
 df.plot(x='order_median_volume', y='median_volume')

Lesson 3: Compare Open and Close values

In this lesson, we will work with stock Open and Close values. We will investigate the correlation between Close and Open values of the stock.

Compare stock Open to its previous Close
  1. For a given ticker intickers_dic calculate the ratio betweenAdj. Close andOpen for each row and store them as a new columnC/O.
  2. calculate the ratio betweenOpen and the previous day'sAdj. Close values for each row and store them as a new columnO/C.
  3. PlotO/C vsC/O
ticker = 'AC.TO'close_open_ratio = tickers_dic[ticker]['df']['Adj. Close'] / tickers_dic[ticker]['df']['Open'] tickers_dic[ticker]['df']['C/O'] = close_open_ratioopen_close_ratio =   tickers_dic[ticker]['df']['Open'] / tickers_dic[ticker]['df']['Adj. Close'].shift(1)tickers_dic[ticker]['df']['O/C'] = open_close_ratiodf.plot(x='C/O', y='O/C')

Lesson 4: Compare two stock values

In this lesson we want to build a function to allow us to compare stockAdj. Close values for two tickers.

Comparing two stocks

We want to define a function to get tickers_dic, fixed_ticker, moving_ticker, interval, time_shift, today_date, forecast_days and return plot the fix_ticker_value vs mocinv_ticker_value.

fixed_ticker: is a base ticker that we want to forecast. moving_ticker: is a ticker that we want to compare to fixed_ticker and use for forecasting.interval: is the time interval that we want to compare two stocks. For example 60 business days (~three months).time_shift: is a shift between the fixed_ticker snd moving_ticker.today_date: date for the start of forecasting. In reality, this will be today's date but for training, we will change this date and test the forecasting.forecast_days: number of the days from today_date that we want to forecast.Note: If the today_date is the last available date for fixed_ticker then time_shift should be greater than the forecast_days. For example, if we want to forecast the next 10 business days (forecast_days = 10) then time_shift should be greater than 10.
  1. define a function with the requested parameters:
def compare_stocks(tickers_dic, fixed_ticker, moving_ticker, interval, time_shift, today_date, forecast_days)
  1. read dfs for both fixed and moving tickers
  2. convertDate column to datetime.date object. Similarly, convert thetoday_date string to the datetime.date object.
  3. normalize the 'Adj. Close' values of both stocks using z-score (standardized). Store the normalization parameters to allow us to convert the normalized values back to the
  4. filter the fixed_ticker df to read the number ofinterval rows starting fromtoday_date
  5. filter the moving_ticker df to read the number ofinterval+time_shift rows starting fromtoday_date
  6. plot normilied 'Adj. Close' values for both fixed_ticker and shifted moving_ticker

Lesson 5: Fit data with linear models

Fit data with linear models

Lesson 6: Download data from API

Download data from API

About

Stock Forecasting Application

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp