1
\$\begingroup\$

Hi folks: I posted a SO (https://stackoverflow.com/questions/57541246/optimizing-vba-function-loop) and was told to ask here.

I am starting to think that rather than relay on excel as my data repository, I should create a separate class that holds a variant array that I can then query must faster??

Anyway... here's my question. I hope someone can help. I agree that searching arrays may be faster but I need this data available across all functions of this sheet.

I am in need to optimizing some VBA which currently works functionally.

Given columns of sequential Dates (column B) and Times (column C), andGiven a time window (T1 and T2), return a range of rows in which dates and times fall within T1 and T2. For example, I want MIN and MAX price between those two times.

The goal is to build Open/High/Low/Close charts for Excel candlestick charts and the data source has over 260,000 rows of data.

I currently have the following code that

Dim priceRange As RangestartRowNum = GetFirstRow(StartTime)     << THIS TAKE 10 SECONDSendRowNum = GetLastRow(endTime)         << THIS TAKE 10 SECONDSSet priceRange = Range(Cells(startRowNum, 4), Cells(endRowNum, 4))targetRange.Offset(0, 2).Value = Application.WorksheetFunction.Max(priceRange) targetRange.Offset(0, 3).Value = Application.WorksheetFunction.Min(priceRange)

To find the first row...

Function GetFirstRow(T As Date) As Long'Starts at FirstRow and returns the first row where the time is greater than T1.Dim currentRow As LongDim CompareTime As DateDim CompareDate As DatecurrentRow = 4 'Start at row4 due to headers.Do While (IsDate(Cells(currentRow, 2)))    CompareDate = Cells(currentRow, 2)    CompareTime = Cells(currentRow, 3)    marketTime = CompareDate + CompareTime  If (marketTime >= T) Then Exit Do  currentRow = currentRow + 1LoopGetFirstRow = currentRowEnd Function

GetLastRow is very similar.

My issue is that the GetFirstRow function has to process 49,000 (yes, forty nine thousand) rows, and it takes about 10 seconds.... so it takes "minutes" to complete this run.

Can someone help me optimize this?

Note I Need the date since market data starts the night before. If this is what is slowing me down, I can filter that as I import the data?

askedAug 18, 2019 at 4:04
Ed Landau's user avatar
\$\endgroup\$
3
  • 2
    \$\begingroup\$You should use SQLto query data. Ususally the data is stored in a database, but you can missuse excel for that. Depending on your needs, you can use excel-vba ot ms-Access or ssms toquery the range. Just reference the range in the From-Clause of query.\$\endgroup\$CommentedAug 18, 2019 at 5:41
  • \$\begingroup\$Thanks. Yes of course SQL is fast at queries... but not at creating candlestick charts :). I'm stuck with Excel... looking for a way to optimize. I'm looking to arrays but still can't figure out how to create a public array shared by all functional in a worhseet.\$\endgroup\$CommentedAug 18, 2019 at 15:47
  • 1
    \$\begingroup\$But you can lookupGetFirstRowandGetLastRowwith sql.\$\endgroup\$CommentedAug 18, 2019 at 15:53

1 Answer1

2
\$\begingroup\$

You are doing twice the work by having a function to get the starting row and a second function get the last row. Passing the starting row into theGetLastRow() function would be more efficient.

I prefer to have a single function return the range object. Using theWorkshetFunction.Match() is far more efficient then iterating over the cells.

Results

Immediate Window Results

getDateRange:Function

'Enumerations to clarify column data contentPublic Enum DataColumns    dcStocks = 1    dcDates    dcTimes    dcValuesEnd Enum' https://docs.microsoft.com/en-us/office/vba/api/excel.worksheetfunction.matchFunction getDateRange(ByVal StartDateTime As Date, ByVal EndDateTime As Date) As Range    Const LargestValueGreaterThanOrEqualTo As Long = -1    Const FirstExactMatch As Long = 0    Const LagestValueLessThanOrEqualTo As Long = 1    Dim Target As Range    With ThisWorkbook.Worksheets(1)        Set Target = .Range("A4:Z4", .Cells(.Rows.Count, dcDates).End(xlUp))    End With    Dim dates    Dim RangeStart As Long, RangeEnd As Long    Dim SearchValue As Double    SearchValue = StartDateTime - 1    On Error Resume Next    RangeStart = WorksheetFunction.Match(SearchValue, Target.Columns(dcDates), LagestValueLessThanOrEqualTo)    On Error GoTo 0    If RangeStart = 0 Then Exit Function    Dim r As Long    Dim StartFlag As Boolean    Dim DateTime As Date    With Target        For r = RangeStart To .Rows.Count            DateTime = .Cells(r, dcDates).Value + .Cells(r, dcTimes).Value            If DateTime >= StartDateTime And Not StartFlag Then                RangeStart = r                StartFlag = True            End If            If DateTime > EndDateTime Then                RangeEnd = r - 1                Exit For            End If        Next        If r > .Rows.Count Then RangeEnd = .Rows.Count        Set getDateRange = .Rows(RangeStart & ":" & RangeEnd)    End WithEnd Function

Worksheet Test Preparation

Sub Prep()    Const RowCount As Long = 260000    'https://codereview.stackexchange.com/questions/226360/vba-loop-optimization    Dim codes, dates, stocks, times, Values    Dim d As Date, t As Date    codes = Array("ACB", "AYI", "A2B", "ABP", "ABL", "AEG", "ABT", "AJC", "AKG", "AX8", "AX1", "ACS", "ACQ", "ACF", "ACR", "ACW", "AIV")    ReDim stocks(1 To RowCount, 1 To 1)    ReDim dates(1 To RowCount, 1 To 1)    ReDim times(1 To RowCount, 1 To 1)    ReDim Values(1 To RowCount, 1 To 1)    Dim r As Long, r2 As Long    d = #1/1/2010#    For r = 1 To RowCount - 48        d = d + 1        For r2 = 0 To 47            t = TimeSerial(0, r2 * 30, 0)            stocks(r + r2, 1) = codes(WorksheetFunction.RandBetween(0, UBound(codes)))            dates(r + r2, 1) = d            times(r + r2, 1) = t            Values(r + r2, 1) = Int((Rnd * 100) + 1) + Rnd        Next        r = r + r2 - 1    Next    Range("A4").Resize(RowCount) = stocks    Range("B4").Resize(RowCount) = dates    Range("C4").Resize(RowCount) = times    Range("D4").Resize(RowCount) = ValuesEnd Sub

Test

Sub Main()    Dim Results(5) As String * 25    Const TestCount As Long = 10    Dim n As Long    Results(0) = "Date Range"    Results(1) = "StartDateTime"    Results(2) = "EndDateTime"    Results(3) = "MinPrice"    Results(4) = "MaxPrice"    Results(5) = "Time"    Debug.Print Results(0), Results(1), Results(2), Results(3), Results(4), Results(5)    For n = 1 To TestCount        Test    NextEnd SubSub Test()    Dim Results(5) As String * 25    Dim t As Double: t = Timer    Dim Target As Range    Dim d As Date, StartDateTime As Date, EndDateTime As Date    StartDateTime = WorksheetFunction.RandBetween(#1/2/2010#, #8/30/2024#)    EndDateTime = StartDateTime + TimeSerial(WorksheetFunction.RandBetween(1, 24) - 1, WorksheetFunction.RandBetween(1, 2) * 60, 0) + WorksheetFunction.RandBetween(1, 60) - 1    Set Target = getDateRange(StartDateTime, EndDateTime)    Dim MinPrice As Double, MaxPrice As Double    MinPrice = WorksheetFunction.Min(Target.Columns(4))    MaxPrice = WorksheetFunction.Min(Target.Columns(4))    Results(0) = Target.Address    Results(1) = StartDateTime    Results(2) = EndDateTime    Results(3) = MinPrice    Results(4) = MaxPrice    Results(5) = Round(Timer - t, 2)    Debug.Print Results(0), Results(1), Results(2), Results(3), Results(4), Results(5)    Target.SelectEnd Sub

CandleStick Chart

The dataset in the image shows that you need to know the Open, High, Low, and Close for each day to create the Chart. Considering there are over 200 K rows, I presume that you will also need to filter by stock. If this is true then I would take a different approach.

CandleStick Chart Image

I would have a dictionary that stores a sub-dictionary for each stock that stores a dictionary for each day that stores an arraylist to store the values.

Create Array From Data Structure and Write it To New Sheet

Dim CandleStickDataReDim CandleStickData(1 To RowCount, 1 To 6)r = 0For Each StockKey In StockMap    Set DateMap = StockMap(StockKey)    For Each DateKey In DateMap        Set ValueList = DateMap(DateKey)        r = r + 1        rowData = ValueList.ToArray        CandleStickData(r, 1) = StockKey        CandleStickData(r, 2) = DateKey        CandleStickData(r, 3) = rowData(0)        CandleStickData(r, 4) = WorksheetFunction.Max(rowData)        CandleStickData(r, 5) = WorksheetFunction.Min(rowData)        CandleStickData(r, 6) = rowData(UBound(rowData))    NextNextWorksheets.AddRange("A1:F1").Value = Array("Stock", "Date", "Open", "High", "Low", "Close")Range("A2").Resize(RowCount, 6).Value = CandleStickDataDebug.Print Round(Timer - t)

I did a quick mock up and it took 21 seconds to load 259,967 rows of data into the dictionaries and ArrayList and just 11 seconds to build a new Array and write it to a worksheet. After the data has been processed, it would be a simply matter of getting the date range and updating the chart table. Changing the stocks or chart data should take no more than 1 tenth of a second.

enter image description here

answeredAug 19, 2019 at 6:52
TinMan's user avatar
\$\endgroup\$
4
  • \$\begingroup\$Wow that is awesome. 'Never thought about returning a range. THANK YOU. FYI: 200,000 rows are all for one stock so will need to call getDateRange() for each increment (like 15 minutes). I also LOVE the idea of passing in the starting row of the previous iteration so reduce the amount of data to go thru as we progress thru the chart. This is awesome and I need to look deeper into you suggestions. THANK YOU!\$\endgroup\$CommentedAug 20, 2019 at 13:17
  • \$\begingroup\$Creating a Candlestick table from the data-set would take about 40 secs. Having 1 record per Candlestick would increase the performance by +15x and the storage capacity by 15x (assuming 15 records per Candlestick). Updating the table weekly would take far less time.\$\endgroup\$CommentedAug 20, 2019 at 17:54
  • \$\begingroup\$I recommend using a scrollbar as opposed to the mouse wheel. You would have to use WinApi calls to hook the mouse wheel.\$\endgroup\$CommentedAug 20, 2019 at 17:55
  • \$\begingroup\$TinMan: Would you be willing to consult for an hour? 'Willing to pay :) If so, my temporary throw-away email, valid for the next few days is...TempAug20Ed@gmail\$\endgroup\$CommentedAug 21, 2019 at 2:43

You mustlog in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.