Get the Data of Fund Performance directly into Excel utilizing VBA and Google

Let’s leave the fascinating world of algebra and Matlab for a while and let’s see what exciting we can do from the Excel level. Microsoft Excel has this ‘thanks God’ programming toolkit known as VBA (Visual Basic for Application) that adds a soul of coding into a mortal body of spreadsheets. If you are an advanced programmer and you are asked or forced to deal with solving problems in Excel/VBA, no doubt you will find programming in VBA as a return to the kindergarten. Saying shortly, VBA is all about fun and relax at work.

Great. As I’m working over a special section of QuantAtRisk.com devoted to From Zero to Hero bootcamp for those of You who wish to brush up or master own skill in VBA and Matlab programming in the application to the quantitative finance and risk management, today I want you to jump with me into the open ocean of VBA application for a specific task that I found very helpful at some stage of my work with the data. Therefore, let’s define the problem first.

The problem is about our quick ability to download some data from the webpage of our choice (e.g. regarding a recent fund or investment option performance) directly into Excel, and next, to look for and extract exactly what we would like to save for our further quantitative analysis.

Since VBA codes for solving this problem are easily spread around the Web, it wouldn’t be challenging to dedicate another article treating the same problem in a new frame. And it’s not my style. Because I love to challenge myself in my private life, I even more love to challenge the problem I wish to solve. I don’t need to look too far as the problems I find around challenge me equally.

1. Case Study

We hunt for the data that are displayed by the data provider within a web-frame, and all VBA codes aimed at getting access to that frame simply fail. Let’s consider the example of InvestSMART.com.au webpage. It’s an Australian portal feeding for some funds its data from the Morningstar.com.au data provider. We are looking for the latest investment option performance of, say, OnePath OA PS – OP Australian Shares EF of the APIR code MMF0022AU:

Screen Shot 2013-10-21 at 11.15.55 PM

and we want to download into Excel the Total Return numbers for 1 Month, 3 Month, and 1 Year performance. In case when the standard ways of getting this information do not work, for example, utilizing the classical variables of doc.body.innerText or doc.body.outerText as in the epic approach:

Dim iFrm As HTMLIFrame
Dim doc As MSHTML.HTMLDocument
 For iterator = 0 To appIE.Document.all.Length - 1
  If TypeName(appIE.Document.all(iterator)) = "HTMLIFrame" Then
   Set iFrm = appIE.Document.all(iterator)
   Set doc = iFrm.Document
   MsgBox doc.body.outerText
  End If
Next

we need to get another way round this problem. One of them is the application of Google Search for the fund performance based on the investment option code.

2. Limit Google Search down to the first listing, download the webpage directly into Excel

That title says everything. We can use Google.com to perform the search for the same investment option among all webpages of InvestSMART as indexed by Google up to date. This solution seems to be more elegant and getting straight to the point. To make this post worth substantially more, imagine, that your task is to repeat the data acquisition process for a long list of options you have in your Excel worksheet:

Screen Shot 2013-10-21 at 11.48.54 PM

We not only look for the way of fetching the required performance numbers but also to automate the whole process to work in the background (or overnight). Given the active Excel worksheet constructed as displayed above, we write VBA code (a macro) making use of Google Search engine. We read in the APIR codes one by one from column B and allow VBA to search the Google by the keyword constructed, in general, as: “InvestSMART”+APIRCode:

'Get Fund Performance directly into Excel utilizing VBA and Google
'
'(c) 2013 QuantAtRisk.com, by Pawel Lachowicz

Sub CheckAPIRCodes()

WB = ActiveWorkbook.Name
WS = ActiveSheet.Name

'Declaration of Variables
Dim i, j, k, r As Long
Dim cellstr, cellstr2, cellstr3, cellstr4, str1, str2, extractedHTML As String
Dim NameB, NameS, MyStr, GoogleSearchPath, SearchString As String
Dim d1, d2, d3, d4, d5, d6, APIRCode As Variant
Dim Nstr, Mstr, c1, c2, c3, c4, c5, c6 As Long
Dim SearchingError As Boolean
Dim iStart, iEnd As Integer
Dim ie As InternetExplorer
Dim RegEx As RegExp, RegMatch As MatchCollection
Dim iedoc, pDisp As Object

SearchingError = False
Set ie = New InternetExplorer
Set RegEx = New RegExp


'Find number of all rows with codes in WS (including header)
Dim Nrow As Long
Nrow = ActiveSheet.Range("B" & Rows.Count).End(xlUp).Row


'--Main Loop over the list of all APIR codes as specified in column B

For r = 1 To Nrow
    Application.Wait (Now + TimeValue("0:00:05"))
    SearchingError = False
    'Read APIR Code from the list (row number = r)
    Workbooks(WB).Worksheets(WS).Activate
    APIRCode = Workbooks(WB).Worksheets(WS).Cells(r, 2).Value
    
    '--GOOGLE Search for APIR Code in InvestSMART
    Set ie = New InternetExplorer
    Set RegEx = New RegExp
    'Search in Google for 'SearchString' keyword
    GoogleSearchPath = "https://www.google.com.au/search?q="
    'The keyword
    SearchString = "InvestSMART+" & APIRCode
    ie.Navigate GoogleSearchPath & SearchString
    'Loop until IE page is full loaded
    Do Until ie.ReadyState = READYSTATE_COMPLETE
    Loop
    'Set regular expression
    MyStr = ""
    MyStr = ie.Document.body.innerText
    Set RegMatch = Nothing
    Set RegMatch = RegEx.Execute(MyStr)
    If RegMatch.Count > 0 Then
        ie.Navigate RegMatch(0)
        Do Until ie.ReadyState = READYSTATE_COMPLETE
        Loop
            'show Internet Explorer
            ie.Visible = True
            Set iedoc = ie.Document
            'Extract the first webpage from the Google search; if failed,
            'handle the error
            extractedHTML = ""
            On Error GoTo ErrHandler:
            extractedHTML = iedoc.getElementById("search").innerHTML
ErrHandler:
            ie.Quit
            Mstr = 0
            Mstr = Len(extractedHTML)
            Resume GoOn:
GoOn:
            If (Mstr = 0) Then
                SearchingError = True
            Else
                SearchingError = False
            End If
            If (SearchingError = False) Then
                'find the first href as this will be the first link, 
                'add 1 to encompass the quote
                iStart = InStr(1, extractedHTML, "href=", vbTextCompare) 
                         + Len("href=") + 1
                'locate the next quote as this will be the end of the href
                iEnd = InStr(iStart, extractedHTML, Chr(34), vbTextCompare)
                'extract text
                extractedHTML = Mid(extractedHTML, iStart, iEnd - iStart)
                Nstr = Len(extractedHTML)
                iStart = 0
                iStart = InStr(1, extractedHTML, "&", vbTextCompare)
                If (iStart > 0) Then
                    str1 = Left(extractedHTML, iStart)
                    str2 = Right(extractedHTML, Nstr - iStart - 4)
                    extractedHTML = str1 & str2
                End If
            End If
            str1 = Left(extractedHTML, 29)
            If (str1 <> "https://www.investsmart.com.au") Then
                SearchingError = True
            End If
            Else
                 'have the best sex ever
        End If

In line #29 the code counts the number of elements listed in column B. This solution works only for the lists having no blank cells between the elements.

The practice with the code also reveals that Google protects itself from the robotic-like requests sent from our VBA level if the time of sending a search request to Google is spaced by a few seconds only. Therefore, the smart step to avoid this obstacle is to add a time delay of about 5 seconds (see line #35). The code works perfectly under Windows environment as it uses Internet Explorer, so please don’t be misled by a screenshot taken from Apple’s MacOS. To the best of my knowledge, there are problems in running this code in Apple due to the lack of right socket replacing IE.

The last step is natural. If Google found and limited its first search entry to the webpage of InvestSMART and requested APIR code, we fetch and download the page directly into Excel. The following solution (among many many available on-line) I found to work the best:

'---- If InvestSMART with APIR code has been found by Google, 
'extract performance numbers
    If (SearchingError = False) Then
        'Download extractedHTML into Excel (new workbook)
        Application.Workbooks.Open (extractedHTML)
        NameB = ActiveWorkbook.Name
        NameS = ActiveSheet.Name

where extractedHTML variable stores the exact web address of the page with the data we are interested in. Check this code separately how does it work with different domains. A little bit of reconnaissance here will help you to discover how this process is powerful for Excel. In short, the extractedHTML webpage is opened in Excel as a new worksheet!

Given that unique experience, we are left with a basic data manipulation based on the data extracted from the Internet:

       For i = 1 To 50
            Workbooks(NameB).Worksheets(NameS).Activate
            cellstr = Workbooks(NameB).Worksheets(NameS).Cells(i, 1).Value
            cellstr2 = Left(cellstr, 16)
            If (cellstr2 = "Fund Performance") Then
                cellstr4 = cellstr
                For j = 1 To 20
                    Workbooks(NameB).Worksheets(NameS).Activate
                    cellstr = Workbooks(NameB).Worksheets(NameS).Cells(i+1,j).Value
                    If (cellstr = "1 Month") Then c1 = j
                    If (cellstr = "3 Month") Then c2 = j
                    If (cellstr = "1 Year") Then c3 = j
                Next j
                For j = i + 1 To i + 5
                    cellstr = Workbooks(NameB).Worksheets(NameS).Cells(j, 1).Value
                    cellstr2 = Left(cellstr, 12)
                    If (cellstr2 = "Total Return") Then
                        k = j
                        'read data from InvestSMART
                        '1M
                        d1 = Workbooks(NameB).Worksheets(NameS).Cells(k, c1).Value
                        '3M (Q)
                        d2 = Workbooks(NameB).Worksheets(NameS).Cells(k, c2).Value
                        '1Y
                        d3 = Workbooks(NameB).Worksheets(NameS).Cells(k, c3).Value
                        'save in a working sheet
                        Workbooks(WB).Worksheets(WS).Activate
                        Workbooks(WB).Worksheets(WS).Cells(r, 3).Value = d1
                        Workbooks(WB).Worksheets(WS).Cells(r, 4).Value = d2
                        Workbooks(WB).Worksheets(WS).Cells(r, 5).Value = d3
                        Workbooks(WB).Worksheets(WS).Cells(r, 6).Value = cellstr4
                        Workbooks(WB).Save
                    End If
                Next j
            End If
        Next i
        'Workbooks(NameB).Close
    End If
    'ie.Quit
    Set RegEx = Nothing
    Set ie = Nothing
Next r

Workbooks(WB).Worksheets(WS).Activate
Range("A1").Select

End Sub

what accomplishes our mutual VBA-Google efforts into the final result of a “Master and his dog” game of throwing and fetching the ball:

Screen Shot 2013-10-22 at 12.25.44 AM

1 comment
  1. For the VBA code to work you need to select the References, which you have not described. Best would be an example excel file.

Leave a Reply

Your email address will not be published. Required fields are marked *