Internet, XML and Dropbox > Web Service Wannabes

Navigation:  Internet, XML and Dropbox >

Access Answers: Web Service Wannabes

Previous pageReturn to chapter overviewNext page

Doug Steele

This month, Doug Steele explores a way to get information from the Internet.        


There's a Web page that has information that changes over time. I want to be able to read that page, and get the most up-to-date information from it. How can I do this?

Realistically, this sounds like a job for Web Services. However, sometimes the provider doesn't provide a Web Service. Depending on the Web page, it may be possible to simulate a Web Service using a technique often referred to as "screen scraping."

To do this, I'm going to take advantage of the XMLHTTP object that should already exist on your machine, even if you don't know about it. As long as you've installed at least Internet Explorer, version 4.0, you should have the file msxml.dll on your machine. If it's there, you're in business.

What I'm going to describe isn't the prettiest solution, but I find that this method works. This method does require that you have some knowledge of HTML, since what you're going to be retrieving from the Web site is the actual HTML that makes up the page, not the text that appears in your browser. As well, you'll see that this technique isn't really that amenable to sites that change their format frequently.

Also, I'm going to completely ignore how you initiate communication with the Internet if you don't have a dedicated connection (I have cable, so I have no way of testing any code for linking to the Internet). If it's any consolation, I can at least point you to code that you can use to determine whether or not you have a live Internet connection. Microsoft VB MVP Randy Birch has some code on his VBNet site at You could also take a look at his, but that solution requires some tweaking to be applied in Access (the code uses a control array, which Access doesn't support). Another bit of code you can check is Michael Kaplan's connect.bas module at (I found that using 0&—a zero followed by an ampersand—instead of StrPtr(App.Title) in the call to Michael's InternetOpen function worked for Access).

Having said all that, the following code is a module that will return the HTML associated with a given URL and print it to the debug window:

Function GetFromWebpage( URL As String) As String

On Error GoTo Err_GetFromWebpage

Dim objWeb As Object

Dim strXML As String

  Set objWeb = CreateObject("Microsoft.XMLHTTP")

  objWeb.Open "GET", URL, False


  strXML = objWeb.responseText

  GetFromWebpage = strXML


  Set objWeb = Nothing

  Exit Sub


  MsgBox Err.Description & " (" & Err.Number & ")"

  Resume End_GetFromWebpage

End Sub

The routine begins by instantiating an instance of the XMLHTTP object and then gives the object a URL to retrieve. When the Send method is done, the code looks at the responseText property of the object. The problem with this code is that there's a lot of stuff returned that you probably don't care about: You get all of the HTML required to render the page in your browser, not just the data you're seeking. If there's a specific piece of information on the page that you want, you now have to parse the page (using functions like InStr, Left, Mid, and Right) to get only that part of the HTML that's of interest to you.

I'll present a simple example. Let's say you want to be able to get stock quotes. There are lots of Web sites that will give you stock quotes. For no particular reason, I'll choose to use Looking at that site, you can go directly to a given stock's metrics if you know the symbol for that stock. Picking everyone's favorite, you can look for Microsoft's stock price by going to (Warning: Here's where writing these articles weeks or months in advance becomes a problem! When I wrote this at the beginning of August, the site appeared as shown in Figure 1. However, there was a warning on the page that they were in the process of testing a new interface. Hopefully they haven't implemented it before you read this!)

When I look at the HTML associated with that table, I see something like this:

<table cellspacing=1 cellpadding=4 border=0 width=100%>

<tr bgcolor=dcdcdc><th nowrap><font face=arial

size=-1>Symbol</font></th><th nowrap colspan=2><font

face=arial size=-1>Last Trade</font></th><th nowrap

colspan=2><font face=arial size=-1>Change</font></th>

<th nowrap><font face=arial size=-1>Volume</font></th>


<tr align=right bgcolor=white>

<td nowrap align=left><font face=arial size=-1>

<a href="/q?s=MSFT&d=t">MSFT</a></font></td>

<td nowrap align=center><font face=arial size=-1>

12:22pm</font></td><td nowrap><font face=arial

size=-1><b>25.62</b></font></td><td nowrap><font

face=arial size=-1><font color=ff0020>-0.09</font>

</font></td><td nowrap><font face=arial size=-1><font

color=ff0020>-0.35%</font></font></td><td nowrap>

<font face=arial size=-1>17,721,532</font></td>


and so on

The pattern for this page is consistent:

• The stock symbol appears after the HTML tags <font face=arial size=-1><a href="/q?s=.

• Last Trade information appears after the following <font face=arial size=-1> tag.

• The actual stock price follows the <font face=arial size=-1> following the Last Trade Information (and is enclosed in <b></b> tags).

It isn't that difficult to write a function that retrieves the HTML for the entire page, and then searches for those specific bits of text in the HTML. Since the code I showed before retrieves the entire HTML, all I need to do is write code to parse all of the ugly parts and find the important stuff buried in it:

Function GetStockQuote(CompanySymbol As String) _

                              As String

On Error GoTo Err_GetStockQuote

Dim objWeb As Object

Dim lngDateEnd As Long

Dim lngDateLen As Long

Dim lngDateStart As Long

Dim lngPointer As Long

Dim lngQuoteEnd As Long

Dim lngQuoteLen As Long

Dim lngQuoteStart As Long

Dim strDate As String

Dim strHTML As String

Dim strQuote As String

Dim strReturn As String

Dim strSearchFor As String

Dim strURL As String

strURL = "" & _


strHTML = GetFromWebpage(strURL)

  strReturn = "Sorry: Couldn't get the quote for " & _


  strSearchFor = "<font face=arial size=-1>" & _

                 "<a href=""/q?s=" & _

                 CompanySymbol & "&d=t"

  lngPointer = InStr(1, strHTML, _

                 strSearchFor, vbTextCompare)

  If lngPointer > 0 Then

    strSearchFor = "<font face=arial size=-1>"

    lngPointer = lngPointer + Len(strSearchFor)

    lngDateStart = InStr(lngPointer, strHTML, _

                   strSearchFor, vbTextCompare)

    If lngDateStart > 0 Then

      lngDateStart = lngDateStart + Len(strSearchFor)

      lngDateEnd = InStr(lngDateStart, strHTML, _

                   "</font>", vbTextCompare)

      If lngDateEnd > 0 Then

        lngDateLen = lngDateEnd - lngDateStart

        strDate = Mid$(strHTML, lngDateStart, lngDateLen)

        lngQuoteStart = InStr(lngDateEnd, strHTML, _

                        strSearchFor, vbTextCompare)

        If lngQuoteStart > 0 Then

          lngQuoteStart = lngQuoteStart + _


          lngQuoteEnd = InStr(lngQuoteStart, strHTML, _

                        "</font>", vbTextCompare)

          If lngQuoteEnd > 0 Then

            lngQuoteLen = lngQuoteEnd - lngQuoteStart

            strQuote = Mid$(strHTML, lngQuoteStart, _


            strReturn = UCase$(CompanySymbol) & _

                       " last traded for " & _

                       RemoveBold(Trim$(strQuote)) & _

                       " at " & Trim$(strDate)

          End If

        End If

      End If

    End If

  End If


  Set objWeb = Nothing

  GetStockQuote = strReturn

  Exit Function

The RemoveBold function I use when concatenating everything for output is just this very simple code that replaces all occurrences of either <b> or </b> with a zero length string (""):

Function RemoveBold(StringToEdit As String) As String

RemoveBold = Replace( _

   Replace(StringToEdit, "<b>", vbNullString), _

   "</b>", vbNullString)

End Function

As you can see, my sample search function isn't particularly pretty code, but it does what it's supposed to. The problem, of course, is that this code will fail to work if the Web page changes (including errors in the HTML generated to produce the page). And as you can see, a reasonable understanding of HTML code is required to be able to pinpoint the exact location in the file of the information you want.


Your download file is called AA1103.ZIP in the file

This is found on this page


See Also

Giving Your Access Form Controls Web-style Behaviors