Page 1 of 4 1234 LastLast
Results 1 to 15 of 51
  1. #1
    Euler is offline Advanced Beginner
    Windows 7 32bit Access 2010 32bit
    Join Date
    Feb 2015
    Posts
    62

    Scraping a website

    I'm trying to write VBA code that will take a tracking ID from an Access form, look it up on the FedEx website and then return the delivery date. I have this code so far:




    Code:
    Private Sub Test()
    Code:
        'Need reference to Microsoft Forms 2.0 (FM20.dll)
        Const PREFIX_FOR_FEDEX_LINK As String = "https://www.bing.com/packagetrackingv2?packNum="
        Const SUFFIX_FOR_FEDEX_LINK As String = "&carrier=Fedex&FORM=PCKTR1"
        Dim IEbrowser As Object
        Dim MyData As MSForms.DataObject
        Set MyData = New MSForms.DataObject
        Set IEbrowser = CreateObject("InternetExplorer.Application")
        With IEbrowser
            .Visible = True
            .Navigate PREFIX_FOR_FEDEX_LINK & "581190049992" & SUFFIX_FOR_FEDEX_LINK
            'Wait for page to finish loading.
            Do While .Busy Or .READYSTATE <> READYSTATE.READYSTATE_COMPLETE: DoEvents: Loop
            'SelectAll (Ctrl+A)
            .ExecWB 17, 2
            'Copy selection (Ctrl+C)
            .ExecWB 12, 2
            MyData.GetFromClipboard
            Me!txt2 = Left(MyData.GetText(), 100)
            .Quit
        End With
    End Sub
    




    But this returns everything. How can I get it to just return a single piece of information? The code I have is from a few examples on the Internet but I couldn't find anything that speaks directly to my issue. Thanks for your help.

  2. #2
    Edgar is online now Competent Performer
    Windows 8 Access 2016
    Join Date
    Dec 2022
    Posts
    274
    I get this:
    Code:
    <html>
      <head></head>
      <body class="vsc-initialized">
        <div class="b_rich">
          <div class="b_vPanel">
            <div>
              <div class="b_focusTextSmall">Delivered: viernes, 13 de mayo de 2022 12:59</div>
            </div>
            <div>
              <div class="progressBarParent">
                <div class="pr-tr-currentState">Package current status Delivered </div>
                <div class="p-line-tr">
                  <div class="pr-tr-img pt-p-third pr-tr_last">
                    <img aria-hidden="true" alt="Delivery truck" class="rms_img">
                  </div>
                  <div class="pt-p-first pt-line_finished"></div>
                  <div class="pt-p-second pt-line_finished"></div>
                </div>
                <div class="p-line-text">
                  <span aria-hidden="true" class="p_tr_Processed rpt_p_done">Processed</span>
                  <span aria-hidden="true" class="p_tr_intransit rpt_p_done">In transit</span>
                  <span aria-hidden="true" class="p_tr_delivered  rpt_p_done">Delivered</span>
                </div>
              </div>
            </div>
            <div>
              <table class="rpt_se">
                <tbody>
                  <tr>
                    <th id="6ca5ceb4c89847c2bd8a82960b4b55d1" scope="col" class="pt_header">DATE</th>
                    <th id="39345da1865546e2b53040f7432e7805" scope="col" class="pt_header pt_header_time">TIME</th>
                    <th id="26a84ff21bc94b3a9f7caebddb667931" scope="col" class="pt_header pt_header_Loc">LOCATION</th>
                    <th id="ed190c2bac754bbba9935f8393c78c0c" scope="col" class="pt_header">STATUS</th>
                  </tr>
                  <tr>
                    <td class="  pt_Cell" headers="6ca5ceb4c89847c2bd8a82960b4b55d1" width="12%">13 may.</td>
                    <td class="b_rTxt pt_Cell" headers="39345da1865546e2b53040f7432e7805" width="10%">12:59</td>
                    <td class="  pt_location_cell pt_Cell" headers="26a84ff21bc94b3a9f7caebddb667931" width="33%">Brockton, MA, United States</td>
                    <td class="  rpt_se_rm pt_Cell" headers="ed190c2bac754bbba9935f8393c78c0c" width="45%">Delivered</td>
                  </tr>
                  <tr class="toggledItem">
                    <td class="  pt_Cell" headers="6ca5ceb4c89847c2bd8a82960b4b55d1" width="12%">13 may.</td>
                    <td class="b_rTxt pt_Cell" headers="39345da1865546e2b53040f7432e7805" width="10%">7:25</td>
                    <td class="  pt_location_cell pt_Cell" headers="26a84ff21bc94b3a9f7caebddb667931" width="33%">Brockton, MA, United States</td>
                    <td class="  rpt_se_rm pt_Cell" headers="ed190c2bac754bbba9935f8393c78c0c" width="45%">On FedEx vehicle for delivery</td>
                  </tr>
                  <tr class="toggledItem">
                    <td class="  pt_Cell" headers="6ca5ceb4c89847c2bd8a82960b4b55d1" width="12%">13 may.</td>
                    <td class="b_rTxt pt_Cell" headers="39345da1865546e2b53040f7432e7805" width="10%">7:18</td>
                    <td class="  pt_location_cell pt_Cell" headers="26a84ff21bc94b3a9f7caebddb667931" width="33%">Brockton, MA, United States</td>
                    <td class="  rpt_se_rm pt_Cell" headers="ed190c2bac754bbba9935f8393c78c0c" width="45%">At local FedEx facility</td>
                  </tr>
                  <tr class="toggledItem">
                    <td class="  pt_Cell" headers="6ca5ceb4c89847c2bd8a82960b4b55d1" width="12%">13 may.</td>
                    <td class="b_rTxt pt_Cell" headers="39345da1865546e2b53040f7432e7805" width="10%">4:17</td>
                    <td class="  pt_location_cell pt_Cell" headers="26a84ff21bc94b3a9f7caebddb667931" width="33%">Willington, CT, United States</td>
                    <td class="  rpt_se_rm pt_Cell" headers="ed190c2bac754bbba9935f8393c78c0c" width="45%">Departed FedEx location</td>
                  </tr>
                  <tr class="toggledItem">
                    <td class="  pt_Cell" headers="6ca5ceb4c89847c2bd8a82960b4b55d1" width="12%">13 may.</td>
                    <td class="b_rTxt pt_Cell" headers="39345da1865546e2b53040f7432e7805" width="10%">2:56</td>
                    <td class="  pt_location_cell pt_Cell" headers="26a84ff21bc94b3a9f7caebddb667931" width="33%">Willington, CT, United States</td>
                    <td class="  rpt_se_rm pt_Cell" headers="ed190c2bac754bbba9935f8393c78c0c" width="45%">Arrived at FedEx location</td>
                  </tr>
                </tbody>
              </table>
            </div>
          </div>
        </div>
      </body>
    </html>
    What part of this do you need?
    Please click on the ⭐ below if this post helped you.


  3. #3
    June7's Avatar
    June7 is online now VIP
    Windows 10 Access 2010 32bit
    Join Date
    May 2011
    Location
    The Great Land
    Posts
    52,929
    View the source code of the retrieved page and will see it is organized with div tags. The specific info you want is:

    <div class="b_focusTextSmall">Delivered: Fri, May 13, 12:59 PM</div>

    Maybe the element can be directly referenced, otherwise best I think you can do is to parse data from the entire string you retrieve. You seem to accomplish that with the Left() function.
    How to attach file: http://www.accessforums.net/showthread.php?t=70301 To provide db: copy, remove confidential data, run compact & repair, zip w/Windows Compression.

  4. #4
    Edgar is online now Competent Performer
    Windows 8 Access 2016
    Join Date
    Dec 2022
    Posts
    274
    Then this?
    Code:
    Sub fsdfk()
        Dim req As Object
        Dim res As Object
        Set req = CreateObject("msxml2.xmlhttp")
        req.Open "get", "https://www.bing.com/packagetrackingv2?packNum=581190049992&carrier=Fedex&FORM=PCKTR1", False
        req.send
        Set res = CreateObject("htmlfile")
        res.body.innerHTML = req.responseText
        
        Debug.Print res.getElementsByClassName("b_focusTextSmall")(0).innerText
    End Sub
    Please click on the ⭐ below if this post helped you.


  5. #5
    June7's Avatar
    June7 is online now VIP
    Windows 10 Access 2010 32bit
    Join Date
    May 2011
    Location
    The Great Land
    Posts
    52,929
    I get "does not support" error on the Print line.

    It works with this change:

    Dim res As HTMLDocument
    How to attach file: http://www.accessforums.net/showthread.php?t=70301 To provide db: copy, remove confidential data, run compact & repair, zip w/Windows Compression.

  6. #6
    Euler is offline Advanced Beginner
    Windows 7 32bit Access 2010 32bit
    Join Date
    Feb 2015
    Posts
    62
    Quote Originally Posted by June7 View Post
    I get "does not support" error on the Print line.

    It works with this change:

    Dim res As HTMLDocument

    Thank you so much for your help with this, June7.
    I haven't tested it yet but I will tomorrow.
    Do you know how I can get the year in the delivery date? It shows up in Edgar's version in Spanish but not in the English version for some reason.

  7. #7
    Euler is offline Advanced Beginner
    Windows 7 32bit Access 2010 32bit
    Join Date
    Feb 2015
    Posts
    62
    Thank you so much for your help with this, Edgar.
    I haven't tested it yet but I will tomorrow.
    Do you know how I can get the year in the delivery date? It shows up in your Spanish version but not in the English version for some reason.

  8. #8
    June7's Avatar
    June7 is online now VIP
    Windows 10 Access 2010 32bit
    Join Date
    May 2011
    Location
    The Great Land
    Posts
    52,929
    Can't get what isn't there. The English version does not provide year.

    Somebody messed up when they programmed the page.
    How to attach file: http://www.accessforums.net/showthread.php?t=70301 To provide db: copy, remove confidential data, run compact & repair, zip w/Windows Compression.

  9. #9
    Euler is offline Advanced Beginner
    Windows 7 32bit Access 2010 32bit
    Join Date
    Feb 2015
    Posts
    62
    Maybe I can pull it from the Spanish version somehow.

  10. #10
    Edgar is online now Competent Performer
    Windows 8 Access 2016
    Join Date
    Dec 2022
    Posts
    274
    Maybe you can send some headers to make that website think you need spanish locale
    https://developer.mozilla.org/en-US/...ccept-Language
    Please click on the ⭐ below if this post helped you.


  11. #11
    Euler is offline Advanced Beginner
    Windows 7 32bit Access 2010 32bit
    Join Date
    Feb 2015
    Posts
    62
    Quote Originally Posted by Edgar View Post
    Maybe you can send some headers to make that website think you need spanish locale
    https://developer.mozilla.org/en-US/...ccept-Language

    I'm going to try going to the FedEx website using this link to see if I can get the year:

    "https://www.fedex.com/apps/fedextrack/?action=track&trackingnumber=581190049992&cntry_co de=us&locale=en_US"

  12. #12
    Euler is offline Advanced Beginner
    Windows 7 32bit Access 2010 32bit
    Join Date
    Feb 2015
    Posts
    62
    Can you please tell me how I extract the delivery date information from this website? I don't know how to extract data from HTML code.

    "https://www.fedex.com/apps/fedextrack/?action=track&trackingnumber=581190049992&cntry_co de=us&locale=en_US"

  13. #13
    June7's Avatar
    June7 is online now VIP
    Windows 10 Access 2010 32bit
    Join Date
    May 2011
    Location
    The Great Land
    Posts
    52,929
    Viewing the source of that page and the data you want is not there. Instead, there are calls to server side scripts that must be pulling data for display. I don't see how the data you want can be pulled from this page.
    How to attach file: http://www.accessforums.net/showthread.php?t=70301 To provide db: copy, remove confidential data, run compact & repair, zip w/Windows Compression.

  14. #14
    Euler is offline Advanced Beginner
    Windows 7 32bit Access 2010 32bit
    Join Date
    Feb 2015
    Posts
    62
    When I right click on the website page and select "Inspect" I get a window showing the HTML code, I think. When I right click on the Elements window and select "Copy" then "Copy outerHTML" and paste it in Word and do a search I see this:

    direction--column fdx-u-display--flex"><!----><span class="deliveryDateText">Friday</span><!----><span class="deliveryDateTextBetween">5/13/22 at 12:59 PM</span><!----><!----><!----><!----><!----><!----><!----><!----><!----></div><!----><!----><!----></div><!----></trk-shared-shipment-status-delivery-date><trk-

    Is there some way I can capture the data from this?

  15. #15
    Edgar is online now Competent Performer
    Windows 8 Access 2016
    Join Date
    Dec 2022
    Posts
    274
    I couldn't send a header to make this url return a different date format:
    https://www.bing.com/packagetracking...ex&FORM=PCKTR1
    However, using my VPN, I was able to see that the date format changes according to the IP. So using a VPN will let you return a good date.

    The second website sends a series of http requests while loading the data, so our http request procedure will only bring the initial site before it loads stuff. For this kind of workflow, a browser automation is better. However, I tested that it fails quite often because of it being internet explorer (or maybe I sent too many requests too fast). You could however use Selenium or Puppeteer for that. Or use another method. Having the VPN wouldn't be a bad idea, bing can give you a good date if you use an IP outside the US. It would save you a lot of time.

    In order to get the text with class "deliveryDateTextBetween", you would use:
    Document.getElementsByClassName("deliveryDateTextB etween")(0).innerText in VBA
    Document.getElementsByClassName("deliveryDateTextB etween")[0].innerText in JS

    Don't give up with VBA, though. There are other approaches to take. Maybe you can open the IE window, then grab it using shell and get its document element to inspect it.
    Please click on the ⭐ below if this post helped you.


Page 1 of 4 1234 LastLast
Please reply to this thread with any new information or opinions.

Similar Threads

  1. Query from Website
    By ajmac in forum Forms
    Replies: 2
    Last Post: 07-30-2020, 07:24 AM
  2. Replies: 0
    Last Post: 12-06-2016, 11:28 AM
  3. Access on Website
    By Kevo in forum Access
    Replies: 6
    Last Post: 07-08-2015, 11:23 AM
  4. Web Scraping with MS Access 2010
    By Fabricio Sanches in forum Import/Export Data
    Replies: 1
    Last Post: 02-26-2012, 04:33 PM
  5. Scraping text from web into Access
    By abbeyainscal in forum Import/Export Data
    Replies: 5
    Last Post: 04-10-2010, 01:43 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Other Forums: Microsoft Office Forums