Web scraping is the act of extracting data, information, or images from a website using an automated method. Think of it as copy and paste on full automatic.

We either write or use an app to go to the websites we want it to and make a copy of the specific things we want from those websites. It’s much more precise than downloading an entire website.

Table of Contents

    Like any tool, web scraping can be used for good or evil. Some of the better reasons for scraping websites would be ranking it in a search engine based on its content, price comparison shopping, or monitoring stock market information. You might even use it as a research tool of sorts.

    Use Excel as a Tool to Copy Data from the Web image

    How Can I Scrape Websites with Excel?

    Believe it or not, Excel has had the ability to extract data from websites for a long time, at least since Excel 2003. It’s just that web scraping is something most people don’t think of, let alone think of using a spreadsheet program to do the job. But it’s surprisingly easy, and powerful. Let’s learn how it’s done by making a collection of Microsoft Office keyboard shortcuts.

    Find the Sites You Want to Scrape

    The first thing we’re going to do is find the specific web pages from which we want to get information. Let’s go to the source and search at https://support.office.com/. We’re going to use the search term “frequently used shortcuts”. We can make it more specific by using the name of the specific app, like Outlook, Excel, Word, and so on. It may be a good idea to bookmark the results page so we can easily get back there.

    Find the Sites You Want to Scrape image

    Click on the search result, “Keyboard shortcuts in Excel for Windows”. Once on that page, find the list of Excel versions and click on Newer Versions. Now we’re working with the latest and greatest.

    We could go back to our search results page and open the results for all the other Office apps in their own tabs and bookmark them. It’s a good idea, even for this exercise. This is where most people would stop in collecting Office shortcuts, but not us. We’re going to put them in Excel so we can do whatever we want with them, whenever we want.

    Open Excel and Scrape

    Open Excel and start a new workbook. Save the workbook as Office Shortcuts. If you have OneDrive, save it there so the AutoSave feature will work.

    Once the workbook is saved, click on the Data tab.

    Open Excel and Scrape image

    In the ribbon of the Data tab, click on From Web.

    Open Excel and Scrape image 2

    The From Web wizard window will open. This is where we put the web address or URL of the website from which we want to scrape data. Switch to your web browser and copy the URL.

    Open Excel and Scrape image 3

    Paste the URL in to the URL field of the From Web wizard. We could choose to use this in Basic or Advanced mode. The Advanced mode gives us a lot more options on how to access the data from the website. For this exercise, we only need Basic mode. Click OK.

    Open Excel and Scrape image 4

    Excel will now attempt to connect to the website. This may take a few seconds. We’ll see a progress window, if it does.

    Open Excel and Scrape image 5

    The Navigator window will open, and we’ll see a list of tables from the website on the left. When we select one, we’ll see a table preview on the right. Let’s select the Frequently used shortcuts table.

    Open Excel and Scrape image 6

    We can click on the Web View tab to see the actual website, if we need to look around for the table we want. When we find it, we can click on it and it will be selected for import.

    Open Excel and Scrape image 7

    Now, we click on the Load button at the bottom of this window. There are other options we could choose, which are more complex and beyond the scope of doing our first scrape. Just be aware that they are there. Excel’s web scraping capabilities are very powerful.

    Open Excel and Scrape image 8

    The web table will be loaded in Excel after a few seconds. We’ll see the data on the left, where the number 1 is in the picture below. Number 2 highlights the Query used to get the data from the website. When we have multiple queries in a work book, this is where we select the one we need to use.

    Open Excel and Scrape image 9

    Notice that the data comes into the spreadsheet as an Excel table. It’s already set up for us to be able to filter or sort the data.

    We can repeat this process for all the other web pages that have the Office shortcuts that we want for Outlook, Word, Access, PowerPoint, and any other Office app.

    Keeping Scraped Data Current in Excel

    As a bonus for you, we’re going to learn how to keep our scraped data fresh in Excel. This is a great way to illustrate just how powerful Excel is for data scraping. Even with this, we’re only doing the most basic scraping that Excel can do.

    For this example, let’s use a stock information web page like https://www.cnbc.com/stocks/.

    Keeping Scraped Data Current in Excel image

    Go through what we did before and copy and paste the new URL from the address bar.

    Keeping Scraped Data Current in Excel image 2

    You’ll get to the Navigator window and see the tables available. Let’s select the Major U.S. Stock Indices.

    Keeping Scraped Data Current in Excel image 3

    Once the data is scraped we’ll see the following spreadsheet.

    Keeping Scraped Data Current in Excel image 4

    On the right, we see the query for Major U.S. Stock Indexes. Select that so it is highlighted. Make sure we’re in the Table Tools tab and in the Design area. Then click on the down arrow under Refresh. Then click on Connection Properties.

    Keeping Scraped Data Current in Excel image 5

    In the Query Properties window, under the Usage tab, we can control how this information refreshes. We can set a specific time period to refresh, or to refresh when we open the workbook the next time, or to refresh in the background, or any combination of these. Once we choose what we need, click on OK to close the window and continue.

    Keeping Scraped Data Current in Excel image 6

    That’s it! Now you can track stock prices, sports scores, or any other data that changes frequently from an Excel spreadsheet. If you’re good with Excel equations and functions, you can do almost anything you want with the data.

    Maybe try to identify stock trends, run a fantasy sports pool at work, or maybe just keep track of the weather. Who knows? Your imagination and the data available on the Internet, are the only limits.

    Leave a Reply

    Your email address will not be published. Required fields are marked *