Skip to main content

Scrape data from a table

Table data is common among websites related to finance, sports, etc. This tutorial will guide you on how to scrape table data.

If you have learned how to grab a list of data, then table data is more or less similar (Extract a list). You can take each row of the table as an element of list data. Then, each table cell is equal to a sub-element in the element.

How to collect the table data with Octoparse? Go ahead with this tutorial!

The main steps are shown in the menu on the right. [Download the sample task file here.]


1. Use the Auto-detect function to set up the workflow

Octoparse supports auto-detecting the table and capturing all the columns. With this feature, you just need to

  • Copy the URL into Octoparse and click Start to create a new task

  • Click on Auto-detect web page data in the Tips panel to create a workflow

  • Check if the data preview shows what you want. If not, please click on Switch auto detect results until you see the right data.

  • Untick the Add a page scroll option if the website does not require it

  • Click Create workflow

The created workflow will be like this:

Tip: Check out Lesson 1: Start with Auto-detect for details about auto-detect.


2. Set up the workflow manually

What if the auto-detect fails or doesn't collect the complete table data? In this case, you need to set up the task manually. Here are the steps:

  • Select the first cell in the first row of the table, and then click the Expand the selection button until it selects the whole first row

Tip: You can click Turn OFF Auto-detection or Cancel Auto-detection to stop auto-detect if it starts automatically.

  • Choose Select all child elements on the Tips panel.

All the child elements in the first row are selected, and then Octoparse finds other similar elements highlighted in red.

  • Choose Select all similar groups from the Tips panel.

All the child elements in the table are selected and highlighted in green.

  • Click Element data on the Tips panel.

  • Edit data fields if needed (optional)

You now have all the data fields set up for the task. You can refine the data fields in the Data Preview section.

  • Double-click the field name to rename the data fields

  • Click the More button next to the field's name for more actions

Did this answer your question?