Table data is common among websites related to finance, sports, etc. This tutorial will guide you on how to scrape table data.

If you have learned how to grab a list of data, then table data is more or less similar (Extract a list). You can take each row of the table as an element of list data. Then, each table cell is equal to a sub-element in the element.

How to collect the table data with Octoparse? Go ahead with this tutorial!

Case URL: https://money.cnn.com/data/hotstocks/index.html

mceclip0.png
  1. Use the Auto-detect function to set up the workflow

  2. Set up workflow manually


1. Use the Auto-detect function to set up the workflow

Octoparse supports auto-detecting the table and capturing all the columns. With this feature, you just need to

  • Copy the URL into Octoparse and click Start to create a new task

start1.png
  • Click on Auto-detect web page data in the Tips panel to create a workflow

auto-detect.png
  • Check if all table cells have been captured and click Create workflow

byutt.png

TIP: Check out Lesson 1: Start with Auto-detect for details about auto-detect.


2. Set up workflow manually

What if the auto-detect fails or it doesn't collect the complete table data? In this case, you need to set up the task manually. Here are the steps:

  • Select the first cell in the first row of the table, and then click the Expand the selection area button until it selects the whole first row

TIP: You can click Turn OFF Auto-detect or Cancel Auto-detect to stop auto-detect if it starts automatically.

ecx.png
  • Choose Select all sub-elements in the Tips panel.

5756757567.png

All the sub-elements in the first row are selected, and then Octoparse finds other similar elements highlighted in red.

red.png
  • Choose Select all from the Tips panel.

2233.png

All the sub-elements in the table are selected and highlighted in green.

green.png
  • Click Extract data in the Tips panel.

6555.png
  • Edit data fields if needed (optional)

You now have all the data fields set up for the task. You can refine the data fields in the "Data Preview" section.

  • Double-click the field name to rename the data fields

  • Click the More button next to the field's name for more actions: delete, copy, clean data, etc.

809999.png
Did this answer your question?