You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!

Houzz is a platform that connects homeowners and home professionals with tools, resources, and vendors to provide home renovation and design.

You can go to "Task Templates" on the main screen of the Octoparse scraping tool and start with the ready-to-use Houzz Templates directly to save your time. With this feature, there is no need to configure scraping tasks. For further details, you may check it out here: Task Templates

This tutorial will show you how to collect professional details, such as title, rating, location, and description on Houzz.com with Octoparse.

SNAG-houzz0002.jpg

To follow through, you may want to use the URL below:

https://www.houzz.com/professionals/architects-and-building-designers/

Here are the main steps in this tutorial: [Download task file here]

  1. Create a Go to Web Page - to open the target website

  2. Auto-detect the webpage - to create a workflow

  3. Click to open the detailed webpage - to load extra information about a Professional

  4. Extract data on the detailed webpage - to collect the Professional’s description

  5. Run the task - to get target data


1. Create a Go to Web Page - to open the target website

  • Enter the page URL on the home screen and click Start to create a new task

SNAG-houzz0009.jpg

2. Auto-detect the webpage - to create a workflow

  • Choose Auto-detect webpage and wait for the detection to complete

SNAG-houzz0008.jpg

Check the data fields in Data preview and delete unwanted fields or rename them if needed

SNAG-houzz0006.jpg

NOTE: do not delect the URL field as we need to use it to open the detail page.

  • Uncheck Add a page scroll

  • Click Create workflow

SNAG-houzz0007.jpg

3. Click to open the detailed webpage - to load extra information about a Professional

  • Choose Click on link(s) to scrape the linked page(s) in the Tips panel

SNAG-houzz0004.jpg
  • Select the data field as Title_URL

  • Click Confirm

SNAG-houzz0003.jpg

4. Extract data on the detailed webpage - to collect the Professional’s description

  • Click Read More in the About us section

  • Choose Click button

SNAG-houzz0002.jpg

NOTE: The location of the Read More button may differ in each detailed webpage. Thus, to extract the data more accurately, we need to modify the matching XPath of Read More

  • Choose Click Item

  • Input the Matching XPath as: //div[@data-compid="profile-about"]/button

xpath.jpg
  • Select the text needed

  • Click Extract the text of the element

SNAG-houzz0001.jpg

5. Run the task - to get target data

  • Click Save on the upper right to save your task

  • Click Run next to it and wait for a Run Task window to pop up

  • Select Run on your device to run the task on your local device

  • Wait for the task to complete

Here is sample output from a local run:

data_preview.jpg
Did this answer your question?