Scrape professional details from Houzz
FollowHouzz is a platform that connects homeowners and home professionals with tools, resources, and vendors to provide home renovation and design.
You can go to "Task Templates" on the main screen of the Octoparse scraping tool and start with the ready-to-use Houzz Templates directly to save your time. With this feature, there is no need to configure scraping tasks. For further details, you may check it out here: Task Templates
This tutorial will show you how to collect professional details, such as title, rating, location, and description on Houzz.com with Octoparse.
To follow through, you may want to use the URL below:
https://www.houzz.com/professionals/architects-and-building-designers/
Here are the main steps in this tutorial:
- Create a Go to Web Page - to open the target website
- Auto-detect the webpage - to create a workflow
- Click to open the detailed webpage - to load extra information about a Professional
- Extract data on the detailed webpage - to collect the Professional’s description
- Run the task - to get your desired data
1. Create a Go to Web Page - to open the target website
- Enter the page URL on the home screen and click Start to create a new task
2. Auto-detect the webpage - to create a workflow
- Choose Auto-detect webpage and wait for the detection to complete
Check the data fields in Data preview and delete unwanted fields or rename them if needed
Tip: Remember to keep the URL of each listing item, as we need to use it to open the detail page.
- Uncheck Add a page scroll
- Click Create workflow
3. Click to open the detailed webpage - to load extra information about a Professional
- Choose Click on link(s) to scrape the linked page(s) in the Tips panel
- Select the data field as Title_URL
- Click Confirm
4. Extract data on the detailed webpage - to collect the Professional’s description
- Click Read More in the About us section
- Choose Click button
- Choose Click Item
- Input the Matching XPath as: //div[@data-compid="profile-about"]/button
- Select the text needed
- Click Extract the text of the element
5. Run the task - to get your desired data
- Click Save on the upper right to save your task
- Click Run next to it and wait for a Run Task window to pop up
- Select Run on your device to run the task on your local device
- Wait for the task to complete
Here is a sample output from a local run:
If you have further issues with the task or have a suggestion that would make this a better resource for you, we’d love to hear about it. Submit a request here.
Author: Cassie
Editor: Yina