You are browsing a tutorial guide for Octoparse's latest version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!
In this tutorial, we are going to show you how to scrape the product details from Wayfair, an American home improvement retailer commerce company that sells home goods.
For this example, we will use the URL below in order to scrape data such as the product title, description, and price from each product detail page.
Here are the main steps in this tutorial:
- "Go To Web Page" - open the targeted web page
- Create a pagination loop - scrape search results from all pages
- Create a "Loop Item" - scrape all the items on each page
- Extract data - select data for extraction
- Run the task - to get the desired data
Note: Download the demo task file at the bottom of this page
1. "Go To Web Page" - open the targeted web page
- Enter the target url into the search box at the center of the home screen
- Click Start to create a new task in Advanced Mode
2. Create a pagination loop - scrape search results from all pages
- Scroll down to the bottom of the page, click the "Next" button
- Click "Loop click single URL" on the "Tips" panel
3. Create a "Loop Item" - scrape all the items on each page
- Click on any product title on the page, the frame of the product will turn green
- Click "Select all" on the "Tips" panel and the frames of all products will turn green
- Click "Loop click each element"
4. Extract data - select data for extraction
After you click "Loop click each element", Octoparse will open the details page of the first product.
- Click on the data you need on the page, after they turn green
- Select "Extract text" inside "Action Tips"
- Rename the fields by double licking the field
- Delete unwanted data by clicking
Your workflow will show below:
5. Run the task - to get the desired data
- Click the Save button first to save all the settings you have made
- Then click Run to run your task either locally or cloudly
- Select Run on your device and click Run Now to run the task on your local device
- Waiting for the task to complete
Below is a sample data run from the local. Excel, CSV, HTML, and JSON formats are available for export.
Artículo en español: Scraping productos detalles de Wayfair
También puede leer artículos de web scraping en el website oficial