Amazon is one of the most popular e-commerce websites around the world. Many users try to scrape it to collect product information. In this tutorial, we are going to show you how to scrape product details from Amazon.

You can also go to "Task Templates" on the main screen of the Octoparse scraping tool and start with the ready-to-use Amazon Templates directly to save your time. Octoparse provides several Amazon templates designed for different countries such as Germany, France, the US, Spain, and India. With this feature, there is no need to configure scraping tasks. For further details, you may check it out here: Task Templates

15191515615.png

If you would like to know how to build the task from scratch, you may continue reading the following tutorial or check this video below.

To follow through, you may want to use this URL in the tutorial:

https://www.amazon.com/s?rh=i%3Aelectronics%2Cn%3A172541%2Cp_n_feature_four_browse-bin%3A12097501011&ie=UTF8&lo=electronics

Here are the main steps in this tutorial: [Download task file here]

  1. Go to Web Page - open the targeted web page

  2. Auto-detect the web page - create the workflow

  3. Click into each product link to scrape more information

  4. Extract Data - extract data on the detail pages

  5. Set up AJAX timeout for "Click to Paginate"

  6. Start extraction - run the task and get data


1. Go to Web Page - Open the targeted web page

  • Enter the URL on the home page and click Start

61495456415.png

2. Auto-detect the web page - create the workflow

  • Click Auto-detect web page data and wait for the detection to complete

hjhjhjjh.gif
  • Delete unwanted fields or rename fields if needed in the Data preview

2.1.png
  • Uncheck the Add a page scroll

  • Click Create workflow

4.png

A Pagination and Loop Item would be generated automatically in the workflow.

5.png

If all the data you need could be scraped from the listing page, you can stop here and jump to Set up AJAX timeout for "Click to Paginate". If you want to go to each product detail page to get more info, follow the steps below.


3. Click into each product link to scrape more information

  • Choose Click on link(s) to scrape the linked page(s) on the Tips panel

  • Select Click on an extracted data field and select the field you want to click on from the drop-down menu (you can confirm if it's the correct link on the Data Preview)

  • Click Confirm

8.gif

Octoparse will automatically go to the first product page.


4. Extract Data - extract data on the detail pages

  • Select information on the web page

  • Choose Extract text of the selected element

  • Repeat the above steps to extract all the data you need

9.gif

5. Set up AJAX timeout for "Click to Paginate"

  • Click open the Action Settings of Click to Paginate

  • Tick Load with AJAX and select 10s as the AJAX timeout

12.gif

6. Run extraction - run your task and get data

  • Click Save

  • Click Run on the upper left side

  • Select "Run on your device" to run the task on your computer, or select "Run task in the Cloud" to run the task in the Cloud (for premium users only)

nbop.png

Here is the sample output.

56156156.png
Did this answer your question?