Web scraping online shops like eBay or Amazon has become a critically important data source, which allows you to do the comparison between the hot-sale products from prices, features, and product descriptions, conveniently. E-commerce web scraping is of great importance. It can help you do the comparison among the hot-sale products from different online shops like eBay and Amazon based on their prices, features and product descriptions.
Also, you can go to "Task Template" on the main screen of the Octoparse scraping tool and start with the ready-to-use eBay Templates directly to save your time. With this feature, there is no need to configure scraping tasks. For further details, you may check it out here: Task Templates
If you would like to know how to build the task from scratch, you may continue reading the following tutorial. This tutorial will show you how easy it is to retrieve the product data from eBay by Octoparse 7.X, an incredibly user-friendly web-scraping tool, to facilitate your data mining on websites.
We will scrape data such as the name, condition, price, and more info from the product details page with Octoparse.
To follow through, you may want to use this URL in the tutorial:
- "Go To Web Page" - open the target web page
- Create a pagination loop - scrape all the results from multiple pages
- Create a "Loop Item" - scrape all the items on each page
- Extract data - select the data for extraction
- Customize data field - clean the data by deleting extra strings (Optional)
- Save and start extraction - run the task and get data
1. "Go To Web Page" - open the target web page
- Click "+ Task" to start a new task with Advanced Mode
Advanced Mode is a highly flexible and powerful web scraping mode. For people who want to scrape from websites with complex structures, like samsclub.com, we strongly recommend Advanced Mode to start your data extraction project.
- Paste the URL into the "Website" box and click "Save URL" to move on
- Turn on "Workflow" mode to check and edit your workflow conveniently
2. Create a pagination loop - scrape all data from multiple pages
- Scroll down and click the ">" button on the web page
- Click "Loop click the selected element" on the "Action Tips" panel
3. Create a "Loop Item" - scrape all the items on each page
- Click "Go To Web Page" to go back to the first page, and then click "Pagination" box
- Click the title of the first listed item
- Click "" on the "Action Tips" panel, and then click "Select all"
- Click "Loop click each URL"
4. Extract data - select the data for extraction
- Click the information you need on the page
- Select "Extract text of the selected element" on the "Action Tips" panel
- Edit the "Field name"
- Click "OK" to save
1. If the selected item to be extracted doesn't have enough information, you could select the other item in "Loop Item" to fulfill the data field. In this case, products on eBay present their prices differently. Some showed by "Current Bid", while others showed by "Price", so we select the third option in the "Loop Item" to fulfill the data extracted field.
2. As the price on online shops may change from time to time, so you may want to add the timing of the data extraction. Now just clicking "Add predefined fields" at the bottom of the data field, you will see the option of "Add Current Time".
5. Customize data field - clean the data by deleting extra strings (Optional)
Now you may notice that the title of each product begins with "Details about", so you may want to delete it to make the data tidy.
- Select the data
- Click "Customize data field"
- Choose "Refine extracted data"
- Click "Add step", and choose "Replace"
- Copy and paste "Details about " in the "Replace" field, and leave it empty in the "With" field, then click "Evaluate" (Please be noticed to copy the blank space)
- Click "OK"
- Click "Save"
6. Start extraction - run the task and get the data
- Click "Save"
- Click "Start Extraction" on the upper left side
- Select "Local Extraction" to run the task on your computer, or select "Cloud Extraction" to run the task in the Cloud (for premium users only)
For a premium user, Cloud Extraction is highly recommended.
Was this article helpful? Contact us any time if you need our help!