Scrape different variants of a product
FollowIf you need to scrape e-commerce data, especially the product data, you may have such a need as follows.
For some products with different options, you want to collect each variant's price, SKU, etc. Taking this hair dye product as an example, you may need to scrape its pricing for each color.
Solution:
To show you how to do it with Octoparse, we can take this web page URL as an example: https://www.walmart.com/ip/SoftSheen-Carson-Dark-and-Lovely-Fade-Resist-Rich-Conditioning-Color/10314047.
For this product, its color, pricing, images, page URL, and product ID will vary when you switch the option.
Here are some steps you can follow when you encounter this kind of situation.
1. Enter product URL(s) to start a new task
You can enter a list of URLs if you have a list of products to monitor. For the demonstration, I just enter one product URL.
2. Create a loop item to loop through each color option
- Click the 1st color option on the list, and then choose "Select all" on the "Tips" panel
- Then, choose "Loop click each element"
- AJAX is detected for this web page, and you can modify the time based on your local Internet to load the page content (Learn more about Handling AJAX)
- Double-click "Click Item" inside the "Loop Item" to uncheck "Open in a new tab".
- (Optional) Double-click "Loop Item" to change the "Loop Mode" from "Fixed List" to "Variable List". Then, enter the Element XPath: //DIV[@class="variants__list"]/LABEL/DIV[2]. This is important when you have different products with different numbers of colors to scrape.
Learn more about XPath here: What is XPath and how to use it in Octoparse.
3. Extract data you need on the page
You can click elements on the page to extract the data you need and rename the data fields if needed.
Here is the data output sample.
If you still have trouble with this topic, submit a ticket to our support team! We're here to help.
Tutorial en español: Scrapear diferentes variantes de un producto
También puedes leer más tutoriales de web scraping en sitio web oficial
Author: Vanny
Editor: Yina