You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!
Shopify is a popular e-commerce platform for online stores. This tutorial will show you how to scrape product info on websites built through Shopify with Octoparse.
To follow through, here is the example URL:
The main steps are shown in the menu on the right, and you can download the sample task file here.
1. Create a Go to Web Page - to open the target website
Enter the target URL on the home page of Octoparse and click Start
2. Auto-detect the webpage - to create a workflow
Click Auto-detect web page data and wait for it to complete
Untick Add a page scroll
Click Create workflow
Note: If there are pop-ups when opening the page, please turn on the Browse mode in the upper right corner of the screen to close them. Remember to turn it off after the operation.
Go to Data Preview to see if you're okay with the current data output
Delete unnecessary data fields directly by clicking the More button and choosing Delete field
Modify the data field names by double-clicking the headers
3. Set up scroll - to better load the data on the webpage
Click Go to Webpage > Options
Tick Scroll down the page after it is loaded
Set Scroll Mode as for one screen
Set Wait to "2s" before the next scroll
Set Scroll times to "100"
Click Apply to save the settings
Repeat the steps above for the step Click to Paginate
4. Change XPath for the Loop Item and data fields - to make sure Octoparse scrapes correct data
Click Loop Item
Make sure Loop Mode is Variable List
Input Xpath //a[@id="js-paginate-next"]/../../preceding-sibling::div[1]/div[last()]/div
Click Apply
Click the More button on the data field for the price field
Choose Customize Xpath
Input Xpath "//div[@class="css-1dbjc4n r-1awozwy r-6koalj r-18u37iz"]/div[1]"
Click Apply
Modify Xpath for the image filed to "//div[@class="css-1dbjc4n r-k6uee3 r-1aamwfn r-1mlwlqe r-1udh08x r-417010"]//img[@class="css-9pa8cd"]" in the same way
5. Run the task - to get your target data
Click Save and click Run on the upper right side
Select Run on your device to run the task on your computer, or select Run in the Cloud to run the task in the Cloud (for premium users only)
Here's the sample data output for your reference: