You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!
Canadian Tire is a Canadian retail company that operates in the automotive, hardware, sports, leisure and housewares sectors.
In this tutorial, we will show you how to collect product information on canadiantire.com with Octoparse.
To follow through, you may want to use this URL:
Note: Task otd file has been attached at the bottom of this tutorial.
Here are the main steps in this tutorial:
- Create go to web page - to open the target page
- Auto-detect the web page - to create workflow
- Set scroll page for Go to Web Page - to scroll down the page
- Add pagination - to load more info
- Set Xpath for loop item - to exclude data loaded earlier
- Run the task - to get the desired data
1. Create go to web page - to open the target page
- Enter the URL on the home page and click Start
2. Auto-detect the web page - to create workflow
- Click on Auto-detect web page data and wait for the detection to complete
(It might take a long time since the website applies infinite loading)
- Delete unwanted fields or modify field names in the Data Preview
- Untick Add a page scroll and Click Create workflow
The workflow would look below:
3. Set scroll page for Go to Web Page - to scroll down the page
- Click Go to Web Page
- Click Options
- Set Scroll for one screen
- Wait 2s
- Scroll 50times
4. Add pagination - to load more info
- Click Show more results button
- Click Loop click single URL on Tips
- Click Click to paginate
- Choose Options
- Tick Scroll down the page after it is loaded
- Scroll for one screen
- Scroll 20times
5. Set XPath for loop item - to exclude data loaded earlier
For this page, there will be 16 new items loaded out every time the show more button is clicked . To prevent from scraping duplicate data, set up a loop Xpath to locate the last 24 items.
- Click Loop Item
- Put below Xpath in Matching Xpath :
- Click Apply
6. Run the task - to get the desired data
- Click the Save button first to save all the settings you have made
- Then click Run to run your task either locally or cloudly
- Select Run on your device and click Run Now to run the task on your local device
- Wait for the task to complete
Below is the sample data run from local. Excel, CSV, HTML, and JSON formats are available for export.
If you have further issues with the task or any suggestions, we’d love to hear about them. Submit a request here.
Is this article helpful? Contact us any time if you need our help!