In this tutorial, we will show you how to collect product information on canadiantire.com with Octoparse.
For Canadian Tire, you could visit our easy-to-use "Task Template" on the main screen of the Octoparse scraping tool. All you need is to type in several parameters and the task is ready to go. For further details, you may check it out here: Task Templates
We will scrape the title, price, stock, etc. from this website. To follow through, you may want to use this URL:
Here are the main steps in this tutorial:[Download demo task file here ]
- Open the target web page
- Auto-detect the web page to create a workflow
- Click into each product link to get more detailed information
- Extract data from the product detail page
- Set up wait time to slow down the scraping speed
- Save and start to run the task and get data
1) Open the targeted web page
- Enter the URL on the home page and click Start
If you see any pop-ups on the web page, please switch to Browse mode by clicking to close it manually. Remember to turn off Browse mode after that.
2) Auto-detect the web page to create workflow
- Click "Auto-detect web page data" and wait for the detection to complete
(It might take a long time since the website applies infinite loading)
- Click on "Edit" under "Click on a Load More button" to revise the "Number of clicks" according to how many products you need to scrape (the default setting is to click "1" time)
- Delete unwanted fields or modify field names on the Data Preview
- Click "Create workflow"
- Set the After loading page as following in the "Go to Web Page1"
3) Click into each product link to get more detailed information
- Choose to “Click on link(s) to scrape the linked page(s)”
- Select "Click on an extracted data field" and select "Title_URL" from the drop-down menu
- Click "Confirm"
4) Extract data from the product detail page
- Select information on the web page
- Choose "Extract text of the selected element"
- Repeat the above steps to extract all the data you need
- Rename the fields is needed
5) Set up wait time to slow down the scraping speed
The website applies an anti-scraping technique and it would deny your access if you scrape too fast. We need to slow down the scraping by setting the wait time.
- Double-click the "Extract Data1" to open the settings
- Tick "Wait before action"
- Set up the wait time as 7-10s
6) Click the "Load More" Button
- Scroll Down to find the "LOAD MORE RESULTS" Button
- Click it and select the "Loop click single element"
- The final Workflow would be:
7) Run extraction - run your task and get data
- Click "Run" on the upper left side
- Select "Run task on your device" to run the task on your computer, or select "Run task in the cloud" to run the task in the Cloud (for premium users only)
Here is the sample output.
Is this article helpful? Contact us any time if you need our help!