In this tutorial, we will show you how to collect product information on canadiantire.com with Octoparse.
For Canadian Tire, you could visit our easy-to-use "Task Template" on the main screen of the Octoparse scraping tool. All you need is to type in several parameters and the task is ready to go. For further details, you may check it out here: Task Templates
We will scrape title, price and stock from this website. To follow through, you may want to use this URL in the tutorial:
This tutorial will also cover:
- Deal with AJAX for pagination
Here are the main steps in this tutorial [Download demo task from here ]:
- "Go To Web Page" - open the targeted web page
- Create a pagination loop - scrape all the results from multiple pages
- Create a "Loop Item" - loop click into each item on each list
- Extract data - select the data for extraction
- Start extraction - run the task and get data
- Click "+ Task" to start a new task with Advanced Mode
- Paste the URL into the "Extraction URL" box
- Click "Save URL" to move on
- Scroll down to the bottom of the page, click the "LOAD MORE RESULTS" button
- Click "Loop click next page" on "Action Tips" panel
- Uncheck "Retry when pages remains unchanged"
- Set up an AJAX loading for 5s (optional according to your local network condition)
- Click "Pagination" and then click "End loop when"
- Check "Execution times each" and set 5 times
- Click "OK" to save
AJAX timeout can often be used as webpage timeout for Click Action. For example, when you have a page that takes forever to finish loading, long after the data you need gets loaded, you can conveniently use AJAX timeout to tell Octoparse to move on to the next action when the set time is reached. If you want to learn more about AJAX, here are the related links.
- Click on the first three product on the page
- Click "Loop click each element" on the "Action Tips" panel
- Uncheck "Uncheck the box for "Retry when page remains unchanged (use discreetly for AJAX loading)"
- Set up an AJAX loading for 5s (optional)
- Click "Save"
- Click on the data you need on the page
- Select "Extract text of the selected element" from the "Action Tips"
- Rename the fields by selecting from the pre-defined list or inputting on your own
- Click "Start Extraction" on the upper left side
- Select "Local Extraction" to run the task on your computer, or select "Cloud Extraction" to run the task in the Cloud (for premium users only)
Here's the data we extracted.
Happy data hunting!