You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!

Kijiji is a Canadian online classified advertising website and part of eBay Classified Group.

This tutorial will show you how to scrape car information from Kijiji.

1.png

To follow through, you may want to use this URL in the tutorial:

https://www.kijiji.ca/b-cars-trucks/guelph/chevrolet-camaro/c174l1700242a54a1000054


Here are the main steps in this tutorial: [Download task file here]

  1. Go To Web Page - to open the target web page

  2. Create a "Loop Click Item" - to loop click into each item on each list

  3. Modify Xpath for Loop Item - to locate all the items

  4. Set Click Item - to show detailed info

  5. Extract Data - to select the data to scrape

  6. Modify path for data file - to locate elements accurately in each detailed page

  7. Run the task - to get the desired Data


1. Go To Web Page - to open the target web page

  • Enter the URL on the home page and click Start

enter.png

2. Create a "Loop Click Item" - loop click into each item on each list

  • Click on the first item card

  • Click on the second item card

  • Click on Loop click each element in the tips box

LOOP.png
  • Click on the Click Item box > Options

  • Tick Open in a new tab

  • Set AJAX Timeout to 7s

  • Click Apply

CLICK_ITEM.png

3. Modify Xpath for Loop Item - to locate all the items

After setting the Loop for the item cards, some items failed to be included in the Loop. We need to modify the Xpath to locate all the items manually.

  • Click Loop Item

  • Choosing Loop Mode as Variable List

  • Input Xpath as //div[@class='clearfix']

  • Click Apply

3.png

4. Set Click Item - to show detailed info

Detailed information and description have been hidden on the detailed page, so we need to click the "Show more" button to load the information fully.

  • Click > Show more for the product detailed page

  • Click > Click button in the tips box

SHOW.png
  • Click > Show more under the description

  • Click > Click button in the tips box

more.png

5. Extract Data - select the data to scrape

  • Click on the wanted Data

  • After all the chosen data turn green, Click > Extract data in the tips box

DATA.png
  • Double click the data file if you need to rename them

DATA.png

6. Modify XPath for data fields - to locate elements accurately in each detailed page

If there is a missing data collection or field misplacement, we need to rewrite XPath to ensure the elements are located for every detailed page.

  • Change the data preview panel to a vertical view by clicking the upper right corner icon

  • Input Xpath for the field

xpath.png

Please find XPath for each data field below:

img_url: //div[contains(@class,'backgroundImage')]//img

doors: //dd[@itemprop="numberOfDoors"]

transmission: //dd[@itemprop="vehicleTransmission"]

fuel: //dd[@itemprop="fuelType"]/a

trim: //dd[@itemprop="vehicleConfiguration"]

stock: //dt[contains(text(),'Stock #')]/following-sibling::dd

color: //dd[@itemprop="color"]/a

km: //dd[@itemprop="mileageFromOdometer"]

seats: //dd[@itemprop="seatingCapacity"]

drivetrain: //dd[@itemprop="driveWheelConfiguration"]


The final workflow will look like

WF.png

7. Run the task - to get the desired Data

  • Click the Save button first to save all the settings you have made

  • Then click Run to run your task either locally or cloudly

mceclip8.png
  • Select Run on your device and click Run Now to run the task on your local device

  • Waiting for the task to complete

mceclip9.png

Below is a sample data run from the local. Excel, CSV, HTML, and JSON formats are available for export.

DATA.png

Tip: Local runs are great for quick runs and small amounts of data. If you are dealing with more complicated tasks or mass of data, Run in the Cloud is recommended for higher speed. You are welcome to try the premium feature by signing up for the 14-day free trial here. Tasks can be scheduled hourly, daily, or weekly, and data delivered regularly.

Did this answer your question?