Tokopedia is an Indonesian technology company specializing in e-commerce. In this tutorial, we are going to show you how to scrape product information from Tokopedia.

For Tokopedia scraping, you can use our ready-to-use Task Template available on the home page or follow this tutorial to build the task from scratch.

sto.gif

To demonstrate, we will use the URL below as an example: https://www.tokopedia.com/search?st=product&q=usb

Here are the main steps in this tutorial: [Download demo task file here: Task 1/ Task 2]

Task 1 - Extract Product URLs

  1. Open Target Webpage

  2. Auto-detect Web Page Data

  3. Create Pagination

  4. Check the Workflow

  5. Run Task and Export Data

Task 2 - Extract Data from Detail Page

  1. Create a New Task

  2. Extract Data from the Product Page

  3. Check the Workflow

  4. Run Task and Export Data


Task 1 - Extract Product URLs

1. Open Target Webpage

  • Paste the URL and click Start

mceclip0.png

2. Auto-detect Web Page Data

  • Select Auto-detect web page data on the Tips panel

mceclip8.png
  • After the auto-detection finishes, select Edit under Add a page scroll

Add_page_scroll.jpg
  • Adjust the Repeats number to 3 and Confirm, then Create workflow

scroll_repeat.jpg
  • Go to Data Preview - delete all fields except the page URL by clicking on ... (more) next to the field headers

mceclip11.png

3. Create Pagination

  • Click on the Next button on the web page

  • Click BUTTON at the bottom of the Tips

Create_pagination_1.jpg
  • Choose Loop click single button

create_pagination2.jpg

4. Check the Workflow

Below is what the final workflow looks like. If everything is in place, you can continue to run the task.

mceclip6.png

5. Run Task and Export Data

  • Run the task on the top right corner: Run task on your device to run the task on your local device (note that cloud run may not work for this website as it is sensitive to scrapers)

mceclip7.png

Task 2 - Extract Data From Detail Page

1. Create a New Task

  • Select Advanced Mode in the top left corner - Select Import from file and import the excel file we export from the previous crawler then locate the correct Sheet and Column - Save to continue

mceclip1.png

2. Extract Data from the Product Page

  • Click on any text from the page and select Extract the text of the selected element

sto.gif
  • Go to Data Preview - Double click to rename the field header

sto.gif

3. Check the Workflow

Below is what the final workflow looks like, once everything is in place, you can continue to run the task.

mceclip2.png

4. Run Task and Export Data

  • Run the task on the top right corner: Run task on your device to run the task on your local device.

Here is the sample output:

mceclip12.png
Did this answer your question?