Tokopedia is an Indonesian technology company specializing in e-commerce. In this tutorial, we are going to show you how to scrape product information from Tokopedia.
For Tokopedia scraping, you can use our ready-to-use Task Template available on the home page or follow this tutorial to build the task from scratch.
To demonstrate, we will use the URL below as an example: https://www.tokopedia.com/search?st=product&q=usb
Task 1 - Extract Product URLs
1. Open Target Webpage
Paste the URL and click Start
2. Auto-detect Web Page Data
Select Auto-detect web page data on the Tips panel
After the auto-detection finishes, select Edit under Add a page scroll
Adjust the Repeats number to 3 and Confirm, then Create workflow
Go to Data Preview - delete all fields except the page URL by clicking on ... (more) next to the field headers
3. Create Pagination
Click on the Next button on the web page
Click BUTTON at the bottom of the Tips
Choose Loop click single button
4. Check the Workflow
Below is what the final workflow looks like. If everything is in place, you can continue to run the task.
5. Run Task and Export Data
Run the task on the top right corner: Run task on your device to run the task on your local device (note that cloud run may not work for this website as it is sensitive to scrapers)
Task 2 - Extract Data From Detail Page
1. Create a New Task
Select Advanced Mode in the top left corner - Select Import from file and import the excel file we export from the previous crawler then locate the correct Sheet and Column - Save to continue
2. Extract Data from the Product Page
Click on any text from the page and select Extract the text of the selected element
Go to Data Preview - Double click to rename the field header
3. Check the Workflow
Below is what the final workflow looks like, once everything is in place, you can continue to run the task.
4. Run Task and Export Data
Run the task on the top right corner: Run task on your device to run the task on your local device.
Here is the sample output: