Scrape product information from bukalapak
FollowBukalapak is an Indonesian E-commerce company. It enables small and medium enterprises to go online, and it also supports traditional family-owned businesses.
You can go to "Task Templates" on the main screen of the Octoparse scraping tool and start with the ready-to-use Bukalapak Templates directly to save your time. With this feature, there is no need to configure scraping tasks. For further details, you may check it out here: Task Templates
This tutorial will show you how to collect product details on bukalapak.com with Octoparse.
To follow through, you may want to use this URL in the tutorial:
Note: If you want to check whether your workflow works correctly, please download the OTD file at the bottom of this page for this case.
Here are the main steps in this tutorial:
- Create a Go to Web Page - to open the target website
- Auto-detect the webpage - to create a workflow
- Modify the XPath of Loop Item - to locate the data field(s) more accurately
- Modify the settings of Pagination - to fully load the content on the webpage
- Run the task - to get your desired data
1. Create a Go to Web Page - to open the target website
- Enter the page URL on the home screen and click Start to create a new task
- Click on Go to Web Page box > Option
- Tick Scroll down the page after it is loaded
- Set Scroll as for one screen > Repeat times as 12
2. Auto-detect the webpage - to create a workflow
- Click Auto-detect web page data and wait for the detection to complete
- Check the data fields in Data Preview and delete unwanted fields or rename them if needed
- Uncheck Add a page scroll
- Click Create workflow
3. Modify the XPath of Loop Item - to locate the data field(s) more accurately
- Choose Loop Item box in the workflow
- Input the Matching XPath as: //div[@class="bl-flex-container flex-wrap is-gutter-16"]/div
4. Modify the settings of Pagination - to fully load the content on the webpage
- Choose Click to paginate in the workflow > Click Option
- Tick Scroll down the page after it is loaded
- Set scroll as to the bottom of the page
- Set scroll times as 12
5. Run the task - to get your desired data
- Click Save on the upper right to save your task
- Click Run next to it and wait for a Run Task window to pop up
- Select Run on your device to run the task on your local device
- Wait for the task to complete
Here is a sample output from a local run:
If you have further issues with the task or have a suggestion that would make this a better resource for you, we’d love to hear about it. Submit a request here.
Author: Cassie
Editor: Yina