You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!
With an extensive hotel network in 200 countries and regions, Trip.com helps customers make a comfortable and amazing choice of accommodation. Customers can find information such as the price, services, and reviews of a hotel.
This tutorial will introduce how to collect hotel information, such as hotel name, location, comments, price, and rating on Trip.com with Octoparse.
To follow through, you might want to use the URL below:
Here are the main steps in this tutorial:[Download task file here]
1. Create a Go to Web Page - to open the target website
Enter the target URL into the search bar on the home screen and click Start.
NOTE: The way Trip.com deal with pagination is a little bit complicated. We need to scroll down several times then the Search More Hotels button will show up. Thus, we need to add a page scroll down at the beginning.
Click on Go to Webpage > Option
Tick Scroll down the page after it is loaded
Set the Scroll to repeat 10 times and Wait 2s for each scroll
Click Apply to save the settings
2. Auto-detect web page data - to create a workflow
Octoparse's Auto-detection function can help you create a workflow quickly according to the design of the target website.
Click Auto-detect web page data in Tips and wait for the detection to complete
Check the data fields in Data preview and delete unwanted fields or rename them if needed
Uncheck Paginate to scrape more pages and Add a Page Scroll
Click Create workflow
3. Click the Load More button - to load more hotels
Scroll down to the bottom of the page until you see the Search More Hotels button
Click Search More Hotels > Loop click single element in the Tips
4. Set up a Scroll Page - to extract the new hotels' information
Click the add step button in the workflow to add a new step
Choose Loop Mode as Scroll Page
Set the Repeats as 10 times and Wait 2s for each scroll
NOTE: When clicking on the Search More Hotels on Trip.com, it often needs a long time to finish loading, thus, we need to add a wait before action time before the page scroll.
Tick Wait before action > set wait time as 10s
5. Run the task - to get your target data
Click Save on the upper right to save your task
Click Run next to it and wait for a Run Task window to pop up
Select Run on your device to run the task on your local device
Wait for the task to complete
Here is sample data for your reference: