Scrape hotel information from Trip.com
FollowYou are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!
With an extensive hotel network in 200 countries and regions, Trip.com helps customers make a comfortable and amazing choice of accommodation. Customers can find information such as the price, services, and reviews of a hotel.
This tutorial will introduce how to collect hotel information, such as hotel name, location, comments, price, and rating on Trip.com with Octoparse.
To follow through, you might want to use the URL below:
Note: If you want to check whether your workflow works correctly, please download the OTD file for this case at the bottom of this page.
Here are the main steps in this tutorial:
- Create a Go to Web Page - to open the target website
- Auto-detect web page data - to create a workflow
- Click the Load More button - to load more hotels
- Set up a Scroll Page - to extract the new hotels' information
- Run the task - to get data you desired data
1. Create a Go to Web Page - to open the target website
- Enter the target URL into the search bar on the home screen and click Start.
Note: The way Trip.com deal with pagination is a little bit complicated. We need to scroll down several times then the Search More Hotels button will show up. Thus, we need to add a page scroll down at the beginning.
- Click on Go to Webpage > Option
- Tick Scroll down the page after it is loaded
- Set the Scroll to repeat 10 times and Wait 2s for each scroll
- Click Apply to save the settings
2. Auto-detect web page data - to create a workflow
Octoparse's Auto-detection function can help you create a workflow quickly according to the design of the target website.
- Click Auto-detect web page data in Tips and wait for the detection to complete
- Check the data fields in Data preview and delete unwanted fields or rename them if needed
- Uncheck Paginate to scrape more pages and Add a Page Scroll
- Click Create workflow
3. Click the Load More button - to load more hotels
- Scroll down to the bottom of the page until you see the Search More Hotels button
- Click Search More Hotels > Loop click single element in the Tips
4. Set up a Scroll Page - to extract the new hotels' information
- Click
in the workflow to add a new step
- Select Loop
- Choose Loop Mode as Scroll Page
- Set the Repeats as 10 times and Wait 2s for each scroll
Note: When clicking on the Search More Hotels on Trip.com, it often needs a long time to finish loading, thus, we need to add a wait before action time before the page scroll.
- Click Option
- Tick Wait before action > set wait time as 10s
5. Run the task - to get data you desired data
- Click Save on the upper right to save your task
- Click Run next to it and wait for a Run Task window to pop up
- Select Run on your device to run the task on your local device
- Wait for the task to complete
Here is a sample data for your reference:
Author: Cassie
Editor: Yina