Skip to main content

Scrape hotel info from Expedia

Updated over 2 years ago

You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!

Expedia is a popular American online hotel booking website for travelers. This tutorial will show you how to scrape basic information such as hotel name, location, price, amenity, etc., from Expedia with Octoparse.

To follow through, here is the example URL:

The main steps are shown in the menu on the right, and you can download the sample task file here.


1. Create a Go to Web Page - to open the target website

  • Enter the target URL on the Octoparse homepage and click Start


2. Auto-detect the webpage - to create a workflow

  • Click Auto-detect webpage data and wait for it to complete

  • Uncheck Add a page scroll

  • Click Create workflow

  • Go to Data Preview to see if you're okay with the current data output

    • Delete unnecessary data fields directly by clicking More and Delete field

    • Modify the data field names by double-clicking the headers


3. Set up a Page Scroll - to better load the data on the webpage

  • Click Go to Webpage > Options panel

  • Tick Scroll down the page after it is loaded

  • Set Scroll Mode as for one screen

  • Set Wait to "2s" before the next scroll

  • Set Scroll times to "100"

  • Tick Stop scrolling when there's no more content to load

  • Click Apply to save the settings

  • Repeat the steps above for the step Click on a "Load More" button


4. Modify XPath for the loop and data fields - to make sure Octoparse scrapes correct data

  • Click Loop Item 1

  • Make sure Loop Mode is Variable List

  • Input Xpath //div[@data-stid="lodging-card-responsive"]

  • Click Apply

  • Click the More button on the data field

  • Choose Customize XPath

  • Input Xpath //button[@class="uitk-image-link"]/../div/img for image

  • Click Apply

  • Input Xpath for the final price: //div[@data-stid="lodging-card-responsive"]/descendant-or-self::DIV[contains(@class,"uitk-spacing uitk-spacing-padding-block-half")]/span the same way


5. Run the task - to get your target data

  • Click Run to run your task either on your device or in the cloud

  • Select Standard Mode under Run on your device section to run the task on your local device

  • Wait for the task to complete


Here's the sample data output for your reference:

Did this answer your question?