You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!

Stubhub is a website for fans to buy and sell tickets for different events. Usually, you can search for the ticket price, time, and location of an event to decide whether to purchase it or not.

This tutorial will show you how to scrape ticket prices from Stubhub.

search.jpg

To follow through the tutorial, you may want to use the URL below:

https://www.stubhub.com/find/s/?q=Shawn%20mendes

Here are the main steps of this tutorial: [Download task file here]

  1. Create a Go to Web Page - to open the target website

  2. Auto-detect the webpage - to create a workflow

  3. Modify the XPath of Loop Item - to locate the loop accurately

  4. Create a Pagination - to load more data on the webpage

  5. Run the task - to get your desired data

1. Create a Go to Web Page - to open the target website

  • Enter the target URL into the search bar on the home screen and click Start

start.jpg

2. Auto-detect the webpage - to create a workflow

Octoparse's Auto-detection function can help you quickly create a workflow according to the target website's design.

  • Click Auto-detect web page data in Tips and wait for the detection to complete

auto_detect.jpg
  • Check the data fields in Data preview and delete unwanted fields or rename them if needed

data_preview.jpg
  • Click Create workflow

create_workflow.jpg
  • Modify the XPath of Price

Sometimes, the price of an event might not be a fixed value but a range, so we need to modify the XPath of the Price data field to scrape different types of prices.

change.jpg
field_xpath.jpg

We have prepared the XPath for this field for you. You can copy and paste it into Octoparse. Enjoy! Price: //div[@class="EventItem__Tickets"]

The price of the event will now be presented as:

price_change.jpg

3. Modify the XPath of Loop Item - to locate the loop accurately

The auto-generated XPath of Loop Item needs to be modified; otherwise, Octoparse may fail to correctly locate the loop on different web pages.

  • Click Loop Item to open its settings

  • Input the Matching XPath as: //div[@class='Panel Panel-Border EventItem']

  • Click Apply to save the change

studhub_loop.jpg

4. Create a Pagination - to load more data on the webpage

  • Click See more events at the bottom of the webpage

  • Click Loop click single button in the Tips

load_more.jpg

Now, you will see a workflow created like the one below:

workflow.jpg

5. Run the task - to get your desired data

  • Click Save on the upper right to save your task

  • Click Run next to it and wait for a Run Task window to pop up

  • Select Run on your device to run the task on your local device

  • Wait for the task to complete

Here is the sample output from a local run:

sample.jpg
Did this answer your question?