You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!

Grubhub is a website for food delivery. This tutorial will show you how to scrape restaurant information from Grubhub.

1.jpg

To follow through the tutorial, you may need the URL below:

https://www.grubhub.com/search?orderMethod=delivery&locationMode=DELIVERY&facetSet=umamiV2&pageSize=20&hideHateos=true&searchMetrics=true&latitude=40.71277618&longitude=-74.00597382&variationId=0.5-new-gotos&sortSetId=umamiV2&sponsoredSize=3&countOmittingTimes=true

Here are the main steps in the tutorial: [Download task file here]

  1. Enter the URL on the home page - to open the target page

  2. Create a pagination loop - scrape all the results from multiple pages

  3. Create a "Loop Item" - loop click into each item on each list

  4. Extract Data - select the data to scrape

  5. Click the go back button- to go back to the home page

  6. Run the task - to get the desired data


1. Enter the URL on the home page - to open the target page

  • Enter the URL on the home page and click Start

2022-05-27_12-19-12.png

Tip: If you see any pop-ups on the web page, please switch to Browse mode to close it manually. Remember to turn off Browse mode after that.


2. Create a pagination loop - scrape all the results from multiple pages

  • Scroll down the page and click the >> button

  • Click Loop click single element on the Tips panel

2.png

3. Create a "Loop Item" - loop click into each item on each list

  • Click the Add Step + button in the workflow

  • Choose Loop

1.png
  • Click on Loop Item

  • Choose Loop Mode as Variable List

  • Input Xpath //div[@class='s-row restaurantCard-search-body--lessInfo restaurantCard-search-body--altLayout']

  • Click Apply

4.png
  • Click the Add Step button inside the Loop Item

  • Choose Click

CLICK.png
  • Choose Relative Xpath to the Loop Item in General settings for Click Item

  • Input Xpath /div

  • Click Apply

5.png

4. Extract Data - select the data to scrape

  • Click on the wanted data

  • After all the chosen data turns green, click Extract data in the tips box

6.png

5. Click the go back button- to go back to the home page

For this website that fails to open in a new tab after clicking, we can add a click to go back button step to go back to the home page.

  • Click the Go Back button

  • Select Click element in the tips box.

7.png

The final workflow will look like this:

8.png

6. Run the task - to get the desired data

  • Click the Save button first to save all the settings you have made

  • Then click Run to run your task either locally or cloudly

mceclip8.png
  • Select Run on your device and click Run Now to run the task on your local device

  • Waiting for the task to complete

mceclip9.png

Below is a sample data run from the local. Excel, CSV, HTML, and JSON formats are available for export.

data.png

Tip: Local runs are great for quick runs and small amounts of data. If you are dealing with more complicated tasks or mass of data, Run in the Cloud is recommended for higher speed. You are welcome to try the premium feature by signing up for the 14-day free trial here. Tasks can be scheduled hourly, daily, or weekly, and data delivered regularly.

Did this answer your question?