You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!
Grubhub is a website for food delivery. This tutorial will show you how to scrape restaurant information from Grubhub.
To follow through the tutorial, you may need the URL below:
Here are the main steps in the tutorial: [Download task file here]
1. Enter the URL on the home page - to open the target page
Enter the URL on the home page and click Start
Tip: If you see any pop-ups on the web page, please switch to Browse mode to close it manually. Remember to turn off Browse mode after that.
2. Create a pagination loop - scrape all the results from multiple pages
Scroll down the page and click the >> button
Click Loop click single element on the Tips panel
3. Create a "Loop Item" - loop click into each item on each list
Click the Add Step + button in the workflow
Click on Loop Item
Choose Loop Mode as Variable List
Input Xpath //div[@class='s-row restaurantCard-search-body--lessInfo restaurantCard-search-body--altLayout']
Click the Add Step button inside the Loop Item
Choose Relative Xpath to the Loop Item in General settings for Click Item
Input Xpath /div
4. Extract Data - select the data to scrape
Click on the wanted data
After all the chosen data turns green, click Extract data in the tips box
5. Click the go back button- to go back to the home page
For this website that fails to open in a new tab after clicking, we can add a click to go back button step to go back to the home page.
Click the Go Back button
Select Click element in the tips box.
The final workflow will look like this:
6. Run the task - to get the desired data
Click the Save button first to save all the settings you have made
Then click Run to run your task either locally or cloudly
Select Run on your device and click Run Now to run the task on your local device
Waiting for the task to complete
Below is a sample data run from the local. Excel, CSV, HTML, and JSON formats are available for export.
Tip: Local runs are great for quick runs and small amounts of data. If you are dealing with more complicated tasks or mass of data, Run in the Cloud is recommended for higher speed. You are welcome to try the premium feature by signing up for the 14-day free trial here. Tasks can be scheduled hourly, daily, or weekly, and data delivered regularly.