Scrape information from Groupon
FollowYou are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!
Groupon is a website that provides professional personal services including classes, photography, local services and so on.
In this tutorial, we are going to show you how to scrape information about photography services from Groupon.com.
To follow through, you may want to use this URL in the tutorial:
https://www.groupon.com/browse/chicago?category=personal-services&category2=photography
Note: Task otd has been attached at the bottom of this tutorial for further checking.
Here are the main steps in this tutorial:
- Go to Web Page - open the target web page
- Click "X" - close the ad
- Start Auto-detect - generate a workflow
1. Go To Web Page - open the target web page
- Enter the URL on the home page and click Start
2. Click "x" - close the ad
- Click "x" on the upper right corner of the ad
- Click " click element" on the tips panel
3. Start Auto-detect - generate a workflow
- Click "Auto-detect web page data" on the tips panel
-
- Double click to rename the data file
- Untick "Add a page scroll"
- Click "Create workflow"
4. Extract Data - select the data to scrape
- Click on the wanted data
- After all the chosen data turn green, Click > Extract data in the tips box
- Edit the field name by double clicking it
The final workflow will look like:
5. Rewrite the Xpath - to locate the element accurately.
To locate target data accurately and avoid missing data, the Xpath for start and rating needs to be modified.
Click to change the data field into a vertical view.
Inpu Xapth: //span[@id="numerical-rating"] >> start field
//span[@class="star-rating-text"]>> rating field
6. Run the task - to get the desired data
- Click the Save button first to save all the settings you have made
- Then click Run to run your task either locally or cloudly
- Select Run on your device and click Run Now to run the task on your local device
- Waiting for the task to complete
Below is a sample data run from the local. Excel, CSV, HTML, and JSON formats are available for export.
If you have further issues with the task or have any suggestions, we’d love to hear about them. Submit a request here.
Is this article helpful? Contact us any time if you need our help!
Writer: Eric
Editor: Fergus