Skip to main content

Scrape reviews from Google Play

Updated over a week ago

You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!

🐙 Did you know?

Octoparse offers a variety of preset templates for scraping data from Google Play. We strongly suggest that you test with these templates to determine if any of them meet your data requirements.

Google Play is a great platform for collecting user reviews of mobile applications. These reviews help users make informed decisions and encourage developers to improve their apps.

In this tutorial, we will teach you how to scrape reviews from Google Play.

You can find our easy-to-use Task Templates right on the Octoparse home screen. Just enter a few details, and your task will be ready in no time. For further details, please check it out here: Task Templates

To follow through, you may want to use this URL in the tutorial:

We will scrape data such as reviewer name, post time, and review content from each APP details page with Octoparse.

The main steps are shown in the menu on the right, and you can download the sample task file here.


1. Go to Web Page - to open the target web page

  • Enter the page URL on the home screen and click Start


2. Click See all reviews - to see all the reviews

  • Click See all reviews from the web page

  • Choose Click element on the Tips panel

The workflow will be like this:


3. Auto-detect the web page data - to create the workflow

  • Click Auto-detect webpage data

  • Untick Click on a "Load More" button

  • Click Create workflow in the Tips window

  • Check the data fields in Data Preview section, and you can also delete the unwanted fields or rename fields if needed

mceclip0.gif

4. Modify the XPath of the Scroll Page- locate the scrolling area precisely

  • Click on Scroll Page

  • Select the scroll area as Partial

  • Enter the XPath //div[@jsname="bN97Pc"]

  • Choose for one screen and enter 1000 in the repeats

  • Tick End loop when there's no more content to be load

  • Click Apply

The final workflow should look like this:

Tip: If you want to learn more about XPath, please check the following tutorial: What is XPath and how to use it in Octoparse


5. Run extraction - run your task and get data

  • Click Save

  • Click Run on the upper right side

  • Choose one mode on Run on your device to run the task on your computer, or choose one mode on Run in the cloud to run the task in the Cloud (for premium users only)

Here is the sample output.

mceclip12.png
Did this answer your question?