You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!

Google Play is a big database with tons of Application information. In this tutorial, we are going to scrape the basic information of applications from Google Play.

You could visit our easy-to-use "Task Template" on the home screen of the Octoparse. All you need is to type in several parameters, and the task is ready to go. For further details, please check it out here: Task Templates


To follow through, you may want to use this URL in the tutorial:

We will scrape data such as detail page URL, application name, author page URL, and author name with Octoparse.

Here are the main steps in this tutorial:

  1. Go to Web Page - to open the target web page

  2. Auto-detect the web page data - to create the workflow

  3. Modify the XPath of the Loop Item - locate the Apps accurately

  4. Start extraction - to run your task and get data

1. Go To Web Page - to open the target web page

  • Enter the page URL on the home screen and click Start


2. Auto-detect the web page data - create the workflow

  • Click Auto-detect the web page data

  • Wait for the detection to complete

  • Check the data fields in Data Preview section, and you can also delete the unwanted fields or rename fields if needed

  • Click Create workflow in the Tips


3. Modify the XPath of the Loop Item - to locate the Apps accurately

  • Click Loop Item in the workflow

  • Enter the XPath //c-wiz[@jsrenderer="PAQZbb"]


The final workflow should be like this:


Tip: If you want to learn more about XPath, please check the following tutorial:

What is XPath and how to use it in Octoparse

4. Start Extraction - to run your task and get data

  • Click Save

  • Click Run on the upper right side

  • Select Run on your device to run the task on your computer, or select Run in the Cloud to run the task in the Cloud (for premium users only)


Here is the sample output.

Did this answer your question?