You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!

Quora is an American social question-and-answer website. Users can collaborate by editing questions and commenting on answers that have been submitted by other users to share ideas.

This tutorial will show you how to scrape questions from Quora.

quora.jpg

To follow through with the tutorial, you may want to use the URL below:

https://www.quora.com/search?q=webscraping

Here are the main steps in this tutorial: [Download task file here]

  1. Create a Go to Web Page - to open the target website

  2. Set up a page scroll - to load more data

  3. Create a Loop - to capture the list of questions from the webpage

  4. Create an Extract Data - to extract questions and related information you need

  5. Run the task


1. Create a Go to Web Page - to open the target website

  • Enter the target URL into the search bar on the home screen and click Start

quora_start.jpg

2. Set up a Page scroll - to load more data

  • Click the + add step button to add a step in the workflow

  • Click Loop

create_loop.jpg
  • Set the Loop Mode as Scroll Page

  • Set Scroll way as Scroll to the bottom of the page

  • Set Scroll times: 10 times for example (more scroll means more data)

  • Set the Wait time: 2-3s recommended

  • Click Apply to save the change

quora1.jpg

3. Create a Loop - to capture the list of questions from the webpage

  • Click the + add step button to add a step inside the Scroll Page Loop

  • Select Loop

quora2.jpg
  • Set the Loop Mode as a Variable List

  • Input the Matching XPath as: //div[contains(@class,"q-box qu-borderAll")]/div[2]/div

  • Click Apply to save the changes

xpath_loop.jpg

NOTE: If you want to learn more about Loop item, please check out here.


4. Create an Extract Data - to extract questions and related information you need

  • Click the + add step button to add a step in the workflow

  • Select Extract Data

quora3.jpg
  • Click on your target data field and select Extract data in the Tips panel

select.jpg
  • Double click the data field names to rename them if needed

rename.jpg

5. Run the task

  • Save your task from the upper right

  • Run the task (next to Save)

  • Select Run on your device to run the task on your local device

  • Wait for the task to complete

Here is sample output from a local run:

data_preview.jpg
Did this answer your question?