Skip to main content

Scrape data from Duckduckgo search results

Updated over a month ago

You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!

💡 Try our pre-built DuckDuckGo Search template for faster setup!

To follow through with the tutorial, kindly use the following Example Search URL::

The main steps are shown in the menu on the right and you can download the demo task file here.


1. Task Setup

  • Enter DuckDuckGo search URL

  • Click "Start" to initialize


2. Auto-Detect Workflow

  1. Click "Auto-detect web page data"

    detec.jpg
  2. Select "Create workflow" after detection

  3. Clean up detected fields:

    • Remove unwanted fields

    • Rename fields by double-clicking headers


3. Configure Pagination

  1. Edit Loop Item XPath for "Load More" button:
    //button[@id="more-results"]

  2. Update results container XPath:
    //ol[@class="react-results--main"]/li[@data-layout="organic"]


4. Refine Data Fields

Switch to Vertical View and update XPaths:

  • Title: //a[@data-testid="result-title-a"]

  • Summary: /article/div[3]/div[1]/span[last()]


5. Optimize Workflow

To prevent duplicate data, move the Extract Data outside the pagination loop


6. Run & Export Data

  • Save your workflow

  • Run in Standard Mode (local) or Cloud

  • Export formats available:

    • Excel

    • CSV

    • HTML

    • JSON

Sample data:

data_overview.jpg
Did this answer your question?