Scraping data from a search engine is a good way to collect information related to one topic. In this tutorial, we are going to show you how to scrape the search results data on Google search.
You can go to "Task Templates" on the home screen of the Octoparse and start with the ready-to-use Google Search Template directly to save your time. With this template, there is no need to configure scraping tasks. For further details, you may check it out here: Task Templates
You can also use our Advanced Mode to create your own task. To follow through, you may want to use this URL in the tutorial: https://www.google.com/
We will scrape data such as the title, URL, and description from the search results page with Octoparse.
Here are the main steps in this tutorial:[Download demo task file here ]
- Open the targeted web page
- Auto-detect the web page to create steps to enter text
- Modify the settings for the "Click Item"
- Auto-detect the search result page to scrape data
- Set up wait time to slow down the scraping speed
- Save and start to run the task and get data
1) Open the targeted web page
- Enter the URL on the home page and click Start
2) Auto-detect the web page to create steps to enter text
- Click "Auto-detect web page data" and wait for the detection to complete
- Choose "Search with keywords" on the Tips panel and you will see instructions to help you set up steps
a. "Add a search box": click "Settings" and select the search box on the web page
b. "Add keyword(s)": click and input the keyword(s). One keyword per line.
- Click "Confirm" to generate the workflow
3) Modify the settings for the "Click Item"
- Double-click the "Click Item" to enter the Action Settings panel
- Tick "Open in a new tab"
- Extend the AJAX Load timeout
4) Auto-detect the search result page to scrape data
- Auto-detect the page again
- Click "Create workflow"
- Rename the fields or delete fields you don't want
5) Set up wait time to slow down the scraping speed
- Double-click the Extract Data action
- Tick "Wait before action"
- Select the wait time as 1s-3s
6) Save and start to run the task and get data
- Click "Run" on the upper left side
- Select "Run on your device" to run the task on your computer, or select "Run in the Cloud" to run the task in the Cloud (for premium users only)
Here is the sample output.
Is this article helpful? Contact us any time if you need our help!