Lead generation is one of the most important parts of any sales process. Yellowpages is a good data resource for companies in any industry to collect leads. In this tutorial, we are going to show you how to scrape the leads from Yellowpages.
For Yellowpages, you can visit our easy-to-use "Task Template" on the main screen of the Octoparse scraping tool. All you need to do is type in several parameters, and the task is ready to go. For further details, you may check it out here: Task Templates
Please follow the steps below if you want to know how to build a task from scratch with Octoparse. We will use the URL below to scrape data such as title, address, telephone, etc.
Here are the main steps in this tutorial:[Download demo task file here]
1) "Go to Web Page" - Open the targeted web page
Enter the URL on the home page and click Start
2) Auto-detect the web page - create a workflow
Click "Auto-detect web page data" and wait for the detection to complete
Go to "Data preview" to see if you're okay with the current data output
Delete unnecessary data fields by clicking the trash icon
Modify the data field names here directly by clicking the pen icon
Uncheck "Add a page scroll"
Click "Create workflow"
If the data you need can all be scraped from the listing page, you can just jump to Set up wait time to slow down the scraping speed. If you want to click on each detail link to get more information, please follow the next step.
3) Click into each detail link to scrape more information
Choose to “Click on link(s) to scrape the linked page(s)” on the Tips panel
Select "Click on an extracted data field" and select the one you want to click on from the drop-down menu (you can confirm if it's the correct link in the Data Preview)
Click on "Confirm"
4) Extract Data - extract data on the detail pages
Select information from the web page
Choose "Extract text of the element"
Repeat the above steps to extract all the data you need
Double click on the field name to rename it if needed
5) Set up wait time to slow down the scraping speed
Since Yellowpages might block your IP if you scrape it too much, we need to control the scraping speed.
Click on the "Extract Data1" action
Tick "Wait before action" under "Options"
Set up time as 5s-10s
6) Run extraction - run your task and get data
Click "Run" on the upper left side
Select "Run on your device" to run the task on your computer, or select "Run task in the Cloud" to run the task in the Cloud (for premium users only)
Here is the sample output -