In this tutorial, we will show you how to collect business details on Yell.com with Octoparse.
To demonstrate, we will use the URL below as an example.
We will scrape data, such as Title, Address, Rating, Phone number, and Reviews from the business details page with Octoparse.
Here are the main steps in this tutorial: [Download demo task file here ]
- Go To Web Page - open the targeted web page in Octoparse
- Create a pagination loop - enable Octoparse to scrape across all available pages
- Create a "Loop Item" - build a loop for all the businesses on the page then click into each one of them
- Extract data - select the data fields needed for the extraction
- Start extraction - run the task and get data
1. Go To Web Page - open the target web page
- Click "+ Task" to start a task using the Advanced Mode
The Advanced Mode is a powerful and flexible way to scrape. It allows you to create your own workflow accommodating to your specific scraping requirements.
- Copy and paste the target URL into the "Extraction URL" box and click "Save URL" to move on
2. Create a pagination loop - scrape multiple listing pages
- In the built-in browser, scroll down to locate the "Next Page" button then click on it.
- When prompted, click "Loop click next page" on the "Action Tips" panel.
- Click on the titles of any two businesses. Octoparse should automatically identify the list of all businesses available on the webpage.
- Then, select "Loop click each element" on the "Action Tips" panel. This tells Octoparse to click into each business on the list one by one.
4. Extract data - select the data fields needed for the extraction
- Once arrive at the business detail page, click on the target data fields one at a time. Select "Extract text of the selected element" when prompted.
- To extract the reviews, we'll need to create a loop for all the reviews available on the page. Click on any two reviews to have the list of all reviews identified.
- Next, select "Extract text from selected elements" on the "Action Tips" panel then all the reviews would be extracted in a loop.
- Rename the fields as needed.
5. Start extraction - run the task and get data
- Click "Save"
- Click "Start Extraction" on the upper left side
- Select "Local Extraction" to run the task on your computer, or select "Cloud Extraction" to run the task in the Cloud (for premium users only)
Here is the sample of your data.
Happy data hunting!