Octoparse

<code>You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade <a href="https://www.octoparse.com/download" rel="nofollow noopener noreferrer" target="_blank">here</a> if you haven't already done so!</code>

Clutch is a leading ratings and reviews platform for B2B service providers, featuring companies in over 100 countries and 500 industries. Clutch categorizes companies by their geographic location, field of expertise, and the focus on proven skills. Based on the data gathered, Clutch formulates a fair rating for all the firms.

This tutorial will show you how to scrape a company listing page for company details from clutch.co with Octoparse.

The sample URL we will use in this tutorial is:

<a href="https://clutch.co/agencies/digital?geona_id=26487" rel="nofollow noopener noreferrer" target="_blank">https://clutch.co/agencies/digital?geona_id=26487</a>

The main steps are shown in the menu on the right, and you can download the sample task file <a href="https://drive.google.com/file/d/1Aha2lHp45nRkvFzg1yD0O8Dtb94LNg-d/view?usp=drive_link" rel="nofollow noopener noreferrer" target="_blank">here</a>.

___________________________________________________________

1. Create a Go to Web Page - to open the target web page

Enter the page URL on the home screen and click Start to create a new task

- Enter the page URL on the home screen and click Start to create a new task

2. Set up Pagination Loop - to scrape data from multiple listing pages

To instruct Octoparse to extract data from every page, you'll need to set up pagination first by scrolling to the bottom of the page

Select Loop click next page on the Tips panel

- Click on the next&gt; button
- Select Loop click next page on the Tips panel

3. Create Loop Item - to go through all the companies

Click on any of the company names, and all similar titles are highlighted in red.

Click Select all similar elements on the Tips panel.

- Click on any of the company names, and all similar titles are highlighted in red.
- Click Select all similar elements on the Tips panel.

Select Text on the Tips panel

- Select Text on the Tips panel

You'll see a Loop Item being generated in your workflow for all 50 companies on one page.

Note: If you have more than 50 items in the loop, you probably have the sponsored results or ads on the page included too.

In this case, you can modify the loop item XPath to this to avoid including the sponsored result: //li[@data-type="Directory"]

4. Extract more data - to extract other information about the companies

To extract information other than the company name:

Click on your desired data (Location in this case)

- Click on your desired data (Location in this case)
- Select Text on the Tips panel

You'll find a data field that has been added to the Data Preview section:

5. Set Wait before Action - to make sure data is fully loaded

Wait before action is a function that can be set to every action in the workflow. It will let the task wait before the action is executed.

In this case, it is better to add a Wait before Action for Loop Item in the workflow.

Click on Loop Item first and then click on Options

Tick Wait before action and choose 3s

- Click on Loop Item first and then click on Options
- Tick Wait before action and choose 3s 
- Click Apply

6. Run the task - to get your desired data

Click Save on the upper right side to save your task

Click Run next to it and wait for the Run Task window to pop up

Select Run on your device to run the task on your local device

- Click Save on the upper right side to save your task
- Click Run next to it and wait for the Run Task window to pop up
- Select Run on your device to run the task on your local device
- Wait for the task to complete

Here is a sample output from a local run:

Scrape company listing from clutch.co

Go to Octoparse.com

Octoparse DE

Octoparse FR

Octoparse ES

Octoparse JP

Download

Blog

API Docs

Find answers and get help from Intercom Support and Community Experts

This site employs cookies and other technologies that we and our third party vendors use to monitor and record personal information about you and your interactions with the site (including content viewed, cursor movements, screen recordings, and chat contents) for the purposes described in our Cookie Policy. By continuing to visit our site, you agree to our {websiteTermsLink}, {privacyPolicyLink} and {cookiePolicyLink}.

This site uses cookies and similar technologies ("cookies") as strictly necessary for site operation. We and our partners also would like to set additional cookies to enable site performance analytics, functionality, advertising and social media features. See our {cookiePolicyLink} for details. You can change your cookie preferences in our Cookie Settings.

We use cookies to make our site work and also for analytics and advertising purposes. You can enable or disable optional cookies as desired. See our {cookiePolicyLink} for more details.

Advertising cookies are set by our advertising partners to collect information about your use of the site, our communications, and other online services over time and with different browsers and devices. They use this information to show you ads online that they think will interest you and measure the ads' performance. Social media cookies are set by social media platforms to enable you to share content on those platforms, and are capable of tracking information about your activity across other online services for use as described in their privacy policies.

These cookies enable the website to provide enhanced functionality and personalisation. They may be set by us or by third party providers whose services we have added to our pages. If you do not allow these cookies then some or all of these services may not function properly.

These cookies are necessary for the website to function and cannot be switched off in our systems.

These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us to know which pages are the most and least popular and see how visitors move around the site.

You have the right to opt out of the sale of your personal information. See our {cookiePolicyLink} for more details about how we use your data.

Your Privacy Choices

We use cookies to enhance your experience. You can customize your cookie preferences below. See our {cookiePolicyLink} for more details.

Cookie Settings

Empty Help Center

Uh oh. That page doesn’t exist.

Home

Search results

Disappointed

Neutral

Smiley

Thinking...

Searching through sources...

Analyzing...

Tickets submitted through the messenger or by a support agent in your conversation will appear here.