You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!
Coinmarketcap is a website that provides free access to current and historic data for Bitcoin and thousands of altcoins. From the website, people can have an overall view of the constant coin market.
In this tutorial, we are going to scrape crypto coin prices and other info from Coinmarketcap.
Sample URL: https://coinmarketcap.com/
Here are the steps in this tutorial: [Download task file here]
1. Create a Go to Web Page - to open the target website
Create your task by inputting the URL in the search box on the homepage
Click the "Start" button to move on
2. Auto-detect web page data - create a workflow
Click Auto-detect web page data and wait for it to complete
Go to Data preview to remove unwanted data or rename it if needed
Delete unwanted data fields directly by clicking the delete icon
Modify the data field names by double-clicking the headers
Click Create a workflow to generate a workflow
The automatically generated workflow would look like the below:
3. Modify the action order - to correctly load more information
In this case, we need to drag the Loop Item out of the Scroll Page step which means extracting data after the whole page is loaded. And set the scroll page for one screen.
Pull the Loop Item down to Scroll Page
Set Scroll for one screen
Repeats 30 times
TIP: More tutorials for page scroll-down settings, you can check here.
4. Run your task - get the data you want
Click the Save button first to save all the settings you have made
Then click Run to run your task either locally or cloudly
Select Run on your device and click Run Now to run the task on your local device
Wait for the task to complete
Below is a sample data run from the local run -
TIP: Local runs are great for quick runs and small amounts of data. If you are dealing with more complicated tasks or mass of data, Run in the Cloud is recommended for higher speed. You are very welcome to try the premium feature by signing up for the 14-day free trial here. Tasks could be scheduled hourly, daily, or weekly and data delivered regularly.