Scrape cryptocurrencies information from Yahoo Finance
FollowA cryptocurrency is a digital or virtual currency that is secured by cryptography, which makes it nearly impossible to counterfeit or double-spend. Many cryptocurrencies are decentralized networks based on blockchain technology—a distributed ledger enforced by a disparate network of computers.
Cryptocurrencies players need to monitor the price fluctuation on the currencies as the price changes in seconds. Octoparse can schedule the scraping to run instantly to help update the information in time.
In this tutorial, we are going to show you how to scrape cryptocurrencies info from Yahoo Finance.
For Yahoo Finance, you could visit our easy-to-use "Task Template" on the main screen of the Octoparse scraping tool. All you need is to type in several parameters and the task is ready to go. For further details, you may check it out here: Task Templates
To follow through, you may want to use this URL in the tutorial:
https://finance.yahoo.com/cryptocurrencies?count=50&offset=0
We will scrape data such as the Symbol and Name from the cryptocurrency chart with Octoparse.
Here are the main steps in this tutorial: [Download task file here ]
- Go To Web Page - to open the targeted web page
- Auto-detect web page data - to close the pop-up
- Auto-detect web page data - to create the workflow
- Extract data - to modify the data fields
- Modify XPath of Pagination - to fix endless scraping
- Start extraction - to run the task and get data
1. "Go To Web Page" - to open the targeted web page
- Enter the page URL on the home screen and click "Start" to create a new task
2. Auto-detect web page data - to close the pop-up
- Choose the "Auto-detect web page data" and wait for the detection to complete
- Choose "Close a popup" on the Tips panel
- Select the "I agree" button and confirm
- Double-click the "Click" or click
to open the action settings
- Extend the AJAX timeout to 7-10s
3. Auto-detect web page data - to create the workflow
- Choose the "Auto-detect web page data" again and wait for detection to complete
- Click "Switch auto-detect results" on the Tips panel to locate the chart
- Uncheck "Add a page scroll"
- Click "Create workflow"
- Click open the settings of the "Click to Paginate action"
- Extend the AJAX timeout to 7-10s
4. Extract data - to refine the data fields
- Click open the settings of "Extract Data" action
- Delete unwanted fields by clicking the icon
- Rename the fields by clicking the field name
Tips!
|
5. Modify XPath of Pagination - to fix endless scraping
The auto-generated XPath of Pagination needs to be modified; otherwise, the scraping cannot be stopped. Octoparse will keep scraping the last page. Check out details about this issue here.
- Click open the settings of Pagination
- Input the new XPath //button[not(@disabled)]//span[text()="Next"]
- Click OK to confirm
6. Start extraction - to run the task and get data
- Click "Save"
- Click "Run" on the upper left side
- Select "Run on your device" to run the task on your computer, or select "Run in the Cloud" to run the task in the Cloud (for premium users only). You can also schedule the task to update the data frequently
You can export the result data in provided formats such as EXCEL, CVS, JSON or in your database.
Here is the sample output.
Tutorial en español: Scrapear información sobre criptomonedas de Yahoo Finance
También puedes leer más artículos de web scraping en el sitio web oficial
Author: Yina
Was this article helpful? Contact us at any time if you need our help!