Octoparse verion 8 is released! Find out more by click here.
Web scraping tasks created in Octoparse can be run on your local machine (Local Extraction) or in the cloud (Cloud Extraction ). Running tasks locally can help you,
Local Extraction is available for both free and premium users.
For free users, it is limited to 10,000 records of data exported each time and 2 concurrent local runs ; for premium users (Standard & Professional), there is no limitation on records of data exported and concurrent local runs.
In this tutorial, we will go through following features:
Run tasks on Local Extraction
In Wizard Mode , when Octoparse proceeds to "complete", you can click "Local Extraction" to execute the crawler on your local machine.
In Advanced Mode , after the completion of configuring your task, click "Start Extraction" and then select "Local Extraction" to run the task locally.
Then you can see the running process of the task and view the data extracted.
Settings of Local Extraction
When the task is running, you are able to modify the "Extraction settings" for your local tasks. By default, Octoparse disables these three functions. You can enable them based on the requirements of your task.
Display error message: Error message will show up in the built-in browser when there is an error, such as data missing.
Loading image: Disable image loading to speed up opening the webpage.
Memory release: Local extraction can easily eat up your computer memory. Select "Memory release" to release.
1. Where does the local task run?
Local Extraction is running the crawler with your own IP and some websites may limit the visit times of the same IP. Under this circumstance, the crawler is likely to be blocked if it runs on websites over the limitation.
2. What will affect Local Extraction?
As the crawler is running on the local machine, it will be affected by the local network speed and hardware configuration.
Artículo en español: Ejecute tareas en la máquina local
También puede leer artículos de web scraping en el sitio web oficial