Octoparse offers a powerful Cloud platform for premium users (Standard & Professional ) to run your tasks 24/7.
When you run a task with "Cloud Extraction", it runs in the cloud with multiple servers using Octoparse's IPs. You can shut down the APP or your computer while the task is running. No need to worry about hardware limitations. Data extracted will be saved in the cloud and can be accessed at any time.
Task scheduling is also supported by Octoparse Cloud extraction. To retrieve the most updated information, you can schedule your task to run as frequently as you need.
Features covered in this tutorial：
To run your task with cloud extraction:
When you finish configuring your task, click "Run" and select "Run task in the Cloud" to execute a run in the cloud.
Once a task is set to run in the cloud, its status will change to "Running" on the dashboard.
To batch run tasks with cloud extraction:
Select any tasks that need to be run, click on 'Run (Cloud)' and the tasks will run together in the Cloud.
Settings of cloud extraction:
Octoparse cloud extraction allows for executing multiple tasks simultaneously.
On the Standard Plan, you can run 6 concurrent tasks in the cloud (6 cloud servers available), and on the Professional Plan, you can run 20 concurrent tasks (20 cloud servers available). To set the maximum number of tasks running in parallel, click and select a desired number from the drop-down options:
1. How’s the performance of cloud extraction?
Getting data extracted in the Cloud can be a lot faster than running the tasks locally given the task is split-table (Learn about when a task is split-table ).
A split-table task can be broken down into multiple subtasks which can be run on multiple servers simultaneously, thus making the extraction faster.
2. Can I run more tasks than the maximum number's allowing for?
Yes, you can. But some of the tasks will be queued until more cloud servers become available upon completion of the earlier tasks.
To schedule a run in the cloud:
When you finish configuring your task, click "Run" and select "Schedule(Local)".
Select how frequently you want to run it: Once/Weekly/Monthly/Repeat. And customize the time and date according to your data requirements. Click "Save and Run" and the task will be run as scheduled.
The time for the next execution can be found on the dashboard on the "Next Run" column.
And if you wish to cancel a scheduled task, click "More", select "Schedule OFF" in the "Cloud runs".
What's the default time zone for Octoparse Cloud platform?
The next execution time shown on the dashboard is in your local time zone (according to your operating system) by default. However, if you've built the task to extract "current date & time" in the Cloud, the extracted time & date will be in UTC±00:00 regardless of your actual location.
Currently, Octoparse does not support changing the timezone.
To set a schedule for a group of tasks, switch to 'Task Group' mode, select a task group, then choose "Set a schedule for the task group".
Other advantages of Cloud Extraction:
Artículo en español: Ejecutar/Programar tareas en Cloud
También puede leer artículos de web scraping en el sitio web oficial