It can be a big challenge for users to find ways to speed up their tasks, particularly when dealing with some complicated tasks. This article is designed to help you troubleshoot a very low-speed task. Specifically, it will go through all the possible factors of determining whether the problem is more likely to be caused by the local environment, website structure, or simply the settings of your task.
Situation 1: There are too many steps in the workflow.
- Simplify your task
The workflow sometimes might be too complicated with many steps to get to the target page, so it is quite essential for you to simplify your workflow by deleting some unnecessary steps such as click actions. You will need to use the URL from the nearest layer in order to make your task much simpler as well as straightforward.
For example, if you want to extract 3D glasses from Amazon. You will want to avoid the following situation of clicking items layer by layer to reach the 3D glass product page.
You will need to directly use the URL from the 3D glass product page to start your task.
- Split your task
When your task needs to click a list of elements to get the data, you can try to split the task into two.
Task 1: Get the URL list for each entry from the list page
Task 2: Use the URL list obtained from the first step to set up a new task to extract data from the detail page.
You can check this example case for reference: Scraping property data from Realtor.com
Situation 2: The website applies AJAX but you have not set it up.
- Set a proper AJAX time
Many websites use the AJAX technique to update information without reloading the entire webpage over and over again. When a page is loaded with AJAX, but you have forgotten to set it up, the task may get stuck and work really slowly. An appropriate AJAX time will allow the extraction process to go on smoothly.
Situation 3: The local environment is not good(local runs).
- Improve the local environment
If the local extraction runs quite slow, it is likely caused by the local environment, such as operating system, hardware capacity, IP address, network bandwidth, CPU performance, and so on. You will need to manually check the current status of all the related factors listed above.
- Running tasks in the Cloud (premium users only)
However, it is quite understandable that those sorts of issues will be less likely to be settled down or fixed up. However, running tasks in the Cloud will be more effective and feasible for you to enjoy times faster data extraction with Octoparse.
You can check how to speed up the tasks following this tutorial: How can I scrape data faster in Cloud? (Version 8)
Tip: You can refer to "What is Cloud Extraction? "for more details about Cloud extraction.
Situation 4: The website content might take a longer time to be fully loaded
When a website contains too many elements like images or videos, then the overall loading speed of the web pages will be slowed. This will also be another main factor that slows the overall running speed of certain tasks.
Resolution: Disable image loading
We can choose not to load the images on the web pages to shorten the time of page loading.
- Open task settings
- Tick Disable image loading and click Save
If you have any questions, you are welcome to submit a request here. Our support team will get back to you within 24 hours.