There are occasions where our workflow seems perfectly fine, but the task has little success extracting the data we want. How could this happen? Why do I see data in the preview tab but cannot extract it when running the task? If this problem bothers you, take a few minutes to go through the following article with tried-and-true solutions.
Reason 1: The default timeout for the Go to Web Page action is not long enough.
If Octoparse stops extracting even before the web page loads completely, it is highly likely that the page data you need is not loaded before Octoparse moves on to the next action.
Solution: Set a longer timeout for the Go to Web Page action to make sure the page loads completely before the next step
- Click the Go to Web Page action in the workflow and set a longer timeout under the General tab for this action.
Reason 2: The target website has a load delay.
If the target website loads completely, but Octoparse still stops and extracts nothing, consider the possibility of a load delay for the page data you want. Many websites use JSON to update web pages, which causes the load delay of page elements.
Solution: Set a wait time for the next action after the Go to Web Page action
- Click on the action next to the Go to Web Page in the workflow and set a wait before action time under the Options tab. (Check out this article to read the complete guide on how to set up a wait time. )
Reason 3: The website uses lazy loading to improve its page loading speed.
If the target website won't load data other than those you see before you scroll, that means it has opted for a lazy loading strategy. We need to tell Octoparse to scroll down the page after it is loaded.
Solution: Set scroll down the page after it is loaded
- Click on your Go to Web Page action in the workflow and check the Scroll down the page after it is loaded under the Options tab. Modify the detail setting according to your needs.
Reason 4: The AJAX timeout we set is not long enough.
If your workflow has a click item to deal with data hidden under a “show more” or "load more" button, make sure you set an AJAX load timeout long enough for the data to update completely.
Solution: Set a longer AJAX timeout for your click item
- Click on the Click item in your workflow, check Load with AJAX, and set a longer timeout.