You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!
Suppose you have configured a remarkable crawler with branches and loops, and you have realised that you must commence scraping from multiple links rather than a solitary link. At this point, you reflect upon your workflow, which is depicted as follows.
We understand how much you don't want to start over at the moment!
Here are two simple solutions to save you from work.
Solution 1:
Input a list of URLs and click on Save
A Loop Item containing the URLs will be automatically generated within your workflow.
Solution 2:
Solution 1 is only functional in cases where the Starting URL needs to be modified, specifically within the initial Go to Web Page step. In situations where there are multiple Go to Web Page steps, it is possible to manually construct a Loop to incorporate the necessary URLs.
A. Add a Loop to open the list of links
Click Add Step to add a loop
Choose Loop in the dropdown menu
Input the list of URLs into the designated input box on the subsequent page. A new loop has been successfully established.
B. Combine the new Loop with the rest of your workflow
Drag the rest of the workflow (including Go to Web Page, other Loop Items, or Branches) into the new Loop Item with URLs, as shown in the picture below:
Click on Go to Web Page
Tick Load URLs in the loop
Apply the setting