Octoparse verion 8 is released! Find out more by click here.
Question: When to select the “Use Loop”?
Where is the "Use Loop"?
Use loop is an option in the advanced setting.
"Use Loop" is to do certain action to the items in the "Loop Item".
What is a loop (loop item) in Octoparse?
A Loop Item might contain three kinds of items, a list of URLs (input by the user), a list of texts (input by the user), or a list of items on the page (located by XPath).
So, there will be three cases. Let's break them down.
1) If it is a list of URLs, to use the loop item, there must be a "Go To Web Page" action inside to open the URLs. Please check the "Use loop URL" (Open the URL in Loop Item)option in the step of "Go To Web Page".
Check the related tutorials on building a loop of a list URLs:
How to change a single link extraction to multiple links
Extract data from a list of URLs
2) The second situation is a list of text in the "Loop Item" like the picture below.
There will always be an "Enter Text" action inside. Please check the "Loop Text"(Use the text in Loop Item to fill in the text box) in the step of "Enter Text".
Check the related tutorial here: Text/keyword input
3) In the third situation, the loop item includes a list of items on the page (located by XPath). The common steps followed by are "loop click the list of items" and "loop extract from the list of items".
"Use Loop" shall be checked when the step is a Click Item.
Here is the tutorial on how to build such a Loop Item to click the list of items:
Lesson 5: Getting data - Click on a list and capture data from each item page
Check the Select Items in Loop Item, if the Extract Data is the following step.
And you can have both Extract Data and Click Item to use the loop item when you need to scrape from both the listing page and detail page:
Artículo en español: ¿Cuándo seleccionar "Use Loop"?
También puede leer artículos de web scraping en el sitio web oficial