If you have set up a loop for a list, but find that Octoparse only clicks open the first item and stops without looping through the other items of the same list, it is likely that Octoparse fails to return to the original list after having the first item loaded. The most common causes may be one of the below.
1. The detail page is not set up to be opened in a new tab.
Click the "Click Item" and you will find an advanced option named "New Tab". You should click on the "New Tab" and re-create the following steps.
Do remember to re-create the steps following the "Click Item" because Octoparse needs to identify a new page to extract and the previous steps would not work out.
2. The website applies AJAX to update information or the website is not compatible with Octoparse.
If Octoparse still cannot work even though you click "New Tab", the website either applies AJAX or the website may be incompatible with Octoparse. Page loaded with AJAX will cover the previous one, so Octoparse cannot get to the next item to scrape. The compatible issue relates to application compatibility and adaptability between Octoparse and websites you want to scrape. In this case, you should divide your task into two steps. Extract detail page URLs with Octoparse firstly, and then scrape data you want with the URL list. If you are new to URLs list extraction, please follow this video tutorial to learn more. [Click here ]
You can follow these steps to manually check if Octoparse can return to the list page once it enters the detail page.
1. Click “Go To Web Page” to open the website.
2. Click “Loop Item” box in your workflow.
3. Click “Click Item” to open the detail page.
4. Click “Loop Item” again and see if Octoparse can return to the list page. If not, you should follow the above steps to revise your workflow.
Artículo en español: ¿Por qué Octoparse solo extrae el primer elemento y produzca duplicados?
También puede leer artículos de web scraping en el sitio web oficial