Dealing with pagination (Infinitive Scroll)
FollowThe updated tutorial for the latest version 8.1 is available here. Go to have a check now!
Infinite-scrolling, also known as "endless scrolling" is a technique used most often by websites with JavaScript or AJAX to load additional content dynamically as users scroll down to the bottom of the webpage. Usually, when you drag down the sidebar to the bottom directly, you can see the "loading" sign, and the new content will be added into the page very soon:
Similar to how you will manually scroll down the page, Octoparse does it the same way with the proper settings. Basically, all you need to do is to tell Octoparse which page to scroll, how many times to scroll, and the time interval between every two scrolls.
In this tutorial, we are gonna show you how to deal with infinitive scrolling in Octoparse, you may want to use this URL to follow through https://biomarket.com.ar/product-category/almacen/desayuno/
1) Use the auto-detect algorithm to deal with it
2) Set up the infinitive scroll manually
1) Use the auto-detect algorithm to deal with it
1. Select "Auto-detect web page data" on the Tips panel. Octoparse will start detecting the page data and let's wait for it to finish.
2. Modify the scroll settings
Click "Edit" under "Add a page scroll" and set up the scroll way, repeat times, and wait time as needed. Click "Confirm" to save the settings. Make sure to set up enough scroll-down times and proper intervals between two scrolls.
3. Create the workflow with the settings
Click "Create workflow" on the Tips to generate the workflow with settings.
4. Check if the Loop Item created can locate all the elements
You can go to the settings of the Loop Item to see if all the elements are located
Tips! Note to check if the loop mode of the Loop Item is selected as "Variable List". In case it is "Fixed List", please change to "Variable List" and modify the XPath. Check the tutorial below for more details: Infinitive Scroll has setup but no new elements added to the list? |
2) Set up the infinitive scroll manually
1. Double click the Go to Web Page action(or a Click Item) or click the icon to access the setting menu. Then find the Page scroll-down options by clicking open the "After loading page" section.
2. Check the box for "Scroll down the page after it is loaded", and set up the scroll way, repeat times, and wait time as needed.
Tips! Find more details about the page scroll-down function at Page scroll-down. |
It is easy to set up for infinitive loading but to find the most appropriate settings, you might want to test running the task to see if you’ve assigned enough scroll times and if the scrolling is working with the right pace.
Tutorial en español: Tratar la paginación (desplazamiento infinitivo)
También puedes leer más tutoriales de web scraping en sitio web oficial
If you need any help with task configuration or data collection, submit a ticket to our support team! We'll get back to you soon.
Author: Kara
Editor: Yina