Dealing with pagination (with a "Next" button)
FollowPagination, also known as paging, is the process of dividing a document into discrete pages, either electronic pages or printed pages. Setting up pagination is a crucial process to get plenty of data from multiple pages. Common ways of pagination include paging with a "Next" button, a "Load More" button, or with infinite scroll.
In this tutorial, we will introduce the way to deal with pagination with a "Next" button in Octoparse.
There are two ways to set up pagination with a "Next" button:
1) Use the auto-detect algorithm to set it up
2) Set up the pagination manually
1) Use the auto-detect algorithm to set it up
You may need this example link to follow through: https://www.amazon.com/s?k=sunglass&ref=nb_sb_noss_2
1. Click the "Auto-detect web page data" on the Tips panel.
2. When the auto-detect process is completed, check whether the "Paginate to scrape more pages" option is ticked. (Usually, this option will be shown on the Tips panel automatically.)
3. Click the "Check" button to inspect which button is identified as a "Next" button.
When Octoparse fails to recognize the correct next page button, click "Edit" to select the next button manually.
4. Click "Create workflow" on the Tips panel and you will see a workflow with pagination created.
5. Test the workflow by clicking the "Pagination" and clicking "Click to paginate" to see whether Octoparse is able to move to the next page. If you find Octoparse is able to go to the next page, that means your pagination is successfully set up.
Tips! When there is no "Paginate to scrape more pages" option on the Tips panel after the auto-detection completes, we can click "Create workflow" first, and then we will see the option shown on the Tips.
Select the next page button on the web page and click Confirm. |
2) Set up the pagination manually
1. Click the next page button/icon. After that, choose "Loop click single URL" on the Tips panel.
2. When the pagination is set up, click the "Pagination" bar and then click "Click to paginate" to check whether Octoparse is able to move to the next page. If you find Octoparse is able to go to the next page, that means your pagination is successfully set up.
Tips! When there is no "Loop click single URL" on the Tips, you can select "Loop click next page" or "Loop click single element ". |
If you need any assistance with your data project, feel free to submit a request here to contact our support team anytime!
Artículo en español: Tratar la paginación (con botón "Siguiente")
También puedes leer artículos de web scraping en sitio web oficial
Author: Fergus
Editor: Yina