Capture all the images from an image carousel
FollowMany product web pages use image carousels (like the one below) to display multiple images as slides which you can usually flip through manually. In this tutorial, I will show you how to extract the images of a carousel into your desired format.
1. Scape one image into one column
2. Scrape images to different lines
3. Scrape all images into one column
1. Scape one image into one column
Scraping multiple images to different columns is just as easy as scraping one image. Let's use this example URL for demonstration: https://www.ebay.com/itm/Lenovo-Legion-Y540-15-6-144Hz-i7-9750H-16GB-RAM-256GB-SSD-GTX-1660-Ti-Office/303553933195
Simply select one of the images, and select "Extract the URL of the selected image" on the Tips Panel. Repeat the same process to fetch all the other image URLs.
2. Scrape images to different lines
It is also possible to scrape images to different lines of the same column using a loop extract action.
1) Select the first image
2) Go on to select the second image and choose "Extract image URLs".
3. Scrape all images into one column
There are two ways to achieve scraping all images into one column.
Option 1. Combine the extracted image URLs
Once you've loop extracted the image URLs into different lines (following steps in Scrape images to different lines), you can then combine the extracted data to merge the lines into one single line.
1) Click the setting icon for the Extract Data action.
2) Click the "see more" icon for the data field, then select "Combine data", then "Combine the captured data".
Option 2. Scrape the HTML code of the carousel and match out the image URLs from the code
1) Select the entire carousel and select "Extract the outer HTML of the selected element"
2) Go to the Settings for the "Extract Data" action, click the see more icon for the field, and select "Clean data".
3) Inspect the code to find the starting value and ending value of the image URL.
4) Click "Add step" and choose "Matching with Regular Expression"
5) Click "Try the ReEx tool"
6) Enter Start with and End with value to generate a RegEx and apply the setting.
7) Tick "Match all" and confirm
Tips! 1. The image URLs scraped are thumbnail URLs. If you need to get the full image URLs, please check this tutorial: |
Tutorial en español: Scrapear todas las imágenes de un carrusel de imágenes
También puedes leer más tutoriales de web scraping en sitio web oficial
Author: Yina
Editor: Isabel