How to extract multiple pictures from the webpage?
FollowThe updated tutorial for the latest version 8.1 is available here. Go to have a check now!
In general, there are two ways to extract multiple pictures from the webpage with Octoparse.
1. Click on the image to extract the URL directly
Method 1:
Click the first image, then select “Extract URL” on "Action Tips" panel, click the second image, then select “Extract URL” etc. The detailed instructions can be found in Step 4) of this tutorial: Scrape Product image from Amazon
The data extracted would be in this format:
Method 2:
Click on image1, then image2, image3 etc. till we get all the desired images selected, then click “Extract URL" on the Action Panel. A Loop Item of images will be generated automatically. The detailed instructions can be found in Example1 of this tutorial: Build an Image Crawler without Coding
The data extracted would be in this format:
2. Scrape the source code first, then format out the image URLs which are embedded in that pile of source code;
To learn how, please check the Tips under Step 4) of this tutorial: Scrape Product Image from Amazon
Note: If selecting "Extract inner HTML" can't get you the right pile of source code, please try "Extract outer HTML". |
Now we get the image URLs at hand, the last step would be getting full-sized pictures from thumbnails. For this step, we'll use Octoparse's built-in data cleansing tool. Please refer to How to scrape the full image URLs instead of the thumbnails for details.
The image URLs will be all in one cell like this:
Artículo en español: ¿Cómo extraer varias imágenes de la página web?
También puede leer artículos de web scraping en el sitio web oficial
Author: Momo
Editor: Yanni