Extract star rating information
FollowSometimes, we can't extract the rating information directly as scraping other text-format information, like page title. In the case below, the rating information is stored in the value of "alt" attribute within the "img" element. In this tutorial, we will show you how to scrape the star rating information from web pages.
Example site: https://www.trustpilot.com/review/airforcegiftshop.co.uk
There are two ways to fetch the star rating info:
1) Extract attribute from the source code
2) Extract and cleanse the HTML code
1) Extract attribute from the source code
1. Select the rating area on the web page and choose "Extract the URL of the selected image". You can also choose to extract the text or HTML code here. This step is only for creating a data field.
2. Double click the Extract Data action or click the gear icon to open the settings.
3. Click the "..." icon. Then choose the "Customize field" option.
4. Select "Extract attribute" and then select "alt". The result will be displayed in the "Example" box.
5. After saving the changes you made, when returning to the home page, you will find the result has become the rating information.
2) Extract and cleanse the HTML code
1. Select the rating area on the web page and choose "Extract the outer HTML of the selected element".
2. Go to the settings of Extract Data and choose "Clean data".
3. After that, click "Add step", and then choose "Match with Regular Expression".
4. If you know how to use Regular Expression, you can enter the formula directly in the Regular Expression box. If you're not familiar with it, click "Not sure about RegEx? Try the RegEx tool!".
5. Click "Start with" and then input the part of strings before the actual information we need. Next, click "End with" and then input the part of strings after the actual information we need.
After that, check "Match all" and then click "Match" to see if the matched info is what we need. Then click "Apply".
6. Double-check the result when you return back to the settings. Tick the "Match all" option, and confirm it.
If you have any questions, you are welcome to submit a request here. Our support team will get back to you later.
Tutorial en español: Extraer información de clasificación por estrellas
También puedes leer más tutoriales de web scraping en sitio web oficial
Author: Fergus
Editor: Yina