Sometimes, we can't extract the rating information directly the same as scraping other text-format information, like page title. In the case below, the rating information is stored in the value of the "alt" attribute within the "img" element. In this tutorial, we will show you how to scrape this kind of star rating information from web pages.
Example site: https://www.trustpilot.com/review/airforcegiftshop.co.uk
There are two ways to fetch the star rating info:
1. Extract attributes from the source code
1. Select the rating area on the web page and choose to Extract the URL of the selected image. You can also choose to extract the text or HTML code here. This step is only for creating a data field.
2. Click on the Extract Data action and click the "..." icon. Then choose the Customize field
3. Select Extract attribute and then select alt
4. The result will be displayed in the field
2. Extract and cleanse the HTML code
1. Select the rating area on the web page and choose Extract the outer HTML of the selected element
2. Click the "..." icon. Then choose Clean data.
3. After that, click Add Step and then choose Match with Regular Expression.
4. If you know how to use Regular Expression, you can enter the formula directly in the Regular Expression box. If you're not familiar with it, click "Not sure about RegEx? Try the RegEx tool!".
5. Click Start with and then input the part of the string before the actual information we need. Next, click End with and then input the part of the string after the actual information we need.
After that, click Match to see if the matched info is what we need. Then click Apply.
6. Go back to the settings and confirm it.
7. After all the settings, click Apply to save