Sometimes, you may encounter such a situation that you need to click some items first to display the information hidden behind. For example, some phone numbers will be hidden behind a Call button, or some information are hidden on different tabs. How can we scrape this kind of data?
Let's take this website as an example: https://www.cclcomponents.com/fimer-pvs-100kw-solar-inverter-3-phase-with-sx-wiring-box-14293
For this website, if you want to view the detailed description, you need to click to show the Description tab first.
Suppose we want to extract the phone number from this page.
Here are two ways to help you achieve it:
- Set up a click step - tell Octoparse to click open the description tab
- If the data can be found in the source code of the web page, you can extract data directly
1. Set up a click step - tell Octoparse to click open the description tab
- Click on the Description
- Choose Click URL (could also be Click element or Click button)
- Open the Click Item settings
- Go to the Options tab
- Untick Open in a new tab option
- Set up AJAX Load as 2s-5s
- Click Apply to save
Once the description text is shown, we can get the text.
2. If the data can be found in the source code of the web page, you can extract data directly
- Switch on Browse mode
- Manually click open the Description tab
- Turn off Browse mode
- Scrape the description text just as other text info
This method only works when the data can be found in the source code whether we click open it or not. If the information can only be shown after clicking, we still need to use the first method to do it.