A drop-down menu is a list of items that appear when clicking on a button or text selection. This tutorial will show you how to select options in a drop-down menu in Octoparse.
You may need this example link to follow through:
1. Click on the drop-down menu and select Loop through options in the dropdown from the tips panel
2. A Loop Item has been created and added to the workflow automatically to loop through options in the drop-down menu.
3. Check if all the options we need have been included in the Loop Item
- Click on the Loop Item box for the drop-down, then refer to the looped items in the list
- Check if all the items added to the loop are desired. If not, refine the list using the XPath function: position().
For example, in this case, the first option in the drop-down menu is "-Select-", which is not a real option but a header, and we want to remove it from the list.
So you can just add "[position()>1]" to the current XPath. By doing so, the loop item will include every single option with a position greater than 1, or we can say just exclude the first option.
TIP: When a drop-down menu is detected and created in Octoparse, all available options will be selected by default. Besides the method of adding [position()>1] we just introduced to modify the list by adding or removing items, there are more methods you can use with XPath function position(). Adding [position()="x"] to the end of the XPath to include only options of certain positions, ie. position( )=1, position( )=2, etc. For this example, if you want to choose the year
1996, the Xpath should be added [position()=27]
To learn more tricks, please refer to How to select a specific option from the drop-down list?
4. We are now done configuring the drop-down menus. Click on the confirmation button to complete the search.
As you can see from the GIF above when there are multiple drop-downs on the web page, and we want to loop through them. i,e. get the results of the different combinations, we can just follow the steps of looping through one drop-down menu as we introduced before and repeat it several times. The loop items newly built should be inside the former one, like this: