You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!

It happens quite often to extract data from a dropdown list. Sometimes you need to get all the options on the list. Sometimes, you might just need one or several specific options. This tutorial will show you how to select whichever option is on the list.

In short, writing the correct XPath is the fastest way to locate the right option. Let's use an example to show you how to do it.

You may want to use this example link to follow through:

https://www.zazo.de/eliquids/zazo/493/zazo-5-1-gratis-paket-10ml-flaschen

Here is a dropdown list, and it contains a lot of options.

1.png

Let's loop through all the options in the dropdown menu first

  • Click on the dropdown menu and choose Loop through the options in the dropdown menu on the tips panel

2.png

You will see the default XPath of the Loop item is //form[@class="buybox--form"]/div[1]/div[1]/div[1]/select[1]/OPTION

As you can see, there are 299 items inside the dropdown list.

6.png

We need to modify the XPath for the Loop Item to suit our needs.


Choose a specific option by its index

For example, if we want to select the 5th option, which is "Ananas 12mg", the correct XPath should be ://form[@class="buybox--form"]/div[1]/div[1]/div[1]/select[1]/OPTION[5]

Just add [X] to the end of the XPath to choose the option you want. If you replace the default XPath with the new one, you will see the 5th option appear.

77.png

Choose a specific option by its text

If we want to select all options containing "Banana," the correct XPath should be:

//form[@class="buybox--form"]/div[1]/div[1]/div[1]/select[1]/OPTION[contains(text(),'Banana')]

Using "contains" can help you select the option containing specific text.

12.png

Choose a specific option by its position

If we want to select all the options except the 1st one, the correct XPath should be:

//form[@class="buybox--form"]/div[1]/div[1]/div[1]/select[1]/OPTION[position()>1]

We could use ">","=",'<' after “position()” to adjust according to our needs.

11.png

If we want to select only the last option, the right XPath is:

//form[@class="buybox--form"]/div[1]/div[1]/div[1]/select[1]/OPTION[last()]

12.png

Tip: If you want to check whether the XPath you modified works well or not in Octoparse, you need to click Apply to save first, click another action in the workflow, and then click Loop Item again.

_1.gif
Did this answer your question?