I want octoparse to give me a list of the links where it says "Hauptausschuss" "Bildungsausschuss" "Stadtentwicklungsaussschuss"
I am trying to extract links from multiple websites with a similar coding, e.g.:
https://sessionnet.krz.de/itzehoe/bi/si0040.asp?__cjahr=2020&__cmonat=2&__canz=1&__cselect=0
or
https://www.kreis-pinneberg.de/Politik/Kreistagsinformation/Sitzungskalender.html
I want octoparse to give me a list of the links where it says "Hauptausschuss" "Bildungsausschuss" "Stadtentwicklungsaussschuss"
On the first one Octoparse reads the first 3 but not he last 2. On the 2nd page, I try to paginate him towards the month "Februar" and then extract, but it won't extract from the paginated site and instead paginate from the current month.
Is the coding of the websites just not good enough for a Webscraper to reliably scrape the data?
-
It looks like you need to set up a trigger to scrape specific data. Triggers
And you should use this XPath to move to another page.
//*[@id="smcfiltermenu"]/ul/li[5]/a
Please sign in to leave a comment.
Comments
1 comment