I want to make a data backup of a forum. Its home page shows the threads in 'most recent post' order so I suppose Octoparse can just chase each thread, go to next page, chase, go, etc.
But it would make more sense, for an archive, if I could just loop over all the threads. Each thread has URL of the form: http://site/comment/56/XXX where XXX is a number from 1 to about 11,000. That URL leads to a page with that thread laid out, where I suppose Octoparse can grab the html, pictures, etc.
Is this something I can learn to do from one of the tutorials?
Thanks for any guidance.
Please sign in to leave a comment.