I have a task which I last used in April and was trying to run again, scraping headlines from the CNN website at the URL http://transcripts.cnn.com/TRANSCRIPTS/2020.04.09.html (with a batch generation for the date). When I run this task, it loads the first page fine, but the next URL for some reason is changed to wyciwyg://0/http://transcripts.cnn.com/TRANSCRIPTS/2020.04.10.html, which fails to load anything. After hanging for a minute or so, it moves on to wyciwyg://1/http://transcripts.cnn.com/TRANSCRIPTS/2020.04.11.html (note that the number between "wyciwyg://" and "http://" increases by 1 each time).
I've tried everything I could think of to fix this issue. I updated my Octoparse client to 7.3.0, I tried setting it to use multiple different browsers, I tried using "https://" instead of "http://" (which doesn't work at all because CNN's website isn't set up for that), and I tried making a new version of the task from scratch. No matter what I do, the "wyciwyg://" is always spuriously added and the page doesn't load. What could be causing this? Other tasks I run work fine, so I'm putting this under Website Issues rather than an issue with Octoparse.
Please sign in to leave a comment.