You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!

If you scrape a list of URLs, you may want to get the original input URL as a field along with your target data so you can match them to see if there are any URLs that haven't been scrapped.

However, chances are the URLs might change after opening (e.g, some URL parameters might change) or be redirected to another totally different URL. Now the new feature of adding an Original input URL in Octoparse 8.5 perfectly resolves this dilemma! Let's see how to use this function.

What's the original URL Octoparse adds as a field?

For this function, Octoparse adds the original URL you input to Octoparse to start the task.

  • Single URL. If you start the task with one single URL, you will get the URL that you put in the Go to Web Page action

  • URL lists in the loop item. If you are extracting data from a URL list, you will get the URL list you input in the Loop URLs by using the Original Input URL


How to add the original URL?

Let's take this link as an example:

Open this link in your browser and you will notice that the URL is redirected to another one:


STEP 1. Input your URL(s) in Octoparse to start a task


STEP 2. Go to the Data Preview section and select Original input URL from Add Custom Field


You will see a field named Original_URL created as a field and the value of it is not


Tip: You can also scrape the URL after redirecting, which means to get instead of Please check the tutorial Scrape page-level data (metadata, page URL, page title, source code)

Did this answer your question?