You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!
When you are using Current date & time function (as Add custom data field) to extract the scraping time during Cloud Extraction, you may find the time zone of exported data is UTC+0 by default, which may be different from your local time.
Note: Timezone conversion is only applicable for data in date & time format such as "YYYY-MM-DD HH: MM." You can use Reformat date & time to clean the data first: Refine extracted data (replace content, add a prefix, ..)
Time in the Date Preview when creating a task:
Time in the exported data of the same task using Cloud Extraction:
How to convert the time zone:
Click the More button beside the Current time field
Choose Clean Data
Click Add Step
Choose Timezone conversion
Choose your preferred time zone from the dropdown list
Check the Output field to see the converted time, and then Click Confirm if the converted time is the same as your preferred time.
Timezone conversion can also kick in when you want to convert the scraped time data from your target website. For example, if you want to scrape the post date of a news article, you may want to convert the post time to your local time.
If the time data you scraped includes the timezone info, for example, 2022-2-15 05:40 +00:00, then the Timezone conversion will convert the time based on the timezone info.
If the scraped data does not include the timezone, then the Timezone conversion will convert the time based on your local time (the time showing on your computer).
However, sometimes the timezone of the time data you scraped may be different from your local time. For example, when viewing a foreign news website, the post time may be presented in UTC +9, while your local timezone is UTC+8. What if you want the scraped data to be presented in your timezone? We can easily add the timezone info as a suffix to the scraped data.
Giving an example
Default time data: 2022-2-15 05:40 (Suppose that you are in UTC +8 zone)
Scraped time data: 2022-2-15 06:40 (which is a UTC+9 time as the website suggests)
You preferred output data: 2022-2-15 05:40
Steps to achieve that goal:
Click the More button next to your data field and choose Clean Data
Click Add Step, select Add a suffix
Enter +09:00 (which means UTC+9) in the text box, click Confirm
Click Add step again and select Timezone conversion
Choose your preferred timezone, check out the Output time, and click Confirm