Get page-level data (metadata, URL, title & HTML)
FollowIn this tutorial, we will show you how to use Octoparse to extract page-level data, including webpage URL, page title, meta description, meta keywords, and HTML source code.
STEP 1. Select an Extract Data from the workflow
STEP 2. Go to the Data Preview section then click on Add Custom Fields button
STEP 3. Select your target data field from Page-level data
STEP 4 (optional). Rename the data field by double clicking on the field name
- Page URL: URL of the current page
- Page title: title of the current page, which is a short description of a webpage and appears at the top of a browser window.
- Meta description: meta description tag of the current page, which contains a summary of the page.
- Meta keyword: meta keyword tag of the current page
- HTML source code: the complete HTML code of the web page