You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!

As the second most popular search engine directly behind Google, Youtube provides many fantastic videos for people around the world – where the video's associated data can also be precious.

This tutorial will show you how to scrape comments from a Youtube video in only 3 steps with the Octoparse auto-detection feature.

youtube_search.jpg

To follow through with the tutorial, you may want to use the URL below:

https://www.youtube.com/watch?v=CICqmmpOPs4

Here are the main steps in this tutorial: [Download task file here]

  1. Create a Go to Web Page - to open the target website

  2. Auto-detect the webpage - to create a workflow

  3. Run the task - to get your desired data


1. Create a Go to Web Page - to open the target website

  • Enter the target URL into the search bar on the home screen and click Start.

youtube_start.jpg

2. Auto-detect the webpage - to create a workflow

  • Click Auto-detect web page data in Tips and wait for the detection to complete

auto_detect.jpg
  • Check the data fields in Data preview and delete unwanted fields or rename them if needed

data_preview.jpg
  • Click Create workflow

create_workflow.jpg

NOTE: YouTube uses page scroll down to load more comments; thus, to make sure that all the data can be loaded and extracted, we need to set an appropriate waiting time for the Page Scroll-down.

  • Click Scroll Page in your workflow

  • Set Wait time: 2-3s recommended

  • Click Apply

wait.jpg

If you want to learn more about Page Scroll-down or set up one manually, please check out here.


3. Run the task - to get your target data

  • Click Save on the upper right to save your task

  • Click Run next to it and wait for a Run Task window to pop up

  • Select Run on your device to run the task on your local device

  • Wait for the task to complete

Here is a sample output from a local run:

youtube_data.jpg
Did this answer your question?