Lesson 2: Getting to know Octoparse
FollowThe latest version 8.1 for this tutorial is available here. Go to have a check now!
In this tutorial, we’ll introduce the user interface of Octoparse Version 7.X. By the end of this reading, you should know exactly where to add a new task, where to check your data when the extraction is done and most importantly, where to get help when you need it. It is a preliminary and essential step for anyone to get familiar with the Octoparse UI in order to prepare for the successful scraping experience ahead. Let’s take a quick tour of Octoparse V7.X!
The Octoparse user interface has two main parts to it: the sidebar navigation and the home screen. Clicking on any items from the sidebar navigation menu will take you to a new tab on the home screen.
Dashboard is the main console for all your task management, such as running, stopping tasks, setting schedules for any tasks, grouping tasks and more. You can also check the status of your tasks and easily access the extracted data here.
Tips! 1. Select more than one task to batch manage tasks. 2. Tasks can be sorted by time or groups. Click the dashboard view icon |
Tools provides extra help with XPath generation, regular expression, export to database and data API.
Tutorials include an abundance of learnings associated with all features of Octoparse, as well as step-by-step tutorials for the most popular websites.
Data Service takes care of your data scraping requests if you are looking for additional help such as task configuration service or data delivery service.
Contact support for any questions regarding getting data with Octoparse or any other data scraping inquiries.
Tips! 1. Hover over the account username to find out what your account status is and when your account will be expiring. 2. Right below the account username, there are two handy icons: click 3. The side navigation menu is collapsable by clicking 4. You can always set Workflow Mode to be the default mode by accessing Account Setting. |
Now, let’s quickly start a new task and check out the task configuration interface.
1) The Select Mode
The Octoparse Select Mode is specifically designed for easy capturing any web data with simple clicks. You will be able to interact with the webpage the same ways you do when browsing through the website. Octoparse goes on to "guess" what you may want to do by offering a set of possible "actions" in Action Tips. You can build a scraper to fetch the data needed simply by following through the step-by-step instructions provided in Action Tips.
Tips! 1. Click on The Octoparse Select Mode will give you an easy start to any web scraping jobs, what if you want to see how the task is set up from the beginning or revise the previous step? This is done by switching to the Workflow Mode. |
2) The Workflow Mode
The Workflow Mode offers far more flexibility as it provides all the advanced options for every single action in the workflow, such as adding wait time, adjusting for AJAX and many more.
The Workflow Designer shows explicitly how one action is connected to the next. All extraction actions can be dragged and added to the workflow manually. By clicking through each action in the workflow, you can easily see how Octoparse is interacting with the website and if the target data fields can be extracted as expected.
Tips! 1. Switch between the Select Mode and the Workflow Mode using the on-and-off button |
日本語記事:レッスン2:Octoparseを知ってもらおう!
Webスクレイピングについての記事は 公式サイトでも読むことができます。
Artículo en español: Lección 2: Conociendo Octoparse
También puede leer artículos de web scraping en el sitio web oficial.
Lesson 3: Getting data - Capture text from a page
From: https://www.octoparse.com/tutorial-7/getting-to-know-octoparse