You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!

Bing Maps is a geospatial mapping platform that allows developers to create applications that layer location-relevant data on top of licensed map imagery.

This tutorial will show you how to scrape business info from Bing Maps.

1.png

The URL being used in this tutorial is: https://www.bing.com/maps/

Here are the main steps of this tutorial: [Download task file here]

  1. Create a Go to Web Page - to open the target website

  2. Enter text - to search for the result

  3. Set up loop click - to extend the detailed card

  4. Extract data - to get the desired data

  5. Set up a loop for the images - to extract images on the detailed content card

  6. Run the task - to get the desired data


1. Create a Go to Web Page - to open the target website

  • Enter the target URL into the search bar on the home screen and click Start

OPEN_PAGE.png

2. Enter text - to search for the result

  • Click on the search box

  • Click Enter Text from the Tips box after it turns green

enter_text.png
  • Input "Restaurant" in the Textbox

  • Click Confirm

RESTURANT.png
  • Click on the search icon

  • Click Click URL from the Tips box

search.png
  • Untick Open in a new tab for the generated Click Item operation

  • Set AJAX Timeout as 7s

  • Click Apply

AJAX.png

3. Set up loop click - to extend the detailed card

  • Click on the first item card, then click on the second item card

  • Click Loop click each element from the Tips box after both item cards turn green

LOOP_CLICK.png

Modify Xpath for the Loop to locate all item cards

  • Click on the Loop Item frame

  • Choose Variable List as Loop Mode

  • Input Xpath as: //li[@data-priority]

  • Click Apply

LOOP.png

Set up Xpath and Ajax timeout for Click Item1

  • Click Click Item1 in the Loop Item box

  • Tick Relative Xpath to the Loop Item

  • Input Xapth as : /a

  • Click Apply

click1.png
  • Click Options

  • Set AJAX Timeout as 10s

  • Click Apply

10S.png

4. Extract data - to get the desired data

  • Click on the data you want on the detailed card

  • Click Extract data from the Tips box after the data turns green

data.png
  • Double click to rename data fields if needed

rename.png

Automatic identification for the Website field will extract the text attribute, here we need to change it to extract the "href" attribute to extract url.

  • Click ... icon

  • Click on Customize field

CUSTOMIZE.png
  • Click Select URL (href attribute)

HREF.png

Modify Xpath for the data fields to locate the data in every detailed content card

  • Turn the Data preview into a vertical view

  • Input xpath as below

    • tel: //a[contains(@href,"tel:")]

    • website: //a[@aria-label="Website"]

    • opentime: //span[@class="opHours"]/span

xpath.png

5. Set up a loop for the images - to extract images on the detailed content card

  • Click on any of the two images

  • Click Extract image URLs and download linked files

IMAGE.png

Modify Xpath for the loop Item1 to locate all images

  • Click on Loop Item1 frame

  • Choose Variable List as Loop Mode

  • Input Xpath as: //div[@id="locovl_imgcol"]//img

  • Click Apply

LOOP1.png

Clear XPath for the data field

  • Turn the Data preview into a vertical view

  • Clear XPath for image field as below

remove.png

Note: For more details on how to set up the download location, please check on this tutorial: Scrape and download files

The final workflow will look like this:

WORKFLOW.png

6. Run the task - to get the desired data

  • Click the Save button first to save all the settings you have made

  • Then click Run to run your task either locally or cloudly

mceclip8.png
  • Select Run on your device and click Run Now to run the task on your local device

  • Waiting for the task to complete

Below is a sample data run from the local. Excel, CSV, HTML, and JSON formats are available for export.

data.png
Did this answer your question?