Scraping product information from Target.com
FollowTarget.com is one of the largest online retailers in the United States, with 1,931 stores and 51 supply chain facilities.
This tutorial will show you how to scrape the product information, such as product name, price, brand, etc., from Target.com.
To follow through, you may want to use the URL below:
https://www.target.com/c/milk-substitutes-dairy-grocery/-/N-5xszh?lnk=MilkMilkSubstit
Here are the main steps in this tutorial:
- Create a Go to Web Page - to open the target website
- Save the Cookies - to load product information
- Auto-detect the webpage - to create a workflow
- Modify the settings of Pagination - to fully load the content on the webpage
- Run the task - to get your desired data
1. Create a Go to Web Page - to open the target website
- Enter the page URL on the home screen and click Start to create a new task
- Click Option
- Tick Scroll down the page after it is loaded
- Set Scroll for one screen
- Set the Wait time as 3s and Repeats 15 times
2. Save the cookies - to load product information
Since the product information will only be presented if there is a certain shop on Target.com, we need to choose one and tell Octoparse to save our choice.
- Turn on the Browse Mode
- Click Please select a Store
- Input your zip code to find stores nearby > choose the store you need
- Tick Use Cookie > Click Use cookie from current page > Apply
- Turn off the Browse Mode
3. Auto-detect the webpage - to create a workflow
- Click Auto-detect web page data and wait for the detection to complete
- Check the data fields in Data Preview and delete unwanted fields or rename them if needed
- Uncheck Add a page scroll
- Click Create workflow
4. Modify the settings of Pagination - to fully load the content on the webpage
- Choose Click to paginate in the workflow > Click Option
- Tick Scroll down the page after it is loaded
- Choose Scroll for one screen
- Set the Wait time as 3s and Repeats 15 times
- Choose the Pagination box in the workflow
- Input the Matching XPath as: //button[@data-test='next' and not(@disabled)]
5. Run the task - to get your desired data
- Click Save on the upper right to save your task
- Click Run next to it and wait for a Run Task window to pop up
- Select Run on your device to run the task on your local device
- Wait for the task to complete
Here is the sample output from a local run:
If you have further issues with the task or have a suggestion that would make this a better resource for you, we’d love to hear about it. Submit a request here.
Author: Cassie
Editor: Yina