Office Depot is a leading American office supply retailer owned by The ODP Corporation, which also owns brands OfficeMax and Grand & Toy. The company has over 1,400 stores, and over 38,000 associates, which generates over $11 billion dollars a year. Their website hosts a variety of products, from office chairs, desks, stationery products, school supplies and much more. In this blog post, we will show you how to scrape Office Depot, with ParseHub, our free web scraping tool.
To follow along with this tutorial, download and register to ParseHub for free.
Let’s begin scraping Office Depot!
- Firstly, open the ParseHub application and log in.
- Click the blue “New Project” button to start your project.
- Enter the Office Depot URL you wish to scrape from, we will scrape office chairs with this URL: https://www.officedepot.com/a/browse/office-chairs/N=5+593067/https://www.officedepot.com/a/browse/office-chairs/N=5+593067/
- Once the page loads, click the first chair’s name and description to select it.
- The rest of the products will now be highlighted in yellow. Click the next one.
- Now all 24 products on the first page will be extracted!
- Rename this selection to “product” on the left pane, where you see “selection1”.
Scraping Additional Data
When scraping additional data respective to each product, we need to use ParseHub’s Relative Select tool:
- To scrape the respective prices, start by clicking the PLUS(+) button next to your first selection.
- Choose “Relative Select” and click the first product’s name to select it.
- Point the arrow to the product’s price to connect it.
- All prices should now be extracted for each product!
- Rename this selection to “price” on the left pane.
- Redo steps 1 to 5 for other data such as the item number or the reviews!
If we ran the scrape right now, ParseHub would only scrape 24 products on the first page. To scrape multiple pages we need to use ParseHub’s pagination.
- Begin by scrolling down the page until you see the page navigation bar.
- Click the PLUS(+) button next to your “page” selection at the top of the left pane.
- Choose “Select” and click the “Next” link.
- Rename this selection from “selection1” to “pagination”.
- Click the expand drop-down icon and delete the two extractions.
- Click the PLUS(+) button next to your pagination selection and choose “Click”.
- Choose “Yes” on the popup to confirm this is a next page button.
- Choose the number of additional pages you wish to scrape, we will choose 2 which means 3 pages of scraped data in total!
At the time of this blog post, there were no blocks when scraping Office Depot, which means you could probably scrape large amounts of data without blocks! However, if you do run into blocks or empty results, you will need to enable ParseHub’s IP Rotation.
To begin your scrape, click the green “Get Data” button on the left pane. You can now choose to Run your project, Schedule it, or even Test Run if you run into any issues. If you have issues scraping and cannot solve it with a test run, feel free to reach out to our live chat support!
If you followed the steps correctly, your data export should look something like this:
If you enjoyed this guide on scraping Office Depot, here is a guide on scraping any e-commerce website!