In this tutorial, we will show you how to scrape products off a WordPress website, which is most likely built using the open-source WooCommerce plugin. We will be using ParseHub, our free web scraping tool to scrape any WordPress website using WooCommerce, or another eCommerce plugin.
According to BuiltWith, over 13 million websites are using WooCommerce. BuiltWith is a free tool that can analyze a specific URL and give you a detailed report on the technologies and plugins running on the website. We recommend you use BuiltWith to check if a website is running WooCommerce and/or WordPress before using this guide! If it’s hosted with Shopify, you can follow our Shopify scraping guide. However, the steps are more or less the same, as scraping eCommerce products with ParseHub is super easy regardless of the technology behind the website.
Let's begin scraping a WordPress website that is using WooCommerce!
Remember: You can use BuiltWith to check if a website is using WooCommerce.
Step 1: Scraping Products
- First, open the ParseHub application on your computer.
- Start a new project by clicking the “New Project” button.
- Enter the WooCommerce store URL you wish to scrape, we will be scraping this online comic store: https://certifiedcomic.shop/graded-comics/?orderby=price-desc
- Once the WordPress website loads, click the first product’s name to extract it.
- The other product names should turn yellow, click the yellow ones to train the algorithm.
- Continue until all 18 products have been extracted
- Rename this selection to “product” on the left.
Step 2: Scraping Prices
- First, click the PLUS(+) button next to your product selection from the last step.
- Choose “Relative Select” and click the first product and then the product’s price.
- All prices should now be extracted, rename this selection on the left to “price”.
If you’d like to scrape additional product information by parsing into each product, read this guide.
Step 3: Pagination
If we ran the scrape, only the first page’s products would be scraped, to scrape multiple or all the pages, we need to use ParseHub’s pagination.
- Scroll to the bottom of the WooCommerce store until you see the nav bar.
- Click the PLUS(+) button next to the “page” selection (not to be confused with the product one from the earlier step).
- Choose “Select” and click the next button arrow to extract it.
- Rename the selection to “pagination”, expand it, and delete the two extractions.
- Click the PLUS(+) button next to your pagination selection and choose “Click”.
- Choose “Yes” as this is a next page button, and choose the additional amount of pages to scrape.
- We chose 2, which means 3 pages of scraped data in total.
Step 4: Start WooCommerce Product Scraping
If you followed along, you should have successfully extracted products, their prices, perhaps some additional information on each product page, and finally set up pagination to scrape hundreds of WooCommerce products! To begin the scraping process on ParseHub’s servers, click the green “Get Data” button on the left side.
You can choose to Test, Run or Schedule your scrape, we chose to run the scrape a single time, getting us 3 pages of data in total, as specified in the pagination step!
Here’s what ParseHub’s data export produced:
If you need further help scraping WooCommerce websites, or any website, contact our live support.
Happy Scraping! 💻