Groupon is one of the largest marketplaces for local businesses to find customers online.
Through Groupon, local businesses can directly target Groupon’s large userbase with exclusive deals.
And by boasting 45.3 million active users in Q3 2019, Groupon attracts a lot of businesses to their platform.
As a result, the product and business data on Groupon’s website can be incredibly valuable. For example, it could give you great insight into deals that your competitors are offering on the platform.
However, manually extracting data from Groupon’s site could be extremely time-consuming and inconvenient. Here’s where Web Scraping comes into play.
Groupon and Web Scraping
A web scraper will allow you to select the specific data you’d like to extract from Groupon and download it as an Excel spreadsheet or JSON file.
You could even setup your scraper to run on a schedule and export your results to Google Sheets, in order to always have access to the most recent data.
For this example, we will scrape the Greater Toronto Area Deals page in Groupon. We will extract data on deals, prices, reviews, and addresses.
To do this, we will use ParseHub, a free and powerful web scraper that can easily complete this task.
How to Scrape Groupon Data
It’s time to get scraping. First, make sure to download and install ParseHub for free.
- Open ParseHub, click on “New Project” and enter the Groupon URL you want to scrape. The page will now render in the app.
- Start by clicking on the business name of the first result on the page. It will be highlighted in green to indicate is has been selected. Click on the second business name in the list to select all the listings on the page.
- In the left sidebar, rename your selection to business.
- ParseHub is now pulling the business name and the deal URL.
- Now, click on the PLUS (+) sign next to your business selection and choose the Relative Select command.
- With the Relative Select command, click on the business name of the first result and then on its rating score. An arrow will appear to indicate the relationship between these data points.
- Rename your new command to rating.
- Expand the rating command and remove its URL extraction, since this URL is already being pulled.
- Repeat steps 5-8 to extract more data such as number of reviews, deal price, offer percentage, offer details and business address. Your final project should look like this:
Interested in scraping images from this page as well? Check out our guide on how to scrape and download images from any website.
Deal with Pagination
ParseHub is now extracting the data you have selected from every result on the first page of results. We will now instruct ParseHub to navigate to the next pages of results and extract more data.
- First, click on the PLUS (+) sign next to your page selection and choose the Select command.
- With the Select command, scroll all the way to the bottom of the page and click on the “Next Page” button. Rename your selection to next.
- Expand your next selection and remove both extract commands under it.
- Click on the PLUS (+) sign next to your next selection and choose the Click command.
- A pop-up will appear asking you if this a next page button. Click “Yes” and enter the number of times you’d like to repeat this process. For this example, we will repeat it 5 times. Then click on the “Repeat Current Template” button.
- Lastly, select your click command and under it, tick the “Uses AJAX” option.
Groupon’s website does not load all results on a page unless the user scrolls further down the page.
As a result, we will need to tell ParseHub to scroll to the bottom of the page before it starts extracting data from the page.
- First, click on the PLUS (+) sign next to your page selection, click on Advanced and choose “Scroll”.
- Then, drag your new scroll command to the top of your project (right under the page selection). Your project should look like this:
Running Your Scrape
You are now ready to run your scrape job. To do this, click on the green “Get Data” button on the left sidebar.
Here, you will be able to choose if you want to test your scrape run, schedule it for later or run it right away. For larger scrape jobs, we recommend that you test your run first to verify everything is working correctly. In this case, we will just run it right away.
ParseHub will now go and scrape the data you’ve selected. Once the scrape is completed, you’ll be able to download your scrape as an Excel or JSON file.
Groupon might sometimes block you from scraping their website. Your scrape jobs will come back blank when this happens.
In order to get around this, you’ll have to enable IP Rotation on ParseHub, which is a paid feature.
To enable it, click on the setting icon on the top left of your project and tick the “Enable IP Rotation” box. Then go back and run your scrape job again.
And that’s all that there is to it! You know now how to scrape data from Groupon’s website.
You can also repeat this process with search result pages and other listing pages.
If you run into any issues, you can contact us via chat or email and we’ll be happy to assist you.