Scraping addresses from the internet can be done for many reasons.
After all, this kind of data can be pretty valuable.
The problem is how tedious the data collection process can be. Especially if you want to extract data to an excel sheet or JSON file.
Here’s where Web Scraping can help.
Free and Easy Web Scraping
A Web Scraper will allow you to load up any website, select the data you want to extract and automatically extract this data in a new format.
ParseHub is a powerful and free web scraper that works with any website. With ParseHub, we can load up a website such as Yelp and scrape as many addresses as we’d want.
This process requires us to select the data we want and let ParseHub run the scrape for us. Quick and easy.
Scraping Addresses from Yelp
Before we get started, make sure to download and install ParseHub.
- Start by booting up ParseHub, click on New Project and enter the URL you want to scrape. In this case, we will scrape the Yelp search results page for Coffee Shops in Toronto. Once you submit the URL, it will be rendered inside the app.
- Start by clicking on the name of the first business on the page. It will be highlighted in Green to indicate that it has been selected. The rest of the names on the page will be highlighted in Yellow. In the left sidebar, rename your selection to business.
- Now click on the second business name on the list to select them all.
- Use the PLUS(+) sign next to your business selection and choose the Relative Select command.
- Using the Relative Select command, click on the name of the first business on the list and then on the business address. An arrow will appear to show the association you’re creating. Rename your new selection to address.
- You can repeat the previous step to also scrape other data such as phone number, category and more. In this case, we will keep it to just the address.
ParseHub is currently only pulling data from the first page of listings. Next up, well set it up to scrape additional pages of data.
- Click on the PLUS(+) sign next to your page selection and choose the select command.
- Scroll all the way to the bottom of the page and click on the “next page” link at the bottom of the page. Rename your selection to next.
- Use the icon beside your next selection to expand it.
- Then, delete both extract commands under your selection.
- Use the PLUS(+) sign next to your next selection and choose the Click command. A pop-up will appear asking you if this a next page button. Click on “Yes” and enter the number of times you’d like to repeat this process. In this case, we will run it 5 more times.
Running Your Scrape
Now it’s time to run your new scraping project.
To do this, click on the green “Get Data” button on the left sidebar. Here you can test, schedule, or run your scraping project.
In this case, we will run it right away. ParseHub will now go and scrape all the data you’ve selected on the cloud. You will receive a notification once your scrape is complete.
After your scrape is complete, you will be able to download your data as an Excel sheet or JSON file.