Yellow Pages is one of the largest business directories in the world.
While the days of thick yellow books might be over, Yellow Pages’ online directory is choke-full of valuable business information.
Unfortunately, there’s no way to easily download all the data from Yellow Pages to an Excel spreadsheet. Including business data, addresses, phone numbers and more.
This is true unless you use a web scraper to extract and download all the data you want.
Today, we’ll use a free web scraper to extract data from Yellow Pages.
A Free Yellow Pages scraper
ParseHub is a free and powerful web scraper that can scrape data from any website.
Make sure to download ParseHub for free before we get started.
For today’s example, we will be scraping data from Yellow Page’s search results page for coffee shops in Los Angeles.
Now, let’s get scraping.
Scraping Yellow Pages data
It’s now time to start scraping data from Yellow Pages.
- Install and Open ParseHub, click on “New Project” and enter the URL you will be scraping. In this case, we are scraping the search results page for coffee shops in Los Angeles. The page will now render inside the app.
- Make your first selection by clicking on the name of the first business on the list. It will be highlighted in green to indicate that is has been selected. The rest of the business names will be highlighted in Yellow. In the left sidebar, rename your selection to “business”.
- Now click on the second business name on the page to select them all. All business names on the page will now be highlighted in green.
- ParseHub is now extracting the name and yellow pages link for each business on the page. Let’s extract more data. Start by clicking on the PLUS(+) sign next to your business selection and click on the “Relative Select” commands.
- Now click on the name of the first business on the page and then on the phone number next to it. An arrow will appear to show the association you’re creating. On the left sidebar, rename your selection to “phone”.
- Repeat steps 4-5 to select and extract more data from this page. We will repeat these steps and extract the business address, number of reviews and business website. Your project should look something like this:
Scraping Detailed Data
Now, you might want ParseHub to scrape business data that you cannot find on the search results page. So let’s setup ParseHub to click on each listing on the page and extract more data. If you already have all the data you want, skip to the next section.
- First, click on the PLUS(+) sign next to your “business” selection and choose the “click” command.
- A pop-up will appear asking you if this a “next page” button. Click on “No” and name your template “product_template”.
- The business page of the first business on the list will now open inside the app and a new select command will be automatically created.
- We will use this automatically created command and click on the “Email Business” button to extract the business email. In the left sidebar, rename your selection to “Email”.
- Use the PLUS(+) sign next to your page selection to add additional “Select” commands for any other data you might want to extract.
ParseHub is now extracting all the data you’ve selected from every business on the first page of search results. Let’s now set it up to extract data from more pages of results.
- First, use the browser tabs and the tabs on the left side of the app to go back to your main template and the search results page.
- Now click on the PLUS(+) sign next to your “page” selection and choose the select command.
- Scroll all the way down to the bottom of the page and click on the “next page” link. Rename your selection to “pagination”.
- Use the icon next to your “pagination” selection to expand it.
- Now delete both extract commands under your “pagination” selection by using the icons next to them.
- Now use the PLUS(+) command next to your “pagination” selection and choose the “click” command.
- A pop up will appear asking you if this a “next page” link. Click on “Yes” and enter the number of additional pages you’d like to scrape. In this case, we will scrape 4 more pages.
Running your Scrape
It is now time to run your scrape.
To do this, click on the green “Get Data” button in the left sidebar. Here you, you can test, run or schedule your scrape.
In this case, we will run it right away. ParseHub is now off to scrape the data you have selected from Yellow Pages.
Once your scrape has completed, you will be able to download your data as an Excel or JSON file.
If you run into any issues during your project, please reach out to us via the live chat on our site and we will be happy to assist you.