We often receive requests asking how to scrape and work with data extracted from a map. For example, consultants, analysts and marketers who are interested in scraping business or store locations from a map for market research or business planning.
Being a foodie who is relatively new to Toronto, I was interested in looking into where restaurants are located by food type. To do so, I scraped the latitude and longitude from all Toronto Restaurants on the Yelp directory and plotted them on to a Tableau Public map which can be filtered by the restaurant’s category.!(/blog/content/images/2018/05/MapofTorontoRestaurantsbyCategory.png)
The map allows you to select one or more categories on the right-hand sidebar to view the location of Toronto restaurants in those categories (use Shift or Ctrl/Command to select multiple). Clicking into each restaurant on the map will provide you with more information.
How did I do it?
The data extracted is from Yelp’s Toronto Restaurant Categories page. My project loops through every category on the page, clicks into each one and extracts all the restaurants that appear, extracting their latitude, longitude, number of reviews, rating, address, phone, restaurant and price range.
Exact details on how to do this are provided at the end of the article.
Creating the map
I considered several alternatives on how to plot my map data, such as the Google Maps API, but ultimately decided on Tableau Public who offer a free version of their tool. This method was surprisingly easy – I just had to import my CSV/Excel file and specify which columns were the latitude and longitude.
This article explains how to do so in more detail.
What did I find?
Toronto is known for its multiculturalism, with some like the BBC Radio referring to it as the most diverse city in the world. Over half the population was born outside of Canada and come from 232 different nationalities. The 132 restaurant categories on Yelp are a reflection of the various cultures that share the city.
According to Wikipedia’s List of Neighbourhoods in Toronto, there are several neighbourhoods that correspond to different communities such as Little India, Little Italy, Little Portugal, Little Tibet, Greektown, Chinatown, Koreatown and Little Jamaica. I was curious to see if the majority of restaurants from these cultures were found in their respective areas so I searched for the perimeter for each neighbourhood to compare with the restaurants on my map.
Some restaurant types can be found in most neighbourhoods:
- While Indian food is widespread throughout the city, there does appear to be a cluster of restaurants in the area See Toronto describes as Little India - on Gerrard Street East, between Coxwell Avenue and Greenwood Avenue.
- Similar to Indian food, Italian food is also widespread but it is true that there is a small cluster in the area defined as Little Italy.
Some restaurant types are indeed clustered mostly around their ethnic neighbourhood:
- The majority of Portuguese restaurants are mostly around Toronto’s Little Portugal neighbourhood and surrounding areas.
- There are not too many Tibetan restaurants in the city, but there is a cluster of quite a few at the far west of Queen Street West.
Some have clusters in different areas:
- There is a cluster of Greek restaurants in Greektown on the Danforth, but there also appears to be quite a few in the downtown Financial District.
- Chinese restaurants are widespread throughout the city. There are clusters in the downtown area around Chinatown, at the north of Yonge Street and also in the north-east in Markham.
- Korean restaurants are largely found downtown with a cluster on Bloor Street West in Koreatown and another at the north of Yonge Street.
Some didn’t appear to have any specific clusters:
- While there are plenty of Caribbean restaurants in the Little Jamaica area, there are also plenty downtown and in the east of the city. There are very few on Yonge Street and its surrounding areas outside of the downtown core.
I also noticed that some restaurants from other cultures are primarily located in specific areas:
- There is a cluster of Ethiopian restaurants on the Danforth
- There are plenty of Tapas and Spanish restaurants just west of downtown
- Running along the Humber River and on the West end of the city are plenty of South American cuisines
While clicking through the different categories, I noticed a distinct cluttering of Sandwich, Cafes, Soup and Salad around the Financial District. These are likely the types of restaurants that people lunch at during their workdays.
The Waffle Mile
Did anyone know we have a Waffle Mile? I could not help but notice that there is a path of Waffle restaurants downtown. It’s probably more than a mile, but worth every step!
Recently there have been news about vegan restaurants in Parkdale attempting to refer to it as Vegandale, but where are all the vegan restaurants?
They do appear to be mostly spread throughout the western side of downtown although Parkdale itself does not necessarily have more vegan restaurants. That being said, the naming also comes from upcoming restaurants that do not currently appear on the map.
There are plenty of other interesting restaurant category trends you can find on the map, play around with it and let us know in the comments if you’ve spotted any!
Details on building the project
If you want to replicate the project that I built on ParseHub, you can do the following.
Open the ParseHub client and click on New Project
Input https://www.yelp.ca/c/toronto/restaurants as the URL and click on Start project on this URL
Click on the first category (e.g. Afghan) and continue to click on categories until all are selected. You can rename that selection1 to Category.
I wanted to extract the category’s code. To do so, I took the URL Extract command and used RegEx to extract that category. Rename the selection to code. I used the following RegEx:
- Click on the + sign next to Begin new entry in Category and choose a Go to Template command. For the URL, we want it to go to the Yelp link searching for the category code we extracted previously, as follows:
'https://www.yelp.ca/search?cflt=' + code + '&find_loc=Toronto%2C+ON%2C+Canada'. It should go to a new template "category".
A category will open along with the new category template. From here, you can follow the instructions on this tutorial to scrape all of the restaurants in that category.
On the details template, we used a Select command to select the map that appears and then added two Extract commands below it, each extracting the Image URL from the drop-down and using the RegEx
center=(-?[\d.])for the latitude and the RegEx
center=[-?\d.]%2C([-?\d.]*)for the longitude.
You should be able to easily replicate this and similar projects using ParseHub. If you have any questions while you build your own project, you can always contact us at email@example.com or book a demo call here and we would be happy to assist.
For information on how to scrape data from a store locator, check out this blog post.