Kijiji is an excellent consumer-to-consumer and business-to-consumer website! Many ads are placed on Kijiji every day and many great deals can be found on used items. You can sell a wide variety of items like:
- professional services.
- Many more
Kijiji is a great place to find real estate properties that are for rent and for sale. This information is useful for both real estate agents and buyers/renters.
We are ParseHub, and today we will show you how to web scrape a website like Kijiji. You'll be able to export the extracted data into CSV/ Excel or JSON file.
So let’s get started!
Web scraping Kijiji
For this example, We are going to scrape apartments for rent in the province of Ontario. We will extract:
- Date posted
- Listing URL
- Apartment details
If you want to follow along with the example you can use this link.
Scraping the Kijiji Results Page
- Once ParseHub is downloaded and installed, open the app, click on the green “New Project” and paste the URL from the Kijiji result page. The page will now be load inside the app.
2. Once the website is rendered, a selection function will automatically be created. If not, you can click on the plus sign next to the page selection. Click on the first headline on the page. The headline you’ve clicked will become green to indicate that it’s been selected.
3. ParseHub will now suggest the other elements you want to extract in yellow. Click on the second headline on the page. Now all of the items that were previously highlighted in yellow, are now green because they are selected.
4. On the left sidebar, rename your selection to “Listing”. You will notice that ParseHub is now extracting the headline and URL for each listing.
5. On the left sidebar, click the PLUS(+) sign next to the listing selection and choose the Relative Select command.
6. Using the Relative Select command, click on the first headline of the listing on the page that is highlighted in orange and then on the price. You will see an arrow connect the two selections.
7. Repeat steps 5-6 to also extract the date, location, number of rooms and city. Make sure to rename your new selections accordingly.
We have now selected all the data we wanted to scrape from the results page. Your project should now look like this:
Scraping more data from each Kijiji listing
Let’s scrape more data from each listing! We will tell ParseHub to click on each listing we’ve selected and extract additional data from each page. In this case, we will extract:
- Address/ map URL
- Apartment Amenities
First, on the left sidebar, click on the 3 dots next to the main_template text and click on rename template
Rename your template to” results_page” or anything you see fit. Templates help ParseHub keep different page layouts separate, and will help you organize your project.
- Now use the PLUS(+) button next to the listing selection and choose the “Click” command. A pop-up will appear asking you if this link is a “next page” button. Click “No” and next to Create New Template input a new template name, in this case, we will use Listing_page.
2. ParseHub will now automatically create this new template and render the first property listing on the results page
3. Click on the “View Map” link, Parsehub will now extract the URL of the map. Rename your selection to map_URL.
4. Click on PLUS (+) sign next to the page command choose select.
5. While using the new select command, click on the address of the apartment. Rename your selection to “address” or anything you see fit.
6. Now let’s extract the apartment amenities! Click on PLUS (+) sign next to the page command and use choose select then click on one of amenities under the heading. Once you’ve selected one, click on the next amenities that are highlighted in yellow to extract all of them.
7. Rename your selection to amenities.
Your listing_page template should look like this:
Dealing with pagination (optional)
We can add pagination to this project depending on how many listings you want to scrape. Since Kijiji provides multiple pages of apartments for rent, let’s show you how you can scrape multiple pages.
Let’s setup ParseHub to navigate to the next results pages.
- On the left sidebar, return to the results_page template. You might also need to change the browser tab to the search results page as well.
- Click on the PLUS(+) sign next to the page selection and choose the Select command.
3. Then select the Next page link at the bottom of the Kijiji website. Rename the selection to next_button.
4. By default, ParseHub will extract the text and URL from this link, so expand your new next_button selection and remove these 2 commands.
5. Now, click on the PLUS (+) sign of your next_button selection and use the Click command.
6. A pop-up will appear asking if this is a “Next” link. Click Yes and enter the number of pages you’d like to navigate to. In this case, we will scrape 4 additional pages, then press "Repeat Current Template".
Your final project should look something like this:
Running and Exporting your Project
Now our project is ready to scrape Kijiji. To do this, simply click on the left sidebar and click on the green “Get Data” button.
You’ll be brought to this page:
This is where you can test, run or schedule your project. For longer and bigger projects, we recommend doing a Test Run just to make sure your data will be extracted and formatted correctly.
But for this project, click on the “Run” button to begin your scrape
Once ParseHub is done scraping the website, you will be notified by email and you’ll be able to download your extracted data as an Excel/CSV or as a JSON file.
Now you know how to extract data from Kijiji and export it into a CSV/ Excel or JSON file. There are many things that are put on Kijiji for sale every day and you can use a web scraping tool like ParseHub to create this list for
- price comparison
- industry insights
- List of leads
You can recreate this list for cars as well!
We understand that projects can get quite complex. If you need any help you can contact our customer support team using our live chat. We will be more than happy to assist you!
What will you scrape?