Indeed is one of the most popular jobs posting websites.
In 2010, they passed Monster.com to become the highest-traffic job listing site in the United States.
As a result, their job listing database contains tons of valuable information. Web scraping can help you uncover this value.
Indeed and Web Scraping
With the help of a web scraper, you’ll be able to extract data related to job listings such as companies that are hiring, roles in demand, average salaries and more.
Are different cities hiring more for specific professionals? What are the average salaries per industry per city?
This analysis becomes easy to conduct with the help of web scrapers and Indeed’s data.
To do this, we will use ParseHub, a free web scraper that can easily tackle this task.
How to Scrape Indeed Data
Now, it’s time to get into the nitty-gritty of things. Here’s how to scrape data from Indeed.
- Make sure to download and install ParseHub for free. Boot it up and click on New Project.
- You now should enter the list of job postings to scrape. For this example, we will scrape the results page for “Operations Manager” jobs in Toronto. Enter the URL and the site will now load inside the app.
Now it’s time to set up your web scraping project in ParseHub.
- Start by clicking on the title of the first job listing on the page. It will be highlighted in Green to indicate that it has been selected.
- The rest of the titles will be highlighted in yellow. Click on the second one on the list to select them all. In the left sidebar, rename your selection to listing.
- Now, Click on the PLUS (+) sign next to your listing selection and choose the “Relative Select” command.
- Using the “Relative Select” command, click on the title of the first listing on the page. Then, click on the name of the company under it. An arrow will appear to show the association you’re creating. You might have to repeat this step with the second listing as well to fully train the scraper. In the left sidebar, rename your selection to company.
- Repeat Steps 3 and 4 to also pull the company’s location, listing salary and listing rating. Rename your selection accordingly. Your project should now look like this:
Scraping more listing details
You might want to scrape even more data from each listing that is only accessible once you visit the listing page.
We will now setup ParseHub to click on each listing and extract more data.
- Click on the PLUS(+) sign next to your listing selection and choose the “click” command.
- A pop-up will appear asking you if this is a “next page” link. Click on now and rename your new template to “listing_page”.
- You can now use the select command that is auto-generated to select the data you’d like to scrape from this page. In this case, we will select the job listing description. Rename your selections accordingly.
- If you want to extract additional data, just make sure to create additional “select” commands.
Deal with Pagination
ParseHub is currently only extracting the details we’ve selected from the first page of search results. We will now set it up to extract data from further pages of search results.
- In the left-most tabs in ParseHub, go back to main_template. At the top of ParseHub, click back to the browser tab for the search results page.
- Click on the PLUS(+) sign next to your page selection, choose the select command and click on the “next page" link at the bottom of the page. Rename your selection to next.
- Click on the Expand icon next to your next command.
- Delete both extractions below the next command.
- Click on the PLUS(+) sign next to the next command and choose the “click” command.
- A pop-up will appear asking you if this is a “next page” link. Click on Yes and enter the number of times you’d like to repeat your scrape. To scrape 5 pages, we will need 4 repetitions.
Running you Scrape
You can now run your web scraping project and download the data you have selected.
To do this, click on the green “Get Data” button in the left sidebar. Here you can choose to Test, Run or Schedule your scrape project.
For larger projects, we recommend running a test scrape. In this case, we will run it right away.
After your run has finished. You will be able to download it as a CSV or JSON file.
You will now have access to all the employment data you requested.
If you run into any issues while setting up your project, make sure to reach out to us via the live chat on our site. We’ll be happy to assist you through your project.