Downloading lots of images from a website can be quite time-consuming.
Right-click, Save Image As…, repeat ad nauseam.
In these cases, web scraping is the solution to your problem. In this tutorial, we will go over how to extract the URL for every image on a webpage using a web scraper.
We will also go over how to use this extracted list to quickly download all the images to your computer.
ParseHub and Web Scraping
In order to complete this simple task, you’ll need a web scraper that can collect the URLs in question. ParseHub is a free and incredibly powerful web scraper, the perfect candidate for this task.
Make sure to download and install ParseHub before getting started.
Scraping Image URLs
For this example, we will assume that we are interested in downloading every image for the first 5 pages of results on Amazon.ca for “wireless earbuds”. This information could potentially be incredibly valuable for competitor analysis.
- After downloading ParseHub, make sure you have it up and running on your computer.
- Get the specific URL of the page we will be scraping.
Creating a Project
- In ParseHub, click on “New Project” and enter the URL from the Amazon website that we will be scraping.
- The webpage will now render in ParseHub and you will be able to choose the images you want to scrape.
Select Images to Scrape
- Begin by selecting the first image from the search results. It will then turn green, meaning it has been selected to be scraped.
- The rest of the images in the search results page will then turn yellow. Click on the second image to select all the images in the page. They will all turn green, which means they have been selected to be extracted.
- Since these images also act as links to the product pages, ParseHub is extracting both the image URL and the link it is pointing to (product page). As a result, we will delete the URL selection from the left sidebar and only keep the image selection.
- Now ParseHub will scrape every image URL for the first page of results.
Now we need to tell ParseHub to extract this same information but for the next 5 pages of search results.
- Click on the PLUS(+) sign next to the page selection and use the select command.
- Then click on the “Next” button and the bottom of the search results page.
- By default, ParseHub will extract the link from the Next button. So we will click on the icon next to the “Next” selection and remove the two items under it.
- We’ll then use the PLUS(+) sign next to the “next” selection and use the “click” command.
- A window will pop up asking if this is a Next Page link. Click “Yes” and enter the number of times you’d like this cycle to repeat. For this example, we will do it 5 times.
Scrape and Export Data
Now comes the fun part, we will let ParseHub run and extract the list of URLs for every image we have selected.
- Click on the “Get Data” button on the left sidebar.
- Here you can select when to run your scrape. Although we always advise testing your scrape runs before running a full scrape, we’ll just run the scrape right now for this example.
- Now ParseHub will scrape the image URL’s you’ve selected. You can either wait on this screen or leave ParseHub, you will be notified once your scrape is complete. This process took less than 1 minute in this case.
- Once your data is ready to download, click on the CSV/Excel button. Now you can save and rename your file.
Download Images to your Device
Now that we have a list of all the URLs for every image, we will go ahead and download them to our device with one simple tool.
For this, we will use the Tab Save chrome extension.
Once installed on your browser, open the extension by clicking on its icon. This will open up the extension, then click on the edit button at the bottom left to enter the URLs we just extracted.
When you click on the download icon at the bottom right of the extension windows, all images will automatically be downloaded to your device. This might take a couple of seconds depending on how many images you are downloading.
Following every step on this guide, you will end up with a folder of all the images you needed to download. In this case, we downloaded over 330 images from Amazon in less than 5 minutes.
Now, if you’ll excuse me, I’ve got to go and delete all these images from my hard drive.