If you’re an online shopper, you have definitely heard of Shein! Since 2021, it has been the most popular shopping app on iOS and Android and has generated over $10 billion dollars in revenue annually. The company has over 100,000 employees and 44 million users per month. Their website hosts a large number of products, from men's and women’s wear, accessories and much more. All these products can be scraped easily with ParseHub, a visual and codeless free web scraping tool.
In order to follow along, download and register to ParseHub for free.
Let’s start scraping!
- To begin, download ParseHub and open it on your PC, Mac or Linux computer.
- Click the “New Project” button to create a new project.
- Enter the URL you wish to scrape, we will use this URL to scrape popular men’s wear on Shein: https://ca.shein.com/hotsale/Men-Top-Rated-Clothing-sc-00331083.html?ici=ca_tab04navbar04menu03&scici=navbar_MenHomePage~~tab04navbar04menu03~~4_3~~itemPicking_00331083~~~~0&src_module=topcat&src_tab_page_id=page_home1664215983546&src_identifier=fc%3DMen%60sc%3DCLOTHING%60tc%3DTOP%20RATED%60oc%3D0%60ps%3Dtab04navbar04menu03%60jc%3DitemPicking_00331083&srctype=category&userpath=category-CLOTHING-TOP-RATED
- Once the page has loaded, click the first product’s name to extract it.
- The rest of the products will turn yellow, click the next product’s name to train the algorithm.
- Rename this extraction on the left to “product”.
- All 120 products on the first page will now be extracted!
ParseHub’s Relative Select tool is used to extract relative data, such as prices, from each product. Here’s how:
- Begin by clicking the PLUS(+) button next to the “product” selection.
- Click “Relative Select” and once again, click the first product’s name.
- An arrow will appear, point it to the respective product’s price and click to connect it.
- All 120 prices should now be extracted!
- Rename this selection to “price” on the left.
Scraping Multiple Pages
Using ParseHub’s pagination, we can scrape more than a single page, which will get us more than 120 rows of data in our export!
- To scrape multiple pages, scroll all the way down the page until you see the nav bar.
- Click the PLUS(+) button next to the “page” selection on the left pane, and choose “Select”.
- Click the next arrow link to extract it, and rename this selection to “pagination”.
- Expand the selection and delete the extraction, as it adds an unnecessary column to your data.
- Now click the PLUS(+) button next to your pagination selection and choose “Click”.
- Choose “Yes” when the popup appears, as this is a next page button.
- You can now choose the additional amount of pages you wish to scrape, we will choose 2, which means 3 pages of products in total!
Starting Your Scrape
Now that you have your product selections, relative price selections and pagination set up, you are ready to begin scraping!
Begin by clicking the “Get Data” button on the left side of ParseHub. You can choose to test, run or schedule your scrape. In our example, we chose to “Run”, which will scrape 3 pages of data in total.
Your data should look similar to this if you followed our guide correctly:
Note: If scraping large amounts of data and pages, you may need to enable ParseHub’s IP Rotation.
We hope you enjoyed this tutorial on scraping Shein, we also have one on scraping ASOS products; another well-known online fashion retailer.
Happy Scraping! 🛍️