Python Tutorial 21: Amazon Best Selling Product Scraper – Awesome Approaches to Find Niche Products, Monitor Competitors and Identify Potential Clients

Amazon, the eCommerce giant globally, is the important lighthouse of product trend and direction to millions of sellers and hundreds of brands worldwide. Most of them are keeping eyes on the popularity of product selling and momentum of trends on Amazon. No brainer to ask a question where to find the trend? Amazon best selling products and brands is the place you should often drop by. It makes more sense if you are asking how to implement more efficiently in an automatic format. If you do, this piece can be helpful to you.

amazon best selling product scraper

Amazon best selling products are being updated hourly everyday. The updates are based on sales performance and customer rating of the product. However, you might carefully look into who is selling the products. Basically the best selling merchants are almost hourly although the position might vary. Just some new sellers might stand out by selling new popular niche SKUs.

In a way, it’s not realistic and unnecessary to manually monitor every hour on all categories and SKUs. What’s more, the interface in the Amazon best selling page is not a friendly way for you to understand the performance in a data format. So in this article, I would walk you through how to create an Amazon best selling product scraper using Python. By the end of this piece, you can learn how to code and create the scraper for fetching your competitor up-to-date product performance and pricing, finding potential customers, and identifying potential products. And then you can set up Crontab to automate the process and refresh your dashboard.

Amazon Besting Selling Product Ranking Page by Category

Amazon has hundreds of categories and subcategories in Best selling product pages. Your brand and business might not be related to each category and product, but you must like to realise which page to check the information is the most proper. The information is valuable for you to explore new niche products, monitor competitor’s SKUs and spot potential clients.

amazon best selling product scraper

Amazon ranks the best selling products by departments. Each department has categories, and under each category, it also has many subcategories, and sub subcategories as well. Take the amazon devices & accessories for the example as listed below. It looks like an onion, which can be splitted layer by layer.

https://www.amazon.com/Best-Sellers/zgbs/amazon-devices
https://www.amazon.com/Best-Sellers-Amazon-Device-Accessories/zgbs/amazon-devices/370783011

So first thing first, we can fetch all department’s URL by using the URL below

values_list = https://www.amazon.com/Best-Sellers/zgbs

amazon best selling product scraper

All category page left hand side menu’s html coding is with the same naming. I assume you have already fetched all the department’s URL by using the URL above. Below is the full list of department’s URLs

If you like to scrape categories to understand each department, you need to create a loop to scrape using selenium, gspread and beautifulsoup.

First thing first, you need to use the simulation driver to open Amazon best selling page, as Amazon disallows BeautifulSoup direct access to page html.

amazonSERP = []

## Selenium ##
driver = webdriver.Chrome(executable_path='your chrome driver local path')

Then, you can create a table in Google Sheets, and paste all department URLs in a column. You can use gspread to read the URLs.

## gspread ##scope = ['https://www.googleapis.com/auth/spreadsheets','https://www.googleapis.com/auth/drive.file',

'https://www.googleapis.com/auth/drive']creds = ServiceAccountCredentials.from_json_keyfile_name('your google api credential account json file.json',scope)client = gspread.authorize(creds)

sh = client.open('AmazonPriceTracker')worksheet = sh.get_worksheet(5)values_list = worksheet.col_values(3)

Last but not least, the category URLs in Amazon best selling product pages are sitting under id zg_browseRoot. It’s formatted by <li></li>. Thus, below are the looping scraper of codings

## Fetch using beautifulsoup ##

for url in values_list:

seller = driver.get(url)
soup = BeautifulSoup(driver.page_source, 'html.parser')
results = soup.find('ul',{'id': 'zg_browseRoot'})
category_li = results.find_all('li')

time.sleep(5)

for tag in category_li[1:]:

title = tag.text.strip()

url = tag.find('a')['href']

element_info = {
"Category": title, "URL": url
}

amazonSERP.append(element_info)

df = pd.DataFrame(amazonSERP)

Fetch Amazon Best Selling Product Data by Category

Once you have a full list of category URLs on hand, you can use the specific category URL to set up the scraper.

First thing first, each category best selling product has two pages, one is top 50 and the other is 51 – 100. You can use the parameter ?pg= after the category URL.

What’s more, the block of each product is class = aok-inline-block zg-item. So basically we can use selenium and beautifulsoup to find all blocks of the best selling product dataset.

## Top 100 product data blocks ##

for x in range (1,3):

seller = driver.get('https://www.amazon.com/Best-Sellers-Garden-Outdoor-Cooking/zgbs/lawn-garden/553760' + '?pg=' + str(x))

soup = BeautifulSoup(driver.page_source, 'html.parser')

results = soup.find_all('span',{'class': 'aok-inline-block zg-item'})

Then, we need another loop to fetch the specific data from each product, such as title, product url, review, ratings, pricing, etc.

for tag in results:

title = tag.find('div',{'class': 'p13n-sc-truncated'}).text.strip()

url = tag.find('a',{'class': 'a-link-normal a-text-normal'})['href']

review = tag.find('a',{'class': 'a-size-small a-link-normal'}).text.strip()

Stars = tag.find('span',{'class': 'a-icon-alt'}).text.strip()

price = tag.find('span',{'class': 'a-size-base a-color-price'}).text.replace("$","").strip()

Last but not least, we can use pandas to append the data connected with the variables created above in the scraping codes. Then automatically it can update the data into the Google Sheets.

## append the data and upload to Google Sheets ##

element_info = {
"Market": Market,
"Channel": Channel,
"Tier of Cate": Tier,
"Name of Cate": category,
"Title": title,
"URL": url,
"Review": review,
"Stars": Stars,
# "Min-Price": min_price,
"Max-Price": price
}

amazonSERP.append(element_info)

df = pd.DataFrame(amazonSERP)

value_list2 = sh.values_update('sheet position', params={'valueInputOption': 'USER_ENTERED'},body=dict(values=df.T.reset_index().T.values.tolist()))

Full Script of Amazon Best Selling Product Scraper

If you would like to have the full version of the Python Script of Amazon Product Price Tracker, please subscribe to our newsletter by adding the message Python Tutorial 21. We would send you the script immediately to your mailbox.

Contact us

I hope you enjoy reading Python Tutorial 21: Amazon Best Selling Product Scraper – Awesome Approaches to Find Niche Products, Monitor Competitors and Identify Potential Clients. If you did, please support us by doing one of the things listed below, because it always helps out our channel.

  • Support my channel through PayPal (paypal.me/Easy2digital)
  • Subscribe to my channel and turn on the notification bell Easy2Digital Youtube channel.
  • Follow and like my page Easy2Digital Facebook page
  • Share the article to your social network with the hashtag #easy2digital
  • Buy products with Easy2Digital 10% OFF Discount code (Easy2DigitalNewBuyers2021)
  • You sign up for our weekly newsletter to receive Easy2Digital latest articles, videos, and discount code on Buyfromlo products and digital software
  • Subscribe to our monthly membership through Patreon to enjoy exclusive benefits (www.patreon.com/louisludigital)

Leave a Reply

Your email address will not be published. Required fields are marked *