Scraping Instagram influencers and their posts can not only help collect influencer candidate lists to work with in the future but also give you content marketing insights about what content might have better engagement with audiences. If you’re looking to strengthen your brand identity, learning from the profiles of influencers is a good place to start.
However, scraping social data is not the same as website scraping, because almost all social platforms require you to log in first before using any features on the platform. So in this Python Tutorial, I would walk you through how to use Selenium to stimulate you to log into the platform, browse Instagram, and search the hashtags for downloading the top posts’ links. By the end of the Python Tutorial, you can start downloading all top Instagram posts by changing the hashtags as you like.
Table of Contents: Build an Instagram Bot and Use Hashtags to Scrape Top Instagram Posts and Instagram Users
- Install Selenium and ChromeDriver
- Log in Instagram account
- Simulate you to select the options
- Search posts using hashtags and scroll for more posts
- Find the elements you like to scrape and save them in a CSV file
- Full Python Script of Instagram Top Ranking Post and Influencer Profile Scraper By Using Hashtag
- INSTAGRAM Latest Trending API Endpoint Recommendation
Instagram Bot – Install Selenium and ChromeDriver
Selenium is a free open-source automated testing framework used to validate web applications across different browsers and platforms. You can use multiple programming languages like Java, C#, Python, etc to create Selenium Test Scripts. Testing done using the Selenium testing tool is usually referred to as Selenium Testing.
If you have read my Python Tutorial article regarding setting up the pip3 previously, installing Selenium is very easy. You just need to type in this code in your Mac terminal
Then, you need a virtual driver to act on your behalf of you in the process. I would recommend ChromeDrive in this tutorial. First thing first, please go to Google and search ChromeDriver and click through to their website. You can see two versions basically – the beta and the latest standard. Just click the standard now!
You can select the version that is configurable with your device, here we would select mac64.zip. After downloading, you need to extract the zip and install the ChromeDriver. Quick notes for you to copy the ChromeDriver location path to the clipboard. It will be used in a moment.
In the python script, first of all, we need to import modules as well as other Python scripts we created before.
Then, we create a variable called the driver and add on the path copied just now using executable_path. Also, we type into a code about the requests to browse instagram.com. It’s similar to the request.get, but we need to use a driver in a selenium environment.
driver = webdriver.Chrome(executable_path='/Users/louislu/Desktop/Python/chromedriver')
Instagram Bot – Log in Instagram account
Basically, Selenium testing would simulate my normal browsing on Instagram. So first thing first must be the account login.
First, we go to the login page and right-click to select inspect, for finding out what elements are used to function in the username and password type-in box. As we can see, basically it is using the
< input name=”username”> element representing this box, as well as the password is using input either. So we can use
By.CSS_SELECTOR it to specifically point out this section.
In the selenium expected_conditions, there is one argument we can use which means the element is clickable is element_to_be_clickable. And as we might need to consider the loading speed, we can create the lines of coding also by using WebDriverWait.
Here are the codes:
username = WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"input[name='username']")))
password = WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"input[name='password']")))
Secondly, we would send the account username and password value to the box. Before that, I would recommend clearing up the box first, to ensure that the box is empty. Then, we use a method from selenium API – send_keys, for sending the value to the box.
Last but not least, we also need to inspect what elements the login button is, as well as check the username and password box. Then it continues to use element_to_be_clickable and By.CSS_SELECTOR. As we need to click the button, at the end, a method, click(), needs adding.
log_in = WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"button[type='submit']"))).click()
Simulate you to select the options
Some platforms would have some pop-up windows after you logged in. In this case, you also need to clarify what pop-up windows that might have. Instagram generally has two windows and to smoothly browse our target content, we can click not now.
Here we also can use XPATH to click the not now button. Here are the codes:
not_now = WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//button[contains(text(),'Not Now')]"))).click()
not_now2 = WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//button[contains(text(),'Not Now')]"))).click()
Search posts using hashtags and scroll down for more posts
For searching posts using hashtags, Instagram has a fixed path which is
https://www.instagram.com/explore/tags/ + keyword. So we need to create a query variable first, here I presume to search “moussy”. Then, we also create a variable page to visit the page.
When you scroll down for more posts, you might find out it will load a while for more posts. So we need to write codes to scroll and also add codes to avoid the scraping being stopped due to the loading time. We would use
window.scrollBy () and
time.sleep() methods. The x and y number in the scroll method represents the max. The height you like to scroll down. But as the scrolling would be stopped due to loading time, I would recommend you set a bigger number first and add more lines if you aim to scrape more posts.
Find the elements you like to scrape and save them in a CSV file
Now basically all posts are ready, and what we need to do is to fetch the post links. Again, we can inspect and find the elements. In the Selenium argument, there are two methods,
links = driver.find_elements_by_tag_name('a')
links = [link.get_attribute('href') for link in links]
If you try to print these codes and the result comes up in the links, it means it’s working
So you can use Pandas to append the column and save it as a CSV file. Because of this, I shared previously and I am not going to elaborate on the details here.
Full Python Script of Instagram Top Ranking Post and Influencer Profile Scraper By Using Hashtag
If you would like to have the full version of the Python Script of Instagram Post and Influencer Scraper By Using Hashtag, please subscribe to our newsletter by adding the message “Chapter 12”. We would send you the script asap to your mailbox.
So easy, right? I hope you enjoy reading Chapter 12 – Build an Instagram Bot and Use Hashtags to Scrape Top Instagram Posts and Instagram Users. If you did, please support us by doing one of the things listed below, because it always helps out our channel.
- Support and donate any amount to our channel through PayPal (paypal.me/Easy2digital)
- Subscribe to my channel and turn on the notification bell Easy2Digital Youtube channel.
- Follow and like my page Easy2Digital Facebook page
- Share the article on your social network with the hashtag #easy2digital
- Buy products with Easy2Digital 10% OFF Discount code (Easy2DigitalNewBuyers2021)
- You sign up for our weekly newsletter to receive Easy2Digital latest articles, videos, and discount code
Q1: What is Instagram Post Scraper?
A: Instagram Post Scraper is a tool designed to extract data from Instagram posts, including images, captions, likes, and comments.
Q2: How does Instagram Post Scraper work?
A: Instagram Post Scraper works by using an API to access and retrieve data from Instagram’s servers. It analyzes the HTML structure of posts and extracts the desired information.
Q3: What can I do with Instagram Post Scraper?
A: With Instagram Post Scraper, you can gather data from Instagram posts to analyze trends, track user engagement, monitor competitors, or create reports for marketing purposes.
Q4: Is Instagram Post Scraper legal?
A: Using Instagram Post Scraper is legal as long as you comply with Instagram’s terms of service and respect the privacy settings of users. Make sure to use the tool responsibly and ethically.
Q5: Does Instagram Post Scraper require coding knowledge?
A: No, Instagram Post Scraper is designed to be user-friendly and does not require coding knowledge. You can easily navigate and use the tool’s features without any technical expertise.
Q6: Can I scrape data from private Instagram accounts?
A: No, Instagram Post Scraper cannot extract data from private Instagram accounts. It can only access and retrieve data from public posts.
Q7: Is Instagram Post Scraper compatible with all devices?
A: Yes, Instagram Post Scraper is compatible with all devices, including desktop computers, laptops, smartphones, and tablets. It works on various operating systems and web browsers.
Q8: What is the pricing of Instagram Post Scraper?
A: The pricing for Instagram Post Scraper varies depending on the subscription plan you choose. Visit our website for detailed pricing information.
Q9: Is there a free trial available for Instagram Post Scraper?
A: Yes, we offer a free trial of Instagram Post Scraper. You can sign up on our website to try out the tool’s features and see how it can benefit your business.
Q10: What support options are available for Instagram Post Scraper?
A: We provide customer support for Instagram Post Scraper through email and live chat. Our team is available to assist you with any questions or issues you may have.
Instagram API Endpoint Recommendation
Instagram Trending Post Scraper API
Instagram trending post scraper crawl the most trending and popular post from Instagram.com. Users input a hashtag keyword and her/his account credntial to generate a list of specific posts. It returns the post URL, profile name, and post copy. API allows to fetch the Instagram profile data which includes profile followers, historical posts, post performance, etc
More API options from the Instagram collection.