Site icon EASY2DIGITAL

Chapter 31 – Build a Zhihu Bot & Scraper for Grabbing Top and Trending Q&A, Blogger Content

In this Chapter, I’ll walk you through what elements to create a Zhihu bot and execute the scraping. By the end of this chapter, you can master how to write the python script

– Components: Zhihu account (Either verified or not; Automatic messaging requires verified Zhihu Account (Personal ID, or passport).

Table of Contents: Zhihu Bot

Why you need a Zhihu Bot for marketing purposes in China

First things first, Zhihu is one of the largest Q&A communities with recognized quality content. People get used to going there and finding answers regarding daily life problems, brand word of mouth, product reviews, healthcare information, and professional knowledge. Thus, it gathers in-demand traffic on the platform.

Also, foreigners are accessible to Zhihu as well because the platform allows users to sign up using a foreign country’s mobile phone number. Even though it has a limit to release and comment content if that user doesn’t verify her or his identity using a personal identity card. However, whether you want or not, at least browsing the platform and scraping the top-ranking content and KOLs, KOCs is not a problem.

Unlike Instagram, Zhihu encourages users to message each other, and invite others to comment. So there isn’t a limit to daily messaging, commenting, following, etc. It provides a friendly environment for marketers to automate data collection and outreach communication on the platform. So having a Zhihu bot facilitates you doing marketing and recruitment in China.

Define a ZhihuLogin Function

As a foreigner, signing up for a Zhihu needs to install the Zhihu app on your mobile, and make it done through mobile devices. For more details, I’ll release another article regarding how to sign up for a Zhihu account using a non-China mobile number.

Zhihu login journey is very simple and has 4 steps. Below are the codes I convert into these 4 steps

Last but not least, it has a verification process after you click submit. I will suggest manually as it is just a one-off and it is unnecessary to verify again. I will release another article regarding how to pass security checkers using Python.

Tips and codes to scrape Zhihu SERP data

You can filter the dimension of the posts in SERP, such as the most votes, within a year, etc. Here is the URL with parameters for the best engagement content in the latest year.

https://www.zhihu.com/search?q=' + keywords + '&sort=upvoted_count&type=content&time_interval=a_year'

One trick to scrape the Zhihu SERP data is you need to manually scroll down and load one more result page first, otherwise, the html element data is not reachable and your bot can’t work.

In the SERP, the scrapable data are the post link, profile name, post tile, number of like & comments.

3 types of post links in SERP

In the SERP, there are three types of post links, which are the Q&A post, video, and column. Here are the URL samples FYI. In terms of the content-wise difference, please refer to this article.

https://www.zhihu.com/question/460666810/answer/1906844914
https://www.zhihu.com/zvideo/1401218993156419584
https://zhuanlan.zhihu.com/p/405042094

Zhihu Marketing Guide for Western Company’s SEO and Branding in China

Zhihu Q&A

For the Q&A post links, the script can grab the number of profile followers directly, as on the right hand there is the follower and the following number.

Then, you can set up a conditional coding that if the follower number is larger than a number, the script can automatically message the KOL.

Here are the codings of python automatic messaging

Zhihu Column

In SERP, articles possibly are from Zhihu Column and when you click on the piece, you are not able to find the profile follower numbers. So in the python script, you also need to break it down into two sections. One is to scrape the profile page URL first, and the other script is to fetch the follower number from the profile URL.

Full Python Script of Zhihu Bot

If you are interested in the full script of the Zhihu bot, please subscribe to our newsletter by adding the message “Chapter 31”. We would send you the script immediately to your mailbox.

Contact us

I hope you enjoy reading Chapter 31 – Build a Zhihu Bot & Scraper for Grabbing Top and Trending Q&A, Blogger Content. If you did, please support us by doing one of the things listed below, because it always helps out our channel.

FAQ:

Q1: What is Zhihu Post Scraper?

A: Zhihu Post Scraper is a tool specifically designed for extracting data from Zhihu, a popular question-and-answer platform in China. It enables users to scrape and collect information from Zhihu posts, including questions, answers, comments, and user profiles.

Q2: How does Zhihu Post Scraper work?

A: Zhihu Post Scraper uses advanced web scraping techniques to extract data from Zhihu. It simulates human behavior to navigate through Zhihu pages, clicks on posts, scrolls to load more content, and collects the desired information. It then organizes the data into a structured format for further analysis or use.

Q3: What data can I extract with Zhihu Post Scraper?

A: With Zhihu Post Scraper, you can extract various data from Zhihu posts, including the post title, content, author information, post date, number of views, number of upvotes, number of comments, and more. It provides a comprehensive set of data points to analyze and understand Zhihu posts.

Q4: Is Zhihu Post Scraper legal to use?

A: Zhihu Post Scraper operates within the legal boundaries of web scraping. However, it is important to use the tool responsibly and abide by the terms and conditions of Zhihu. It is advised to check the website’s scraping policy and respect any limitations or restrictions imposed by Zhihu to ensure compliance.

Q5: How can Zhihu Post Scraper benefit my eCommerce business?

A: Zhihu Post Scraper can provide valuable insights into customer opinions, preferences, and trends related to your eCommerce products or industry. By analyzing the data extracted from Zhihu posts, you can gain a competitive advantage, understand customer needs, improve your products, and make informed business decisions.

Q6: Can I extract data from specific Zhihu posts with Zhihu Post Scraper?

A: Yes, Zhihu Post Scraper allows you to specify the target Zhihu posts you want to extract data from. You can provide URLs or post IDs to scrape data from specific posts, enabling you to focus on relevant content and extract the information that is most important for your analysis.

Q7: Is Zhihu Post Scraper suitable for both small and large eCommerce businesses?

A: Yes, Zhihu Post Scraper is suitable for businesses of all sizes. Whether you are a small eCommerce startup or a large enterprise, the tool can help you extract and analyze valuable data from Zhihu posts. It offers scalability and flexibility to meet the needs of different business sizes.

Q8: Can I schedule automated data extraction with Zhihu Post Scraper?

A: Yes, Zhihu Post Scraper provides the option to schedule automated data extraction. You can set up recurring scraping tasks to extract data from Zhihu posts at specific intervals or times. This allows you to gather updated information regularly without the need for manual intervention.

Q9: What format is the extracted data provided in?

A: Zhihu Post Scraper provides the extracted data in a structured format, such as JSON or CSV. This makes it easy to import the data into various analytical tools or databases for further processing, visualization, or integration with other systems.

Q10: Is technical knowledge required to use Zhihu Post Scraper?

A: While basic technical knowledge can be helpful, Zhihu Post Scraper is designed to be user-friendly and accessible to users without extensive technical expertise. The tool offers a simple and intuitive interface, making it easy for eCommerce professionals to extract and analyze data from Zhihu posts.

Exit mobile version