This paper aims to study the characteristics and evolution rules of tagging knowledge network for users with different activity levels in question-and-answer (Q&A) community represented by Zhihu.
A random sample of issue tag data generated by topics in the Zhihu network environment is selected. By defining user quality and selecting the top 20% and bottom 20% of users to focus on, i.e. top users and bot users, the authors apply time slicing for both types of data to construct label knowledge networks, use Q-Q diagrams and ARIMA models to analyze network indicators and introduce the theory and methods of network motif.
This study shows that when the power index of degree distribution is less than or equal to 3.1, the ARIMA model with rank index of label network has a higher fitting degree. With the development of the community, the correlation between tags in the tagging knowledge network is very weak.
It is not comprehensive and sufficient to classify users only according to their activity levels. And traditional statistical analysis is not applicable to large data sets. In the follow-up work, the authors will further explore the characteristics of the network at a larger scale and longer timescale and consider adding more node features, including some edge features. Then, users are statistically classified according to the attributes of nodes and edges to construct complex networks, and algorithms such as machine learning and deep learning are used to calculate large-scale data sets to deeply study the evolution of knowledge networks.
This paper uses the real data of the Zhihu community to divide users according to user activity and combines the theoretical methods of statistical testing, time series and network motifs to carry out the time series evolution of the knowledge network of the Q&A community. And these research methods provide other network problems with some new ideas. Research has found that user activity has a certain impact on the evolution of the tagging network. The tagging network followed by users with high activity level tends to be stable, and the tagging network followed by users with low activity level gradually fluctuates.
Research has found that user activity has a certain impact on the evolution of the tagging network. The tagging network followed by users with high activity level tends to be stable, and the tagging network followed by users with low activity level gradually fluctuates. For the community, understanding the formation mechanism of its network structure and key nodes in the network is conducive to improving the knowledge system of the content, finding user behavior preferences and improving user experience. Future research work will focus on identifying outbreak points from a large number of topics, predicting topical trends and conducting timely public opinion guidance and control.
In terms of data selection, the user quality is defined; the Zhihu tags are divided into two categories for time slicing; and network indicators and network motifs are compared and analyzed. In addition, statistical tests, time series analysis and network modality theory are used to analyze the tags.
The authors acknowledge the financial support from the National Natural Science Foundation of China (No. 11905042), the Youth Fund for Humanities and Social Sciences Research for the Ministry of Education (No. 16YJC630022,20YJC870005) and Hebei Federation of Social Science Circles Key Project, China (No.201802120102).
Feng, X., Li, L., Li, J., Cui, M., Sun, L. and Wu, Y. (2021), "Temporal evolution of tagging subnetwork features and motif under different activity levels – take the Q&A community Zhihu as an example", Information Discovery and Delivery, Vol. 49 No. 2, pp. 151-161. https://doi.org/10.1108/IDD-07-2020-0084
Emerald Publishing Limited
Copyright © 2021, Emerald Publishing Limited