Twitter user tagging method based on burst time series
International Journal of Web Information Systems
ISSN: 1744-0084
Article publication date: 15 August 2016
Abstract
Purpose
Many Twitter users post tweets that are related to their particular interests. Users can also collect information by following other users. One approach clarifies user interests by tagging labels based on the users. A user tagging method is important to discover candidate users with similar interests. This paper aims to propose a new user tagging method using the posting time series data of the number of tweets.
Design/methodology/approach
Our hypothesis focuses on the relationship between a user’s interests and the posting times of tweets: as users have interests, they will post more tweets at the time when events occur compared with general times. The authors assume that hashtags are labeled tags to users and observe their occurrence counts in each timestamp. The authors extract burst timestamps using Kleinberg’s burst enumeration algorithm and estimate the burst levels. The authors manage the burst levels as term frequency in documents and calculate the score using typical methods such as cosine similarity, Naïve Bayes and term frequency (TF) in a document and inversed document frequency (IDF; TF-IDF).
Findings
From the sophisticated experimental evaluations, the authors demonstrate the high efficiency of the tagging method. Naïve Bayes and cosine similarity are particular suitable for the user tagging and tag score calculation tasks, respectively. Some users, whose hashtags were appropriately estimated by our methods, experienced higher the maximum value of the number of tweets than other users.
Originality/value
Many approaches estimate user interest based on the terms in tweets and apply such graph theory as following networks. The authors propose a new estimation method that uses the time series data of the number of tweets. The merits to estimating user interest using the time series data do not depend on language and can decrease the calculation costs compared with the above-mentioned approaches because the number of features is fewer.
Keywords
Acknowledgements
This work was supported by Grant-in-Aids for scientific Research No. 25280110 and No. 15J05599 and NII’s strategic open-type collaborative research.
Citation
Yamamoto, S., Wakabayashi, K., Kando, N. and Satoh, T. (2016), "Twitter user tagging method based on burst time series", International Journal of Web Information Systems, Vol. 12 No. 3, pp. 292-311. https://doi.org/10.1108/IJWIS-03-2016-0012
Publisher
:Emerald Group Publishing Limited
Copyright © 2016, Emerald Group Publishing Limited