To read this content please select one of the options below:

Twitter user tagging method based on burst time series

Shuhei Yamamoto (Graduate School of Library, Information and Media Studies, University of Tsukuba, Tsukuba, Japan)
Kei Wakabayashi (Faculty of Library, Information and Media Science, University of Tsukuba, Tsukuba, Japan)
Noriko Kando (Information and Society Research Division, National Institute of Informatics, Tokyo, Japan)
Tetsuji Satoh (Faculty of Library, Information and Media Science, University of Tsukuba, Tsukuba, Japan)

International Journal of Web Information Systems

ISSN: 1744-0084

Article publication date: 15 August 2016

332

Abstract

Purpose

Many Twitter users post tweets that are related to their particular interests. Users can also collect information by following other users. One approach clarifies user interests by tagging labels based on the users. A user tagging method is important to discover candidate users with similar interests. This paper aims to propose a new user tagging method using the posting time series data of the number of tweets.

Design/methodology/approach

Our hypothesis focuses on the relationship between a user’s interests and the posting times of tweets: as users have interests, they will post more tweets at the time when events occur compared with general times. The authors assume that hashtags are labeled tags to users and observe their occurrence counts in each timestamp. The authors extract burst timestamps using Kleinberg’s burst enumeration algorithm and estimate the burst levels. The authors manage the burst levels as term frequency in documents and calculate the score using typical methods such as cosine similarity, Naïve Bayes and term frequency (TF) in a document and inversed document frequency (IDF; TF-IDF).

Findings

From the sophisticated experimental evaluations, the authors demonstrate the high efficiency of the tagging method. Naïve Bayes and cosine similarity are particular suitable for the user tagging and tag score calculation tasks, respectively. Some users, whose hashtags were appropriately estimated by our methods, experienced higher the maximum value of the number of tweets than other users.

Originality/value

Many approaches estimate user interest based on the terms in tweets and apply such graph theory as following networks. The authors propose a new estimation method that uses the time series data of the number of tweets. The merits to estimating user interest using the time series data do not depend on language and can decrease the calculation costs compared with the above-mentioned approaches because the number of features is fewer.

Keywords

Acknowledgements

This work was supported by Grant-in-Aids for scientific Research No. 25280110 and No. 15J05599 and NII’s strategic open-type collaborative research.

Citation

Yamamoto, S., Wakabayashi, K., Kando, N. and Satoh, T. (2016), "Twitter user tagging method based on burst time series", International Journal of Web Information Systems, Vol. 12 No. 3, pp. 292-311. https://doi.org/10.1108/IJWIS-03-2016-0012

Publisher

:

Emerald Group Publishing Limited

Copyright © 2016, Emerald Group Publishing Limited

Related articles