This paper aims to estimate the population in a specific space from the numbers of posted tweets and their senders, using Twitter's real-time property and location information data.
The population to be estimated was set to be the attendance at each game among the six baseball teams of the Japan Professional Baseball Pacific League held at the main stadium of each team. The relation between the attendance and Twitter data was analyzed, and regression models using Twitter data were used to estimate the attendances.
The correlation coefficient tended to be larger for the attendance and tweeting users than for the attendance and that of the number of tweets. Furthermore, the comparison and evaluation of several regression models combining Twitter data, game data and weather data for estimating the attendance showed the usefulness of Twitter data, and that using the number of tweeting users improved the accuracy of population estimation.
While there are many studies on event detection or location identification using Twitter data, no study has been reported on the estimation of the population in a specific space using “time information” and “location information” characteristic of Twitter data. Using Twitter data, which contains users' messages, for estimating the population can be extended to various types of analyses, such as the analysis of feelings and opinions of the groups in the space.
Hara, H., Fujita, Y. and Tsuda, K. (2021), "Population estimation using Twitter for a specific space", Data Technologies and Applications, Vol. 55 No. 3, pp. 430-445. https://doi.org/10.1108/DTA-03-2020-0065
Emerald Publishing Limited
Copyright © 2020, Emerald Publishing Limited