The steady increase of data on human behavior collected online holds significant research potential for social scientists. The purpose of this paper is to add a systematic discussion of different online services, their data generating processes, the offline phenomena connected to these data, and by demonstrating, in a proof of concept, a new approach for the detection of extraordinary offline phenomena by the analysis of online data.
To detect traces of extraordinary offline phenomena in online data, the paper determines the normal state of the respective communication environment by measuring the regular dynamics of specific variables in data documenting user behavior online. In its proof of concept, the paper does so by concentrating on the diversity of hashtags used on Twitter during a given time span. The paper then uses the seasonal trend decomposition procedure based on loess (STL) to determine large deviations between the state of the system as forecasted by the model and the empirical data. The paper takes these deviations as indicators for extraordinary events, which led users to deviate from their regular usage patterns.
The paper shows in the proof of concept that this method is able to detect deviations in the data and that these deviations are clearly linked to changes in user behavior triggered by offline events.
The paper adds to the literature on the link between online data and offline phenomena. The paper proposes a new theoretical approach to the empirical analysis of online data as indicators of offline phenomena. The paper will be of interest to social scientists and computer scientists working in the field.
Jungherr, A. and Jürgens, P. (2013), "Forecasting the pulse: How deviations from regular patterns in online data can identify offline phenomena", Internet Research, Vol. 23 No. 5, pp. 589-607. https://doi.org/10.1108/IntR-06-2012-0115Download as .RIS
Emerald Group Publishing Limited
Copyright © 2013, Emerald Group Publishing Limited