The purpose of this paper is to present a Big Data solution as a methodological approach to the automated collection, cleaning, collation, and mapping of multimodal, longitudinal data sets from social media. The paper constructs social information landscapes (SIL).
The research presented here adopts a Big Data methodological approach for mapping user-generated contents in social media. The methodology and algorithms presented are generic, and can be applied to diverse types of social media or user-generated contents involving user interactions, such as within blogs, comments in product pages, and other forms of media, so long as a formal data structure proposed here can be constructed.
The limited presentation of the sequential nature of content listings within social media and Web 2.0 pages, as viewed on web browsers or on mobile devices, do not necessarily reveal nor make obvious an unknown nature of the medium; that every participant, from content producers, to consumers, to followers and subscribers, including the contents they produce or subscribed to, are intrinsically connected in a hidden but massive network. Such networks when mapped, could be quantitatively analysed using social network analysis (e.g. centralities), and the semantics and sentiments could equally reveal valuable information with appropriate analytics. Yet that which is difficult is the traditional approach of collecting, cleaning, collating, and mapping such data sets into a sufficiently large sample of data that could yield important insights into the community structure and the directional, and polarity of interaction on diverse topics. This research solves this particular strand of problem.
The automated mapping of extremely large networks involving hundreds of thousands to millions of nodes, encapsulating high resolution and contextual information, over a long period of time could possibly assist in the proving or even disproving of theories. The goal of this paper is to demonstrate the feasibility of using automated approaches for acquiring massive, connected data sets for academic inquiry in the social sciences.
The methods presented in this paper, together with the Big Data architecture can assist individuals and institutions with a limited budget, with practical approaches in constructing SIL. The software-hardware integrated architecture uses open source software, furthermore, the SIL mapping algorithms are easy to implement.
The majority of research in the literature uses traditional approaches for collecting social networks data. Traditional approaches can be slow and tedious; they do not yield adequate sample size to be of significant value for research. Whilst traditional approaches collect only a small percentage of data, the original methods presented here are able to collect and collate entire data sets in social media due to the automated and scalable mapping techniques.
The author acknowledges the financial support from the International Doctoral Innovation Centre, Ningbo Education Bureau, Ningbo Science and Technology Bureau, China’s MoST and The University of Nottingham. The project is partially supported by NBSTB Project 2012B10055.
Ch'ng, E. (2015), "Social information landscapes: Automated mapping of large multimodal, longitudinal social networks", Industrial Management & Data Systems, Vol. 115 No. 9, pp. 1724-1751. https://doi.org/10.1108/IMDS-02-2015-0055
Emerald Group Publishing Limited
Copyright © 2015, Emerald Group Publishing Limited