Deep Text: Using Text Analytics to Conquer Information Overload, Get Real Value from Social Media, and Add Big(ger) Text to Big Data

Behrooz Bayat (Islamic Azad University of Hamedan, Hamedan, Iran)

The Electronic Library

ISSN: 0264-0473

Article publication date: 6 November 2017

477

Citation

Bayat, B. (2017), "Deep Text: Using Text Analytics to Conquer Information Overload, Get Real Value from Social Media, and Add Big(ger) Text to Big Data", The Electronic Library, Vol. 35 No. 6, pp. 1269-1270. https://doi.org/10.1108/EL-09-2017-0188

Publisher

:

Emerald Publishing Limited

Copyright © 2017, Emerald Publishing Limited


Deep text is an approach to text analytics that involves using computerized techniques for gaining insights into large volumes of unstructured text. This book looks in depth at what text analytics is and how it can be practiced in a way that goes beyond text mining. It describes the nature of text analytics generally and the vital role that a deep text approach can play in making text analytics successful. The book gives an understanding of text analytics and how it can be carried out, and also the kinds of applications text analytics can support. The context is usually the corporate sector.

The book is divided into five parts each including three chapters. The first part, “Text Analytics Basics”, lays the foundations for text analytics by providing a general picture of the concept. Chapter 1 presents a broad definition of text analytics, what it involves and what it can provide. This chapter also discusses the importance of content models and metadata in adding structure to unstructured texts and briefly describes the technology behind text analytics. Chapter 2 looks at the major core capability areas within text analytics including text mining, extraction, summarization, sentiment analysis and auto-categorization, all of which require the design of difficult and often expensive software. Chapter 3 considers the important issue of the return on investment of text analytics and analyzes the basic business logic of text analytics, which is to add structure to an enormous amount of unstructured text and to get value from it. The chapter also describes three major areas in which text analytics appears significantly beneficial to an organization including enterprise, search, social media and multiple text analytics.

The second part of the book, “Getting Started in Text Analytics”, suggests that getting familiar with the available text analytics software in the market and researching the information environment and company needs are necessary first steps to be taken. To this end, this part includes chapters on the current state of text analytics software, a smart start to text analytics and the evaluation of text analytics software.

The third part describes the process to go through to implement text analytics applications. Therefore, Chapter 7 looks at the issue of developing auto-categorization as the most challenging but, at the same time, one of the most fundamental aspects of the job. The chapter also discusses the three major phases of text analytics categorization development projects and describes the best practices within each phase. Chapter 8 takes into account social media analysis and its variety of applications, content and approaches. Moreover, it examines the requirements of doing advanced sentiment analysis and other development processes in this field. Chapter 9 reviews some of the projects undertaken in text analytics to provide more tangible understanding of the abstract concepts presented in the previous chapters.

The fourth part, “Text Analytics Applications”, considers three main application areas of text analytics, namely, enterprise search, apps and social media. Its chapters offer useful hints about the very wide applications text analytics has found in each of the three broad areas discussed.

The last part of the book, “Enterprise Text Analytics as a Platform”, suggests that the best overall approach to text analytics could be one that will include an enterprise text analytics platform that supports all the possible applications and not just a series of independent applications. In so doing, it discusses different approaches to text analytics in Chapter 13 and suggests that text analytics be considered as a semantic infrastructure that needs to be regarded as strategic rather than tactical. In Chapter 14, the main features of enterprise text analytics are described. The concept of semantic infrastructure is introduced as one that is more important than the tactical infrastructure most organizations try to develop, as the former will provide for the use of language or semantics in the organization by which means such services as taxonomies and other communication models are made possible. Finally, the last chapter examines the idea of a semantic infrastructure in more details, and it discusses how developing such a foundation can bring more value to the organization.

All in all, the book is a must-read for anyone dealing with unstructured content who is interested in adding value and power to their companies through using an ever-growing mass of unstructured texts. Without any doubt, this book can give the insights into and provide practical examples of the main functionalities of text analytics in an easy-to-grasp, accessible manner.

Related articles