CHAPTER 8 TAILORING TOOLS TO THE RESCUE : LESSONS LEARNED FROM DEVELOPING A SOCIAL MEDIA INFORMATION GATHERING TOOL

This chapter describes how researchers and developers may improve the design of technical innovations for crisis communicators by testing how user-friendly the innovation is for its intended end users. In the RESCUE project, a tool for social media information gathering was developed. During this process, tool usability was thoroughly tested. Good usability allows the user to complete tasks and achieve goals with effectiveness, efficiency and satisfaction. The purpose of the r Klas Backholm, Joachim Högväg, Jørn Knutsen, Jenny Lindholm and Even Westvang. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http:// creativecommons.org/licences/by/4.0/legalcode


INTRODUCTION
The aim of this chapter is to discuss how a usability testing process in developing a tool can contribute to improved working conditions for crisis communicators in emergencies.We describe the RESCUE project's work with developing and testing a tool intended for information gathering from across social media platforms.When product designers test the usability of a product, they strive to optimise the design and features of a final product according to user needs.They do not want to force users to change their behaviour to fit the product's requirements (Wallach & Scholz, 2012).In an emergency context, 1 usability testing will not only contribute to an improved product design, but also to better crisis management.A crisis management product that works well will contribute to the overall sense making of the situation, whereas a product with poor design, in the worst case, may add to negative developments during the unfolding emergency (Coombs, 2015;Endsley, 2009).
The RESCUE tool is designed for users who work professionally with gathering and validating information from for example social media.
Hence, the main user groups are emergency management organisations such as authorities, first response rescuers, non-governmental organisations (NGOs) and news organisations.Based on theoretical contributions from areas of mediatisation, computational thinking, situation awareness (SA), mental modelling and activity theory (Endsley, 2009;Johnson-Laird, 1983; tool and the three steps of usability testing we implemented.The steps stretch from a pre-design mapping of user needs to a test of a high-fidelity prototype at the end of the design process, and include methods ranging from interviews and crisis training scenario observations to tests of psychophysiological reactions in laboratory settings.We end the chapter with a summary of how usability tests may add to improved crisis management across user groups.
We aim to answer two research questions in this chapter: RQ1: How can usability tests of a product prototype contribute to better SA in technical innovations for emergency use?RQ2: How should usability tests be included throughout the product prototype development process to contribute to better SA?

WHY DO WE NEED AN EMERGENCY INFORMATION GATHERING TOOL?
In an emergency, information is crucial.While communication professionals working in a crisis will have different reasons for gathering information, the basic need to create an overall picture of the situation by collecting pieces of information from social media and other channels is similar across groups (Coombs, 2015;Ruggiero & Vos, 2014).For example, authorities and first responders may use collected information to identify where rescue efforts are acutely needed, or to inform inhabitants in affected areas about crisis developments (Hughes, St. Denis, Palen, & Anderson, 2014).Non-governmental emergency support organisations or private companies may use gathered information to identify and respond to emerging hot topics in social media, or to prevent rumours with a potentially harmful effect on the organisation (Coombs, 2015;Hornmoen et al., 2018).Finally, news organisations may use gathered information for producing news reports about the emergency itself, or how authorities are managing the situation (Brandtzaeg, Luders, Spangenberg, Rath-Wiggins, & Folstad, 2016;Schifferes et al., 2014).
However, the current communication and media landscapes are complex and dynamic.As stated by Lundby (2009), modern societies are mediatised.Varying forms of media outlets and technological equipment are present in individuals' day-to-day activities, and new forms of communication are developing rapidly (e.g.new social media platforms or 187 Lessons Learned from Developing a Social Media Tool smartphone apps).In unfolding emergencies, communicators and news organisations may struggle with information gathering processes due to the vast amounts of information available.Furthermore, they may struggle with validation issues, as crisis information is rapidly passed on between users within and across media outlets, and modified along the way.Soon, it becomes difficult to tell whether possibly interesting pieces of information are true or false (Brandtzaeg et al., 2016;Coombs, 2015;Eriksson, 2012;Ruggiero & Vos, 2014).
Information handling tools or similar technical innovations may be part of the solution to this problem (Ruggiero & Vos, 2014;Schifferes et al., 2014).Wing (2006) introduced the concept of computational thinking, that is, to actively incorporate, for instance, technical innovations, programming and algorithms to solve an identified problem such as the information handling challenges listed earlier.The problem can then be solved by either a human or a machine, or a combination of both (Wing, 2006).Thus, identifying or developing relevant tools, and combining them with established user group work routines into well-functioning solutions, are relevant strategies for improving emergency communication in a mediatised society À and as shown further in the following text, usability testing sets the basis for applying such strategies.

THE RESCUE TOOL
In the following text, we describe the features of the tool developed in the RESCUE project.We then show how usability testing was implemented throughout the prototype development process.Currently, the tool is a functioning high-fidelity prototype that is ready to be introduced into user organisations' existing software and technology structures.The tool is, however, not entirely finalised, as its implementation in workplace settings needs further testing.
The tool is conceptually centred around events.An event is a field of interest that a user is working on, like an unfolding incident.The user can collect content from different platforms, tweets, Instagram images, news articles, etc. or subscribe to a range of different feeds or searches.For example, the user can set up subscriptions to a specific hashtag on Twitter, a single user on Instagram or an RSS feed.Each event contains a feed (similar to Facebook feeds) where content from these searches and 188 Klas Backholm et al.
subscriptions will appear, with the newest appearing on top.The idea is that the user can sift through the incoming feed, getting an overview of the content.
If users notice content that is of particular interest, they can save that specific content into a separate tab called save.When the user selects the content, it is possible to add metadata, such as how trustworthy the content is, the geographical location (if this is not provided in the content source) or a comment which other users can see and respond to.The user is also given a range of ways to visualise content, including a timeline view, an option where objects are presented on a map, or a diagram option where evaluated content is displayed in accordance with the evaluations given.
In addition to the 'on-going event' view, the tool includes a monitor view.This is a feature where a user can set up searches or subscriptions that are not related to a specific event.Rather, the user could benefit from this view when wanting to monitor a specific geographic area or a list of trusted sources on various topics.

USABILITY TESTING AND EMERGENCIES
In the remainder of this chapter, we shift focus to looking at how usability testing can be applied to improve emergency-related communication.
Usability is about the extent to which a specified user group can use a product to achieve goals with effectiveness, efficiency and satisfaction in a specified context of use (Hornbaek, 2006;ISO 9241, 2010).In an emergency context, effectiveness and efficiency become especially important due to the combination of time criticality and the vast flow of information to be handled (Edland & Svenson, 1993).Furthermore, dissatisfaction and subsequent frustration with a tool in a stressful situation can tax the cognitive systems to a larger degree than in everyday work, and lead to dangerous errors.Research finds that stressors 'appear to cause shifts, lapses, and narrowing of attention, and can also influence decision speed' (Mendl, 1999, 221).
Thus, a high-stress context demands an even higher level of usability within the system, to keep the user performing well and feeling satisfied with the tool (Wastell & Newman, 1996).Endsley (2009)  During high-stress assignments, this may include having a good overview of complex information flows (Salmon et al., 2008;Yin, Lampert, Cameron, Robinson, & Power, 2012).
When conducting usability tests of tools for high-stress environments, we find it useful to divide addressed issues into two general categories of functionality problems (Backholm, Högväg, & Lindholm, 2017).We refer to these as lower-and higher-level discrepancies between the user's mental models about a task, and how the features intended to support the task are designed in the tool.Mental models are widely referred to in the humancomputer interaction literature and reflect the user's existing cognitive structures about for example how a specific function or work task should be carried out.The user retrieves such models from memory when needed, and applies them to the situation at hand (Johnson-Laird, 1983;Norman, 2013).As stated within activity theory (Kaptelinin, 1994), usability should go beyond only designing a well-functioning system.It should aim at supporting human activities in general in everyday contexts À and therefore, for good usability, a product should fit in well with a user's mental models.
We link lower-level discrepancies between tool design and mental models to basic functionality issues in the product.Such discrepancies can be about where on a computer screen a user would normally start looking for cues about how to carry out a task compared with where on the screen such cues have been placed in the product.Lower-level problems reflect an imbalance between the user's expectations about where to find cues and where they are actually placed.Higher-level discrepancies stem from a disconnection between the user's broader mental schemata for work in emergencies and the tool functionality.Such discrepancies may be about how a user from a first response organisation would monitor and mark potentially interesting social media content, as against the product's monitor and marking features (Backholm, Högväg, et al., 2017).
Avoiding lower-and higher-level usability discrepancies in a product is especially important to maintain SA in high-stress environments.Although not outspokenly using the low/high level categorisation, Endsley and Jones (2004) provide a useful description of potential problems that designers need to address.For example, systems that include too many features, or where the most salient features are not necessarily the most important ones, 190 Klas Backholm et al.
may lead to the user focusing on the wrong tool details.This clearly reflects possible lower-level problems.Furthermore, technological equipment designed to handle work tasks may take over tasks to such a degree that users have difficulties understanding what is going on.As users lose the ability to link emergency-related tool actions to their own higher-level mental models, they fall out of the loop and thereby lose control of the situation.

TESTING USABILITY OF A SOCIAL MEDIA INFORMATION GATHERING TOOL
In the following text, we exemplify how usability testing can be used to secure SA, by referring to our work in the RESCUE project.One of the general aims of the project was to investigate whether professional emergency communicators need a tool or system that automatises social media information gathering tasks, and if so, how a tool should be designed to reflect user needs.We included authorities, first responders, NGO representatives and news journalists.We conducted three steps of testing, and in this section, we present the setup of the three phases, as well as the main results of our studies.In Figure 1, the three phases are summarised and related to the level of tool development.

PHASE ONE: BUILDING A CONCEPTUAL MODEL
The (3) observations of, and semi-structured interviews with, five journalism students participating as journalists in a regional crisis training scenario (information about region not provided due to participant anonymity; Backholm, Ausserhofer, et al., 2017).Furthermore, we mapped existing tools and software (unpublished data), to avoid duplicating already existing products.The interviews in dataset 1 showed that all communicator groups saw a relevance in implementing social media in their emergency information gathering and distribution routines.As familiar from previous research, the underlying needs varied between groups in our dataset.For example, authorities/NGOs need to listen in to public concerns, while first responders need to validate the trustworthiness of potentially important information from the crisis scene.
At the time of data collection (2014À2015) news organisations had come further than other communication groups in their collective Results from phase one showed that respondents would benefit from a tool that focuses on gathering and handling information.Particularly useful features would be to channel content from several platforms into one workspace, and to present content according to the journalist's personal needs.Journalists wanted personalised search-related options such as choosing the content format or media outlet as well as varying options to choose between when visualising content (e.g.timeline or geolocation on a map).
The interviews also showed that journalists would not use a tool designed for automatic content verification purposes.This is such a crucial part of journalistic work that journalists preferred to make final decisions about content trustworthiness manually, as they would not trust tool algorithms to make important decisions.Rather, the tool should offer support for manual decisions by providing clear content overviews gathered into a user-friendly format.
Our mapping of existing tools/software for journalistic information gathering and handling routines showed that this is a rapidly developing area.At the time of data collection, several products that partially answer to journalists' needs existed.Recommendations and guidelines for how to combine available tools in journalistic work also existed À but a tool that directly corresponded to our identified user group needs did not.

PHASE TWO: TESTING THE MAIN TOOL CONCEPT
Based on the results, we constructed a conceptual model for the tool and started designing a first prototype version.Approximately one year 193 Lessons Learned from Developing a Social Media Tool after the pre-design mapping, we conducted a usability test with a semi-interactive wireframe prototype.The interface did not yet have the intended final design, and several features did not work.However, the users could navigate through static images by clicking on predefined areas.
The usability test was conducted in a laboratory setting.A total of 17 news journalists (11 females, 65%; sample age 27 À 60, M = 38) from mid-sized news organisations participated, representing varying publication formats (newspaper, radio, television and web).One journalist participated per session, and each session took between 1 and 2 hours.We collected subjective data (interviews, concurrent think-aloud, observations and surveys) and biometric data (eye movement patterns).Participants were seated in front of a computer screen and shown the tool prototype.
As the first version of the prototype was rudimentary, the main idea was to collect first main impressions of the overall tool idea and included features.Thus, we were interested in possible higher-level discrepancies between users' mental models for journalistic work and the tool idea.We mainly used subjective data in analyses, as it is not relevant to analyse detailed biometric data when studying a low fidelity prototype where the workspace design is a rough first draft.However, biometric data were used for verifying observations made during the test.
A majority of participants had a positive or neutral first impression of the tool (N = 14), and 10 would use a finished version in their work.As can be expected with a rudimentary prototype, all participants struggled with understanding at least a few specific detailed features.However, only one journalist had problems grasping the overall idea of the product.Some central usability problems emerged.Most participants understood how to search for social media content and recognised that identified results would be presented in a feed.As the feed was presented as a timeline, this design was probably familiar from popular social media outlets like Facebook.However, the next step of sorting out information proved problematic.As explained in the nature of the tool section mentioned earlier, users can pick out especially interesting content from the feed and move it to another tab, called saved (which in the prototype version was called 'selected').Participants struggled with understanding this feature and thought that the tab was another way of showing all feed content in more detail.
This usability problem reflected a mismatch with participants' mental models for how to handle social media content.It was easy to grasp how information searches and content feeds should be handled, but the option to then make a second selection from the content feed was counterintuitive.As a response to this, we chose to make two prototype changes: the tab name was changed from 'selected' to 'saved' to clarify the intended action and a function was added in the feed tab that clearly marks which content has been saved (the content changes to red colour).We did not remove the saved feature from the prototype.This would have left the tool user with only the option to sift through all content in the feed, increasing the risk of information overload and diminishing SA.
The second main usability problem had to do with collaborative work and communication in the prototype.In the pre-design mapping, journalists had mentioned a need to share found content with other team members.Thus, the prototype included a third tab (named tasks), where users could comment on content, distribute tasks related to further validation of a specific piece of information or carry out similar collaborative tasks.For this test, we were interested in more detailed prototype features that is lower-level discrepancies between mental models and tool functions.However, possible remaining higher-level discrepancies were naturally also of interest.The tested prototype was a close-to working version, with a design and functionality that was reminiscent of the intended final product (we illustrate the prototype in Figure 2).It had an interface that responded to user input and worked with live data from Twitter and RSS feeds.Therefore, a user could create new events and monitor, add or modify sources.
As we wanted to study how well users understood tool features during an on-going event, we developed a simulation of a real-life emergency (a fire at the regional airport) that would unfold during the test.For this, we used a premade dataset of social media content that would be played back during the test sessions À each piece of content appearing in the prototype at specific time intervals.However, we did not want the scenario to disturb test performance by causing additional stress, and thus participants were not dependent on following each update in detail or creating a thorough understanding of the unfolding crisis.
Most participants saw the second prototype as easier to approach and more intuitive than the previous one.Furthermore, a majority mentioned that the prototype seemed to require less initial learning than is usually the case with new systems À which naturally may be a consequence of having participated in the previous test one year earlier.
Journalists mainly had difficulties with more detailed features (e.g.keeping track of available search/visualisation options or navigating through pop-up windows), that is, lower-level discrepancies.The overall tool logic and main features worked well.We evaluated identified lower-level problems in detail after the tests, and made necessary changes to the prototype. 2

196
Klas Backholm et al.Lessons Learned from Developing a Social Media Tool Two central higher-level discrepancies emerged in the tests.One had to do with an option to add cues about the level of content trustworthiness.
In this prototype, when saving content, there was also an option to manually evaluate the trustworthiness and importance of the specific content on a 10 grade scale.This option was part of the former prototype as well, but had better functionality in this one.
While the main idea of evaluating content corresponded to journalists' mental models, the feature design was seen as too complex and timeconsuming.Content evaluation is a necessity in journalistic work, but especially during high-stress assignments, there is no time to grade several options for each piece of content.Also, this will require cognitive resources needed for other tasks.As a response to this, we changed the evaluation scale from 10 to 3 points, and designed simpler evaluation buttons.
However, we chose to keep the feature in the tool, as journalists in the pre-design mapping phase had asked for ways to add information about found content.In addition, the evaluation feature is optional, not a required action in the tool.
In the first usability test, the feature to save identified content caused problems.This was somewhat problematic in the latter test as well.
Several participants did not understand what was happening when they saved an object in the content feed.The saved content was indicated with a red colour, and participants thought that they had deleted rather than saved the object.The problems can be interpreted as a combination of higher-and lower-level issues.While the 'old' higher-level problem with not understanding the logic between the feed and saved tabs persisted to a degree, a new lower-level problem occurred related to how saved content was visualised in the feed.As post-test improvements, we changed the colour of the indicator for saved content, and further clarified buttons and similar details in the save feature.

DISCUSSION
Activity theory states that usability should be about more than designing well-functioning systems, by also contributing to meaningful human activities in general (Kaptelinin, 1994).Our work on developing an information handling tool shows that computational thinking (Wing, 2006)  The first research question we wanted to answer in this chapter was how usability tests may contribute to better SA.Our results showed that usability tests do contribute and that a tool designed for work in highstress situations needs to be carefully balanced between including necessary features and avoiding tasks that require time-consuming manual actions (Endsley, 2009).For example, while our pre-design samples wanted several support options in the tool, such as to evaluate social media content or communicate such content to colleagues, tests showed that tool features proved difficult to combine with users' mental models about how to do this work in emergencies.
Furthermore, we showed that usability tests are necessary to identify factors that contribute to an intuitive tool interface.As a clear interface is especially important when working in high-stress environments (ISO 9241, 2010), illogical steps between main tool features should be avoided.
Our sample struggled both with the steps between the feed-saved and the saved-tasks tabs, but for different reasons.While the former could be solved by redesigning the interface, the latter was such a severe discrepancy with users' mental models that the whole feature was removed from the tool.
In this chapter, the second research question was about how usability tests should be implemented during the product development phases to contribute to SA in the best possible way.Our usability tests showed that even though the design was based on a mapping of user needs, the tool design and functionality still needed thorough and repetitive usability testing (Figure 1).While problems may be identified and solved in a first test, new problems related to the solution may emerge in a second test À as seen with the saved feature in the RESCUE tool.Thus, designers should include a pre-production mapping of user needs as well as repeated usability testing throughout the product development phase (ISO 9241, 2010).
From a methodological viewpoint, laboratory tests should ideally be combined with tests in real-life scenarios when the tool development process has reached the stage where a relatively finished version of the tool exists.This final part is still missing in our usability testing work with the RESCUE tool and will be a natural next step when implementing the tool 199 Lessons Learned from Developing a Social Media Tool in user organisations' work settings.We conducted our tests in a lab setting, and while we strived to include a sample that represents the occupational group, the interpretability of the results beyond this sample is limited.

NOTES
1.In this chapter, we will refer somewhat interchangeably to emergencies and crises.The distinction between the events can be either the scale of the disruption, or the cause of the event.However, such events cause a threat to societal values and demand some sort of response from different actors.
2. As these are changes related, for instance, to moving the position or changing the wording of a button, we will not list them in further detail in this chapter.For more information about test results, see Backholm, Högväg, Lindholm, Knutsen, and Westvang (in press).
International Organisation for Standardisation (ISO 9241, 2010) provides recommendations for how designers can test usability throughout the product development process.Designers should start the planning phase by identifying what problem a product should solve, and then specify the contexts and requirements related to how the product will be used.In RESCUE, we collected three sets of data for phase one.The data sets consisted of: (1) semi-structured interviews with crisis communicators from authorities/ NGOs, first response teams, and news journalists (four European countries, N = 15; Hornmoen et al., 2018); 191 Lessons Learned from Developing a Social Media Tool(2) semi-structured interviews with news journalists (N = 22) from midsized media organisations in three European countries(Backholm, Ausserhofer, et al., 2017) and

Figure 1 :
Figure 1: Three Phases of Usability Testing in the RESCUE Project.
Most participants struggled with varying features related to this tab, from not understanding the overall feature idea to details such as who sends tasks or what tasks consist of.This problem may be linked to the fact that this feature introduced a new way of thinking in the prototype, and thus required an additional cognitive effort from users.Until now, they had mainly focused on individual social media information handling, but in this feature, they had to handle collective work distribution and communication.This clearly caused problems that would affect SA negatively, and several participants stated that they already have other established communication channels they would use instead.Thus, we chose to remove this feature from subsequent tool versions.PHASE THREE: TESTING THE DETAILED TOOL DESIGN The second usability test was conducted in the same lab, approximately 9À10 months after the first one.Fifteen journalists from the first sample participated, dropouts were due to relocation and sick leave (nine females, 60%; age 28À61, M = 40).Subjective (interviews, subsequent 195 Lessons Learned from Developing a Social Media Tool think-aloud, observations and surveys) and biometric (eye movement patterns, facial responses and skin conductance) data were collected.In this test, biometric data had a more central role and were used in analyses.For instance, facial responses and eye-tracking were used to further verify or reject subjective data.Biometric data in usability tests are explained in more detail in the subsequent chapter by Lindholm, Backholm and Högväg in this book.

Figure 2 :
Figure 2: The Close-to-Working Prototype with a Reactive Interface Used in Test 2. 197