A systematic literature review of how cybersecurity-related behavior has been assessed

Purpose – Cybersecurity attacks on critical infrastructures, businesses and nations are rising and have reached the interest of mainstream media and the public ’ s consciousness. Despite this increased awareness, humans are still considered the weakest link in the defense against an unknown attacker. Whatever the reason, naïve-, unintentional-or intentional behavior of a member of an organization, the result of an incident can have a considerable impact. A security policy with guidelines for best practices and rules should guide the behavior of the organization ’ s members. However, this is often not the case. This paper aims to provide answers to how cybersecurity-related behavior is assessed. Design/methodology/approach – Research questions were formulated, and a systematic literature review (SLR) was performed by following the recommendations of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement. The SLR initially identi ﬁ ed 2,153 articles, and the paper reviews andreports on26 articles. Findings – The assessment of cybersecurity-related behavior can be classi ﬁ ed into three components, namely, data collection, measurement scale and analysis. The ﬁ ndings show that subjective measurements from self-assessment questionnaires are the most frequently used method. Measurement scales are often composed based on existing literature and adapted by the researchers. Partial least square analysis is the most frequently used analysis technique. Even though useful insight and noteworthy ﬁ ndings regarding possible differences between manager and employee behavior have appeared in some publications, conclusive answers to whether such differences exist cannot be drawn. Research limitations/implications – Research gaps have been identi ﬁ ed, that indicate areas of interest for future work. These include the development andemployment of methods for reducing subjectivity in theassessmentof cybersecurity-relatedbehavior. Originality/value – To the best of the authors ’ knowledge, this is the ﬁ rst SLR on how cybersecurity-related behavior can be assessed. The SLR analyzes relevant publications and identi ﬁ es current practices as well astheir shortcomings, andoutlinesgaps


Introduction
The importance of information systems (IS) security has increased because the number of unwanted incidents continues to rise in the last decades. Several avenues or paths can be taken by organizations to secure their IS. Technical solutions like whitelisting, firewalls and antivirus software enhance security, but research has shown that when people within the organization do not follow policies and guidelines these technical safeguards will be in vain.
1.1 Aims of the paper Of the 26 articles included in this review, 10 used some variations of the phrase humans are the weakest link in cybersecurity in either the abstract or introduction. All articles cite multiple authors, accumulating a significant number of previous works, all claiming the same statement. One might agree with Kruger et al. (2020) that it is common knowledge that humans are the weakest link in information security.
Given the premise that humans are the weakest link and the acknowledgment that technology cannot be the single solution for security , research should investigate how organizations can assess the cybersecurity-related behavior of their employees. Identifying, evaluating and summarizing the methods and findings of all relevant literature resources addressing the issue, thereby systematizing the available knowledge and making it more accessible to researchers, while also identifying relevant research gaps, are the aims of this systematic literature review (SLR).

Background
Recent years have shown that cyberattacks are a global issue, such as the extensive power outage causing a blackout across Argentina and Uruguay in 2019 (Kilskar, 2020). In January 2018, nearly 3 million, or roughly 50% of the Norwegian population's medical records, were compromised by a cyberattack. Threats can vary from viruses, worms, trojan horses, denial of service, botnets, man-in-the-middle and zero-day ones (Pirbhulal et al., 2021). The above-listed threats include technical terms with a distinctive flair and uniqueness that is hard to comprehend for employees without a technical background. Moreover, most information security issues are complicated and fully understanding them requires advanced technical knowledge.
With threats originating from internal and external sources, the need to communicate security measures from the management to the organization's members is of great importance (Sommestad et al., 2014). Developing an organization's security policy is central to increasing knowledge and awareness. According to the ISO 27002:2022 standard, the information security policy sets out the organization's approach to managing its information security. It should contain statements concerning the following: Definition of information security; information security objectives or the framework for setting information security objectives; principles to guide all activities relating to information security; commitment to satisfy applicable requirements related to information security; commitment to continual improvement of the information security management system; assignment of responsibilities for information security management to defined roles; and procedures for handling exemptions and exceptions. (ISO, 2022) The extent to which an employee is aware of and complies with information security policy defines the extent of their information security awareness (ISA). ISA is critical in mitigating the risks associated with cybersecurity and is defined by two components, namely, understanding and compliance. Compliance is the employees' commitment to follow best-practice rules defined by the organization (Reeves et al., 2020). Ajzen (1991) defines a person's intention to comply as the individual's motivation to perform a described behavior. The intention to comply captures the motivational factors that influence behavior. As a general rule, the stronger the effort, the willingness to perform a behavior, the more likely it will be performed. Several frameworks or theories can be applied to research human behavior. For cybersecurity, behavior can be viewed through lenses and theories borrowed from disciplines such as criminology (e.g. deterrence theory), psychology (e.g. theory of planned behavior) and health psychology (e.g. protection motivation theory) (Moody et al., 2018;Herath and Rao, 2009). The most commonly used models in the context of cybersecurity are the general deterrence theory, the theory of planned behavior and the protection motivation theory (Alassaf and Alkhalifah, 2021).
Staff's attitude and awareness can pose a security problem. In those settings, it is relevant to consider why the situation exists and what can be done about it. In many cases, a key reason will be the limited extent to which security is understood, accepted and practiced across the organization (Furnell and Thomson, 2009). As a mitigating step toward compliance, decision-makers will need guidance on achieving compliance and discouraging misuse when developing information security policies (Sommestad et al., 2014). Therefore, the ability to assess behavior is a prerequisite for decision-makers in their quest to develop the organizations' information security policies. The development and responsibility for implementing policies lie within the purview of management (Höne and Eloff, 2002). Accordingly, understanding the differences in cybersecurity-related behavior between management and employees will benefit the development of more secure organizations.

Structure of the paper
The rest of this paper is organized as follows: Section 2 describes the methodology for conducting the SLR; the research questions; the record search process; and the assessment criteria. In Section 3, the results and the findings are presented. A discussion of the findings is presented in Section 4. Section 5 summarizes our conclusions and outlines directions for future research.

Method
This section discusses the fundamental stages of conducting an SLR. The SLR constructs are obtained by following the recommendations of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement (Page et al., 2021) and (Fink, 2019;Weidt and Silva, 2016).
The foremost step is to investigate if a similar review has already been conducted. Searching for and studying other reviews help refine both research questions and search strings. The search did not discover any similar reviews. Keywords, search strings and research questions were collected and categorized in a literature index tool and used to optimize search strings and verify that this review's chosen research questions are relevant and valuable to the body of knowledge.
A research review is explicit about the research questions, search strategy, inclusion and exclusion criteria, data extraction method and steps taken for analysis. Research reviews are, unlike subjective reviews, comprehensible and easily reproducible (Fink, 2019). The remainder of this section elaborates on the components of the performed SLR.

Research questions
The idea of such review studies is to broaden and get a deeper understanding of where the edge of current knowledge resides. The research questions should be broad enough to include relevant literature and be precise enough to guide the review (Fink, 2019). Research questions are tailored to a topic and to a context; in this instance, in the context of the human aspect of cybersecurity. Specifically, how can we assess behavior adhering to rules or policies of an organization in a cybersecurity context? Accordingly, our main research question is: RQ1. How is cybersecurity-related behavior assessed?
Such behavior may be affected by an individual's position within the organization. Considering the leading role that the management is expected to have in improving the cybersecurity culture in an organization, exploring possible behavioral differences between management and employees is also significant. Accordingly, a secondary research question is: RQ2. Are there differences between manager and employee behavior in a cybersecurity context?

Record searching process
Various search strings were used in this SLR, depending on the database. The keywords were kept unchanged, but the syntax of each database differs; hence, the search strings have minor differences. This study includes the following databases: Scopus, IEEE, Springer, Engineering Village, ScienceDirect and ACM. In some form of syntax, the keywords (exact and stemmed words) were used: Cyber, Security, Information, policy, compliance, measure, behavior. As an example, the following is the search used in Scopus: TITLE-ABS-KEY ((information AND security AND policy OR information AND security AND compliance OR policy AND compliance) AND (information AND security AND behavior)) AND PUBYEAR > 2001. To increase the precision of the searches, title, abstract and keywords were used as a limiter in all the databases.

Assessment criteria
This section describes the screening methodology and eligibility criteria used in this study. First, duplicates were removed based on each entry's digital object identifier (DOI). No unique tool other than a spreadsheet was used to deploy the removal. In cases where an entry from the database did not include a DOI, a manual search and removal process was performed using the title, author, year or similar information that could identify the unique attributes of the entry. Second, inclusion and exclusion criteria were applied. For this study, the following criteria are defined: (1) Exclusion criteria studies from organization reports, guidelines, technical opinion reports; research designexclude reviews, editorials and testimonials, as using secondary data (data from other reviews, etc.) would make this review a tertiary one; and nonresearch literature.
(2) Inclusion criteria written in English; published in 2001-2022; original studies using theoretical or empirical data; and studies published in Journals, Conference Proceedings and books/book sections.

Analysis of included articles
The result presented in this review is based on the abstraction of data from the articles. The descriptive synthesized results are based on the reviewers' experience and the quality and content of the available literature (Fink, 2019). All results are based on an abstraction of data except for those in Section 3.3.4, where the NVIVO software was used to uncover the most frequently used words from a compiled text of all analysis sections from each and every article in the review.

Identification, screening, eligibility and inclusion mechanism
This research returned 2,153 records. The first step before any analysis is to remove any duplicates. After removing duplicates, a total of 1,611 unique records remained. Following the recommendation from Weidt and Silva (2016), the first analysis step is screening by title and abstract. A total of 1,517 records were found to be irrelevant for this review, leaving 94 articles for additional screening. The (optional) second screening, depending on the number of articles, involves an analysis of each article's introduction and conclusion. For this study, an analysis of the method section was also included in the second screening step. This narrowed the number down to 28, where another 2 articles were excluded because of the lack of empirical data and irrelevance to the topic being reviewed, leaving the total number of 26 articles for complete text analysis. Figure 1, adapted from Page et al. (2021) depicts the screening process.

Trend and classification of included studies
Of the 26 selected articles, 19 were published in journals, and the remaining 7 in conferences, or 73% and 27%, respectively (see Figure 2). The figure also demonstrates the increased interest in the subject in the past two years.

Findings 3.3.1 How is cybersecurity-related behavior assessed?
Of the selected 26 articles in this review, 24 or 92% provide insight into how cybersecurity-related behavior is assessed. A three-step process emerges as the way to assess such behavior: First, information from subjects needs to be collected. This is referred to as data collection. Second, a measurement scale is deployed to ensure that the data collected is relevant and encompasses the research topic. The final step is the data analysis. 3.3.2 Data collection. Two forms of data can be collected, qualitative or quantitative. Both of these types of data can be subjective or objective; neither is exclusive to the other. The most common way to collect subjective data is using a questionnaire with questions whose answers fit into a fiveor seven-point Likert scale. Within a survey, questions may be asked that are subjective, biased or misleading when viewed alone, but the results can easily be used quantitatively (O'Brien, 1999). With the ubiquity of qualitative data, the interest in quantifying and being able to assign "good" numerical values and make the data susceptible to more meaningful analysis has been a topic for research since the first methods for quantification first began to appear around 1940 (Young, 1981).
Subjective data can lead to inaccurate or skewed results. In contrast, objective data are free from the subject's opinions. This can be, for example, the number of attacks prevented or the number of employees clicking the link in a phishing campaign (Black et al., 2008).
The SLR revealed six types of data collection methods, namely, self-assessment questionnaire (SAQ); interview; vignette; experiment with vignettes; affective computing and sentiment analysis; and clicking data from a phishing campaign. An overview of all articles and the data collection method used in each is presented in Table 1. The most prominent form of data collection is self-assessment (SA). This subjective data collection method is defined by Boekaerts (1991) as a form of appraisal that compares one's behavioral outcomes to an internal or external standard. In total, 22 of the 24 articles used SA as the primary data collection method. The most common way to collect data is through a questionnaire (SAQ). A total of 17 or 71% of the articles used an SAQ as their sole method for data collection.
Of the remaining five articles with results stemming from subjective data, two used vignettes in combination with a regular SAQ. Vignettes are hypothetical scenarios in which the subject reads and forms an opinion based on the information. Barlow et al. (2013) performed a factorial survey method (FSM) experiment with vignettes by using randomly manipulated elements into sentences in the scenarios instead of static text. Both regular questionnaires and vignettes use the same Likert scale.
The average number of respondents in the included papers is n = 356, with 52% males and 48% females. The most common way to deploy the SAQ is through online Web Two studies used interviews to collect information: one used interviews with an SAQ, and the other used interviews as the sole input. Interviews provide in-depth information and are suitable for uncovering the "how" and "why" of critical events as well as the insights reflecting the participants' relativist perspectives (Yin, 2018).
Only two studies used objective, quantitative data: Kruger et al. (2020) used affective computing and sentiment analysis. With the help of a deep learning neural network, the study accurately classified opinions as positive, neutral or negative based on facial expressions. Jalali et al. (2020) used a phishing campaign in conjunction with an SAQ to investigate whether there were any differences between intention to comply and actual compliance.
3.3.3 Measurement scale. A measurement scale ensures that the collected data encompass a topic or subject and do not miss any crucial facets. The role of a measurement scale is to ensure that the data collected is holistic and reproducible. Researchers can use predefined scales developed by others or self-developed ones. Those of the reviewed articles that use the latter form of scale are often not fully transparent about the content of the scale.
This SLR shows that 13 of the 22 articles that used a measurement scale used an unspecified scale. The most frequently (in seven papers) used specified scale is the Human Aspect of Information Security Questionnaire (HAIS-Q), developed by Parsons et al. (2014). When used in conjunction with other scales, HAIS-Q is often the most prominent.
Several pitfalls exist and must be considered when researchers select their measurement scale. If choosing to develop an unspecified scale, as found to be the most deployed alternative in this SLR, length, wording, familiarity with the topic, natural sequence of time and questions in a logical order are some of the topics that researchers should be mindful of (Fink, 2015). Especially the length of the questionnaire is significant; how much time do the respondents have to spend answering the survey? Another critical element when designing a measurement scale instead of using an existing one is validity and reliability. Proper pilot testing is required when choosing not to use an already-validated survey (Fink, 2015).
The HAIS-Q is designed to measure information security awareness related to information security in the workplace (McCormac et al., 2017). The Knowledge, Attitude and Behavior (KAB)  Cybersecurityrelated behavior model is at the center of HAIS-Q. The hypothesis is that when computer users gain more knowledge, their attitude toward policies will improve, translating into more risk-averse behavior (Pollini et al., 2021). The HAIS-Q comprises 63 questions covering 7 focus areas (internet use, email use, social networking site use, password management, incident reporting, information handling and mobile computing). Each focus area is divided into equal parts for KAB, resulting in 21 questions for each KAB element divided by the seven focus areas. For a detailed overview of the other scales used in conjunction with HAIS-Q, see the last column in Table 1.
The KAB model that underpins HAIS-Q has been criticized by researchers when used in, e.g. health and climate research. Both Parsons et al. (2014) and McCormac et al. (2016) cite McGuire (1969) who suggest that the problem is not with the model itself but with how it is applied. Parsons et al. (2014) highlight essential differences between environmental and health studies and the field of information security. Much ambiguity and unclear or contradictory information exist in the two former topics, while most organizations have an information security policy, either written or informal, indicating what is expected from employees (Parsons et al., 2014). Barlow et al. (2013) advocate using scenarios instead of direct questions, like in HAIS-Q, because it is difficult to assess actual deviant behavior by observation or direct questioning.
Another critique of the HAIS-Q is the length of the questionnaire. With 63 questions, respondents might lose interest, be inattentive to the questions and sometimes give false answers (Velki et al., 2019). On the contrary, Parsons et al. (2017) show that the HAIS-Q questionnaire is a reliable and validated measurement scale and accommodates some of the concerns raised by Fink (2015). Pollini et al. (2021) advise that, when using one, the questionnaire only considers the individual level and may not capture a holistic and accurate measurement of the organizations. Therefore, in their study, HAIS-Q questionnaires were deployed at the individual level, and interviews were used to assess the organizational level.
3.3.4 Analysis. To uncover how the included articles had analyzed their results, NVIVO, a qualitative data analysis software, was used to identify the most frequently used words in each article. An accumulative document from each article's analysis section was analyzed in NVIVO. All articles use some sort of validation and statistical verification of the collected data. The use of word count provides both a structured presentation and an unbiased account of how often keywords affiliated with the technical part of the analysis are used. The result from NVIVO shows that partial least square (PLS) is the most frequently used method. Herman Wold first coined PLS in 1975; it can be preferable in cases where constructs are measured primarily by formative indicators, e.g. managerial research, or when the sample size is small (Haenlein and Kaplan, 2004). This result is also in line with the finding in Kurowski (2019): "Most of policy compliance research uses partial least squares, regression modeling or correlation analyses." 3.3.4.1 Are there differences between manager and employee intention and behavior in a cybersecurity context? Only five articles, or 19%, provide insight into the second research question. However, none provides a clear-cut response to this research question. There is a consensus in all five articles that organizational culture is a cornerstone for security and policy-compliant behavior (Reeves et al., 2020;Hwang et al., 2017;Alzahrani, 2021;Parsons et al., 2015;Li et al., 2019).
Among the articles, there is also a broad agreement that peers' behavior, the influence that peers have on our behavior, is vital for a positive cybersecurity outcome (Li et al., 2019;Alzahrani, 2021;Hwang et al., 2017). Peer-and policy-compliant behavior can only be achieved when the organization has a positive cybersecurity culture. The development of organizational culture often comes from the top management; hence, the development and continued improvement of culture will be assigned to management (Li et al., 2019;Reeves et al., 2020). One interesting finding in the context of developing or harnessing a security culture is that managers have a much lower information security awareness; Reeves et al. (2020) therefore recommend that future training should be targeted to management. This small paradox is at least something to dwell on, given that culture is built from the top.
All the articles provide reasons for noncompliance in their findings. In a hectic environment, employee workload has been shown to negatively impact compliance (Jalali et al., 2020). Connected to workload are work goals. Security will draw the shortest straw when goals and security do not align. If security is viewed as a hindrance, noncompliant behavior will arise (Reeves et al., 2020;Hwang et al., 2017;Alzahrani, 2021;Parsons et al., 2015). Also, when employees lack knowledge or have not been given sufficient information about the organization's security policies, compliant behavior will be impacted (Hwang et al., 2017;Alzahrani, 2021;Parsons et al., 2015;Li et al., 2019).

Discussion
The findings of this SLR have shown that there is an overweight of subjective data collected to measure cybersecurity. Over 90% of the included articles use subjective data to measure behavior. Only one article relies solely on objective measurements. The availability and ease of use regarding subjective methods might be the reason. An interview can be done without much cost or planning, whereas using objective methods will require more resources, e.g. a phishing campaign.
However, the use of subjective data can lead to biased responses from the subjects. This bias can be problematic. According to Kurowski (2019), "For instance, survey reports of church attendance and rates of exercise are found to be double the actual frequency when self-reported." Almost all articles address the issue of biased measurement. Many refer to Podsakoff et al. (2003) and the recommendation therein to assure respondents that their identity will be kept anonymous. It seems like anonymization is an acceptable way to remove the risk of bias for several researchers. However, as Kurowski (2019) finds, there does exist bias in today's research. In his paper, to test for a biased response, two questionnaires were used, one using standard, straightforward compliance questions and one using vignettes, see Table 1. Kurowski (2019) found that generic questionnaires may capture biased policy compliance measures. If an individual reports policy compliance on the literature-based scale, it may mean any of the following: An individual is indeed compliant; an individual does not know the policy and does not act compliant; or an individual thinks they are compliant with the policy because they behave securely, but do not know the policy. This does not imply that existing research fails to measure policy compliance entirely, but it fails to measure it reliably (Kurowski, 2019). Jalali et al. (2020) included objective and subjective measurements. They compared the employees' intention to comply with their actual compliance by examining whether the employees had clicked the link in the phishing campaign or not. They found no significant relationship between the intention to comply and the actual behavior. This result is not in line with previous studies that used self-reported data, a method that leaves room for socially desirable answers (Podsakoff et al., 2003), or previous answers could influence later answers (Jalali, 2014).
Even the HAIS-Q, the single most used questionnaire, used seven times in this SLR, does not refrain from biased responses. Even though the questionnaire was validated and tested by Parsons et al. (2017), when researched to uncover biased responses by McCormac et al. (2017), showed that social desirability bias can be present. This means that further research is needed to exclude biased responses from HAIS-Q.

Conclusion
This SLR, which started with 2,153 unique articles and was reduced during several analysis steps to 26 articles, provides insights into the predefined research questions.
The main research question was: RQ3. How is cybersecurity-related behavior assessed?
When excluding all preparational work before a study is performed, the assessment of behavior can be classified into three components: data collection, measurement scale and lastly, analysis. This research found that subjective data are collected to a much larger extent than objective data, in the context of cybersecurity, with online SAQ as the most prominent way to collect data. Measurement scales are often composed based on existing literature and adapted by the researchers. The most commonly used questionnaire is HAIS-Q, developed by Parsons et al. (2014). Finally, an analysis is performed to test for internal and external validation of the collected data. PLS analysis is the most frequent technique in selected articles. Although a clear path to assess behavior is uncovered, the proposed selfassessment method can produce biased data. Thus, future research should address the problem of objectively assessing cybersecurity-related behavior and the factors affecting it. The second research question, i.e. whether there exist differences between manager and employee behavior, was not conclusively answered. Of the relatively small number of articles, several provide insights and noteworthy findings but not conclusive answers to this research question. In light of the significance of the matter for improving the cybersecurity culture in an organization, this constitutes another interesting research gap.
Future research should bridge the above research gaps, and studies should include employees and management from the same organization. This will require more planning and coordination than simply deploying a questionnaire online. Extra effort in anonymizing personal data must be in place because subjects come from the same organization. The uncertainty surrounding anonymization and the risk of biased responses concerning anonymization must be mitigated. This can be obtained by, e.g. using a hybrid method consisting of objective and subjective data collection, e.g. self-assessment questionnaires and phishing campaigns. Future research should collect holistic data within a market, country, segment or similar, as research into compliance is context-dependable (Jalali et al., 2020).