Psychological assessment in human resource management: discrepancies between theory and practice and two examples of integration

Purpose – Psychologicalassessmentreferstotheprocesswherebydifferentmethodsandtechniquesareusedto test hypotheses about people and their psychological characteristics. Understanding employees ’ psychological makeup is key to allow effective human resource management, from hiring to retirement. However, the gap between scientific evidence and organizational practices dealing with psychological assessment is still great. Design/methodology/approach – General review along with case study Findings – This paper shows the differences between research and practice, i


Introduction
Assessment and psychological assessment Assessment refers to a process aimed to deliver judgment and make an evaluation or decision (McDermott, 2012;Ceschi et al., 2017a, b, c).People can be assessed for several reasons, e.g. to monitor learning, to make a diagnosis, to decide who to hire (Sartori and Ceschi, 2013).When specifically aimed at investigating psychological characteristics, psychological assessments are carried out by using a combination of methods and techniques (Sartori and Pasini, 2007).These can either refer to the idiographic or clinical approach, aiming at a global evaluation of people, for example by using interviews, or to the nomothetic or psychometric approach, which focuses on a targeted assessment of specific features and mainly makes use of standardized instruments such as psychological tests (Luthans and Davis, 1981;Sartori, 2010).
Research has shown that the integration of different methods and techniques can increase the validity and reliability of psychological assessment and improve its predictive value (Sartori and Pasini, 2007;Sartori, 2010).Moreover, while assessment situations vary, the use of standardized tools is largely encouraged to avoid biases in evaluations (Sartori and Pasini, 2007).Besides, psychological assessment is often referred to as psychological testing, through which people can be described and differentiated based on a set of unidimensional psychological characteristics (Sartori, 2006), such as intelligence and personality traits (Sartori, 2006;Sartori and Pasini, 2007).These can be measured by a variety of different instruments, which have to meet criteria regarding validity, i.e. the extent to which they measure a specific construct, and reliability, i.e. the extent to which their results are consistent and stable over time (Sartori and Pasini, 2007;Sartori, 2010).
Yet, despite the centrality of psychological assessment for HRM, a gap exists between evidence-based recommendations and organizational practices (Highhouse et al., 2016), where psychological instruments are rarely, if ever, employed (Ones et al., 2007).For example, according to a survey conducted among 1627 HR managers representing large organizations in the US, while 68% of employers engage in various forms of job skill testing, only 29% of them use one or more forms of psychological measurements (SIOP, http://www.siop.org/workplace/employment%20testing/usingoftests.aspx).These data are consistent with other findings showing that less than 20% of US companies currently use personality tests and that 82% of organizations do not use personality tests in the hiring or employee promotion process (Dattner, 2013).
Likely, research shows that when it comes to personnel selection, unstructured interviews are still the most common tool used to make hiring decisions despite abounding evidence on their lower validity and reliability compared to structured and standardized instruments (Sartori and Pasini, 2007;Sartori, 2010;Cubico et al., 2010).In a similar vein, a study conducted in the Netherlands showed that HR managers hold stronger intentions toward unstructured interviewing compared to structured interviewing (van der Zee et al., 2002).Similarly, in a study conducted in Italy among 21 HR managers and recruiters, participants perceived individual interviews as unavoidable to assess candidates' psychological characteristics (Sartori et al., 2017).Moreover, results from this study showed that psychological tests were perceived as lacking a fit with specific organizational needs or too time-consuming with regard to administration and analysis of results.
Overall, while generally larger organizations make more extensive use of psychological tests compared to small enterprises, evidence shows that interviews are nevertheless considered a final essential step of the selection process to allow real understanding of applicants' psychological characteristics (van der Zee et al., 2002).Such phenomena can be explained as the result of at least two cognitive biases, i.e. the illusion of control (Langer, 1975), which refers to the tendency to overestimate one's ability to control events, and the overconfidence effect, which occurs when subjective confidence in one's judgments is greater than one's objective accuracy (Sartori and Ceschi, 2013;Ceschi et al., 2019).For example, research shows that while HR managers are aware of the availability of standardized tests and instruments their beliefs regarding the validity of such tools are mixed and include perceptions of being skilled enough to reliably assess psychological traits through unstructured interviews (Sartori et al., 2017).

Personality assessment in HRM
In the literature, the importance and limited use of personality tests in organizations have been subject to considerable discussion.According to Morgeson et al. (2007), personality Psychological assessment in HRM measures in HRM are useless because of their low validity and perceived problems with response distortion.On the other side, others argue that personality constructs have been shown to explain and predict attitudes, behaviors, performance and organizational outcomes (Ones et al., 2007), with hundreds of primary studies and dozens of meta-analyses indicating strong support for the use of personality measures in staffing decisions (Ones et al., 2007).Also, research has shown that employing different methods and techniques can improve predictive validity (Furnham et al., 2008;Gaugler et al., 1987;Goldstein et al., 1998;Hardison, 2006;Krause et al., 2006).Yet, studies reveal that the use of multiple methods is often considered expensive and time-consuming (Sartori and Ceschi, 2013;Krause et al., 2006), which lead HR managers to use intuitive and unstructured interviews (van der Zee et al., 2002).Moreover, from a methodological point of view, while unstructured interviews can pose problems in terms of reliability and validity (Ceschi et al., 2017a, b, c), the use of different methods and techniques can result in too much information that may be contradictory and difficult to manage, eventually leading to biased evaluations (Ceschi et al., 2019;Sartori and Ceschi, 2013).Against this background, research is needed to shed light on how to develop assessment tools that can meet organizational needs while being at the same time reliable and valid.To reach this aim, researchers are called to be aware of the needs from applied contexts, i.e. to tailor and develop solutions that meet scientific criteria and are perceived as useful from practitioners.We present two such examples below, which highlight the efforts and ways to integrate the different needs from scholars and practitioners.In doing so, we aim to show how the two worlds of research and practice can be brought together to give life to tools of psychological assessment that are both valid and reliable from a scientific point of view, useful and useable from a professional point of view.

Case study 1: Development of a psychological test for an Italian health association
The first case study is about an Italian health association offering emergency first aid assistance in accidents, disasters and calamities (Sartori et al., 2014;Sartori and Ceschi, 2015).The association is composed of 12 branches with about 70 employees and 1500 volunteer rescuers who work in the ambulance.The occasion for the development of the specific psychological test presented here is the assessment and selection of the numerous candidate volunteer rescuers.Every year, 100 people are admitted for the training courses (two courses per year) at the end of which, if they pass the final test, they can access the association and operate as volunteer rescuers in the ambulance.
The organization, through its board of six directors, expressed the need for a psychological test with the following characteristics: (1) Tailored on the population of volunteer rescuers of the association.
(2) Short and easy to administer, as well as valid and reliable.
(3) Not too selective from the personnel selection point of view since the organization needs volunteers to provide its services and the number of dropouts is generally high.
To reach these aims, the authors adopted an approach combining both qualitative and quantitative techniques (for further details, see Sartori et al., 2014, pp. 3039-3042).
The qualitative part comprised four focus groups with 45 volunteer rescuers divided into groups of 10-12 people each and two two-hour group discussions with the siz directors of the association, to define the characteristics of the test in terms of length, agility and psychological constructs to be measured.The two group discussions were carried out at the beginning and the end of the development process of the test.The authors decided to carry PR 51,1 out fouor focus groups rather than only one or twoto access the highest number of ideas on the test to be developed, without, however, excessively prolonging this phase of data collection.Qualitative data were interpreted in the light of previous literature showing that volunteers are characterized by specific attitudes (Lammers, 1991;Sundeen, 1992;Chacon et al., 2011) and reasoning style (Haan et al., 1968;Briggs et al., 2010;Stolinski et al., 2004).Accordingly, it was established that the test should have measured two such constructs, i.e. one referring to the attitude candidate should have and the other referring to their reasoning.At the end of the qualitative phase, including data collection and interpretation in the light of previous evidence, the newly developed test was composed of 20 items, nine belonging to the dimension of attitude and 11 belonging to the dimension of reasoning (for further details, see Sartori et al., 2014, Tables 1 and 2).
The items measuring attitude and reasoning were either newly developed or drawn from previous research, even though adapted to the target.For attitude, the response scale ranged from 1 5 completely false, to 6 5 completely true, with an even number of options to avoid responses on the central point.Also, items measuring attitude were all reversed, meaning that higher scores indicated a less desirable attitude aligned with the values of the association.This methodological choice was thoroughly discussed during focus groups and group discussions and made according to the concept of face validity in personnel selection contexts (see : Sartori, 2010).The 11 items measuring reasoning comprised tasks or problems with only one correct answer.Each item had six alternatives, following the 6-point rating scale of the items measuring attitude (Burro et al., 2011).Moreover, the items measuring attitude and reasoning were mixed, a methodological choice thoroughly discussed in the light of literature (Nunnally, 1978;Kline, 1998).

Psychological assessment in HRM
Before going on with the administration of the test, 11 items collecting personal data were developed.The 20-item test resulting from theoretical considerations and the qualitative phase previously described was first administered to a pilot sample of 54 volunteers who did not participate in the focus groups and then to 481 participants to assess the properties of the newly developed test (see Tables 3 and 4).
After reversing all the answers to the items measuring attitude and transforming the answers to the 11 reasoning items in 0-1 score (where 0 5 wrong answer, 1 5 right answer), descriptive statistics (i.e.frequencies, percentages; minimum, maximum, mode, median, mean and SD skewness and kurtosis) were computed to test whether the variables fit a normal distribution, an assumption to be respected before proceeding with inferential statistics (Sartori, 2006).Then, two separate sets of principal component analyses were computed, one for the items measuring attitude and one for the items measuring reasoning (see Tables 5  and 6).Finally, item-total correlations and Cronbach's alpha coefficients were computed as reliability measures (for further details, see Sartori et al., 2014, pp. 3043-3049).Results were in line with the theoretical considerations and showed, for example, that the correlation between attitude and reasoning was not statistically significant suggesting that the measured attitude and reasoning are different psychological constructs, as hypothesized.
Since its validation (Sartori et al., 2014), the test has been used to assess and select the candidate volunteer rescuers who want to enter the association.So far, about 3000 people (without considering the 481 participants belonging to the test sample) have been tested.The association expressed full satisfaction with the test and its results.Validity, reliability and statistical norms are computed after every administration with satisfactory results.Interestingly, the statistical norms resulting from the 54 people of the pilot sample overlap with those calculated on the overall sample of people tested so far.This finding suggests that the development of the test in conjunction with the target population led to a highly reliable tool.

Psychological assessment in HRM
Finally, so far, a percentage varying from 5 to 11% of candidates have been found "not adequate" according to the statistical norms, meaning that the newly developed test, as requested, is not too selective.
Case study 2: Development of a personality test for the assessment of candidates and employees FLORA (Sartori, 2014;Sartori et al., 2016a, b) is the name of an Italian personality test developed for the assessment of specific professional profiles in organizations and based on the five-factor model (FFM, also referred to as the Big Five model; Goldberg, 1981Goldberg, , 1990;;McCrae and Costa, 1999).This test was commissioned by a consulting firm dealing with personnel selection, assessment and development.The consulting firm expressed the need for developing an evidence-based personality test able to identify the most relevant dimensions during assessment for different professional profiles.Hence, while the final version of the test is composed of many items referring to several personality dimensions, each dimension is weighted and has different validity and reliability indexes based on the specific professional profile to be assessed.From a theoretical point of view, the FFM was chosen because it allows for identifying a number of basic dimensions describing individual differences in personality and professional profiles in organizations (Holland, 1966;Rothmann and Coetzer, 2003;van der Linden et al., 2010;Soto et al., 2011).According to the FFM, five personality traits, i.e. agreeableness, conscientiousness, emotional stability, extraversion and openness, can explain and predict individual differences over a wide range of settings, including job performance (Ones et al., 2007;Barrick and Mount, 1991;Barrick et al., 2001;Rothmann and Coetzer, 2003).Moreover, evidence from research shows that it is possible to detect lower-order traits, which contribute to describing different facets of the five personality traits.Findings showed that these facets can range from 12 (Mount et al., 1999) to 45 (Hofstee et al., 1992), passing through 18 (Saucier and Ostendorf, 1999), 30 (Costa and McCrae, 1992), 32 (Schmit et al., 2000) and 44 (Hogan and Hogan, 1992).Also, previous studies have shown that the five factors are relevant to different cultures (McCrae and Costa, 1997;McCrae et al., 2005;De Fruyt et al., 2004) and have been found consistently in factor analyses of peer-and self-ratings of trait descriptors involving diverse conditions, samples and factor extraction and rotation methods (Costa and McCrae, 1988;Grucza and Goldberg, 2007).Yet, in the Italian context, while many personality tests exist, also based on the FFM, none of them was specifically designed to be used in  et al., 2007) were developed to measure only the five main factors but not their facets and were not focused to the organizational context.Against this background, FLORA, an Italian psychometric test developed based on the FFM, expressly aims at assessing personality in specific professional profiles described by numerous facets.Given the specific characteristics that the test was supposed to have, the process of its development and validation was split into two phases: (1) A qualitative phase (i.e.test development), consisting of interviews to employees to detect the personal characteristics involved in successful performance, literature review to organize the characteristics previously detected according to the FFM, theoretical construction and development of the first version of the test; (2) A quantitative phase (i.e.validation process), consisting of the administration of the first version of the test to a validation sample and, after changes due to exploratory statistical analyses, to a confirmation sample for confirmatory statistical analyses, monitoring of concurrent validity and calculation of the correlations between the test and job performance.
In the qualitative phase, 32 interviews with 16 different job profiles were carried out (for further details, see Sartori et al., 2016aSartori et al., , b, p. 2057)).Two organizational psychologists were involved for each interview, one as a primary interviewer, the other one as an assistant taking notes.Each interview was audio-registered.Audio registrations and notes were given to five organizational psychologists who worked together for the extrapolation of the personal characteristics emerged in interviews and the categorization of the personal characteristics according to the Big Five (for further details on the procedure, see Barrick and Mount, 1991, pp. 8-9).Characteristics such as abilities, capabilities, skills, competencies, aptitudes and attitudes were eliminated to keep personality traits only (78% out of all the characteristics emerged).As for personality traits, synonyms and antonyms referring to the same characteristic were unified under one label.The personality traits not related to the Big Five, such as the ones referring to the honesty-humility dimension of the HEXACO model (Ashton and Lee, 2007), were eliminated.Content analyses of interviews led to the identification of 28 different personality traits involved in successful performance.
Each dimension was labeled and operationally defined according to the literature and the organizational aims of the test (Sartori, 2014).For each of the 28 dimensions, six items were generated, three positively and three negatively worded.In the end, 168 items were Psychological assessment in HRM developed.Another eight items, drawn from the literature and aimed at measuring social desirability (Crowne and Marlowe, 1960;Manganelli Rattazzi et al., 2000), were added to form a Lie Scale.All the 176 items were randomized and accompanied by a 7-point rating scale, ranging from 1 5 totally disagree; to 7 5 totally agree.
The quantitative part involved a validation sample composed of 407 employees and a confirmation sample composed of 418 employees (for further details, see Sartori et al., 2016aSartori et al., , b, pp. 2058Sartori et al., -2069) ) (see Tables 7 and 8).
As for the exploratory analyses, principal factor analyses (PFA) and principal component analyses (PCA) with the criterion of eigenvalue > 1 and different rotation methods (oblique and orthogonal) were carried out to explore the latent structure underlying the items and to monitor construct validity (factor loading cut-off 5 0.30;, cf.Cronbach and Meehl, 1955;Kline, 1993Kline, , 1998)).Based on the factor solutions obtained through exploratory analyses, confirmatory analyses were carried out using structural equation models with maximum likelihood estimator to test the robustness of the factor models previously identified.Analyses were carried out for each trait separately (extraversion, sociability, conscientiousness, openness and emotionality) and, within each trait, for each dimension of FLORA.Besides, the items belonging to the Lie Scale were analyzed, and correlation indexes (r) and coefficients of determination (r 2 ) were computed between each dimension of FLORA and the Lie Scale total score to test whether and how each dimension was affected by social desirability.Second-order factor analyses (PFA and PCA) were carried out to test whether FLORA's dimensions overlapped with the original FFM.Also, Cronbach's alphas were calculated as reliability measures in terms of internal consistency between items.
Concurrent validity was also tested by administering FLORA together with the test presented in the first case study to 1028 subjects.Moreover, in line with research by Rothmann and Coetzer (2003) and van der Linden et al. (2010), FLORA was administered to 220 trade agents to test whether and how different facets were associated to job performance expressed in terms of sales figures (Sartori et al., 2016a, b).Overall, results from the different analyses conducted showed that the different dimensions of FLORA are sufficiently uncorrelated to each other and with the Lie Scale measuring social desirability, suggesting that the test was appropriately developed according to both the theoretical model and the needs of the consulting firm and is now a valid and reliable tool for personality assessment.Also, the correlations between attitude and reasoning measured by the test presented above and the different facets of FLORA were aligned with previous literature.Based on these results, FLORA is currently an Italian personality test based on the FFM and measuring 24 The characteristics of FLORA, which was developed starting from interviews to employees, seem to meet the criteria to make it useable for the assessment of specific professional profiles in organizations.Hence, it has reached the goal to be both a scientific instrument and a professional tool.

Final considerations and implications
While psychological assessment in organizations can contribute to better decision-making related to HR functions, often psychological tests may sound cumbersome to practitioners and employees (Hogan et al., 1996;Sartori et al., 2015a, b).As a result, personality profiles and other outputs from psychological assessment may sound meaningless or even abstruse to managers and decision-makers.Think about the Minnesota Multiphasic Personality Inventory (MMPI), for example, which is a standardized psychometric test of adult personality and psychopathology measuring 10 clinical dimensions which is also used in employment (Zapata-Sola et al., 2009).In this paper, we aimed to present findings from research attempting to fill the research-practice gap regarding psychological assessment in organizations.
The cases presented above show examples of how it is possible to develop instruments that are accessible and understandable to practitioners based on specific assessment needs.In doing so, the studies reviewed different processes that can be used to create assessment tools based on validated theoretical models, which are valid and reliable and meet organizational needs.Accordingly, a main implication of this contribution is the recognition of the possibilities deriving from the integration of research and practice.The studies show that the different needs from research and practice can be not only acknowledged but also integrated, leading to a process where research and practice enrich each other and result in assessment tools that are valid to both researchers and practitioners.
The first instrument presented in this article solved the problem of assessing a large number of candidate volunteer rescuers when they want to access an Italian health association.The collaboration between the association and academia has resulted in an instrument which is, as expected, dedicated to the population of candidate volunteer rescuers, valid and reliable, short and easy to administer, not too selective.The second instrument presented in the article, FLORA, has filled a gap in the context of Italian personality tests.It is a test composed of 24 (now 27) dimensions grouped according to the FFM and useable to assess different professional profiles.Feedback from the consulting firm, which has continually proposed the use of FLORA to its clients since the publication of the test (Sartori, 2014), is positive.Specifically, clients report being satisfied with the language used in the test and its outputs, perceived as accessible.
In conclusion, this article stems from the desire to show how the world of research and the world of practice can meet to develop psychological assessment tools that are both valid and reliable, actually useable in the HRM perspective.The main implication is that the combination of practice and research can give birth to valid and reliable psychological assessment tools that companies and organizations can trust and use for their psychological assessment activities included in HRM.
You did it!",says the wife who wants to blame the husband.Based on the information in your possession, you can conclude that. . .did it!",says the culprit who committed the crime.Based on the information in your possession, you can conclude that. . .You did it!",says the wife who saw the husband commit the crime.Based on the information in your possession, you can conclude that. . .
"I did it!",says the innocent who did not commit the crime.Based on the information in your possession, you can conclude that. . .
Caprara et al., 1993)facets of the five main factors.For example, the Big Five Questionnaire (BFQ -Caprara et al., 1993)and the Big Five Questionnaire 2 (BFQ 2 -Caprara organizations Lie Scale.It is composed of 149 items, 78 of which positively worded, 71 negatively worded.Moreover, three new dimensions have been added lately, based on emerging needs from different organizations, i.e. impulsivity (belonging to emotionality), openness to diversity and openness to change (both belonging to openness).