Search results
1 – 10 of over 29000Jose Manuel Azevedo, Ema P. Oliveira and Patrícia Damas Beites
The purpose of this paper is to find appropriate forms of analysis of multiple-choice questions (MCQ) to obtain an assessment method, as fair as possible, for the students. The…
Abstract
Purpose
The purpose of this paper is to find appropriate forms of analysis of multiple-choice questions (MCQ) to obtain an assessment method, as fair as possible, for the students. The authors intend to ascertain if it is possible to control the quality of the MCQ contained in a bank of questions, implemented in Moodle, presenting some evidence with Item Response Theory (IRT) and Classical Test Theory (CTT). The used techniques can be considered a type of Descriptive Learning Analytics since they allow the measurement, collection, analysis and reporting of data generated from students’ assessment.
Design/methodology/approach
A representative data set of students’ grades from tests, randomly generated with a bank of questions implemented in Moodle, was used for analysis. The data were extracted from a Moodle database using MySQL with an ODBC connector, and collected in MS ExcelTM worksheets, and appropriate macros programmed with VBA were used. The analysis with the CTT was done through appropriate MS ExcelTM formulas, and the analysis with the IRT was approached with an MS ExcelTM add-in.
Findings
The Difficulty and Discrimination Indexes were calculated for all the questions having enough answers. It was found that the majority of the questions presented values for these indexes, which leads to a conclusion that they have quality. The analysis also showed that the bank of questions presents some internal consistency and, consequently, some reliability. Groups of questions with similar features were obtained, which is very important for the teacher to develop tests as fair as possible.
Originality/value
The main contribution and originality that can be found in this research is the definition of groups of questions with similar features, regarding their difficulty and discrimination properties. These groups allow the identification of difficulty levels in the questions on the bank of questions, thus allowing teachers to build tests, randomly generated with Moodle, that include questions with several difficulty levels in the tests, as it should be done. As far as the authors’ knowledge, there are no similar results in the literature.
Details
Keywords
Natalia Velikova, Roy D. Howell and Tim Dodd
The purpose of this paper is to address the issue of objective knowledge operationalisation with specific focus on varying levels of scale items’ difficulty. The ultimate goal of…
Abstract
Purpose
The purpose of this paper is to address the issue of objective knowledge operationalisation with specific focus on varying levels of scale items’ difficulty. The ultimate goal of the study was to develop a scale to measure objective wine knowledge, which would address the domain of wine knowledge and differentiate varying levels of consumer wine knowledge.
Design/methodology/approach
The process of items’ development was guided by recommendations suggested by DeVellis (2003) in his influential work on theory and application of scale development. Examination of items’ performance was conducted through a series of field tests with consumer samples (N = 756) in a US wine region. Item response theory (IRT) approach was applied for items’ testing. The developed items were analysed using the two-parameter logistic model in Mplus Version 5.
Findings
The study offers a 44-item test suitable for assessing wine knowledge across a broad spectrum of expertise. For example, if the goal is to assess wine knowledge differences among relatively knowledgeable respondents, a subset of more difficult items could be chosen. Alternatively, a test for novices could be constructed from easier scale’s items.
Research limitations/implications
For researchers, the study offers conceptualisation of the wine knowledge domain, suggests a parsimonious instrument to measure the construct, offers a valid and reliable measure for use in testing theories of consumer knowledge and provides empirical evidence of the value and usefulness of the developed scale.
Practical implications
For professionals, the proposed test may be used to test consumer knowledge and to help assess a prospective employee’s general knowledge of wine. The test can also be given at hospitality programs, outreach and continuing education programs.
Originality/value
The current paper takes an alternative approach to classical test theory and offers an objective wine knowledge scale tested through IRT. This approach avoids shortcomings associated with classical measurement and offers an original scale that can discriminate among respondents with different levels of wine knowledge.
Details
Keywords
The purpose of this chapter is to provide researchers a summary of some of the latest developments in item response theory (IRT), and to help these groups realize that…
Abstract
The purpose of this chapter is to provide researchers a summary of some of the latest developments in item response theory (IRT), and to help these groups realize that psychometric tools can now be used for theory testing in addition to the traditional role of improving construct measurement. The author first reviews some of the fundamental tenets of classical test theory to contrast with IRT. He then describes recent advances in goodness-of-fit tests that have helped turn IRT into a model-testing tool. Finally, the author reviews several new test models that provide new flexibilities, summarizing summarize several examples of research that has used these new models in organizational research. At the end of this review, the author provides suggestions to help researchers better use these new IRT tools. Although there have been significant advances in IRT in the past decade, there has not been a systematic review of these developments. This review places those developments in context to provide readers a real appreciation of these breakthroughs.
Thomas Salzberger and Rudolf R. Sinkovics
The paper investigates the suitability of the Rasch model for establishing data equivalence. The results based on a real data set are contrasted with findings from standard…
Abstract
Purpose
The paper investigates the suitability of the Rasch model for establishing data equivalence. The results based on a real data set are contrasted with findings from standard procedures based on CFA methods.
Design/methodology/approach
Sinkovics et al.'s data on technophobia was used and re‐evaluated using both classical test theory (CTT) (multiple‐group structural equations modelling) and Rasch measurement theory.
Findings
Data equivalence in particular and measurement in general cannot be addressed without reference to theory. While both procedures can be considered best practice approaches within their respective theoretical foundation of measurement, the Rasch model provides some theoretical virtues. Measurement derived from data that fit the Rasch model seems to be approximated by classical procedures reasonably well. However, the reverse is not necessarily true.
Practical implications
The more widespread application of Rasch models would lead to a stronger justification of measurement, in particular, in cross‐cultural studies but also whenever measures of individual respondents are of interest.
Originality/value
Measurement models outside the framework of CTT are still scarce exceptions in marketing research.
Details
Keywords
The present chapter addresses a topic that is of growing interest – namely, the exploration of alternative item response theory (IRT) models for noncognitive assessment. Previous…
Abstract
The present chapter addresses a topic that is of growing interest – namely, the exploration of alternative item response theory (IRT) models for noncognitive assessment. Previous research in the assessment of trait emotional intelligence (or “trait emotional self-efficacy”) has been limited to traditional psychometric techniques (e.g., classical test theory) under the notion of a dominance response processes describing the relationship between individuals' latent characteristics and individuals' response selection. The present study, presents the first unfolding IRT modeling effort in the general field of emotional intelligence (EI). We applied the Generalized Graded Unfolding Model (GGUM) in order to evaluate the response process and the item properties on the short form of the trait emotional intelligence questionnaire (TEIQue-SF). A sample of 866 participants completed the English version of the TEIQue-SF. Results suggests that the GGUM has an adequate fit to the data. Furthermore, inspection of the test information and standard error functions revealed that the TEIQue-SF is accurate for low and middle scores on the construct; however several items had low discrimination parameters. Implications for the benefits of unfolding models in the assessment of trait EI are discussed.
Yvonne Mery, Jill Newby and Ke Peng
With a call for increased accountability for student learning across higher education, it is becoming more important for academic libraries to show their value to the greater…
Abstract
Purpose
With a call for increased accountability for student learning across higher education, it is becoming more important for academic libraries to show their value to the greater university community with the use of quantitative data. This paper seeks to describe the development of an information literacy test at the University of Arizona to measure student learning in an online credit course. In order to measure the impact of an online course, a test that was statistically valid, and reliable was created by local librarians.
Design/methodology/approach
The methodology involved administering test items to undergraduate students enrolled in an online information literacy course and applying both classical test theory and item response theory models to evaluate the validity and reliability of test items. This study included the longitudinal and cross‐sectional development of test items for pre and post‐testing across different student groups. Over the course of two semesters, 125 items were developed and administered to over 1,400 students.
Findings
The creation of test of items and the process of making test items reliable and valid is discussed in detail. Items were checked for construct validity with the use of a national standardized test of information literacy (SAILS). Locally developed items were found to have a higher than average reliability rating.
Practical implications
The process described here offers a method for librarians without a background in assessment to develop their own statistically valid and reliable instrument.
Originality/value
One of the unique features of this research design was the correlation of SAILS items with local items to test for validity. Although SAILS items have been used by many libraries in the past, they have not been used to create new test items. The use of the original SAILS test items is a valuable resource for instruction librarians developing items locally.
Details
Keywords
Francisco J. Martínez‐López, Juan C. Gázquez‐Abad and Carlos M.P. Sousa
Structural equation modelling (SEM) is a method that is very frequently applied by marketing and business researchers to assess empirically new theoretical proposals articulated…
Abstract
Purpose
Structural equation modelling (SEM) is a method that is very frequently applied by marketing and business researchers to assess empirically new theoretical proposals articulated by means of complex models. It is, therefore, a logical thought that the quality of the new advances in marketing and business theory depends, in part, on how well SEM is applied. This study aims to conduct an extensive review and empirical analysis of a broad variety of classic and recent controversies and issues related with the use of SEM, in order to identify problematic questions and prescribe a compendium of solutions for its suitable application.
Design/methodology/approach
The main analyses were conducted on a sample of 191 SEM‐based papers and 472 applications, i.e. all the SEM‐based studies published in four leading marketing journals during the period 1995‐2007.
Findings
Despite the maturity of SEM, its application in marketing research still has notable room for improvement. This is a general conclusion based on numerous problems detected and discussed here.
Practical implications
The study provides plausible solutions to the problems identified, a useful guide that is easy to follow and to apply adequately to present SEM issues in marketing and business studies.
Research limitations/implications
The sample of SEM‐based papers and applications is limited to four publication outlets. A wider set or/and other journals different to those analyzed here may be preferred.
Originality/value
This is a valuable and timely study of the application of SEM in marketing and business research, and is also useful as a guiding framework for good practice. Likewise, as the problems discussed here presumably occur in other areas of social science, this paper should be welcome beyond the borders of the business disciplines.
Details
Keywords
This paper aims to present the results of a systematic review of the evidence on psychometric properties of information literacy (IL) tests.
Abstract
Purpose
This paper aims to present the results of a systematic review of the evidence on psychometric properties of information literacy (IL) tests.
Design/methodology/approach
A two-stage search strategy was used to find relevant studies in two subject and three general databases. A descriptive review of test characteristics and psychometric properties was presented. The review included 29 studies describing psychometric properties of 18 IL tests.
Findings
It was found that the classical test theory was applied for all tests. However, the item response theory was also applied in three cases. Most of the psychometric tests were developed in the USA using ACRL IL competency standards. The most commonly used psychometric analyses include content validity, discriminant validity and internal consistency reliability.
Research limitations/implications
Only studies in English language are included in this review.
Practical implications
The study recommends that standards should be developed for the use and reporting of psychometric measures in designing IL tests. Librarians need to be trained in psychometric analysis of tests.
Originality/value
It is the first study that systematically reviewed psychometric properties of IL tests. The findings are useful for librarians who are teaching IL courses.
Details