Causal modelling of testing and assessment ethics among proctors of public examinations in Oyo state

Purpose – Upholding assessment ethics are common concerns during annual public examination performance appraisal. Previous studies have focused more on examination stakeholder: testees outside proctors however, assessment ethics cannot be studied excluding proctors variables therefore, the study investigated consistency of a structural equation modelling of security, environment, professionalism, testing and assessment ethics. Design/methodology/approach – Ex-post facto design was adopted. Simple random sampling technique was employed to choose 90 proctors drawn from 45 colleges. Proctors Examination Ethics Questionnaire (reliability 5 0.86) was used to collect data for the study. Data collected were analysed using path analysis at 0.05 significant levels. Findings – Out of the six hypothesised paths significantly explaining the consistency of the causal model. Test security, environment and professionalism accounted for both direct and indirect effects on assessment ethics. All model fit indices were established to explain testing and assessment model. Researchlimitations/implications – Fewproctorvariableswerestudied,thereforeassessmentethicsmay not be explained other than through proctor variables considered in this study. Practical implications – Assessment ethics may not be violated if test security, testing environment and professionalism are not cared for during test administration as shown in the study. Social implications – It added to knowledge base in ethical areas of assessment, a 21st-century proctors in upholding testing and assessment ethics, security, environment and professionalism are to be considered. Originality/value – There was a positive causal effect of security, environment and professionalism on testing and assessment ethics among proctors in public examinations.


Introduction
Testing and assessment are often used loosely and interchangeably in education parlance.In educational evaluation, assessment can be thought of diverse means of gathering data on ability or achievement of individual.It involves both quantitative and qualitative means of data collection on learners' achievement.Assessment is an umbrella term encompassing measurement instruments, as well as qualitative methods of monitoring and recoding student learning such as observation, simulations or project work.Olutola, Daramola, and Ogunjinmi (2016) define assessment as teachers' activities directed to help learners to learn and to determine their progress and performance.Assessment is often seen as a tool to measure the Testing and assessment ethics among proctors progress of individual students.Testing, on the other hand, is the process of presenting stimulus in order to elicit response.A test can be seen not only as a tool but also as a set procedure employed to systematically measure a sample of behaviour by asking a series of questions.Tests are designed to measure the quality, competence, skills or knowledge of a sample against specific criteria, and these criteria are typically considered acceptable or unacceptable.In educational practice, testing is a method used to determine whether students have learnt what they are supposed to learn or complete a particular task or to demonstrate proficiency in a skill or content (Kelly, 2023).Testing is done with the help of an instrument called test which could take different forms or different types.The form could be oral or written, the type could be cognitive or non-cognitive.
Assessment is undergoing a paradigm shift.This shift includes but not limited to a shift from assessment of learning (AOL) to assessment for learning (AFL), from psychometrics to a broader model of educational assessment, and from a testing and examination culture to an assessment culture.The whole essence of paradigm shift is to ensure that assessment serves its basic purpose in the educational system.AOL is a type of assessment which is intended to inform the teaching and learning process (Oyinloye & Imenda, 2019).Hafeez et al. (2022) refer AOL as gathering and using evidence purposely to report summary of learning at a particular point in time in order to improve learning.Learning assessment is holistic because it aims to measure learning outcomes and report those outcomes to students, parents, administrators and other stakeholders.AFL, on the other hand, occurs throughout the learning process by providing immediate feedback to both teachers and learners.
Assessment and testing involve construction and use of several instruments as well as organisation of data collected by examiners.Similarly, when testing and assessment programmes are choosing, administered and employed correctly, they can make a valuable contribution to the nation's educational system, this is a case of ethics (Okwilagwe & Jinadu, 2016).There is a growing international consensus that ethics is of increasing importance to education in assessment and testing, and that it must become part of the language that proctors as stakeholders in assessment and testing are comfortable using.The study of ethics can be seen in many fields.It is an academic field of study belonging mainly to the philosophy area, where it is studied either on a theoretical level or on a practical or applied level.Research on ethics and morality has examined the interplay of these two terms.Muleya et al. (2017) argued that ethics are explicit guidelines for regulating activities.The ultimate goal is to establish rules by which human activities are regulated over other human behaviours, desirable values and personality traits worth developing.Botha (2016) holds a similar view and describes ethics as a moral code and cross-cultural consideration that defines obligations of what is right.This view means our actions are ethical when we ensure that they are always good for all stakeholders in all circumstances.Ethics have an important role in guiding standards of behaviour as to what is right in the conduct of assessment and testing this why Macfarlane et al. (2012) state that the ethics of testing and assessment in general ethics classes as part of commercial training also arouses great interest.In college ethics, colleges typically provide relevant training and further education to prospective and part-time teachers.
Despite the germaneness of ethics in assessment, upholding the assessment ethics among proctors of public examinations remains a thing of worry.The misuse of testing probably may not be intentional in many instances.Individuals involved in the preparation and conduct of examination may not understand their expected roles or approved practices in standardised testing.There are many international bodies that identified the appropriate roles of test administrators in assessment, for example, The Design and Delivery of Assessment Centres (British Psychological Society, 2015) and the Guidelines and Ethical Considerations for Assessment Center Operations (6th edition) (International Taskforce on Assessment Center Guidelines, 2015).These guidelines address aspects such as assessor training, validation issues and technology.They also include a section on ethical, legal and social responsibility issues, such as informed participation, data security and unfair discrimination.However, many proctors do not normally adhere to them.
These responsibilities include the selection of assessment tools, the preparation of students for the test, the administration of the test itself, the interpretation and its use.Any practice that follows and conforms to the fundamental reason for any testing programme falls within the ethical practice in testing and assessment.By using such practices, a number of laid down protocols are expected not to be distorted in order not to inappropriately raise test scores.In doing so, many parties involved in the administration of test are to follow laid down procedures to the letters.However, the basis for inappropriate practices in testing can be more uneasy than as seen.The causes can be from anyone or stages therein in exercises.These can be grouped into security, environment and professionalism (International Taskforce on Assessment Center Guidelines, 2015).Test security is all means to keep test items as well as entire process of assessment from breach of protocols.To preserve the security of the assessments mainly is to review with a view to thwart means of tampering test scores via inappropriate preparation practices.So, tests should be kept secure during their development stages (Murchan & Siddiq, 2021).
Test environment is yet another variable that can influence assessment ethics.Kelly (2023) indicates that the test environment deals with noise levels and the condition examination.A noiseless, bright and comfortable space should ameliorate evaluation results," was approved by Kelly (2023).Similarly, during administration, disclosure of purposes of testing to all parties involved is essential.Monitoring of the administration of the test, dealing with breaches of protocols and good testing environment is key.It is pertinent to adhere to terms set by the developer of an assessment herein referred to as environment.Professionalism refers to the behaviour, competences and attitudes towards clients or other workmates in the practice of practice that is always regulated in all professional organisations.Technical competencies include communication, knowledge, technical skills and reasoning.Prayudi (2012) states that professionalism not only affects teacher performance and student learning outcomes but also impedes performance of classroom assessment duties.The study raises the issue of not mastering classroom conditions and learning materials at SMP Negeri 19 Bandar Lampung.A high degree of professionalism is required of teachers in order to master the subjects and not face obstacles in the provision of education.In other words, student assessment must be ethical, fair, useful, feasible and accurate (Trina et al., 2019).
Previous studies have either investigated ethics, assessment or structural equation modelling (SEM) of some variables different from assessment ethics variables.For instance, Reuben and Eremie (2020) consider the future and ethical design of psychological testing.They have taken into account some of the professional concerns that play an important role in the current and future psychological testing landscape like testing, and hope for new and improved testing.With increasing awareness of the psychological needs of test users and people, especially changes in human relation that can psychologically affect human personality, psychologists are looking for new tests to meet the burgeoning desires for future testing requirements to be more creative to develop a growing population and permanent changes to existing tests while achieving psychological testing goals.
Other studies such as Jinadu (2020) that investigated the consistency of the structural equation modelling SEM of digital nativity, digital literacy, category of adoption of digital devices, and digital citizenship adopted an ex-post facto design with simple random sampling technique to select three states from South West, Nigeria.Twenty postgraduate students were randomly selected from the chosen departments in the federal and state universities, while ten postgraduate students were randomly selected from three departments in the private universities making 690 participants for the study.Digital Literacy test and Digital Construct Response Scale were the instruments.Data were analysed using path analysis (PA) Testing and assessment ethics among proctors at 0.05 significant levels.Four out of the six hypothesised paths significantly explained the consistency of the causal model.Digital nativity, category of adoption of digital devices, digital literacy, accounted for high proportion of direct effect on digital citizenship, whereas digital literacy, accounted for little proportion of indirect effects on digital citizenship.Goodness-of-fit index and other model fits were all fit.There was a positive causal effect among the variables therefore, the study recommended that higher degree students should consider digital nativity with inputs from the category of adopting digital technology to become digital citizens.However, the study did not consider assessment ethics or its explanatory variables as the case in this present study.Also, Akanimoh (2022) investigated the examination bodies' compliance to ethics and social responsibility in assessment from the testees' perspectives.The study used ex-post facto design with 2,500 respondents drawn from three states in South-Western Nigeria.Testees' Response to Ethics and Responsibility Questionnaire was used to collect data.Frequency count and correlations were used to analyse the data collected.It was deduced from the findings that the level of compliance varies across the examination bodies.The study recommends that examination bodies that are non-compliants to ethics and social responsibility in assessment should seek for ways of improving their development and administration of test items as well as item analysis for high ethical and social responsibility.These relationships were only observed using linear correlation but not causal or additive relationship unlike the SEM.The study failed to test any hypothesised model on the considered assessment ethics concepts or variables as was done in this current study.
SEM is a universal concept employed to delineate a group of statistical models employed to test the validness of substantive propositions with observed information.It is a statistical procedure that makes a verification of hypothesis from the appraisal of a structural proposition based on certain processes.This proposition exemplifies causal which yields observations on many variables (Byrne & Cahyono, 2022).The term SEM presumed that the causal exemplified by number of regression equations, in which the connections can be modelled vividly to allow a well-defined formulation of the proposition under investigation (Marcoulides & Falk, 2018;Byrne & Cahyono, 2022;and Okwilagwe & Jinadu, 2016).
Respective facets of SEM discern it from the primitive contemporaries of multivariate operations.It adopts a confirmation as against exploratory means of the data appraisal by calling for the form of inter-variable connections specified by logic.SEM bestows appraisal of data for inference purposes on itself (Byrne & Cahyono, 2022).By contrast, many other multivariate procedures are fundamentally for description.This makes verification of hypothesis difficult and sometimes impossible.Also, convectional multivariate ways are not able of either appraising or adjusting for measurement error, but SEM offers clear forecasts of these error variance parameters.Indeed, substitute methods such as ones having basis in regression take errors in the independent variable away.Such faults are kept off when SEM analyses are employed (Okwilagwe & Jinadu, 2016).

Statement of the problem
The assessment and testing literature provides some guidance for teachers and other test administration officials in terms of ethical and unethical practices in standardised testing.Previous studies that have investigated assessment have done so using core public examination officials and these investigations were subsequently tied down to variables such as responsibility, achievement, interest and attitude which are outside ethics being considered in this study.However, these variables cannot be limited to only core public examination officials and their responsibility.It is possible to extend them to proctors as it was done in this study.Literature on variables that have causal relationship with ethics in testing and assessment are rare and the few ones so far also indicated a failure to test hypothesised models in a path analytical study comprising security, environment, professionalism, testing and assessment ethics.Therefore, the researcher investigated the extent to which security, environment, professionalism, testing and assessment ethics have causal relationship.

Research questions
RQ1.How consistent are the causal effects among test security, environment, professionalism and ethics in testing and assessment with empirical data?
RQ2.What are the most meaningful causal paths and models involving the causal effect among the variables (test security, environment, professionalism and ethics in testing and assessment)?
RQ3.What are the fit indices of the re-specified causal path model?
RQ4.What are the effects of the causal model?

Methodology
The study adopted ex-post facto of correlational research type because the variables had occurred much earlier before measurement.Exogenous variables are security and environment, and the endogenous variable is professionalism while the criterion variable is ethics in testing and assessment.The target population comprises all the proctors for certificate examination at the senior secondary school level in Oyo state, Nigeria.Multi-stage sampling procedure was adopted.In the first stage, Oyo was stratified along the existing three senatorial districts and simple random sampling was used to select three local governments each from each of the districts.In the second stage, simple random sampling was employed to choose five schools each from each of the local governments selected.Random sampling was further used to choose two teachers who have been supervisors for WAEC and NECO for not lesser than three years.A total of 90 proctors were drawn from 45 colleges.The sample distribution is shown in Table 1.Proctors Examination Ethics Questionnaire (PEEQ) was used to collect data.PEEQ was developed by the researcher to measure proctors' responses to examination ethics in testing and assessment.Part I deals with proctors' demographic information such as the name of the school, gender, age, highest educational qualification, grade level and number of years of experience in examination supervision.Section B is on examination ethics.The initial test contain 28 items in which participants were asked to respond on a four-point scale of always -4, sometimes -3,  Testing and assessment ethics among proctors rarely -2 and never -1, however, the scoring was reversed for negative items.These items were subjected to pilot testing using testees who were not part of the final sample for the study.The content validity was established by giving the draft to psychometricians in the field of assessment and testing, where irrelevant items were deleted or modified and others subsequently retained.To determine the internal consistency of the instrument, Cronbach's Alpha method of reliability was employed which yielded a value of 0.862.
The researcher himself monitored the data-gathering exercise.The administration was carried out in sequence based on the days and periods allowed by the head of schools used.Data collection exercise lasted six weeks and the data collected were analysed by PA.

Results
How consistent are the causal effects among test security, environment, professionalism and ethics in testing and assessment with empirical data?
Figure 1 reveals the path coefficients and associations among the variables.This is necessary since there is a need to examine the path coefficients and the correlations in order to decide on paths to be deleted and those to be retained.Two sets of SEM analysis (Amos Version 23.0) were conducted in line with the structural diagram (Figure 1).For the first SEM analysis that is for the hypothesised model, the path coefficients and Zero order correlations revealed that two paths were not significant out of six paths which are to be deleted.Therefore, paths P 31 and P 32 (correlation between professionalism and security r 5 À0.48; p > 0.05 and correlation between professionalism and environment r 5 0.06; p > 0.05) were deleted.The second set of SEM analysis was conducted without the deleted paths to depict the meaningful paths.
What are the most meaningful causal paths and models involving the causal effect among the variables (test security, environment, professionalism and ethics in testing and assessment)?
What are the fit indices of the re-specified causal path model?Key: χ 2 5 Chi-square; Df 5 Degree of freedom; NFI5 Normed fit index; GFI 5 Goodness of fit; AGFI 5 Adjusted Goodness of fit; RMSEA 5 Root Mean Square Error of Approximation.
Table 2 shows the fit indices of the re-specified model which is consistent with empirical data.The table indicates Goodness-of-fit index based on the affinity of model fit χ 2 (2) 5 2.188; What are the effects of the causal model?Table 3 shows the causal effects.The table indicates that Test security (0.65), Testing environment (0.05) and Professionalism (0.14) accounted for 98.8% of direct effects on Ethics in assessment and testing, whereas test security (0.01) accounted for 1.17% indirect effects on ethics in assessment and testing.

Discussion
The result on the model which describes the causal effects among test security, environment, professionalism and ethics in testing and assessment as consistent with empirical data reveals the path coefficients and associations among the variables.The result showed that paths P 31 and P 32 (correlation between professionalism and security r 5 À0.48; p > 0.05 and correlation between professionalism and environment r 5 0.06; p > 0.05) were deleted.This result may be due to the fact that ensuring test security ordinarily is being professional as test administrator ditto test environment hence, deletion of those two paths.Also, the result may be due to the sample size used for the study since the maximum number of proctors obtainable from school is limited for a given public examination against the large sample size used in SEM.
The result of this study is in tune with that of (Kelly, 2023) who reported that test environment addresses noise levels in examination.A noiseless, bright and soothing space was expected to improve scores, a hallmark of ethics.This result is also in tandem with that of Akanimoh (2022) who found out that the examination bodies' compliance to ethics and social responsibility in assessment varied based on the testees' perspectives.The study recommends that examination bodies that are non-compliants to ethics and social responsibility in assessment should seek for ways of improving their development and administration of test items as well as item analysis for high ethical and social responsibility.
The finding on the most meaningful causal paths and models involving the causal effect among the variables (test security, environment, professionalism and ethics in testing and assessment) shows that four out of the six hypothesised paths were significant and meaningful.Security and Ethics X1-X4 (r 5 0.089; p < 0.05), Environment and ethics X2-X4 (r 5 0.044; p < 0.05), and between Professionalism and ethics X3-X4 (r 5 À0.203; p < 0.05).The other two paths that were not significant were trimmed off to revalidate the model with a view to be consistent with the empirical information.The trimming of this nature is expected based on the ground rules in SEM which hold that paths that are not significant should be trimmed off.The finding of this study corroborates that of Jinadu (2020) who found out that four out of the six hypothesised paths significantly explained the consistency of the causal modelling consisting of digital nativity, category of adoption of digital devices, digital literacy, accounted for high proportion of direct effect on digital citizenship, whereas digital literacy, accounted for little proportion of indirect effects on digital citizenship.The study also found a positive causal effect among the variables therefore, the study recommended that higher degree students should consider digital nativity with inputs from the category of adopting digital technology to become digital citizens.
The result on fit indices of the re-specified causal path model shows the Goodness-of-fit index based on the affinity of model fit χ 2 (2) 5 2.188; Comparative-Fit Index 5 0.98; Absolute-Goodness-of-Fit Index 5 0.98; Root Mean Square Error Approximation 5 0.01.Findings in respect of fit indices that explain causal effects among the variables studied revealed that initial value for model fit is inferior to that of the re-specified model.The non-significant Chisquare of the re-specified model bespeaks that the distinction between the initial and re-specified model is not substantial hence, the re-specified model is fit.This judgement is drawn on the premise of affinity that goodness of fit calculated has implication for sample size, which implies that it is sample sensitive.The lesser the Chi-square, the desirable the model.This is based on the recommendation of Byrne and Cahyono (2022).Other measures of model fits that are not sample-sensitive pointed that the model met the information hence, agreeing with Marcoulides and Falk (2018), James and David (2020), and Byrne and Cahyono (2022) who recommended the various degrees of fitness.In their recommendations, practically, the Chi-square test of model fit is strongly influenced by sample size that is statistical power increases as sample size increases, hence the use of other fit indices.
The finding on the effects of the causal model shows that Test security (0.65), Testing environment (0.05) and Professionalism (0.14) accounted for 98.8% of direct effect on Ethics in assessment and testing, whereas test security (0.01) accounted for 1.17% indirect effect on ethics in assessment and testing.The finding regarding direct, indirect and total effects among security, environment and professionalism also revealed that the direct effects are more than the indirect effects.The result is in tune with that of Jinadu (2020) who reported higher proportion of direct effects than that of indirect effects.In his study, the direct effects accounted for 62% of the total effects which is in tune with the current study.The researcher also reported that the total effect (direct plus indirect) of all the predictor variables answer for a higher percentage of the variability in the criterion.

Conclusions
The study has established a positive causal relationship among test security, environment, professionalism and ethics in testing and assessment.It was found out that test security, test environment and professionalism had greater direct effects than indirect effects on ethics in assessment and testing with four out of the six paths explaining the consistency of the model.Assessment ethics may not be violated if test security, testing environment and professionalism are properly taken care of during test administration.The understanding of the ethical areas of assessment and testing was also revealed.It is therefore recommended that proctors students should consider the security of test as well as inputs from compliant testing environment and proctor professionalism to attain acceptable ethics in assessment and testing.

Figure 1 .
Figure 1.Hypothesised recursive path model of the four variables

Table by AuthorTable 1 .
Sampling frame

Table 2 .
Model fit summary of the re-specified model