Assessing university students’ perception of academic quality using machine learning

Purpose – The aim of this research is to assess the influence of the underlying service quality variable, usually related to university students’ perception of the educational experience. Another aspect analysed in this work is the development of a procedure to determinewhich variables aremore significant to assess students’ satisfaction. Design/methodology/approach – In order to achieve both goals, a twofold methodology was approached. In the first phase of research, an assessment of the service quality was performed with data gathered from 580 students in a process involving the adaptation of the SERVQUAL scale through a multi-objective optimization methodology. In the second phase of research, results obtained from students were compared with those obtained from the teaching staff at the university. Findings – Results from the analysis revealed the most significant service quality dimensions from the students’ viewpoint according to the scores that they provided. Comparison of the results with the teaching staff showed noticeable differences when assessing academic quality. Originality/value – Significant conclusions can be drawn from the theoretical review of the empirical evidences obtained through this study helping with the practical design and implementation of quality strategies in higher education especially in regard to university education.


Introduction
Nowadays, higher education institutions are making an effort to change and evolve, in order to develop and promote excellence in quality education models, putting students at the centre of the teaching-learning process. Quality assessment in higher education is of global interest; governmental and public demand for accountability from higher education institutions has Students' perception of academic quality steadily increased over the past decade [1]. Under this demand, the need for ensuring the validity and utility of the assessment process has also increased [2]. In this regard, given the importance and influence of quality in service business and the difficulty associated with the measurement of perceived quality, numerous researchers have been focussing on the development of a variety of universal tools, which could be employed to properly assess service quality in the diverse array of business sectors [3]. The concept of perceived quality was defined by Zeithaml in [4] as "the consumer's judgment about a product's overall excellence or superiority". With regard to the perceived quality of a particular service, consumer's attitude involves an overall assessment of the superiority of said service [5]. In this sense, the majority of definitions in the scientific literature assessing service quality revolve around service users' perceptions of the global excellence associated with a particular service. The level of excellence is determined by the assessment of the technical and functional characteristics. Multiple researchers have examined service quality through the implementation of the confirmatory paradigm estimating the perceived quality of services from the differences between expectations and results [6]. In this sense, this relationship springs from a continuous assessment of quality [7]. On the other hand, in [8] the complexity of gauging the quality of services considering their inherent characteristic (intangibility, heterogeneity and inseparability) is clearly underlined, forcing companies to approach service quality from service users' viewpoint. The present study analyses service quality from the perspective of the different services delivered by university institutions (internal analysis) and the point of view of students or users (external analysis).
Higher education institutions have approached quality management models aiming to improve their performance, leading to multiple benefits. These include an improved management of the development in key areas, an accurate assessment of the improvement of the business process and, finally, a greater involvement of the staff in their daily work with an increased motivation and subsequent higher productivity [9]. In addition, universities lay their own plans to meet certain quality goals in order to enforce quality higher education, so both the university community and the society in general can benefit from quality services, quality docent services and quality education degrees and courses [10].
Higher education reforms over the past decade, taken place within the ambit of the European Higher Education Area, have greatly influenced quality assurance. In this sense, a report issued in 2012 by the European Association for Quality Assurance in Higher Education (ENQA, [11]) provided a guideline to enforce quality assurance procedures in the near future. In addition, this report assessed a survey involving 28 quality agencies in 19 different countries, finding a significant quality in docent services. Also, according to a report issued by the Education, Audiovisual and Culture Executive Agency of the European Commission in 2018 [12], most countries have developed institutions and procedures to reach and maintain a high level of quality in university education systems. In the case of Spain, the Ministry of Education, Culture and Sport is periodically assessing the service quality of the university education system through different indicators obtained from the Integrated System of University Information established by the Ministry.
Educational services are considered central to the overall development of countries (especially in terms of economic growth). However, educational services are still far from straightforward, easy assessment and measurement. The intricate and abstract nature of the service quality variable hampers a proper assessment of its relevance [13]. Therefore, multiple authors have approached this dimension with different results; there are remarkable efforts such as the work of Parasuraman [5] which developed the SERVQUAL tool in order to measure service quality through different factors. SERVQUAL has been widely accepted as a valid instrument to measure service quality in multiple research fields (e.g. online banking [14], information technology [15,16], healthcare [17], professional services [18], freemium services [19] and telecommunication [20], among others).
On the other hand, most countries nowadays are promoting educational policies to stimulate research in this area of knowledge, leading to an increasing number of published studies [21][22][23][24][25][26][27][28][29][30]. A recent bibliometric study on service quality [31] corroborates the relevance of the scale used in this research to assess quality.
In this regard, the purpose of this research is to adapt SERVQUAL to teaching services through a twofold, internal and external approach, that is, from the perspective of the services provided by university institutions (internal analysis) and the point of view of students as the service users (external analysis). With that objective in mind, the following section focusses on a comprehensive review of the extant scientific literature with regard to the area of knowledge approached by this research. In addition, Section 3 examines several key aspects of the methodology, and Section 4 assesses obtained results. Lastly, this study discusses conclusions as well as possible improvement plans to optimize service quality delivered to students.
2. Theoretical background: service quality in higher education SERVQUAL modelling has been extensively approached in the scientific literature with regard to the assessment of the services delivered by institutions and organization. This method examines users' perceived quality by contrasting their expectations with the actual results and performance [32][33][34]. In this regard, De Oliveira and Ferreira [35] posit that quality in education services can be defined as a customer satisfaction index; this index can be approached to measure satisfaction in regard to any kind of service through a variety of criteria. Multiple adaptations (e.g. [36,37] [e-SERVQUAL]; [38,39] [WEBQUAL]; [40] [IRSQ]; [8] [E-S-QUAL]; [41] [PeSQ]; [13] [SSTQUAL]) have been suggested after the original scale proposed by Parasuraman et al. [5].
In this sense, this research explores the suggestions and findings in [4], who acknowledged SERVQUAL as a universal tool capable of assessing the service quality delivered by service providers. This research adopts the following classical determinants to explore the qualitative status of education services: (1) Tangible determinants, including the infrastructure and other components of the learning centre as well as supplies, computer equipment and verbal explanations from the teaching staff.
(2) Reliability determinants, indicating the actual capacity of fully meeting expectations respecting certain proposed and delivered services through content organization, well-defined suggested criteria and preparation of classes.
(3) Responsiveness determinants, such as the intent to help students and the preparation to deliver proper solutions with regard to the students' learning process, providing a healthy environment for student-teacher relationships and also complying with timetables and deadlines.
(4) Assurance determinants, reflecting the knowledge and the attitude of the teaching staff along with their ability to inspire trust and security.
(5) Empathy determinants, such as the ability to communicate with sympathy and provide an individualized attention to make classes more compelling, while developing a receptive attitude helping achieve adequate students' comprehension skills and improved participation.
In order for the internal quality assurance systems of the higher education sector (HEIs) to achieve their purpose, the following principles (ENQA) should be implemented: (1) Defining policies and procedures to enforce the standards and qualifications of learning programmes and learning awards, including a systematic review process.

Students' perception of academic quality
Institutions need to adopt a culture of quality improvement in all aspects of their educational product.
(2) Assessing students comprehensively through established criteria, regulations and procedures.
(3) Enforcing the quality assurance of the teaching staff, facilities and resources.
(4) Data processing of information collected through surveys and other sources for an effective management of institutions and their customer services.
(5) Objective and updated information available to the public in regard to available degrees and awards as well as financial data and quality reports.
The relevance of service quality within higher education institutions (HEIs) has been explored in multiple recent studies. In this regard, Foropon et al. [23] suggest four fundamental principles to properly assess service quality: (1) Exploring the particular nature of higher education (2) Comprehensively enforcing and meeting students' expectations and needs (3) Assessing the expected performance from this type of education (4) Considering the scarcity of research on this matter In this regard, Tsinidou et al. in [42] analysed the determinants of service quality within the higher education sector and attempted to assess their individual influence on students' perceived quality. Also, Fares et al. [43] explored students' loyalty and found significant determinants such as student satisfaction, service quality and the effect of the brand image on loyalty development. On the other hand, Senthilkumar and Arulraj in [44] concluded that quality in education services is achieved through a valuable teaching staff along with exceptional infrastructure and resources, a wide variety of degrees and, lastly, by improving students' employability. In addition, in [45] the quality from the perspective of international students (at Japanese universities) was examined, and subsequently a performance-based higher education service quality model was developed. Unlike the rest of the research where the different dimensions of quality of service in the university education sector have been analysed [46][47][48], the novelty of this research lies in the double process of analysis based on student assessments of the SERVQUAL scale, the use of two algorithms and the evaluation of their results by a group of experienced teachers. After thoroughly assessing prior research, this study aims to develop a tool to measure and assess service quality from the perspective of students and to contrast the results with those obtained from a group of experienced teaching staff to corroborate the findings of this study.

Methodological approach
This section presents the stages taken during the research which are depicted in Figure 1 Table 1. In addition, respondents' characteristics and profiles are displayed in Table 2.
The different variables considered in this research to assess service quality with regard to the teaching staff are organized and grouped around five dimensions previously proposed by Parasuraman et al. (tangibles, reliability, responsiveness, assurance and empathy). Also, the socio-demographic profile (gender, age and time spent coursing studies) (see Figure 2 and Table 3) of the respondents is also analysed in detail.

Development of the measurement scales
The survey that this research used for data collection includes the adaptation of some of the most well-known scales (shown in Appendix). In order to validate these scales, a qualitative personal interview and a quantitative test were initially conducted involving professionals in this area of knowledge with the purpose of ensuring the validity of the proposed model. A second pilot test was conducted with a sample of students from the Faculty of Economics and Business Management at the University of Granada (Spain) to validate the measuring tools. Table 4 shows the results of the analysis (Cronbach's alpha) with regard to the reliability of the scales. It is remarkable that removing items was not necessary to improve inner ! ij ∈ R d and y i ∈ R it is desired to obtain a subset of variables where cardinality and validation error are minimums chosen from a Pareto front. The problem of identifying which variables are relevant in order to model a given output is known as variable selection [50,51]. This problem is well known as it has many implications: reducing the curse of dimensionality, improving interpretability of the model, reducing data sets to fit into RAM memory and lessen computational resources, among others. This problem has been tackled from two perspectives: filter and wrapper approaches [50,51]. The first ones perform variable selection in two stages, separating it from the model design. The second ones perform the variable selection during the setting of the parameters which defined the models. This research focusses on the filter approach as the wrapper method poses some problems: (1) the selection could not be representative for other regression methods and (2) the number of models that need to be designed makes it too expensive or even impossible to apply.
3.3.2 Normalized mutual information feature selection. The following algorithm is based on the mutual information (MI) estimation algorithm, a nonlinear correlation measurement tool derived from the information theory [52]. This measurement method also considers nonlinear relationships as opposed to other correlation estimation methods. For two sets of variables, X and Y, it can be calculated by: The normalized mutual information feature selection (NMIFS) algorithm [53] improves upon earlier, well-known feature selection algorithms approaching MI such as the minimum-Redundancy Maximum-Relevancy -mRMRalgorithm, resulting in a better identification of the most significant variables estimating the normalized MI measurement through the maximum entropy of both considered sets of features: The NMIFS method is an iterative methodology which returns a relevance ranking of the input features with respect to the classification variable, taking into account not only their relevance, but also the redundancy among them. Thus, starting from an empty set of features S 5 0, the NMIFS algorithm iteratively selects the input feature f i which maximizes: Where #S is the cardinality of the current selected set S. The number of optimal features to be used in the calculation should be usually estimated through a wrapper approach exploring the learning methodology of the same model in order to assess the appropriate set of features leading to the best results. 3.3.3 Multi-objective optimization using genetic algorithms. In [54], the Hy-index was proposed as a new criterion to identify and select the best subset of variables in a filter approach. However, to determine the subset as presented in this research, all the solutions must be evaluated, and in this particular case, that is not possible. To overcome this problem, a multi-objective optimization genetic algorithm has also been proposed.
Genetic algorithms have been used for many years in optimization problems. Their bioinspired origin has been proved successful for many applications that require the exploration of the solution space. Each solution can be encoded in a chromosome (or individual) so solutions evolve in a population exchanging genes and generating offspring with the idea that two good solutions, when crossed, probably lead to a better solution.
The MGA (multi-objective genetic algorithm) [55] is a new revision of the classical genetic algorithm involving a multi-objective selection operator. This operator is designed to determine two different individuals to cross and evaluate, one individual associated with the MI dimension [56] and the other individual selected when taking into account its score in the Delta Test [57,58]. In the resulting crossover, the individuals are selected according to the two different criteria, thus, the algorithms performs multi-objective optimization.
The variables which define the rest of the genetic algorithm that this research approaches are the following: These values are according to the design principles followed in the literature [59][60][61][62] and after checking that other configurations did not produce significant improvements in the results.

Research results
A panel comprised of ten university teachers was approached in order to validate the results obtained in this research after performing the techniques explained in the previous sections of this paper. The teaching staff were completely unrelated to the quality evaluation process assessed in this research. Their average age was 46.55, 40% of the teachers were men (60% women) and the labour seniority (at the Modern Languages Centre) varied in the range of 10-15 years of experience. The survey validation process followed four stages during the first semester in 2014, involving in-depth interviews, methodology evaluation and assessing results and feedback. Thus, during the first stage of the validation process, in-depth interviews were conducted with the aforementioned experts in order to carefully explain the purpose of this research in detail and unfold the different variables which were selected from the original SERVQUAL scale. During the second stage, experts were handed the different subsets of variables obtained from each of the analysis techniques proposed in this research. In the third stage, experts would assess said subsets of variables according to a Likert (1-7) scale. Results were grouped and statistical conclusions from each of the different proposed research techniques were drawn. Finally, experts were personally and individually approached again to discuss the level of agreement regarding the results obtained in this research, supporting their feedback and assessing the validity of the results from the focus group. Results and scores of students along with the different research techniques employed in this study with regard to the point of view of the teaching staff are shown in Table 5. As observed in Table 5, students ranked responsiveness as the most significant dimension (average 5 6.18), followed by tangibles (average 5 6.14), empathy and reliability (average 5 6.07 for both variables) and, lastly, assurance (average 5 5.97). On the other hand, teaching staff valued the Hy-index as the most appropriate algorithm for this research (5.57 points) as opposed to that involving the MI method (4.21 points). A second round of interviews with the teaching staff was conducted in order to assess these results. The rationale behind the different scores for each of the techniques and methods involved in this research is explained as follows: (1) Regarding the Hy-index method, it only approaches variables associated with the level of assurance (averages of 4.96 and 5.68 points for each of the variables assessed within their related dimension) and the level of empathy (average of 6.06 points). In light of these findings, this research concludes that both the skill and comprehension variables are essential in order to evaluate the level of quality of the students. This technique disregards the rest of the proposed dimensions after reviewing the theoretical framework.
(2) Regarding the MI method, it shows a significant difference with respect to the Hyindex method: Swapping the "capacity of the teaching staff" item related to the Students' perception of academic quality assurance dimension with the variable "age of the student" (average of 0.9) since this variable was deemed key after performing the algorithm proposed in this research. Considering the point of view of the teaching staff, this technique is less valued and ranked as observed in Table 5. Teaching staff do not consider the students' age variable as a significant factor when assessing satisfaction and they rank it accordingly. This finding diverges from the mathematical results obtained through the techniques approached in this research.
Once both methods regarding the point of view of the teaching staff are assessed through the proposed algorithms, this study concludes that the SERVQUAL scale is a valid tool to properly measure the service quality with regard to education services. On the same note, it is worth noting that after exploring the practical application of the research results obtained from the proposed methods, the study could be simplified by focussing on the variables related to the assurance dimension (capacity and ranking of the teaching staff) and the variables associated with the empathy (comprehension) since the teaching staff ranks the Hyindex method with a higher value as opposed to the MI method despite the latter dimension involving socio-demographic variables such as the age of the students. Therefore, this study concludes that the Hy-index method offers more consistent results in the line of the scores provided by the teaching staff, thus validating and reaffirming the proposed and performed research methodology in this research.

Conclusions, managerial implications and future research 5.1 Discussion of results and managerial implications
The shortcomings and weaknesses revealed in significant reports on the overall state of education in Spain in general, and student satisfaction in particular, influence the decisionmaking process within the educational community with respect to educational institutions and teaching staff. In this regard, the positioning of an educational institution and its strategy depends on the management's levels of awareness with respect to the areas of strength and weakness [63]. Therefore, an educational institution could improve its service quality performance addressing the areas considered lacking by assessing the perceived quality of the services provided to its main customers, the students [64]. The SERVQUAL method is one of the most approached techniques in order to assess perceived service quality for organizations and business companies and has also been extensively proven with regard to educational institutions [63,65,66]. In this regard, the dimensions involved and proposed in the SERVQUAL scale have been modified and adapted multiple times in recent years [67] while keeping the level of reliability and effectiveness originally suggested by the original authors in most cases.
This research proposes an evaluation of the SERVQUAL scale through a double process of measurement and validation. In the first stage of the research, the service quality perceived by a sample of 7,580 students was assessed approaching an adaptation of the SERVQUAL scale which introduces a multi-objective optimization technique. This analysis revealed the most relevant dimensions from the students' viewpoint according to their own scores. During the second stage of the research, results from students were compared to those of the teaching staff, providing significant differences in regard to the scores and rankings of the teaching quality.
Students initially scored aspects related to the dimensions associated with the SERVQUAL scale. Two different algorithms, with respect to the Hy-index and MI methods, were performed on the obtained results in order to determine the relevance of the variables involved in the research. In addition, a panel of teachers also scored and assessed the same dimensions. Results showed that the most significant variables for the focus group of experts with respect to the Hy-index were only those variables associated with the assurance and empathy variables. That is, both the knowledge and comprehension of students are central for the teaching staff when assessing the level of quality of the students. On the other hand, the MI method proposed a significant change, swapping the "capacity of the teaching staff" variable of the assurance dimension with the "age of the student" variable. With respect to the focus group of experts, the most significant variables in order to properly assess the educational service quality are the "capacity of the teaching staff", "ranking of the teaching staff" and "facilitating comprehension" variables as opposed to the subset involving the "ranking of the teaching staff", "facilitating comprehension" and the age of the student variables. In this regard, the panel of experts has a higher consideration of the Hy-index method, and this research thus concludes that it is the most appropriate method to approach from the point of view of experienced experts and professionals.
This conclusion, identifying a subset of variables improving students' perceived service quality approaching the original SERVQUAL scale, proves as key for public educational institutions. It allows properly managing and taking advantage of the resources and capacities of the teaching staff. Once these variables have been identified, improvements with respect to the students' rankings and scores could be introduced in the future. In light of these findings, this research concludes that the aspects associated with the assurance dimension, such as the capacity and ranking of the teaching staff, and the empathy dimension, such as "comprehension", are the most significant for students as opposed to other variables related to tangible, reliability and responsiveness factors. This research thus also concludes that students prefer a quality education system based on the capacity of the teaching staff and the sympathy and individualized attention provided at the learning centre as opposed to other aspects related to facilities and supplies, content management, readiness of the services and timetables and deadlines compliance. From this starting point, further research will consider other interesting aspects such as past educational performance. Another interesting conclusion is that the results are homogeneous despite the nationality of the students. Although a preliminary ANOVA statistical analysis considering this variable showed that it could affect the output variable, the ranking algorithms finally ranked this variable in position 18 out of 19. One of the possible reasons is due to the imbalanced data set which was mostly composed of students from the United States and from Spain.
Lastly, it is worth noting that, following the trends of modern marketing, the target groups of most satisfaction surveys approaching the methodologies conceived on the basis of SERVQUAL are service users. However, a significant avenue for reflection opens up by combining perceived quality from service users' viewpoint and the perspective of service providers leading to an increased level of satisfaction among end users. In this regard, the difference between what teaching staff deem important and the actual level of satisfaction of end users should serve as the starting point for educational institutions' managers to optimize the services they provide. Therefore, time, effort and resource should be allocated to those factors improving the level of satisfaction for both external and internal users by raising their awareness.
This paper has also contributed to the research in the machine learning field by considering two well-known criteria to perform variable selection and optimize them in a multi-objective way. The results obtained showed how it is interesting to keep the level of entropy high maintaining variables with small variance of the noise in the output. Nonetheless, the approximation error criterion, although more computationally expensive, still remains as a valid method to identify the most significant variables.