Modeling of student academic achievement in engineering education using cognitive and non-cognitive factors

Purpose – The retention and success of engineering undergraduates are increasing concern for higher-education institutions. The study of success determinants are initial steps in any remedial initiative targeted to enhance student success and prevent any immature withdrawals. This study provides a comprehensive approach toward the prediction of student academic performance through the lens of the knowledge, attitudes and behavioral skills (KAB) model. The purpose of this paper is to aim to improve the modeling accuracy of students’ performance by introducing two methodologies based on variable selection and dimensionality reduction. Design/methodology/approach – The performance of the proposed methodologies was evaluated using a real data set of ten critical-to-success factors on both attitude and skill-related behaviors of 320 first-year students. The study used two models. In the first model, exploratory factor analysis is used. The second model uses regression model selection. Ridge regression is used as a second step in each model. The efficiency of each model is discussed in the Results section of this paper. Findings – The two methods were powerful in providing small mean-squared errors and hence, in improving the prediction of student performance. The results show that the quality of both methods is sensitive to the size of the reduced model and to the magnitude of the penalization parameter. Research limitations/implications – First, the survey could have been conducted in two parts; students needed more time than expected to complete it. Second, if the study is to be carried out for second-year students, grades of general engineering courses can be included in the model for better estimation of students’ grade point averages. Third, the study only applies to first-year and second-year students because factors covered are those that are essential for students’ survival through the first few years of study. Practical implications – The study proposes that vulnerable students could be identified as early as possible in the academic year. These students could be encouraged to engage more in their learning process. Carrying out such measurement at the beginning of the college year can provide professional and college administration with valuable insight on students perception of their own skills and attitudes toward engineering. Originality/value – This study employs the KAB model as a comprehensive approach to the study of success predictors. The implementation of two new methodologies to improve the prediction accuracy of student success.


Background and introduction
Universities are among the most important community institutions that are dedicated to the growth and development of their respective nations.The future of our nation is largely dependent on the technological and knowledge advancements offered by the field of engineering.Lin et al. (2008) and Wall (2010) emphasized the influence of the development of engineering and technology on the future of society.The document prepared by Wall (2010) recommends that a full measure of engineering capacity needs to be considered to improve

Uniqueness of this study
This study aims to approach and explore those critical-to-success factors through the lens of the knowledge, attitudes and behavioral skills (KAB) model which was first introduced to the education field by Schrader and Lawless (2004).In this study, the KAB model is implemented as a comprehensive approach to the study of the factors (both cognitive and non-cognitive) which are critical to the success of undergraduate engineering students."KAB" refers to the combination of knowledge, attitudes and behavior.In this study, knowledge or the "cognitive factor" refers to the previous knowledge the student possess from high school education.The set of non-cognitive factors, under attitudes and behavioral skills, were mostly inspired by Engineering Success written by Schiavone (2001).Schiavone (2001) referred to these non-cognitive factors as the "study and survival skills" which most engineering students need in order to strive and thrive in the engineering major.Based on 179 Modeling of student academic achievement both of these studies, this study determined the non-cognitive factors to be examined in this study and designed an instrument -"Building Blocks of Excellence"to provide a quantified measure of student self-ratings for the two elements from the KAB model; "attitudes" and "behavioral skills."This study is the first to explore the KAB model in the context of predicting students' academic performance.
This study also aims to examine the efficiency of combining a classical variable-selection technique with ridge regression, a more recent technique.The study used two models.In the first model, exploratory factor analysis (i.e.principal component analysis) is used to reduce the dimension of the factors to be included in the ridge regression.The second model uses regression best-subset method to select the variables to be used for ridge regression.The combination of these techniques has not yet been explored in the education field.The efficiency of each of these models in terms of the R 2 value and mean-squared error (MSE) is discussed in the Results section of this study.

Literature review 2.1 Cognitive and non-cognitive factors in the prediction of students' academic outcomes
The persistence and academic success of college students are two different, yet inseparable, aspects affected by various factors and by combinations of factors (Lotkowski et al., 2004).Researchers have confirmed that academic factors cannot be used alone to predict student performance (Astin and Astin, 1992).They have proposed the prospect of using non-cognitive factors as an alternative to standardized tests such as the scholastic aptitude test (SAT).A combination of high school grade point average (HSGPA) and standardized tests accounts for approximately 25 percent of the variance when predicting the first year's GPA (Robbins et al., 2004).We believe that more variance in the SAP could be explained through other non-cognitive factors.
To become a well-rounded student, students' attitudes and other personality factors are as important as the knowledge dimension of the education process.Besterfield-Sacre et al. (1997) reported that education is an "aggregate of both cognitive (content knowledge and technical skills) and affective (attitudes) processes."The non-cognitive factors are gaining momentum in the recent higher-education initiatives.According to a report of the Higher Education Academy, UK (Thomas et al., 2017), "It is the human side of higher education that comes firstfinding friends, feeling confident, and above all, feeling a part of your course of study and in the institutionthat is the necessary point for academic success" (p. 1).This report confirms that non-cognitive factors, e.g., student's sense of belonging, social connectedness and self-efficacy are the starting points for success and retention initiatives.
On a different perspective, information gathered about student perceptions of their attitudes and skill-related behaviors would be valuable to higher-education professionals.This information may support higher-education institutions in guiding resource allocation and program development for first-year engineering students (French et al., 2003).Besterfield-Sacre et al. (1999) reported that students' evaluation of their attitudes could provide educators with an important insight that would facilitate their understanding of student success and persistence.
Through a thorough examination of the literature related to predicting student academic performance in STEM majors, three streams of research can be found.Several studies relied on cognitive factors such as; high school exam scores (De Winter and Dodou, 2011), HSGPA and admission test scores ( Jiang and Freeman, 2011;Andrews, 2014) and scores on standardized tests such as the ACT and the SAT (Kauffmann et al., 2007;French et al., 2005).Though high school grades and other admission tests might be good predictors for first-year GPA, Ackerman et al. (2013) argued that a great deal of individual difference variance remains unaccounted for.
The other stream of research focused only on non-cognitive factors; for example, Walton (2010) found that sequential construct of learning style and positive self, and focus constructs of 180 JARHE 11,2 resilience were found to be predictors of student's persistence.The study by Stankov and Lee (2014) found that measures of motivation and self-efficacy had the highest predictive validity.
The third steam of studies considered both cognitive and non-cognitive factors in their attempts to understand the variance in student academic success and retention.An important study is the meta-analysis by Richardson et al. (2012), which found that performance self-efficacy had the largest correlation with GPA followed by high school GPA, ACT and grade goal.Gray et al. (2016) examined factors of learning as early indicators of students at-risk.Their study found that factors such as age, prior academic achievement, study efforts, self-efficacy and deep learning approach significantly predicted failing students.
The predictive validity of the combination of both cognitive and non-cognitive factors was generally agreed in the general education literature.Several studies from the engineering education field followed the same approach.Some of the early studies include those by Besterfield-Sacre et al. (1998), French et al. (2005), Burtner (2004Burtner ( , 2005)), Bernold et al. (2007), Lin et al. (2008) and Veenstra et al. (2009).Non-cognitive factors that have been examined included variables such as emotional, social, psychosocial and study skill factors.
Recent contributions in the engineering field included the study by Jin et al. (2011).Their study found that while high school grades and SAT scores were better predictors of college GPA, and affective factors such as motivation and leadership were better predictors of first-year retention.In addition, Whitaker (2014) found that academic determination significantly contributed to students' GPAs above and beyond background academic preparation.The study by Hall et al. (2015) explored the impact of aptitudes and personality traits as well as HSGPA on student retention.Their study used the NEO Five-Factor Inventory and measure of locus of control to assess the personality trait.Hall et al. (2015) found that aptitudes tests such as high school GPA, SAT math and Calculus readiness placement test were significant predictors of student retention.In their study, "conscientiousness" from the big-five factors was the only significant personality factor.
From the literature reviewed, it is evident that different studies approached prediction of students' success differently and examined different aspects.There is a lack of a comprehensive framework, consistency or agreed upon structure that can be flexibly used by institutions aiming to study the cognitive and non-cognitive predictors of students' performance.To fill this gap in the literature, this study aims to study the success predictors through the comprehensive lens of the KAB model.To date, the KAB model has not been explored in this context in the field of predictive analytics.

Modeling techniques used for the prediction of students' performance
Several statistical modeling techniques have been used to estimate student success.The majority of such studies have used the classical linear regression or other types of regression.These approaches include logistic regression (Lin and Imbrie, 2014), multiple linear regression (Ting and Man, 2001;Virtanen et al., 2013), hierarchical logistic regression (French et al., 2005;Andrews, 2014), hierarchical multiple regression (Whitaker, 2014) and multinomial logistic regression (Hall et al., 2015).Other statistical methods include path analysis (French et al., 2003) and correlational analysis (Bernadin and McKendrick, 2015).Other classification methods that are employed in this area include discriminant analysis (Burtner, 2005) and artificial neural networks (Lin et al., 2008;Huang, 2011).Classification methods can be used to predict categorical classes (discrete or nominal) but may not be used to model continuous-valued outcomes.The application of classical regression analysis is commonly used for this purpose.
However, multiple regression analysis is not usually the best option.The complexity of interpretation and over fitting are two common issues that often arise in high-dimensional data analysis.However, under this situation, the penalization-based regression techniques are more preferable over the classical regression techniques.The penalization-based regression techniques, such as the ridge and LASSO regression, use the sum of absolute 181 Modeling of student academic achievement values of regression coefficients for penalizing the complexity of the regression model and then enhancing the accuracy of the prediction.
According to Chatterjee and Hadi (2015), when multicollinearity exists in a set of predictor variables, the ordinary least squares (OLS) of the regression coefficients tend to be unstable and can lead to inaccurate inferences.Ridge regression, a penalized form of the generalized linear model (GzLM), is an alternative to standard multiple regression when the predictive variables are highly correlated (multicollinearity).Allowing for a small bias to exist in the estimated coefficients, results in more meaningful estimates to be obtained.Ridge regression minimizes the error by introducing a penalty to the regression coefficients of the GzLM.This penalty causes the regression coefficients to shrink toward 0 value (but not to an exact 0).Hoerl and Kennard (1970) have demonstrated that some of these undesirable effects of multicollinearity can be reduced by using "ridge" estimates in place of the least squares estimates.The value of the shrinkage parameter, λ, is very critical to the quality of the ridge regression model.The value of this parameter can be specified so as to minimize the mean square error of the coefficient estimates of the model.Several methods have been developed over the last two decades for estimating the optimal value of λ.The cross-validation and the ridge trace plot are the most well-known among these methods.Moreover, the GzLM is also a method for relaxing the normality assumption in classical regression (Green and Silverman, 1993).Practitioners prefer GzLMs over the OLS and multiple linear analysis.
Ridge regression was used by Whittaker et al. (2000) to estimate the correlation of markers with a trait of interest.They have compared its performance to other techniques commonly used by researchers.Their study found that in all cases ridge performed better and produced smaller standard errors of selection response.They have concluded that ridge is a very stable procedure in the sense that small changes in the data do not produce large changes in the estimated regression coefficients.Erickson (1981) effectively used ridge regression to estimate directly lagged effects in marketing.Huang et al. (2008) proposed a method which combined ridge regression and hierarchical cluster analysis for investigating the effects of climate variations on bacillary dysentery incidence in northeast China.The effectiveness of ridge regression was also confirmed by El-Dereny and Rashwan (2011).They found that when multicollinearity exists all methods of ridge regression perform better than the OLS method.

Indicators and measures
The survey conducted in this study consisted of one dependent variable (student academic performance) and 11 independent variables: one cognitive variable (high school grade or HSG) and ten non-cognitive variables ( five factors under attitudes and five factors under skill-related behaviors).In this study, the student academic performance is represented by the end-of-course GPA.The end-of-course GPA was used as a measure of student academic performance by many researchers in this field.From the literature reviewed, high school grades proved to be one of the most important cognitive factors that significantly predicted college GPA (Geiser and Veronica Santelices, 2007;Hiss and Franks, 2014).

Non-cognitive Factors. 2.3.2.1 Attitudes
• Motivation: motivation refers to a student's desire and willingness to study engineering.This section is adapted from the Situational Motivation Scale, which represents a self-report measure of situational intrinsic motivation, identified regulation, external regulation and amotivation.This scale was developed by Deci andRyan (1985, 1991) and was tested by Guay et al. (2000).

182
JARHE 11,2 • Academic self-regulation: academic self-regulation is referred to as self-discipline.It was defined by Allen et al. (2008) as the extent to which a student appreciates work and tasks related to college and approaches them conscientiously.Studies by Pintrich and De Groot (1990) and Vrugt and Oort (2008) found that self-regulation is highly correlated with student GPA.

•
Commitment to college and degree: a student's commitment to college and degree refers to the students' perception of the extent of their dedication and commitment to completing college and pursuing their degree.
• Scholastic and career self-efficacy: according to the definition provided by Bandura (1994), self-efficacy is defined as one's beliefs about his/her own capabilities to carry out their academic tasks efficiently.Several researchers including Robbins et al. (2004), Richardson et al. (2012) and Saltonstall (2013) found that student's self-efficacy is among the best predictors of student academic performance.
• Conscientiousness: in Le et al. (2005), conscientiousness was defined as "the extent to which a student is self-disciplined, achievement-oriented, responsible, and careful." The components of this section of the survey were adapted from the conscientiousness scale in the Big Five Personality Traits Test.According to several researchers such as Trapmann et al. (2007) and Poropat (2009), among the Big Five Personality Traits, conscientiousness is the only trait that has a significant correlation with academic achievement.

Skill-related behavior
• Social connectedness: social connectedness is also referred to as social involvement.
It refers to the extent to which a student participates in social activities or has social connections.
• Communication: Gore (2006) defined this term as "attentive to others' feelings and flexibility in resolving conflicts with others."Le et al. (2005) defined communication as "the ability to exchange information with others."This section is adapted from Duran's (1983) Communicative Adaptability Scale and Rubin and Martin's (1994) Interpersonal Communication Competence Scale.
• Confidence in math and science skills: this term was defined by Sheppard et al.
(2010) as a student's perception of his/her proficiency in science, critical thinking, real-world problem-solving and computations.The statements in this section are adapted from the "mathematics self-efficacy and anxiety questionnaire" developed by May (2009).
• Teamwork-oriented skills: this variable measures the extent to which the student prefers to work within a team.A few elements of this section were adapted from the students' attitudinal success instrument proposed by Reid (2009).
• Time-management skills: this factor refers to the students' perceived ability to use and manage their time efficiently.Statements in this section were adapted from the "time-management questionnaire" developed by Britton and Tesser (1991).

Study questions
The following questions guided this study: (1) How can a comprehensive framework be built to serve as a conceptual model for the prediction of students' academic performance?

183
Modeling of student academic achievement (2) To what extent can a group of cognitive and non-cognitive factors explain variability in students' final GPA?
(3) Can using a selection or reduction technique optimize the prediction capacity of ridge regression?
(4) Which reduction or selection technique is more efficient in the optimization of the prediction capacity of ridge regression?

Methodology
An instrument, called the "Building Blocks of Excellence," was developed and administered.The first section includes questions about the student's name, gender, employment status and number of credit hours registered.The second section is titled "attitudes," which represents student perceptions of five factors that are related to their attitudes toward engineering.The third part includes five factors about students' self-report or perceptions toward their own "skill-related behavior."The gained knowledge, which is the remaining component of the KAB approach, is represented by the student's high school final grade.Except for the demographics sections, students select a response based on a Likert scale (strongly agree ¼ 5; agree ¼ 4; neutral ¼ 3; disagree ¼ 2; strongly disagree ¼ 1) for sections two through four.The factors corresponding to each section are shown in Table I.
The study instrument was administered to students in the College of Engineering in Qatar University.The first page of the survey included a letter to the students and a consent form.By signing the consent form, students granted permission to the researcher to obtain their GPA and other information from the Registration Department.
The survey targeted four courses of general requirements: CENG 106 Computer Programming, CENG 107 Engineering Skills and Ethics, GENG 300 Numerical Methods and GENG 200 Probability and Statistics for Engineers.Engineering undergraduates highly populate these courses.A total of 322 students answered the survey.
The study examined the correlation between course GPA and students' cognitive and non-cognitive factors.Preliminary analyses were performed using RStudio and SPSS packages to ensure that the measure met the required assumptions for the proposed statistical tests.Factors were measured for indicators of normality and outliers.Descriptive statistics were then generated to examine means, standard deviations and the minimum and maximum value for each predictive variable.Two methods of variable reduction and selection were performed.Each method was used before the ridge regression was carried out.The first method uses exploratory factor analysis, whereas the second method uses regression best-subset selection techniques.

Correlation matrix among variables and dependent variables
Correlations of our critical-to-success factors and student performance were established using the bivariate Pearson correlation, which determines the strength of linear relationships between two variables.Table IV summarizes the R 2 for the critical-to-success factors with student performance or final GPA.
4.3 Implementation of the proposed methodologies 4.3.1 Method I: factor-analysis-based ridge regression analysis.This study applied exploratory FA, as a dimension reduction technique, to reduce the variables used for the ridge regression model and explore the underlying structure of a large set of variables.The groupings or the components generated using the FA are used along with another two categorical variables (gender and group) in the ridge regression model.
We tested the assumptions for exploratory FA.According to Kim and Mueller (1978), the first assumption of FA is that the sample size is adequate.This assumption is measured on the basis of the Kaiser-Meyer-Olkin measure of sampling adequacy and the This assumption was checked using Bartlett's test of sphericity and the value of this test was (0.0).The third assumption is multicollinearity or singularity; we tested this assumption by checking the values of the tolerance and the variance inflation factor (VIF).Tolerance values were 0.79, 0.78, 0.78 and 0.96, and were all greater than 0.1.Values of the VIF (tolerance) were 1.259, 1.268, 1.181 and 1.04.Normality is not an assumption of FA.Using the eigenvalue criterion and interpretability, we extracted four components.As shown in Table V, the first four components had eigenvalues greater than one and they explained 65.546 percent of the variance in the data.
On the basis of the pattern matrix shown in Table VI, four groupings were established.We elected to suppress variables with a loading value of less than 0.4.According to the study of Guadagnoli and Velicer (1988), a factor with ten loadings greater than 0.4 is considered more stable for a sample size that is greater than 150.A title was assigned to each of the four components, as shown in Table VII.
In addition to two categorical variables (gender and group), the four components or the variables generated by the PCA were used in the ridge regression model, as shown in Table VIII To specify the optimal value of the tuning parameter λ, we used the ridge trace plotted by Hoerl and Kennard (1970).This method is a kind of trade-off between the bias and the variance of the coefficient estimates.This method suggests to visualize the changes in the GLM coefficients over a wide range of λ values and to select the value of λ when the changes in these coefficients are insignificant.The optimal range of λ is found to be between 0.1 and 0.55.A set of ten values of λ have been selected from this range, including extremes.The extreme values of the optimal range were intentionally included in order to maintain a small bias (see Table VIII).In Table VIII The MSE exhibited the smallest value (0.26) for this parameter.In this case, the final regression model is as follows: Student's GPA ¼ 2:95905þ0:0812297 Â GenderÀ0:255536 Â Student group þ0:12479 Â Academic mindsetþ0:00934197 Â Social skills À0:0315418 Â Self-discipline þ0:310555 Â High school grade: (1) The R 2 statistics shown in Table IX indicate that the as-fitted model explains 34.59 percent of the variability in the students' final GPA.The adjusted R 2 statistic is 31.922percent.
The standard error of the estimate shows the standard deviation of the residuals to be 0.51051.The mean absolute error (MAE) of 0.40449 is the average value of the residuals.4.3.2Method II: best-subset-based ridge regression analysis.The regression model selection procedure in the Statgraphics software was used to select the independent variables for use in the ridge regression model to predict our single quantitative dependent variable Y (students' final GPA).This method considers all possible regressions involving various combinations of the independent variables.This procedure compares regression models based on the MSE, R 2 , adjusted R 2 and the Mallows' Cp statistic.Here, we used the regression model selection procedure provided by Statgraphics to find the best subset of independent variables to use in the ridge regression model.Table X provides some examples of the regression models provided by the procedure.We selected the first model because it provided the best adjusted R 2 value and the smallest MSE and Mallows' Cp values.
The selected variables in the first model were used in the ridge regression model.(2) The R 2 statistic indicates that the as-fitted model explains 35.5453 percent of the variability in the students' final GPA.The adjusted R 2 is 31.5447percent.The standard error of the estimate indicates the standard deviation of the residuals to be 0.523226.The MAE of 0.402426 is the average value of the residuals.Finally, we use the results provided in

189
Modeling of student academic achievement Labels were assigned to each of the four groupings and were saved as regression variables.The study used the four components generated by the PCA in the first ridge regression model.The second model used the regression best-subset procedure to reduce the number of variables to be used with the ridge regression.From the results in Table XIII, we conclude that although the second model provided a higher value of R 2 , the PCA model provided a higher adjusted R 2 , as well as lower MSE and mean absolute percentage error (MAPE) values.Though the differences in the values between the two models do not appear to be very large, some implications are made in the next section on the appropriate use for each model.

Summary of this research
This study examined the potential impact of a combination of critical-to-success factors that were essentially based on the KAB approach.On the basis of the PCA, we generated four components: prior knowledge, academic mindset, social skills and self-discipline.
Academic mindsets were described by Farruggia et al. (2018) as directly and strongly affecting students' academic success.Motivation, as a component of mindset, is associated with college students' success in the studies carried out by French et al. (2003) and Kauffmann et al. (2007).However, Jin et al. (2011) suggested that academic motivation is a better predictor of engineering students' retention.Commitment to college and to completing a degree is also a good predictor of engineering success, in contrast to the study by Veenstra (2008).Confidence in math and science is a strong predictor of student performance, as confirmed by the studies of Lotkowski et al. (2004) and Robbins et al. (2004).Self-efficacy, the fourth element under academic mindset, is considered one of the best predictors of GPA.This relationship was evident in several studies including the meta-analysis by Robbins et al. (2004), Richardson et al. (2012) and Saltonstall (2013).The social skills construct had a moderate effect on students' GPAs, consistent with the study of Lotkowski et al. (2004).The last component, academic self-discipline, was a good predictor.It involves two variables: self-regulation and conscientiousness.Both of these variables have been demonstrated in the literature to strongly correlate with a student's success.The positive relationship between self-regulation and academic achievement was demonstrated by Pintrich and De Groot (1990) and Vrugt and Oort (2008).However, this notion is contrary to what was revealed by the study of Virtanen et al. (2013).Conscientiousness was also found in many studies to significantly affect student success (Trapmann et al., 2007;Poropat, 2009).
We found that a combination of cognitive and non-cognitive factors contributes to approximately 35-40 percent of the variance in students' GPAs.This variance is greater than that found by many other studies in both general education and the engineering education field.While the non-cognitive factors accounted for approximately 25 percent of the variance in final GPA, cognitive variables contributed only 15 percent of this variance.Thus, we can conclude that non-cognitive factors are more important in predicting student academic performance.
The study used two prediction methods.Each method involved a combination of a classical data reduction/selection technique and the more recent technique of ridge regression.The first method used PCA to reduce the dimension of variables, whereas the We can conclude that through its KAB model, this study successfully explained the considerable variance in student success for engineering undergraduates.Moreover, adding a selection or dimension reduction technique further improved the prediction capacity of ridge regression.In addition, the predictive analytics used improved our understanding of the influence of those cognitive and non-cognitive factors on students' academic achievement.

Limitations of this research and recommendations for future research
The following limitations of the study were noted.First, the survey could have been conducted in two parts; however, the students needed more time than expected to complete it.Students took up to 15 min to complete the survey.Other researchers (notably, Gratiano and Palm, 2016), employed various techniques to reduce the survey completion time.They proposed a 5 min questionnaire to predict success and retention in first-year engineering students.The survey comprised three open-ended questions, and students were given 10 min to complete the survey.The participants were then interviewed for 30-40 min.However, it should be noted that the researcher conducting the survey reported that the (questionnaire and interview) technique was too lengthy for full participation by the students invited.They also noted that the surveys and interviews were somewhat invasive of student privacy because personal information was collected.Since the surveys and interviews were conducted by the professor, the students might not have felt comfortable providing accurate answers.In our study, the researcher conducting the surveys was a total stranger to the students; the researcher reported that the students were willing to answer all the survey questions and appeared both positive and assured during the surveys.
Second, the study only applies to first-and second-year students because the covered factors are those that are essential for student survival through the first few years of study.Third, if the study is conducted for second-year students, grades of general engineering courses can be included in the model for better estimation of students' GPAs.In addition, decision makers may need to consider the influence of other non-quantitative factors on academic achievement such as learning preferences and personality styles.

Implications for practice
Several implications can be made as below.
5.3.1 A comprehensive tool for a systematically gathered information on student population.The "Building Blocks of Excellence" instrument provides a readily accessible tool for teachers and administrators.College administrators and professors can use it to gain insight into their students' attitudes and perceptions toward engineering at different times of the year.They can use the results to compare differences, measure growth or progress over time and plan for remedial or enhancement programs.Faculty may use the results of such predictive model to guide curriculum development for courses offered to first-year engineering students (e.g.skill-building seminars).
5.3.2An efficient predictive modeling technique.This study proposes an actionable, applicable and accurate technique for the prediction of student outcomes.Both methods can be used when there are a large number of variables/predictors. Since the differences in the R 2 and MSE are not great, we recommend that either method can be used when there are a large number of variables/predictors.
However, when predictors are not related to each other, and if the number of predictors is not more than 20 we can use the best subset selection technique ( James et al., 2013).According to Blum et al. (2013), the best subset is conceptually simple but is difficult to manage for a large number of potential summary statistics.If we are more interested to 191 Modeling of student academic achievement capture as much variance regarding the predicted outcome and are not interested in finding how the data can be grouped based on their linear combination, we may use the best subset.Moreover, no sample restrictions exist when using the best subset.
On the other hand, when we have a large number of variables and some of which are correlated.This correlation between variables brings about a redundancy in the information that can be gathered by the data set.For this reason, we may use PCA to transform the original variables to the linear combination of these variables which are independent.However, sample restrictions exist for the successful implementation of PCA, which can be checked through the sampling adequacy procedure.
Ridge regression is a robust technique that is not affected by outliers and does not require the dependent variable to follow a normal distribution as in the OLS method.It is both a simple and reliable technique which can be used as an alternative to standard multiple regression when the predictors are highly correlated.It also provides a more stable and interpretable model.
5.3.3A whole-institution approach to engineering students' success and retention.According to Thomas et al. (2017), institutions of higher education that want to achieve excellence in learning and teaching, and improve student experiences, persistence and success should "adopt an evidence-informed, whole-institution approach to achieve change" (p.28).Their report provided the basis for a successful initiative or approach aimed at improving student outcomes.
The most important implication is that establishing an engineering success lab/center which involves staff, professors, graduate students and undergraduate engineering students would be beneficial.This success center would aim to improve students' mindsets along with their social and academic integration.Some of the targeted initiatives that can be offered in this center may include the following.5.3.3.1 Math helpdesk.An important service that can be offered through this center includes a math helpdesk, where a student could arrange for one-on-one or group tutoring.5.3.3.2Targeted workshops and training sessions.On the basis of determinants or critical-to-success factors, planned or targeted workshops on goal setting, study skills, social skills and time-management skills could be organized for students.5.3.3.3Internship and job shadowing.One of the best practices that can increase student motivation, commitment and self-efficacy is providing early opportunities for internships and job shadowing.These opportunities can be achieved through collaboration and partnership with the industry, which can help students develop their engineering identities.5.3.3.4Increase students' academic and social engagement opportunities.Providing students with opportunities to conduct research alongside faculty members can also greatly increase their academic self-efficacy and integration.By contrast, social integration can be achieved by engaging students in community-service events or programs.Such events can be led and managed by the students themselves.Service events may be included as course requirements and supervised by professors or offered as general volunteering hours.The supervision and support for these programs should be provided by the engineering success center.5.3.3.5 Extensive support for low-performing students.In collaboration with the academic advising unit, this center should be responsible for the identification of low-performing students.Progress counseling is an essential practice to support low-performing students; progress counseling should include recommendations and strategies to enhance success (e.g.attending a series of skill-building workshops or summer enrichment programs).
Supporting engineering students in their early years of study can create a positive atmosphere for all students to excel.As indicated by Tinto's theory, when the student is successfully integrated socially and academically, he/she is more likely to persist and succeed.The advancement of our society, national security and quality of life is strongly 192 JARHE 11,2 linked to scientific and technological innovation and development.Paying more attention to the field of engineering education is a key component toward a prosperous and innovative economy.

Conclusion
This study was based on well-established theoretical background and validated measures, and was designed to test the predictive capability of a combination of cognitive and non-cognitive factors.Our cognitive and non-cognitive factors were inspired by the KAB approach.We used a student's HSG to represent "knowledge," a number of motivational, attitudinal and self-efficacy factors to represent "attitudes," and a number of skill-related behaviors to represent "behavior."Among these, students' HSGs, academic mindset and academic s elf-discipline were strongly associated with success in engineering courses.The use of a gender/group diverse sample guaranteed that our predictive model accounts for all students including female and first-year students, who are the most prone to failure and withdrawal.
The results of this study suggest that vulnerable students can be identified as early as possible in the academic year.These students should be encouraged to engage more in their learning process.As educators, we should take greater responsibility for finding and utilizing all available information or opportunities available to enhance the education of our students.Carrying out such a measurement at the beginning of the college year can provide college staff and administrators with valuable insights into students' perception of their own skills and into their attitudes toward engineering.The college academic advisor can use this information to develop individualized plans to guide students on how to best work to enhance those factors that are critical to their success and retention.This early support would ensure a smooth transition of engineering undergraduates into the senior levels of study.
2.3.1 Cognitive factors.2.3.1.1HSG.A measure of student academic achievement or academic ability from high school.
GPA Final ¼ À2:78441 þ0:0936042 Â GenderÀ0:214495 Â Student group þ0:055593 Â HSG þ0:18744 Â Commitment to college and degree þ0:190047 Â Social skillsÀ0:160221 Â Communication skills þ0:232532 Â Confidence in math and scienceÀ0:113575 ÂTeamwork skillsÀ0:210142 Â Time-management skills: Descriptive statisticsGeneral descriptive statistics display an overview of the general demographics of participating students (TableII).The collected sample included a total of 144 (45.1 percent) first-year students and 175 (54.8 percent) second-year students.Male and female students accounted for 51.4 and 48.58 percent of the sample population, respectively.Descriptive statistics were carried out for the 11 predictive variables.Table III displays the mean, standard deviation and minimum and maximum values of each of the predictive variables.
-value was (0.840).The second assumption is that correlations among factors are reliable. p .
, the best ridge parameter value is 0.1.
Table XI displays a summary of the regression coefficients corresponding to each value of TableIIIfor more information on each variable parameter lambda (λ).The best ridge parameters ranged from 0.1 to 0.55.The ridge parameter value of 0.1 clearly provided the best value of squared mean error as well as the highest R 2 value.A summary of the model results is shown in TableXII.In this case, the fitted regression model is as follows: ridge Table XIII to draw a tentative conclusion about the efficiency of each model.Table XIII provides a comparative summary of the two models.The first model used PCA to reduce the dimensions of the variables into four components.
used a regression best-subset model.The first model yielded a better adjusted R 2 value and a lower MSE.The second model yielded a better R 2 value. second