“ Automatic ” interpretation of multiple correspondence analysis (MCA) results for nonexpert users, using R programming

Purpose – The purpose of this paper is to create an automatic interpretation of the results of the method of multiple correspondence analysis (MCA) for categorical variables, so that the nonexpert user can immediately and safely interpret the results, which concern, as the authors know, the categories of variables that strongly interact and determine the trends of the subject under investigation. Design/methodology/approach – This study is a novel theoretical approach to interpreting the results of the MCA method. The classical interpretation of MCA results is based on three indicators: the projection (F) of the category points of the variables in factorial axes, the point contribution to axis creation (CTR) and the correlation (COR) of a point with an axis. The synthetic use of the aforementioned indicators is arduous, particularlyfornonexpertusers,andfrequentlyresultsinmisinterpretations.Thecurrentstudyhasachievedasynthesisoftheaforementionedindicators,sothattheinterpretationoftheresultsisbasedonanewindicator,ascorrespondinglyonanindex,thewell-knownmethodprincipalcomponentanalysis(PCA)forcontinuousvariablesisbased. Findings – Two (2) concepts were proposed in the new theoretical approach. The interpretative axis corresponding to the classical factorial axis and the interpretative plane corresponding to the factorial plane that as it will be seen offer clear and safe interpretative results in MCA. Research limitations/implications – It is obvious that in the development of the proposed automatic interpretationoftheMCAresults,theauthorsdonothaveintheinterpretativeaxestheactualprojectionsofthepointsasisthecaseintheoriginalfactorialaxes,butthisisnotofinteresttothesimpleuserwhoisonlyinterestedinbeingabletodistinguishthecategoriesofvariablesthatdeterminetheinterpretationofthemostpronouncedtrendsofthephenomenonbeingexamined. Practical implications – The results of this research can have positive implications for the dissemination of MCA as a method and its use as an integrated exploratory data analysis approach. Originality/value – Interpreting the MCA results presents difficulties for the nonexpert user and sometimes lead to misinterpretations. The interpretative difficulty persists in the MCA ’ s other interpretative proposals. Theproposedmethodofinterpretingthe MCAresultsclearlyandaccuratelyallowsforthe interpretationofits results and thus contributes to the dissemination of the MCA as an integrated method of categorical data analysis and exploration.


Introduction
Dimension reduction methods seek to reduce the number of dimensions [1,2] in the variable space whilst also preserving the most important structure or relationships between the variables, i.e. without significant loss of information (capturing the essential information) [3]. They have also the advantage of handling and visualizing the results of complex and massive amounts of data [4][5][6][7]. Principal Component Aanalysis (PCA) is a popular method for performing dimension reduction [8] of a set of continuous variables, an effective approach to capture characteristics [9], with the aim of identifying those variables that contribute most to the creation of new, composite variables unlike in feature selection [10], known as principal components or dominant factorial axes. This is achieved via the diagonalization of a symmetric correlation or covariance matrix [11]. To identify which of the original variables contribute most to the creation of each principal component, the coordinates of the projection [1] of the variable points on each factorial axis can only be used, that express the correlation coefficients of each variable with factorial axis These coordinates express the coefficients of correlation of each variable with the axis.
In this paper we focus on Multiple Correspondence Analysis (MCA), a generalization of PCA for categorical data and a generalization of simple correspondence analysis [12], that is widely used in various scientific fields such as marketing, psychology, health, economics, management and others [13]. The goal of MCA is to describe the associations between the categories of two or more nominal variables in a low-dimensional space containing these categories. Whilst interpreting the results of MCA, users have to identify which column categories have a major contribution to the definition of the factorial axes. MCA has been used as a first step for reducing predictors in classification problems [2], or for receiving new coordinates for performing Hierarchical Clustering on Principal Components [14] or even as a method for meta analyzing literature findings in marketing [15,16]. Proper interpretation of the MCA results by nonexpert users can often become a difficult task, and consequently it might lead to misinterpretations [17]. The purpose of this paper is to provide nonexpert users with an "automatic" and clear interpretation of the most important points of MCA's results, via an alternative visualization scheme, based on the construction of the so-called interpretive axes and the corresponding interpretive factorial planes. These proposals come to shrink the existing research gap providing originality to the study. This eliminates the requirement for the user to examine and evaluate the tabular MCA output, as well as looking for numbers and statistics for which additional calculations are frequently required. The proposed scheme is similar to the one used in the context of PCA and users familiar with PCA can easily comprehend this one as well. The novelty of this study is the discovery of a geometrical locus of points on the so-called interpretive plane that improves current alternative approaches on interpreting the most important points of a factorial plane. The paper is organized as follows. Section 2 presents the basic concepts for MCA. Section 3 reviews corresponding literature, discuss about research gap and alternatives addressing the problem. The interpretive axis and the interpretive plane for visualizing MCA results are introduced in Section 4. Section 5 demonstrates the proposed approach on a real data set and compares results. Section 6 includes conclusions, discussion, limitations and implications regarding this research. ACI 2. Basic MCA concepts Let X, a n 3 s data matrix, where n objects or individuals are characterized by s nominal variables, X k . Given that each variable, X k , in the original data matrix has r k categories, the data matrix is transformed into the so-called indicator or 0-1 matrix, Z. Each object in Z receives a value of 1 in a single category of each variable (presence) and the value of 0 (absence) in the other categories of the variable. The indicator matrix is of size n 3 p, where p ¼ P s k¼1 r k [2] is the total number of categories of all s nominal variables. The sum of each row in the indicator matrix is s, i.e. the total number of variables, while the sum of each column (variable category) is generally different and determines the number of object instances in each variable category. Τhe total sum of rows or columns is n$s. The values (0 or 1) of each column (variable category) divided by their sum are known as column profiles [12]. The set of column profiles is the cloud N(J) of category points [18]. The category profiles are points of the vector space R n (n the number of objects). Each sum of a column j in Z is denoted as K. j and each sum of a row i as K i ., respectively. The sum of all variable categories is thus equal to n$s. If we divide K. j by the total sum n$s., the result is known as the mass or weight of category j and is denoted as m j . The column profile points together with their mass constitute the cloud of column profiles and are the ones that carry the information of the data. The center of gravity of the column points is also easily shown to be the point n-tuple g, where g ¼ À 1 n ; 1 n ; Á Á Á ; 1 n Á . Another central concept of MCA is the concept of inertia [12]. Let j be a category profile point of a variable with sum K. j . The square of the distance d 2 X 2 ðj; gÞ of the profile point j from the center of gravity g ¼ À 1 n ; 1 n ; Á Á Á ; 1 n Á , is defined with the chi-square distance, χ 2 (d 2 X 2 ) and is given by: The inertia I j of the point j with respect to the center of gravity g is denoted as I j ¼ ðmass of jÞ 3 d 2 X 2 ðj; gÞ and it can be easily shown that The total inertia, I, of all p points of the category profiles is then I ¼ p s − 1 [12]. That is, the inertia of the points depends both on the total number of variable categories, p, and the total number of categorical variables, s, in contrast to PCA, where the corresponding inertia is s, the number of continuous variables. At the core of MCA is the investigation of the structure (form) of the cloud of points N(J), that is, the determination of the main directions of inertia. These directions are perpendicular to one another and they are passing through the center of the cloud g, and optimally describe the cloud N(J). These axes, known as factorial axes are obtained via the decomposition of a special variancecovariance matrix and are automatically sorted in descending order, according to their associated eigenvalues, λ 1 > λ 2 > λ 3 . . .. The inertia of a point j along a factorial axis, a, is defined as the square of the Euclidean distance d 2 ðF a ; gÞ, where F a is the projection of coefficient of point j on the factorial axis a. The sum of the inertias of all points along the factorial axis a, also known as inertia along the axis or inertia interpreted by the axis, is equal to the eigenvalue λ a [19]. For example, the inertia along the first factorial axis or the inertia interpreted by this axis, equals to the largest eigenvalue, λ 1 . In practice, we usually interpret the first few and most important factorial axes, resulting in a loss of information. The number of important factorial axes can be determined using criteria such as the scree plot and the percentage of explained inertia (variance). The points of interpretive interest on each axis are those which contribute most to the total inertia of the axis. The standard MCA output includes: the standard visualization obtained from the projection of all the points on the factorial axes and/or the factorial planes (e.g. the map created by the first and second factorial axes). These plots, however, do not allow for a direct interpretation of the most important points. Determining the most important points requires considering their inertia along each axis. Based on the (M)CA (Correspondence Analysis) literature, one can "Automatic" interpretation of MCA results rely on the following aids to interpretation [20,21] in the form of numerical diagnostics: (a) the coordinate (projection) of point j on a factorial axis a, F a (j). (b) the COR or squared correlationof point j on axis a, COR a (j).
The COR index expresses the amount of inertia of point j explained by axis a, that is: , (c) the CTR index. The total inertia along an axis or equivalently the inertia interpreted by axis a is denoted by λ a . This total inertia is the sum of the inertias of all points in the direction of the axis a. The ratio of the inertia of the point j in the direction of the axis a to the total inertia of the axis a is called contribution of the point j and is denoted as CTR a ðjÞ and is given by CTR a ðjÞ ¼ m$F 2 a ðjÞ λa [22]. Obviously, those points with the highest values of the CTR indicator contribute more to the construction of the axis. The contribution indicator shows the points that contribute to the construction of the axis, so it is on these points that the possible interpretation of the axis is based. The combined use of the aforementioned indicators is sufficient for interpretation but a nonexpert user without this knowledge can easily be misled.

Literature reviewalternative approaches of visualizing important points of the MCA maps
The problem of interpreting the results of MCA with visualizations has been a subject of interest and this can be highlighted from a number of published studies [23][24][25][26]. Special interest presents the effort of interpreting the most important points on a factorial plane [2,[24][25][26]. Approaches addressing the issue of correct interpretation (importance [24] or proximity of points [2]) of factorial maps vary from the usage of symbolic means and points' colorization to mathematical transformations. Even though MCA has been implemented in different software and programming languages (e.g SPSS, SPAD, Python etc.) in this study we focus on R. In R there are packages that are developed to perform the MCA and visualize its results and others that produce visualizations aiming to help users interpreting the results. FactoMiner [14] is a package that can compute MCA's results and offers the ability of producing the classical factorial maps. Users can also colorize points according to their contributions that shift the problem of distinguishing the most important points to another variable information. CainterprTools [23] is an R package that uses additional optical means (dotplots, scatter plots) in order to help users. However, the user has to resort in additional graphical and numerical information that is not a good solution for nonexpert users. CA package [27] includes research of [24,25] regarding the asymmetrical biplots [26] and through its functions user can pass parameters that transform the points by multiplying points' standard coordinates with their corresponding masses. Consequently, points on visualizations receive information of masses which in case of the factorial axis is adequate for immediate extraction of the most important points. While in the factorial plane the points retain and generalize their informative character, here surface one issue. From the moment that the printed arrows (the ones which the ca package is using and connect a point with the beginning of the axes) are not placed on the same axis their lengths must be measured in order to be safe in interpreting the most important points of the plane. This is a drawback, and it is being resolved through our proposed interpretive plane. We also like to underline the findings of [2] where scholars have presented a solution to the problem of interpreting proximity of points on the factorial maps through the defining of "tolerance distance".

Methodologyproposal of interpretive axes and interpretive planes
In this section we introduce the notions of interpretive axis and interpretive plane. We consider that each factorial axis corresponds to an interpretive axis that incorporates or combines all the interpretive information of the coordinates, F, the COR index and the CTR ACI index. This interpretive axis is defined as follows: The coordinate, e a ðjÞ, of a point j on the interpretive axis a, is given by e a ðjÞ ¼ signðF a ðjÞÞ•λ α •CTR a ðjÞ, where signðF a ðjÞÞ denotes its sign on factorial axis a. Here the product, λ α •CTR a ðjÞ is the inertia of the point j in the direction of the factorial axis a. Since e a ðjÞ has the same sign as F a ðjÞ, we have information about the direction of each point on the axis. The next step is to define which points of the interpretive axis are the most important which at the same time they will be the most important for interpretation of the corresponding factorial axis. Important interpretive points j should satisfy the following conditions: 1st condition: je a ðjÞj > λ α p •100, where λ α is the inertia of axis a and p the total number of categories, that is important points on the axis are initially considered to have an inertia contribution above the average inertia contribution in the direction of the axis. We multiply by 100 since most software packages report the contributions in %. 2nd condition: Since we consider that, for a nonexpert user, it is better for an important point to be included on a single factorial axis, in a number of first factorial axes selected by the user (k) (e.g first k 5 5 factorial axes with the largest percentage of variance), we additionally require that: COR a ðjÞ ¼ maxðCOR k ðjÞÞ, where a is the candidate preferred axis and k denotes all the selected axes in which point j satisfies the 1st condition. Conditions 1 and 2 are applied to the 1st factorial axis at first, then the 2nd etc. Second condition addresses cases such as for example the case of a point that satisfies the 1st condition for both the 1st and 2nd factorial axis and at the same time gives the largest value of its inertia on the 2nd factorial axis. At this case this point should then be interpreted as important in the second factorial axis. Consequently, points that satisfy both conditions with the explained sequence are considered as the most important for interpretation of the factorial axis. Therefore, the interpretive axis a, allows as to evaluate the points with the largest interpretive weight for a factorial axis, based on the value of a single index, the interpretive coordinate e a ðjÞ. We can now generalize the notion of the interpretive axis to the interpretive plane and specifically at the first plane which is usually of the greatest interest. The first interpretive plane is created by the first and the second interpretive axes. Important interpretive points j on the first interpretive plane, which are the most important points of the first factorial plane, should satisfy the following conditions. 1st Condition: je 1 ðjÞj þ je 2 ðjÞj > ðλ 1 þλ 2 Þ p $100 where λ 1 ; λ 2 are the inertias of axis 1 and 2, respectively. First condition is satisfied at least by all the points that have already satisfied the corresponding condition 1 for both the first and second interpretive axes. 2nd Condition: COR 1 ðjÞ þ COR 2 ðjÞ ¼ maxðCOR k ðjÞ þ COR l ðjÞÞ, where k,l are the axes that create a plane k x l, created by a combination of axes k,l from a (q)number first factorial axes selected by the user, in which point j satisfies 1 st condition. Points that satisfiy 1st and 2 nd conditions are considered as the most important of the first plane. Now, notice that all the points j on an interpretive plane with je 1 ðjÞj þ je 2 ðjÞj ¼ c, with c a constant, lie at the perimeter of a square KLMN centered at the origin, O, with its diagonals being on the interpretive axes and the semi-diagonal (OK) equal to ðOKÞ ¼ c. This is shown in Figure 1.
Proof: consider the point j in position A which has coordinates ðe 1 ðjÞ; e 2 ðjÞÞ. So its projection (position B) on the first interpretive axis is the distance from the beginning O which is equal to je 1 ðjÞj. We then take the point M on the first interpretive axis (Figure 1) so that its distance from the beginning O is c. It is now apparent that the length of the line segment BM is equal to je 2 ðjÞj. Therefore, the triangle MBA is an isosceles right triangle. Therefore, point j is generally on the perimeter of the square KLMN. It follows that point j 0 that is on the square K 0 L 0 M 0 N 0 with (OK 0 ) 5 c 0 > c, is more interpretive than point j on the first factorial plane. This is shown in Figures 1 and 2. The squares in Figures 1 and 2 are important for the interpretation. More specifically, the squares on the interpretive plane allow the user to directly compare the contribution of the points. For example, points that belong to the same square have the same contribution regardless of their coordinates. Consequently, points that "Automatic" interpretation of MCA results are important for the interpretation of the first factorial plane are closer to the most distant squares which are formed from the points of the first plane. This visualization of the interpretive plane eliminates the need to resort to the values of COR and CTR or asymmetric biplot with vector lengths.

Applications
In this section we illustrate the visualizations presented in previous section and compare them with existing alternatives using the "wg93" dataset that can be found on the ca package  Most important points of interpretive plane with "the square" view visualization ACI [27]. More information on dataset can be found on chapter 2 of [12]. MCA was performed with FactoMineR package [14] while libraries "tidyverse" [28], "ggplot" [29][30][31], "ggrepel" [32], "plotly" [33], "caintertools" [23], "factoextra" [34], "shiny" [35], "DT" [36], "this.path" [37], "soc.ca" [38],"egg" [39] were used as well to produce content that can be found either on the manuscript or on the supplementary material. At the end of the manuscript there is a link that connects with the supplementary material, which can reproduce all visualizations being discussed at this paper and also data tables with numerical evidence for verification. At this section we compare the proposed visualizations with the classical factorial maps while also we compare proposed first interpretive plane with what we consider as the best alternative among approaches that have been discussed on literature section. We encourage readers to download and explore the supplementary material because it's important for the completeness of the paper.
In Figure 3 we observe that on first factorial axis, point A_5 is the most distant point from the right side of the axis, with maximum F value but without the maximum CTR value among points of the right side (A_5, F: 1.64, CTR: 7.32, e: 2.11) while point B_5 that has the third biggest F value from the right side is the one with the maximum CTR value (B_5,F: 1.10,CTR: 9.71, e: 2.80). The same occurs between points B_1 and C_1 on the left side of the axis. This can mislead nonexpert user resulting to an erroneous interpretation of the most contributing points of this axis and an erroneous overall characterization of the factorial axis. Our proposed first interpretive axis, on the contrary, guarantees a successful interpretation of the most important points of the first factorial axis.
In Figure 4 we observe the comparison between the classic first factorial plane and the proposed first interpretive plane that implements the discovery of the geometrical locus of points (squares). On the first factorial plane we observe that some of the most distant points are the C_5 and B_1. Here, a user in order to make a successful interpretation must manipulate MCA's output and perform additional calculations in order to extract inertias of each point for each of the two axes for the comparison of different points. Therefore, a nonexpert user could easily be led to erroneous interpretations considering C_5 and B_1 as   (Figure 4), points C_5(total inertia on first factorial plane: 3.39) and B_1(total inertia on first factorial plane: 4.75) are less important for example from points C_1(total inertia on first factorial plane: 5.71) and B_5(total inertia on first factorial plane: 5.00) that are in fact the two most important points of the first plane. In our visualization point C_1 is located on the most distant square while also point B_5 is located on the second most distant square and this observation "automatically" gives correct interpretation. Similar comparisons can be observed by reader in other points as well.
In Figure 5 we observe the first factorial planes and the one on the right incorporates the asymmetrical biplot theory with the Greenacre's transformation (contribution biplots). In this transformation points in standard coordinates are multiplied by the square root of the corresponding masses. However, this is a different transformation to ours but we acknowledge that provides improvement over the basic factorial plane's visualization and the others that are discussed in literature section. This visualization comparing to ours, lacks of the extremely important observation of the geometrical locus of points which eliminates the need for any further calculation to reach to decision about the importance of a point. In Figure 5 (right plot) now notice that points with similar arrows' lengths need to be measured (with some calculation) and then compared. For example points B_5 and B_1 has visually similar arrows' lengths, therefore is hard to tell which point is more contributing than the other without the evidence from a numerical calculation about their lengths; on the contrary our proposed interpretive plane that incorporates the squares as a geometrical locus of points is a superior and improving approach than the one that is depicted in Figure 5. On our proposed visualization (Figure 4 middle plot) the points B_5 and B_1 are easily compared to each other since the B_5 stands in a more distant square than the B_1 so it's more important than B_1 in the first factorial plane. For a correct interpretation, our proposed visualization requires no additional calculations but only observation.

Conclusions
This paper proposes a new visualization scheme through the introduction of the interpretive coordinate, the interpretive axis and the interpretive plane which address the problem of finding and interpreting the most important points on the MCA's factorial maps by nonexpert users. Several scholars through their work have indicated this research gap and provided corresponding solutions. We presented and compared them with our proposal, and we concluded that our interpretive plane with the squares is a quicker and overall better way to find and interpret the most important points of a factorial plane. The originality/novelty of our work is the discovery of the geometrical locus of points which provide immediate optical First factorial plane (symmetric plot) and first asymmetric biplot with Greenacre's transformation using the ca package "Automatic" interpretation of MCA results identification of most important points. In short, the further a point is from the beginning of the interpretive axis, the more important it becomes for the factorial axis, and also when a point on the interpretive plane is at the perimeter of a more distant square, the more important that point is to the factorial plane. This work can have practical implications through disseminating the use of MCA in a wider audience while it opens a new window for theoretical research on the geometrical relations of the points in factorial maps. Interpretive coordinates as a transformation outcome cannot be used for other analysis methods (e.g.hierarchical clustering) so users must use original MCA coordinates and that can be considered as a limitation. Future research involves more theoretical investigation on geometrical relationships of the points on the factorial maps and providing Information Technology (IT) tools which will help to educate users about this new visualization scheme.