Gender pay gap in Vietnam: a propensity score matching analysis

Purpose – This paper investigates the extent, the determinants and the change in the gender pay gap in Vietnam in the period 2010–2016 in order to provide suggestions for policy adjustment to narrow gender pay inequality more effectively. Design/methodology/approach – This study employs the propensity score matching (PSM) method to examine inequality in pay between female and male earners sharing identical characteristics. The analysis is conducted for both the full sample and various characteristic-based subsamples. This procedure is conducted for 2010 and 2016 separately to discover the change in gap and inequality during this period. Findings –The matching results based on the data sets taken from the Vietnam Household Living Standards Survey (VHLSS) 2010 and 2016 affirm that gender income inequality in Vietnam, though persisted, decreased significantly in 2016 compared to 2010, and was insignificant in many subsamples in 2016. In addition to the observable determinants including educational level, occupation, economic sector and industry, unobservable factors are proved to also play an important role in creating the gender pay gap in Vietnam. Practical implications – The research findings suggest that policies aimed at mitigating gender pay inequality should take into account both observable characteristics and unobservable factors such as unobservable gender differences that affect wages and gender discrimination in pay. Originality/value – This is the first study using a matching technique to investigate gender wage gap in Vietnam. With up-to-date data, longer research period and the superiority of the method used in dealing with sample selection bias, the results obtained are more robust, more detailed and reliable.


Introduction
Labor income not only plays a particularly important role in ensuring the life of laborers and reproducing labor power but also serves as a management tool to improve labor quality and productivity. If labor income is distributed equally among individuals in society, it will be a driving force for sustainable growth.
In fact, gender inequality in earnings is very common. Across the world, women only make 77 cents for every dollar that men earn. Even for jobs that require the same or higher efforts and skills, women are still undervalued and underpaid. Especially, for women of color, immigrant women and mothers, the pay gap is even larger. The so-called "motherhood penalty" pushes women into informal economy and casual and part-time jobs, and this trend is more common and to a greater extent in developing countries than in developed ones Gender pay gap in Vietnam (UN Women, 2020). There are many causes of this situation, in which the root cause comes from the traditional view and stereotypical prejudices against women in feudal societies that limit opportunities for women to access to education and training, career selection and professional qualification improvement. In addition, women also have fewer opportunities than men to access basic services and resources to assert their economic position such as transportation, markets, funds and so on.
In Vietnam, the current system of laws and policies has created a relatively comprehensive legal framework to ensure gender equality in economic activities, labor supply and employment. Although Vietnam has gained remarkable achievements in narrowing the gender inequality, the gender pay gap still exists. Statistics indicate that the labor force participation rate of females is increasing and remains high compared to other countries in the region and in the world. If the labor force participation rate of women in 2010 was 76.92%, this rate increased to 78.63% in 2016 (ILOSTAT, 2020). However, the average monthly salary of female workers in 2004 was only 80 percent of that of male workers, but this rate increased gradually and reached 88.3 percent in 2016 (ILSSA and ILO, 2018).
The issue of gender pay inequality has attracted great attention of researchers all around the world. A common method used to investigate the gender pay gap is the Blinder-Oaxaca (B-O) decomposition method. This method has been improved by Juhn et al. (1993) and Machado and Mata (2001) to investigate the change in pay gap over time and to account for differences in labor market features. However, these methods have certain limitations, including the commonly seen limitations of the parametric methods and also the problem of comparability in the supports, thus leading to overestimates of the component of the gap attributable to differences in individuals' characteristics (Nopo, 2004). Thus, Nopo (2004) proposed a new non-parametric matching technique to explain gender pay gap as an alternative to the traditional B-O method. The most important advantage of this method is that it produces more robust and reliable results since it overcomes the heterogeneity of the samples under investigation and avoids diseases commonly faced by the parametric methods. Until now, the number of international researches on income inequality employing the matching approach is still small.
In Vietnam, there have been some empirical studies on this topic. These studies, regardless of being conducted for the whole economy, by economic sector or by income group, all confirm that gender inequality in pay persists. However, the results of these studies may not be very convincing given the limitations of the methods applied. This paper examines the gender pay inequality of wage earners in Vietnam for the period 2010-2016 using the propensity score matching (PSM) method. This is the first study on gender pay gap in Vietnam that follows this approach. The results from the matching analysis using the Vietnam Household Living Standards Survey (VHLSS) data sets of 2010 and 2016 allow us to affirm with certainty that gender pay inequality persists, though the extent has decreased significantly over time. In addition to the observable determinants including educational qualification level, occupation, economic sector and industry, unobservable factors are proved to also play an important role in creating the gender pay gap.
The following section of the paper explores the related literature. Section 3 presents the matching comparison technique and the data used in the study. Section 4 describes the estimation results. Section 5 discusses the results and concludes.

Literature review
From an economic perspective, wages are determined by the two important forceslabor supply and labor demandin the labor market. Labor supply is affected by individuals' demographic characteristics and the factors related to human capital, which are likely influenced by discrimination, while labor demand is determined by factors related to firm attributes and discrimination. Based on such idea of a market, Becker's work (1971) laid the foundation for the economics of discrimination. According to him, discrimination occurs on both the supply and demand sides, and discrimination creates costs to societythe discriminated worker is paid less, and the discriminator incurs greater expense to hire a worker with the same productivity with the discriminated one. Thus, according to market rules, competitiveness will reduce gender discrimination over time. In contrast to taste-based theories of discrimination represented by Becker, Phelps (1972) and Arrow (1973) introduced the theory of statistical discrimination. According to them, due to the problem of imperfect information in the labor market, employers are forced to rely on statistical information on different groups of applicants to infer their productivity, and thus being able to make recruitment decisions. In the absence of assumptions on racial or gender animosity against members of a targeted group, recruitment and employment decisions based on the group average, by accident, create inequality for minority groups.
Various empirical studies all around the world have been done in an attempt to not only measure the extent but also identify the causes and discover the trends in pay gap and inequality. Blau and Kahn (2017) provide a relatively comprehensive literature review on the explanations for gender wage gap. In addition to traditional market-based factors, Blau and Kahn (2017) investigate the impact of psychological attributes or non-cognitive skills on gender wage inequality.
The literature on gender wage gap has recorded various decomposition methods, ranging from parametric (Blinder, 1973;Oaxaca, 1973;Juhn et al., 1993;Machado andMata, 2001), semi-parametric (DiNardo et al., 1996;Donald et al., 2000;Bourguignon et al., 2002), to non-parametric (Rosenbaum and Rubin, 1983;Nopo, 2004) methods. From another angle, Fortin et al. (2011) classify decomposition methods into mean decomposition and beyond the mean decomposition. A representative of the former group is the traditional B-O method, while the latter includes the extended versions of the B-O method.
Some economists argue that as gender inequality in earnings, in essence, means the difference between earnings of male and female workers despite the same characteristics of labor capacity and productivity, it is very important to strictly control the differences in characteristics of the two comparative groups. Thus, the comparison must be restricted to a common support, where there is sufficient overlap in the characteristics of treated and untreated individuals to find adequate matches (Rosenbaum and Rubin, 1983). Nopo (2004) pointed out the limitations of the traditional B-O method in controlling the differences between the two comparative groups and introduced an alternative method, a new matching technique, to explore the determinants of wage disparity between male and female groups in Peru during the period 1986-2000. Applying the PSM technique, Frolich (2007) conducted an analysis on gender wage gap of UK graduates. Meara et al. (2017) investigated the gender wage gap in the USA, applying the extended PSM method in combination with the inverse probability weighted regression adjustment to address the problems of sample selection bias.
Gender pay gap in Vietnam has been studied using different methods, but, so far, there have been no studies using matching techniques. It is noteworthy that regardless of approach, scope, point and span of time, all studies assert that gender pay gap persists in Vietnam, but the determinants are diverse. Amy (2004) investigates the sectoral gender wage gap in Vietnam during the period 1997-1998 using the decomposition model proposed by Juhn et al. (1993). Pham and Reilly (2007) conducted a mean and quantile regression analysis to explore the gender pay gap for the wage earners in Vietnam over the period 1993-2002. Gian (2014 conducted a comparison analysis of the gender wage gap between Korea and Vietnam using the decomposition method developed by Juhn et al. (1993). Nguyen and Hoang (2018) applied the B-O method to identify the determinants of the gender wage gap in Vietnam for the period 2012-2014. Vu and Yamada (2018) decompose the gender equality in terms of wage distribution in Vietnam during the period Gender pay gap in Vietnam 2002-2014 using two methods: one is suggested by Chernozhukov et al. (2013) and the other is proposed by Firpo et al. (2009).

The PSM technique
This study employs the PSM technique to investigate the gender pay gap in Vietnam in the period 2010-2016.
Based on the concept "propensity scores" introduced for the first time in experimental designs by Rosenbaum and Rubin (1983), Nopo (2004) develops a simple matching procedure, a fully non-parametric method, to decompose gender wage gap. This procedure helps to select two subsamples of males and females having the same characteristics, and thus constructing a counterfactual wage for the common support. By seeking for matched samples with "identical" observable characteristics, this technique helps to solve the problems of sample selection bias created by the traditional B-O method and its later versions.
In this study, female earners constitute the treatment group and male earners form the untreated one. The control group is established as a subset of the untreated group by selecting only males that are identical in all other key characteristics to the treated group. As usual, a probit model is developed to identify the key characteristics. By defining an outcome variable (log of hourly pay) and a binary treatment variable, the matching technique seeks to establish whether a statistically significant difference exists in the log of hourly pay between the treated group and the control one, to investigate the inequality in pay between females and males sharing the same characteristics. This procedure is conducted for 2010 and 2016 separately to discover the change in gap and inequality during this period.
The main parameters in our study are, therefore, the average treatment effect (ATE) for the population, which represents the pay gap between the treated and the untreated groups, and the average treatment effect for "treated" individuals (ATET), which represents the pay gap between the treated group and the control one. The latter is obtained with the use of matching method, and the former is for comparison purpose. Let Y be the log of the hourly pay of an individual, D indicates whether or not "treatment" is received, "1" denotes individuals who are "treated" (females) and "0" denotes those who are untreated (males); thus, these two parameters can be given by: According to Nopo (2004), ATET can be considered as a component of ATE. By performing further algebraic transformation, the pay gap (ATE) can be decomposed into four elements and can be expressed in a simple way as follows where Δ represents the pay gap (ATE), Δ M the part that can be explained by differences between two groups of males matched and unmatched to female characteristics, Δ F the gap explained by the differences in characteristics between two groups of females whose characteristics can be matched and unmatched to male characteristics, Δ X the gap explained by differences in the distribution of (observable) characteristics of males and females over the common support, which can be eliminated using the matching approach, and Δ 0 is the gap that cannot be explained, presumably due to unobservable characteristics or gender discrimination. Δ 0 represents ATET and is of utmost interest to our study. Also it should be noted that in this study, all females are matched to the corresponding control groups, and therefore Δ F does not exist ðΔ F ¼ 0Þ.

JED
The PSM procedure is applied to the full sample as well as various subsamples based on a wide range of key characteristics selected in order to investigate comprehensively the equality in pay between the two genders. Using the propensity scores, a comparable control group comprising males who are similar in terms of all the chosen key observable characteristics is selected for each treatment group of females under investigation. The goal is to identify whether there exists a statistically significant pay gap and the magnitude of the gap between these two essentially similar groups, which is tested using bootstrapped standard errors. A variety of subsamples are, in fact, employed in order to obtain even finer matching of control and treatment groups (and therefore more accurate comparison) and also to evaluate the influences of each selected characteristics on the gender pay gap. Note that, regarding occupation, since there are ten occupational categories, division of the sample by these ten categories may lead to subsamples of insufficient size. Consequently, only three main groups of occupation (namely, professionals, skilled and unskilled workers) are taken into account.
To begin with, the propensity scores are obtained for the full sample by running a probit model with the dependent variable being gender, which takes the value of 1 for females and 0 otherwise, and independent variables consisting of the following key common characteristics: (1) Marital status (1 if married and 0 otherwise).
(4) Dependency ratio (ratio of the number of dependent people (children and the elderly) to household size).
(8) Educational levels (no qualification; primary school; secondary school; high school; and college and higher qualification).
(9) Occupation categories (leaders in all fields and levels; top-level professionals; midlevel professionals; staff (elementary professionals, white-collar technical personnel); skilled workers in personal services, security protection and sales; skilled workers in agriculture, silviculture and aquaculture; skilled handicraftsmen and other manual workers; assemblers and machine operators; unskilled workers; and army force).
For each subsample based on certain selected characteristics, the probit model used is the same as for the full sample except for the fact that the corresponding selected characteristic is no longer included as an independent variable in the regression.

Data
This study examines and compares the gender pay gap in Vietnam between two separate years, 2010 and 2016, using the VHLSS data sets of the corresponding years. Consequently, only employed individuals (wage earners) with available information on incomes are included. We employ the hourly wage of females and males to measure gender pay gap, which accounts for the differences in both the total earnings and the number of hours worked of the two genders. The hourly wage of each individual is obtained by dividing his/her annual earnings from his/her primary job by total working hours spent on that job in the corresponding year. Data on annual income are calculated as the sum of his/her salary and all types of accompanying bonuses, allowances and other benefits from the job during 12 months prior to the survey; the total number of working hours is computed by multiplying the number of his/her working days in 12 months preceding the survey by the average number of hours per working day he/she spends on that job.
The resulting sample in 2010 comprises a total of 7,561 individuals with 3,018 females and 4,543 males (accounting for 39.92% and 60.08% of the total, respectively), while the 2016 sample is considerably larger, with 8,508 individuals and a slightly higher proportion of females (41.58%). This might imply a relative improvement in working opportunities for women between 2010 and 2016.
Regarding labor income, which is the utmost focus of this paper, the average hourly earnings for employed females was 13.87 compared to 15.51 thousand VND for males in 2010. The figures nearly doubled for 2016, with female and male earnings of 27.52 and 30.52 thousand VND, respectively. Accordingly, gender pay gaps, though still significant, reduces from 11.89% to 10.91% between the two years, proving an equality improvement.
Considering potential determinants of gender pay gap, while the characteristics regarding marital status, size of household and rate of dependency are rather comparable for both groups, the percentage of men assigned to be household head is exceedingly higher than that of women in both samples. This somewhat reflects social perception of gender roles in families and may suggest a higher sense of responsibility for males to work and earn more. In fact, the average hourly income of household heads was higher than that of other members of the family, but the gap decreased by almost half, from 16.03% in 2010 to 8.29% in 2016 (see Table 2).
On the contrary, Table 1 shows relatively higher proportions of women who are migrants and from the Kinh group than those for men in both samples. More remarkably, urban rates, Gender pay gap in Vietnam and 1.38 times higher than rural ones in 2010 and 2016, respectively. Earnings are also shown to increase sharply with education level, with salary at the highest educational level (college or higher degree) being 2.47 and 2.08 times higher than that of the lowest level (no degree) in 2010 and 2016, respectively.
Also, while improvement in education takes place for both groups over time, it is more significant for female workers. In terms of education, economic sector, industry or occupation, the shares of women in the categories that yield higher income tend to be larger. Regarding occupation, the two categories that offer the highest income, leaders and the army force, however, have strikingly low proportions of females, but these also have the smallest numbers of observations in the sample.
Overall, Table 2 indicates that the mean hourly earnings vary greatly across different categories in each criterion, proving the significant impacts of these characteristics on the level of earnings earned by individuals. Nonetheless, according to the facts discussed above, women, though having characteristics that are likely to lead to higher wages, are still paid less than men. This implies that these determinants of earnings are insufficient to explain the gender pay gap in Vietnam in the two periods under study and suggests the contribution of other factors, presumably unobserved ones. This judgment is further strengthened in the following section.

Empirical results
Matching results for the full sample and most subsamples in 2010, indeed, provide a strong evidence of a statistically significant pay gap between women and men that share more or less the same characteristics. In other words, women are firmly proved to be paid significantly less than men in this year, and the differences in earnings arise from other (perhaps unobservable) factors besides the wide range of observable ones that are taken into account.
Nevertheless, inequality is shown to be improved dramatically in 2016. Though still significant, the overall wage gap reduces by half from 28.20% in 2010 to 14.29% in 2016 between females and their matched male group. Regarding the subsamples, the gap is insignificant in not only the ones with small sizes but also many others, most notable of which are the state sector and service industry. It, also, has low magnitudes associated with low significance levels in many subsamples, particularly the one of professionals.
When having a closer look at the full sample of 2010, we find that females were paid 28.20% less than males sharing similar characteristics (at the significance level of 0.01). This gap is considerable and much larger than the gap for unmatched groups (16.88%) (See Table 3). This means that the males in the control group, who are identical to the treated group of females in terms of characteristics, on average, earn more than the bigger untreated (male) group, which, in turn, implies that, overall, women possess characteristics that would lead to considerably higher earnings but were treated very unequally.
Despite large variation across different subsamples, the estimated gaps between matched groups, when significant, are also higher compared to unmatched ones for all subsamples, JED except for two groups relying on educational level (the two lowest levels) (See Tables 4 and 5). This further supports the existence of a profound gender pay inequality.
Considering the differences in earnings between females and their matched groups, most subsamples see lower pay gap compared to the full sample, which shows the importance of controlling the distribution of characteristics of the samples to the accuracy of the results (see Tables 4 and 5). Three out of the four exceptions are industry and construction, non-state and skilled workers, all of which are all male-dominated, with the number of males nearly doubling females. They witness the highest pay gaps of 35.21%, 32.96% and 31.90% (all at significance level of 0.01), respectively. The remaining exception is the Kinh group, which accounts for more than 90% of the whole sample, with pay gap only slightly higher than the overall one.
The subsamples with the lowest significant gaps and, therefore, highest income equality, on the other hand, are high school level workers, service workers, professionals and workers in the state sector (with pay gaps of 17.46%, 19.46%, 20.03% and 20.50%, respectively, all at significance level of 0.01). These also suggest that educational level, industry, occupation and economic sector offer good explanations for the gender pay gap in 2010. It is also noteworthy that the wage gap is insignificant in the three smallest subsamples, namely, ethnic minorities, agriculture and FDI workers, with very small and even positive gaps for the first two groups.
By the selected key characteristics, the (significant) estimated pay differences are rather comparable between married and unmarried workers but much lower for household heads than other members of the household, for service than for industry and construction, and for state workers than for their non-state counterparts. Rural areas also witness lower pay gap than urban ones. Among five educational categories, the gap estimated is the lowest for the high school level and the highest for those at middle school level. Pay equality also differs greatly among the three major occupations, of which professionals have the highest equality, followed by unskilled workers, while skilled female workers are treated highly unequally.
Compared to 2010, gender pay differences decreased greatly for both full sample and all subsamples in 2016 (see Tables 6-8). From 2010 to 2016, overall, the estimated gender pay gap between the treatment and control groups for the full sample falls dramatically to 14.29% (at the significance level of 0.01) in 2016, nearly half of that in 2010. This fact proves a sharp progress in enhancing gender pay equality. However, the gap between matched groups still exceeds that of unmatched ones, at 9.15%, indicating that unequal treatment at work between males and females, though has considerably lessened, persisted in 2016.
Similar to the pattern observed in the results for the 2010 sample, in 2016, the use of matching method produces higher gaps for many of subsamples considered, which shows inequality in earnings persists between females and their male counterparts (see Tables 7  and 8). Exceptions are all the subsamples relying on educational level (except for the high school level), professionals and workers in non-state sector (and also workers in agriculture and FDI sector and from the Northern Midland and Mountainous, but the estimated gaps for these are not significant, largely due to insufficient observations). This indicates that, in these subsamples, gender pay differences can indeed be partly explained by the observed characteristics chosen.
It can be seen from Tables 7 and 8 that gender pay inequality varies enormously across different subsamples. The biggest (significant) estimated gender pay gaps were for workers from the Central Highlands (20.18%), the Red River Delta (19.95%) and the Mekong River Delta (18.41%), and those without any educational qualification (19.05%). Many other subsamples in 2016 also have larger gender gaps than the full sample, though most of them made considerable progress compared to the previous six years. These facts suggest the persistence of a profound gender inequality in pay in 2016, when differences are even greater among the same subgroups compared to the entire sample.
On the contrary, gender pay gap is insignificant in many subsamples, not only those with few observations but also those of large size. They include the subsamples of workers from the Northern Midland and Mountainous, FDI sector, agriculture, state sector, service industry, ethnic minorities and the Southeast, the first two of which witness positive wage gaps (i.e. women are paid more than their male counterparts). The lowest significant inequality is also seen at low significant level and small magnitude, for professionals (2.29%, with the low significance level of 0.10), workers whose highest qualifications are from high school (8.70%) and middle school (9.50%) and workers from the North Central and Central Coast (10.05%) (all with significance levels of 0.05). The estimated gender pay gaps for most of the above-mentioned all have dropped sharply between 2010 and 2016, even though workers in these subgroups are also those experiencing the lowest gender gaps in 2010, except for professionals and middle school educational level. They, in fact, are those who observe the largest significant improvement among all subgroups in 2016, with wage gap falling from 20.03% to 2.29% and from 27.81% to 9.50%, respectively, between the two years. The high level of income equality observed in these subsamples also indicates that in this sample, gender pay discrepancies can indeed be partly explained by occupation, educational background, region, industry and economic sector.
By each characteristic, pay differences are much lower for professionals than for skilled workers and laborers and even more substantially for service compared to industry and construction and for state compared to non-state, which is in line with the results in 2010. Likewise, by educational level, gender inequality is also lowest for high school-level workers,   JED but highest for those without any qualification. Nonetheless, it is higher for rural than urban workers and far higher for household heads than other members of the household, which is in stark contrast to the earlier results. Besides, the availability of the data on regions in the sample of 2016 allows us to make further comparisons across six economic regions in Vietnam. Particularly, the region witnessing the highest pay gap was Central Highlands, followed by the Red River Delta and Mekong River Delta. In stark contrast, the estimated gaps are insignificant for both the Southeast and the Northern Midland and Mountainous.

Discussion and conclusion remarks
Studying gender pay gap is simply to conduct a comparison of the pay level of the two groups which are identical in terms of identified characteristics. However, establishing two comparable groups meets some difficulties. Traditional social norms create gender specialization in many areas of life, thereby limiting the role of women and giving priority to men in the labor market. In addition, the creator also contributes to gender specialization in the labor market. It is because health and some gender characteristics make women suitable for only some jobs, and the same is true for men. This fact shows that "not all males are comparable to all females" even in the support. In other words, the distributions of characteristics can be different in the support. Matching techniques are a good tool to control such differences and ensure the comparability of research groups and, therefore, can provide a reliable assessment of gender pay inequality. This study differs from the previous studies on gender pay gap in Vietnam in that it employs the PSM technique with a focus on the common supports only.
The estimation results based on the data sets of VHLSS 2010 and 2016 indicate that there exists a statistically significant pay gap and inequality between comparable groups of male and female earners, but the extent has been narrowed by half in this period for the full sample and also remarkably for all subsamples. While the pay disparity is significant for all subsamples but those of small size in 2010, it is either insignificant or significant at low significance level for many subsamples. This further reinforces the sharp progress in enhancing gender pay equality over this period.
We find that differences in pay in both years arose from not only the observable but also unobservable factors. Of which, the most significant observable determinants are shown to be educational background, occupation, industry and economic sector. Nevertheless, in both years, the estimated gap for matched groups is considerably higher than the gap for unmatched ones for both the full sample and many of the