Assessment of the validity and reliability of an urban household health expenditure ( HHE ) questionnaire in Kuala Lumpur , Malaysia

Purpose – Out-of-pocket (OOP) payments continue to be amajor method of financing healthcare inmany lowand middle-income countries including Malaysia. Although macro-level data show that this is a substantial percentage of national health expenditure, at the grassroots level, the amount spent on health by households remains unknown in Malaysia. The purpose of this paper is to assess the validity and reliability of an adapted-for-purpose questionnaire designed to capture urban household health expenditures (HHEs) among Malaysian households. Design/methodology/approach – This two-part study assessed content validity of the questionnaire using three experts and the reliability of the questionnaire through a test-retest study among 50 OOP-paying patients followed up at one private primary care clinic in Kuala Lumpur. This study was approved by the Malaysian Research Ethics Committee (NMRR-16-172-29311-IIR). Findings – The validity of the 83-item questionnaire was high, with an item content validity index of 1.00 and a scale content validity index average score of 1.0 agreed to among the evaluating experts. In the test-retest reliability study, the majority of the categorical questionnaire items had perfect agreement values (k1⁄4 0.81-1.00). Continuous questionnaire items were also found to be highly reliable with no significant differences between the test-retest segments and high correlation coefficient values (intra-class correlation coefficientW0.7). Originality/value – The HHE questionnaire had excellent content validity and very high test-retest reliability. The results of this study suggest that this questionnaire could be used in Malaysian studies to determine actual urban HHE which is a first step toward developing universal health coverage for all.


Introduction
Universal health coverage (UHC) has been made one of the global health priorities, as signified by its inclusion as Target 3.8 on the global Sustainable Development Goals [1].UHC is based on the Primary Health Care principles from the 1978 Alma-Ata conference, whereby all peoples should be provided with equitable access to quality healthcare based on need, without the user facing financial hardships in the process of seeking and receiving care [2].This has been a struggle for the low-and middle-income countries where low government expenditure in healthcare alongside low-quality provision of health services has led to citizens largely resorting to using expensive, private services and paying for them via out-of-pocket (OOP) payments [2].
In low-and even lower middle-income households, OOP payments for healthcare cause a great strain on families' monthly household expenditures [3].This has been significantly proven to be associated with financial difficulties.When sudden, calamitous health events such as a stroke or a heart attack occur, they may push an individual or family into huge debt and even render them bankrupt, a term called catastrophic health expenditure [4].Incurring large monthly household health expenditures (HHEs) also often causes households to delay seeking healthcare, leading to further health complications and even poor health outcomes as seen in chronic disease patients [3,4].
Malaysia is a middle-income country with largely subsidized public healthcare services [5].Despite the availability of access for Malaysians to healthcare, perceived quality and logistics issues have led to widespread use of private services, especially in private primary care facilities [5].As Malaysia has been deficient in developing pre-payment mechanisms for health, payments for these private health services are largely via OOP payments [6].On a macro level, the evidence is clear that private payments in terms of total healthcare expenditure are high and this largely comes from OOP payments [5].On a micro level however, it is unclear on how much the average Malaysian household spends on healthcare every month and how deeply this healthcare expenditures further impact their overall monthly expenditure.This pilot study aimed to assess the validity and reliability of an adapted-for-purpose questionnaire designed to capture urban HHEs among Malaysian households as part of a larger series of studies to develop a health microinsurance scheme in Malaysia [7,8].

Methods
This study was conducted in two parts.The first part aimed to establish the content validity of the developed questionnaire via a sample of experts.The second part of the study was conducted to evaluate the reliability of the content-validated questionnaire via a test-retest design.Although the questionnaire had had been used and verified in other settings with a highvalidity quotient [9][10][11], the content validity of the adapted version was nevertheless assessed by three experts: a health economist from Chulalongkorn University, Thailand, and two private primary care physicians who worked in private primary care clinics in Kuala Lumpur.
The setting for the test-retest portion of the study was located among private primary care clinics in Jalan Ipoh, a socio-demographically diverse suburb in Kuala Lumpur, with a population of approximately 268,000 people (2010 census) [12].One private primary care clinic was purposively selected from among the list of private primary care clinics registered at the Ministry of Health, Malaysia, as the base location for conducting this portion of the study [13].This study received ethical approval from the Malaysian Research Ethics Committee, Ministry of Health, Malaysia (NMRR-16-172-29311-IIR).
The HHE questionnaire used in this study was adapted-for-use in Malaysia from existing questionnaires being used in Kenya and other settings, previously prepared and based on guidance from WHO manuals [9][10][11].The questionnaire consisted of seven sections with 83 questions capturing data from various household members for a range of health-and incomerelated use and expenditures.Section 1 contained 14 questions on details of the household member answering the questionnaire; and household demographics for each member of the household including gender, age, relationship to the head of household and their education, marital, occupational, relative health condition and chronic disease status, respectively.
The subsequent section containing nine questions captured data on the health-seeking behavior of each household member.Five questions were posed to ascertain in detail Validity and reliability of an urban HHE whether they had suffered from an episode of illness over the past four weeks, and if so, how and where they sought treatment for it.One question aimed to ascertain whether the household member had sought preventive health services in the past four weeks such as immunization or family planning.The three remaining questions in this section were about household members being hospitalized or admitted as in-patients over the past 12 months.All the subsections also sought answers on whether treatment had been foregone, and the reason for foregoing this treatment if such a situation arose.Section 3 of the questionnaire elicited details on the utilization of out-patient and other health services by each household member in the past four weeks, if answered in the affirmative in the earlier section.A total of 25 questions were asked covering areas such as: type of illness; type of health services sought; distance of health provider visited from the home and reasons for choosing them; details of medicine or treatment given; costs of treatment; transport used and its costs; and method of payment and source of funds for payment.One additional question was about costs for routine treatment or other health-related expenses such as the purchase of vitamins.
Section 4, on the other hand, elicited details on in-patient admission of the past year for each household member, if this had been answered in the affirmative in Section 2. Similar to Section 3, 22 questions were asked on cause of illness; health provider used and reasons for making that choice; treatment regimens and costs; methods of transport and its costs; and finally, costs incurred from persons accompanying and caring for the sick household member at the hospital.
Section 5 contained six questions on access to health insurance for the members of the household; with questions seeking to determine the type and depth of health insurance coverage, type of premium payment and amounts, and whether payment was covered by an employer.Section 6 contained three questions on housing conditions, amenities and assets of the household, while Section 7 contained three questions on detailed weekly, monthly and annual household consumption expenditure.
The three experts rated the content validity of each item in the questionnaire separately without any discussion between them.The scale was scored as follows: 1 ¼ test not being relevant; 2 ¼ somewhat relevant; 3 ¼ quite relevant and; 4 ¼ highly relevant [14].A rating of either 3 or 4 meant that the item was considered to be of relevance.For each item, the content validity index for items (I-CVI) was computed; dividing the number of experts giving either a 3 or 4 by the total number of experts [14].The cut-off level for item acceptability incorporating the standard error of the proportion with a panel of fewer than five experts was I-CVI ¼ 1.0 [15].Thus, total agreement (the number of items that achieved the I-CVI of 1.00 divided by the total number of items to be validated in the questionnaire) was calculated to represent the proportion of questions that experts deemed quite relevant or highly relevant.The validity of the entire questionnaire was determined via the use of a content validity index for scale average (S-CVI/Ave) [14,15].The S-CVI/Ave, defined as the average proportion of total items judged valid by involved assessors, is the average of I-CVIs, obtained by summing them and dividing by the number of items [14,15].An acceptable S-CVI/Ave value according to guidelines is a minimum of 0.90 [14].
On the completion of content validation, the questionnaire was translated into Malay from its original in English by a qualified member of the study team who was certified proficient in both languages.The translated Malay language instrument was then re-validated by the two Malaysian experts as detailed above who were fluent in both Malay and English.They verified the questionnaire and resolved discrepancies.Two other bilingually fluent language experts from the Department of Languages, University Putra Malaysia, then independently back-translated the questionnaire into English to be compared with the original version.These experts compared the different versions and resolved translation differences via discussion.

JHR 32,1
Following content validation and translation, the questionnaire was then subjected to a test-retest reliability study.From the selected private primary care clinic, a numbered list of patients who fit the inclusion criteria was compiled.This included: patients living in the study area and patients paying for treatment in the clinic via OOP.A total of 50 patients were then randomly selected from this list using an online random number generator software (http://stattrek.com/statistics/random-number-generator.aspx).Consenting patients were interviewed face to face by a study team member to complete a questionnaire on their monthly HHE.Study team members who conducted the interview underwent a four-hour training session on methods of conducting the interview which also included two "mock interview" sessions with the study's principal investigator.They were also given a short manual detailing the definitions for the different variables in the questionnaire.A second interview was administered to the same patients in a period beginning 30 days after the earlier interview had been conducted.In order to ensure reliability of the answers obtained, it was ensured that the interview was conducted by a different interviewer compared to the earlier interview.
Descriptive statistics were first used to describe the sample characteristics.Data normality for continuous variables was assessed using the Kolmogorov-Smirnov test.Test-retest reliability was assessed in two different methods for the categorical and continuous items.For categorical items, the percent agreement calculation was calculated for each question; defined as the number of agreement scores divided by the total number of scores between the results of the first and second interview for that question [15,16].Test-retest reliability was further evaluated using Cohen's κ coefficient, a statistic measuring inter-rater agreement for categorical items [15,16].Cohen's κw was used for items with more than two possible responses [15,16].The range for κ coefficients was as defined by Landis and Koch: o0 (poor agreement); 0-0.2 (slight agreement); 0.21-0.40( fair agreement); 0.41-0.60(moderate agreement); 0.61-0.80(substantial agreement); 0.81-1.0(almost perfect agreement) [15,16].For continuous items, test-retest reliability was analyzed using a dependent t-test between the results of the first and second interview.An intra-class correlation coefficient (ICC), which measured the agreement between single measures on a linear mixed model, was then calculated along with corresponding 95 percent CI and determined to be high if above 0.7 [15].All statistical analyses were carried out using SPSS version 16.0 with significance levels fixed at po0.05.

Results
The results of this study are described separately following its validity and reliability subsections.The validity of the questionnaire, as assessed by the three different content experts, is as described in Table I.The I-CVI for individual items' acceptability was 1.0 for all items, which meant that all three experts were in complete agreement, proving that the individual items were of relevance.The validity of the entire questionnaire was thus also found to be high; with the calculated content validity index score or S-CVI/Ave being 1.0.This high validity meant that at the end of the validity process, no questions were dropped or added into the questionnaire as it was deemed to be valid and fit for purpose.
The sample for assessment of reliability consisted of 50 respondents sampled from the chosen private primary care clinic.The characteristics of the sample are as described in Table II.The sample largely consisted of Malays (40 percent) with 66 percent of them at least having completed secondary school.Within the sample, 84 percent were within the third income quintile with 56 percent of them being employed, monthly wage receiving staff.
Item completion for the test-retest segment of the study was satisfactory as there was an interviewer who guided the respondents through the questionnaire.For analysis, the items were divided by the type of response, i.e. whether being either categorical or continuous in nature, described in Tables III and IV, respectively.Consistency across the demographic  For the 19 continuous items of the questionnaire, there were no significant differences in the comparison mean values between the test and retest segments.ICC values for 16 of the items were high (ICC W0.7) with three exceptions which were for items relating to the health insurance premium amount, key consumables expenditure for the last one month and key expenditures for the previous year.

Discussion
This study assessed and found that an HHE questionnaire containing 83 items designed for the use in urban Malaysian settings was of high validity and reliability.A valid and reliable detailed and specific questionnaire for the estimation of HHEs is crucial, especially in OOP-dominated health financed countries such as Malaysia.This is because the only other method for estimating these expenditures are via one or two questions which can be found in the National Household Expenditure Surveys; which have been found to vary widely and have proved to be inconclusive in capturing true estimates of a household's health-related expenditure [17,18].
As affirmed by the three consulted experts, the questionnaire exhibited excellent content validity, indicating the relevance of the various items in determining actual HHE.This further authenticated this questionnaire which had already been used in other settings and found to be of high validity [10,19].Concerns have arisen pertaining to the number of experts required for validating a questionnaire; with some authors suggesting that when there are a small number of evaluators, the content validity index can be inflated merely by chance factors [20,21].However, this shortcoming can be overcome by a high representation of agreement between the evaluators in order to encompass all possible variations of rating, i.e. the I-CVI should be 1.00, bearing in mind of course that a minimum of three evaluators are needed [20,21].This clause has been satisfied in this study, thus alleviating concerns of whether this validity assessment in itself was valid.
In terms of test-retest reliability, results from assessment of both the and continuous items proved that this questionnaire had adequate reliability to be used in determining HHE in Malaysia.The high consistency seen across test-retest results of the categorical items reflected accuracy of the items as the answers for these items were easily selectable from the options available on the questionnaire such as demographic information, consisted of information which could be easily remembered such as the name of health provider and distance from home to provider and questioned items which incorporated habitual behavior such as preventive care seeking.In comparison, variability in agreement could be seen across items, which concerned attitudes, assessment of quality and awareness which were more dynamic in nature.This pattern of higher agreement values for habitual items and lower values for awareness/attitude items have been seen in other studies as well [22].Test-retest results of continuous items also showed high agreement across the evaluations; with relatively lower correlation coefficients only seen in items which required distant recall such as exact monetary values for expenditure over the last month or last year.This was undoubtedly due to the presence of a recall bias [22].However, despite this, the differences were non-significant; a further testament to the "true" reliability of the questionnaire.
Overall, caution should be observed in interpreting the results of this study due to its limitations.The sample for the test-retest reliability part of the study was not a

33
Validity and reliability of an urban HHE representative sample of the urban Kuala Lumpur population as it was carried out in only one center, and like all such surveys, it is inherently subject to biases such as recall bias as outlined above.Despite these limitations, this study is an important one for the following reasons: first, this study assesses the validity and reliability of the first specific HHE questionnaire for Malaysia; finding it fit for purpose.Second, despite the exposure to certain unavoidable biases, the low variability seen in the results overall indicate that this questionnaire is strongly structured, cohesive and consistent.

Conclusion
This questionnaire will form the backbone of larger scale studies which will provide further information on health expenditure patterns of urban Malaysian households.A happy unintended consequence is that its validity and reliability will be further tested as well.
Determining household expenditure patterns is part of the essential building blocks toward developing a sustainable method of health financing which will help improve the health and well-being of all peoples, especially here in Malaysia.

Section 3 :
out-patient and health-related services usage over the last 4