An adaptable end-to-end maintenance performance diagnostic framework

Purpose – This paper proposes a progressive, multi-level framework for diagnosing maintenance performance: rapid performance health checks of key performance for different equipment groups and end-to-end process diagnostics to further locate potential performance issues. A question-based performance evaluation approach is introduced to support the selection and derivation of case-specific indicators based on diagnostic aspects. Design/methodology/approach – The case research method is used to develop the proposed framework. The generic parts of the framework are built on existing maintenance performance measurement theories through a literature review. In the case study, empirical maintenance data of 196 emergency shutdown valves (ESDVs) are collected over a two-year period to support the development and validation of the proposed approach. Findings – To improve processes, companies need a separate performance measurement structure. This papersuggestsahierarchicalmodelinfourlayers(objective,domain,aspectandperformancemeasurement)to facilitate the selection and derivation of indicators, which could potentially reduce management complexity and help prioritize continuous performance improvement. Examples of new indicators are derived from a case study that includes 196 ESDVs at an offshore oil and gas production plant. Originality/value – Methodological approachesto derivingvariousperformance indicatorshave rarelybeen addressed in the maintenance field. The proposed diagnostic framework provides a structured way to identify and locate process performance issues by creating indicators that can bridge generic evaluation aspects and maintenance data. The framework is highly adaptive as data availability functions are used as inputs to generate indicators instead of passively filtering out non-applicable existing indicators.


Diagnostic framework of maintenance performance
1. Introduction Productivity and profitability are key factors for companies to remain competitive in today's rapidly changing world.Concomitantly, maintenance is beginning to play a more important role in many production-related industries as the impact of maintenance performance on productivity and profitability has increased (Lado and Singh, 2019;Ismail et al., 2022).Therefore, measuring the performance of maintenance activities has become a crucial part of maintenance management.Evaluating maintenance performance supports asset managers and system owners in gaining knowledge about how the outputs of maintenance processes contribute to the business goal (Parida et al., 2015), which, in turn, drives the continuous improvement of maintenance processes and strategies (Choubey et al., 2021;M arquez, 2007).
The immense importance of maintenance performance tracking has generated increasing interest in the development of a maintenance performance measurement (MPM) framework.According to Parida et al.'s (2015, p. 15) definition, MPM is "the multidisciplinary process of measuring and justifying the value created by maintenance investment, and taking care of the organization's stockholders' requirements viewed strategically from the overall business perspective."As an important and integrated part of performance measurement, an MPM framework links organizational strategy to performance measurements using a list of indicators to set criteria (Kumar et al., 2013).Maintenance performance indicators (MPIs) are the building blocks of an MPM framework.They quantify maintenance performance as a measurable value and provide direct indications of whether the performance of maintenance activities meets the designated objectives.Well-defined MPIs can pave the way to the desired maintenance performance by supporting the identification of performance gaps (Muchiri et al., 2011).However, despite the numerous resources and efforts spent on development, MPM systems often do not have enough influence to trigger decision and process changes (Muchiri et al., 2010).On the one hand, in the vast majority of cases, performance measures are overloaded with technical indicators (Rybin et al., 2020).This increases the difficulty of performance management, leaving maintenance databases and indicators undocumented or unregulated (Parida et al., 2015).The implementation of maintenance performance management is seldom driven by process and demand changes (Wakiru et al., 2022), making existing MPM systems deviate from maintenance objectives and become less effective over time.Therefore, the measurement of these indicators is distributed into individual technical aspects and fails to contribute to an end-to-end view, which shows a complete performance evaluation of major maintenance processes from beginning to end.On the other hand, overall indicators are generally involved in multiple processes, aspects and roles, making it difficult to identify a specific issue and take concrete actions, leading to a loss of focus on continuous improvement (Barber a Mart ınez et al., 2017;van Horenbeek and Pintelon, 2014).Maintenance data, especially those of poor quality, have a strong influence on the capability of existing performance measures (Ge et al., 2023;Lukens et al., 2019).The misalignment between data availability and data requirements for common performance indicators can decrease the validity of existing MPM systems (Braz et al., 2011).Performance-related analysis becomes inconclusive when input data are not available to support the calculation, whereas potentially useful measures might be uncovered, although the input data are ready for use (Agergaard et al., 2021;Villarejo et al., 2017).
Numerous examples of MPM systems that have adopted categorical and hierarchical performance classification methods are found in the literature.In addition, recent studies have explored approaches for selecting performance indicators.However, an MPM framework that comprehensively assesses end-to-end processes and alignment between data availability and performance measures is still missing.

IJQRM
To address the research gaps mentioned above, the research question for this paper is formulated as follows: "How can maintenance performance be measured in a holistic end-toend view by utilizing available data?" To answer the research question, this study developed a conceptual framework for the structured diagnostics of potential performance issues in an end-to-end maintenance setup.The diagnostic aspect of the framework highlights a progressive performance measurement approach, tracing the resulting performance issues down to specific maintenance processes in the domains of effectiveness, efficiency and compliance.The case study shows that the framework can guide the derivation of MPIs using available data on existing equipment.The proposed framework contributes to the scholarly knowledge of maintenance performance management and has the potential to support efficient performance diagnostics of existing production systems.
The rest of the paper is organized as follows.Section 2 provides a brief overview of the structure of existing MPM frameworks and MPI selection methods.Section 3 describes the case research and data collection methods, and Section 4 introduces the development of a twostep multi-level maintenance performance diagnostic framework.Section 5 demonstrates the derivation of case-specific maintenance performance indicators through case study examples, Section 6 discusses some of the observations from the case study, and Section 7 summarizes the research and provides academic and practical implications, limitations and suggestions for future work.
2. Literature review 2.1 Maintenance performance measurement frameworks MPM methods and techniques have been investigated extensively in various industries.Previous studies on MPM frameworks categorized performance measures and indicators in different ways to associate them with their corresponding maintenance objectives.Campbell (1995) classifies commonly used maintenance performance measures into three categories: equipment performance, cost performance, and process performance.Tsang et al. (1999) introduce a holistic approach to establish maintenance performance measures using the well-known balanced scorecard, which translates a business unit's strategy around four perspectives: financial, customer, internal process, and learning and growth (Kaplan and Norton, 1996).In recent studies, various versions of the modified balanced scorecard have been developed for maintenance performance-related applications, focusing mainly on the inclusion of non-financial perspectives (Campos et al., 2017;Maci an et al., 2019;Sirin et al., 2020;Tanoto et al., 2022), and alignment with objectives and information systems (Campos et al., 2017).The European standard for maintenance key performance indicators provides an organizational model of maintenance function, which is composed of six sub-functions: health-safety-environment, management, people competence, engineering, organization and support, and administration and supply (EN:15341, 2019).For each sub-function, a list of key performance indicators is given.Recent studies have suggested incorporating sustainability measures for performance measurement (Olugu et al., 2022).More categorization methods can be found in the review paper by Parida et al. (2015).
MPIs can also be categorized as leading and lagging indicators in a broader sense.According to Weber and Thomas (2006), leading indicators monitor whether maintenance processes are performed in a way that leads to expected results, whereas lagging indicators monitor the results achieved with maintenance activities.The authors further classify lagging indicators as cost, failure, and downtime measuring, whereas leading indicators are classified as six types of measures, each corresponding to one maintenance process.Muchiri et al. (2010) proposed a similar yet different classification methodology for leading and Diagnostic framework of maintenance performance lagging indicators.In this proposal, work identification, work planning and scheduling, and work execution are the three sub-categories under leading indicators, whereas lagging indicators are further categorized as equipment effectiveness, maintenance costeffectiveness, and safety-environment indicators.
To better align indicators with specific purposes and/or users of performance measurements, it is also common to formulate indicators at different levels (Kumar et al., 2013;Wireman, 2005).Such multi-criteria hierarchical MPM frameworks have two types of structures.Some studies suggest that the same measuring criteria or perspective can be applied to all hierarchies.Parida and Chattopadhyay (2007) propose a multi-criteria hierarchical MPM framework with three vertical levels: strategic, tactical, and functional.MPIs at lower levels aggregate at the strategic level for each measuring criterion.Galar et al. (2011) present a five-level hierarchical model of an integrated maintenance balanced scorecard, linking the four perspectives to each organizational level and their corresponding objectives.Lai and Man (2018) introduce a phase-hierarchy model for performance indicator classification, integrating the two performance measurement dimensions into a three-bythree matrix.Indicators are classified by hierarchical level and service delivery phase.Meanwhile, proposals for different measuring criteria for each hierarchy have been observed in recent studies.Naji et al. (2019) develop a multi-level, multi-criteria decomposition approach to classify MPIs in terms of six strategic aspects, each of which can be broken down into its own hierarchy.Lundgren et al. (2020) introduce a multi-criteria hierarchical framework to measure the performance of smart maintenance.Based on the anticipated impacts of smart maintenance, the authors adopted three classified sets of performance criteria in relation to firm, plant, and individual levels (Bokrantz et al., 2020).A brief review of how performance indicators are chosen in the maintenance field and industrial sectors is provided in the following section.

Choice of performance indicators
The issue in performance indicator selection is essentially a multiple criteria decisionmaking (MCDM) problem.The majority of the identified literature applies quantitative ranking or prioritization methods to select indicators.The analytical hierarchy process (AHP) method, one of the most used MCDM methods in performance indicator selection, is designed to solve complex multiple criteria problems by providing prioritization and incorporation for assumingly independent criteria (Saaty and Sodenkamp. 2010).Elhuni and Ahmad (2017) propose an AHP-based approach to prioritize sustainable production performance indicators in the oil and gas industry, while Nam et al. (2019) utilize AHP to prioritize and select a set of performance indicators for evaluating sanitary sewer systems.As an extension of AHP, the analytic network process (ANP) does not assume the independence of criteria (Saaty, 2004).Van Horenbeek and Pintelon (2014) select MPIs by using the ANP to prioritize maintenance objectives at all organizational levels and derive MPIs that are linked to business-specific maintenance objectives.Another method, the decision-making trial and evaluation laboratory (DEMATEL), is also used to examine the interdependency of criteria in MCDM problems (Alinezhad and Khalili, 2019).Aiello et al. (2021) investigate the degree of internal relations among MPIs and select a representative set to monitor preventive maintenance efficiency using the DEMATEL method.Furthermore, Maduekwe and Oke (2021) compare three DEMATEL-based methods in an MPI prioritization and association case in the food processing industry.
Expert opinions were collected through pairwise comparison questionnaires and used as input for the studies listed above.However, the vagueness and uncertainty of human judgment can introduce inconsistencies between judgment and ranking criteria in real-life IJQRM problems (Nam et al., 2019).Some researchers have introduced fuzzy logic to overcome these issues.Stefanovic et al. (2017) present a ranking and assessment approach for maintenance process, cost, and equipment indicators.The authors applied fuzzy sets to calculate the weight values for a group of indicators at a Serbian metal processing plant, followed by a genetic algorithm that ranked the indicators.Naji et al. (2019) quantify elementary maintenance performance measurement with fuzzy logic before introducing the AHP for prioritizing and aggregating elementary indicators.In another case involving an oil refinery plant, Maria and Manuela (2017) prioritize MPIs by combining fuzzy logic with the AHP and the technique for order of preference by similarity to ideal solution (TOPSIS) methods.Furthermore, Gonçalves et al. (2015) suggest selecting MPIs by applying the elimination and choice expressing reality (ELECTRE) method, which enables the handling of heterogeneous scales among criteria, thus allowing the candidates to maintain their original concrete verbal meaning.
Based on the reviewed studies, the authors of this paper summarize the process of choosing a set of MPIs for an MPM framework in three sequential steps: identifying potentially relevant indicators, screening indicators according to the organizational context, and prioritizing indicators with high impacts.The number of MPIs listed in the literature is substantial, yet a widely agreed-upon methodology for deriving the indicators is still unavailable (Kumar et al., 2013(Kumar et al., , 2018;;Maria and Manuela, 2017), leaving a key knowledge gap in MPI formulation.Although the reviewed literature focuses strongly on the ranking methodologies of performance indicators based on aspects of performance evaluation, the reasoning for why one indicator is more relevant than another in the context remains implicit.Moreover, it has been observed that the data collected for performance measurement are not adequately used in decision support (Muchiri et al., 2010).On the one hand, the large number of data collected becomes a problem, as they require more sophisticated methods and algorithms to elicit useful information (Villarejo et al., 2017;Wakiru et al., 2022).On the other hand, the reviewed indicator selection methods take a one-way path from identifying to screening MPIs.Data availability is considered a filter for eliminating non-applicable indicator candidates rather than an input to derive potential indicators.Potential indicator candidates are mostly gathered through literature searches and industrial practices, leaving some of the available and potentially useful data behind.Consequently, the links between selected MPIs and the focus on future performance improvement weaken.Despite the various methods for selecting and prioritizing indicators identified in the literature, there is a lack of research on developing comprehensive measures for evaluating maintenance processes that consider the entire cycle and are less sensitive to data availability.To address these issues, a performance diagnostic framework that creates a mutual connection between performance aspects and maintenance data is proposed.

Research method 3.1 Research aim
The literature review above shows research gaps in the systematic derivation of MPIs and the development of generic MPM frameworks to cover the complete maintenance process flow of existing equipment.Furthermore, it is not clear how existing maintenance data can be used to facilitate the formulation of MPIs.To tackle these issues, this study presents the concept of a generic and adaptive maintenance diagnostic framework through case research, thus contributing to the literature on the performance measurement of maintenance activities.Such considerations form the basis of the research question of this paper: How can maintenance performance be measured in a holistic end-to-end view by utilizing available data?

Diagnostic framework of maintenance performance
The research presented in this paper is based on a literature study and a case study, following a prescriptive research approach.The proposed maintenance diagnostic framework extends the knowledge in the current MPM-related literature.

The case study method
In accordance with the concept development presented in the following section, a case study was conducted to validate the proposed approach.This case study was designed following the five-stage research process model (Stuart et al., 2002).This research uses a single case study setup because it enables in-depth observation of phenomena in exploratory investigations and provides an opportunity to access multiple contexts within the case (Barratt et al., 2011;Meredith, 1998).The in-depth single case study allows the authors to develop a comprehensive and adaptive performance diagnostic framework for maintenance work at an existing offshore oil and gas plant.The limitations of a single case study include the risk of misjudging the representativeness of a single event and reduced generalizability (Voss et al., 2002).Introducing other case companies in a study is a common approach to improve generalizability (Eisenhardt, 1991).

Data collection and analysis
The case study was conducted on 196 emergency shutdown valves (ESDVs) at an offshore production plant of a multi-national company in the oil and gas exploration and production industry.This research primarily uses objective data from the company's database to support concept development and validation.The objective data include quantitative and qualitative historical maintenance data collected over a period of two years, from April 2017 to March 2019.Inputs from maintenance experts were gathered through meetings, semistructured interviews, and workshops.
This study applies several techniques to ensure validity and reliability in data collection and analysis, so that the research outcomes are rigorous and relevant.Reliability is ensured by applying multiple data collection methods, documenting how the case study was conducted, and developing and maintaining a database for the case study (Ellram et al., 2020).The data collection methods applied in this study are listed in Table 1.Specifically, the maintenance records used in this paper are secondary data from a structured maintenance data model.The records were directly extracted through the case company's secured computerized maintenance management system (CMMS) and contextualized in the data model to allow scoping in selected equipment categories.The records were inspected in the data model on an aggregated level to eliminate possible errors during extraction.Incomplete records were found from the original extractions and kept in order to examine the adaptability of the framework.Outliers, such as empty or rejected work orders, were removed before the analysis.Maintenance work principles and guidelines were gathered from the case company's internal documentation.Other qualitative data collected for this study was stored as tables and documents in digital formats.To ensure construct validity, data triangulation was applied using archival data, workshop inputs and interview data.The analysis of maintenance records was primarily carried out on business intelligence software QlikView.Key calculations were repeated to confirm that the results are replicable.Maintenance experts in various roles at the case company were invited to review the drafts of the framework through several iterations.To ensure internal validity, the diagnostic framework was developed based on the literature (Birolini, 1994;Campbell, 1995;Muchiri et al., 2011;Nielsen, 1997;Sigsgaard et al., 2020;Weber and Thomas, 2006).Pattern matching and explanation building were performed throughout the concept development following Karlsson (2016).In this research, the case study was conducted at one company due to resource limitations, which have a negative impact on external IJQRM validity.The external validity issue is mitigated by inviting maintenance experts to review the proposed framework.System owners, as well as maintenance operations and process improvement personnel, were invited throughout the case study to validate the results.

Development of maintenance performance diagnostic framework 4.1 Structure of maintenance performance diagnostic framework
Measuring maintenance performance requires time and effort.A detailed investigation of maintenance performance usually requires data from multiple sources.In some cases, data collection, data processing, and text analysis require manual work to update performance results.The lack of automated data for the knowledge process gives rise to the need to conduct an analysis that takes days to months (Parida et al., 2015).To reduce the time and labor costs of regular performance tracking, the proposed maintenance performance diagnostic framework is designed in two parts, as illustrated in Figure 1.
The first part of the framework focuses on the action "detect," namely, detecting the existence of potential performance issues by measuring maintenance outcomes.For equipment types or maintenance strategies with satisfactory maintenance outcomes, additional diagnostics are not necessary; otherwise, an end-to-end investigation at the process level is carried out in the second part.The second part focuses on the action "locate," namely, locating potential maintenance performance issues down to the process level.The proposed multi-criteria framework has a hierarchical structure at the first three levels (diagnostic objectives, diagnostic domains, and diagnostic aspects).A top-down approach is used to properly formulate the performance measurement structure at these levels.Level 1 determines the top-level objective for the performance diagnostics for each part.Level 2 defines the major performance diagnostic viewpoints under level 1 objectives as domains.Level 3 further expands the domains into categorical diagnostic aspects.The performance result for each thematic category from levels 1-3 is represented by a qualitative indicator, which aggregates the corresponding indicators at the lower level.Level 4 (performance measurement) links MPIs to relevant diagnostic aspects at level 3, depending on the data Diagnostic framework of maintenance performance availability of a specific case.The rest of this section explains levels 1-3 for both parts.Case study examples of level 4 are presented in the next section.
4.1.1Part 1: rapid check of performance health status.As mentioned in the previous section, the purpose of designing the diagnostic framework as having two parts is to reduce the time and effort spent on regular performance tracking.The first part of the framework functions as a rapid health check of the recent maintenance outcomes as a whole, which indicates whether detailed diagnostics at the maintenance process level are required.Therefore, the objective of this part is to perform overall performance diagnostics using as few maintenance result (lagging) indicators as possible.Based on Campbell's (1995) and Muchiri et al.'s (2011) classification methods, the proposed framework performs overall performance diagnostics in two domains: overall equipment performance and overall maintenance performance.
Overall equipment performance shows the functionality of equipment procured by maintenance actions.Overall equipment performance can be measured by reliability, availability, maintainability, and safety (RAMS), which consists of a widely used set of lagging indicators (Warsokusumo et al., 2021).Summarizing the definitions of RAMS by Birolini (1994), Gulati (2013), and Warsokusumo et al. (2021), reliability is the probability of an item being able to perform its intended functions in specified periods and conditions, usually measured with the mean time between failures (MTBF) for repairable items and the mean time to failure (MTTF) for non-repairable systems.Maintainability is the ability of an item to be restored or retained in a certain condition, usually measured with the mean time to repair (MTTR).Availability is a function of reliability and maintainability, measured with the degree to which an item can realize its intended function at an unspecified time.In this study, reliability, availability, and maintainability are categorized as diagnostic aspects at level 3. Safety is not within the scope of this study, but can be included if necessary.
Overall performance diagnostics measure the cost and proactivity of maintenance at the aspect level.Maintenance costs can be measured in direct and indirect ways.Direct maintenance cost indicators are measured in monetary values, such as the total maintenance cost and the cost per unit of product (maintenance intensity).Indirect  (Muchiri et al., 2011).Maintenance proactivity indicates the ability of preventive maintenance work to reduce the need for corrective maintenance.The amount of corrective maintenance should be kept at a reasonable level to avoid disturbance of scheduled maintenance work.Overloaded corrective maintenance work compresses the work capacity for preventive maintenance, which consequently creates backlogs and leads to a higher risk of new equipment failures.The International Organization for Standardization (2016) provides three examples of key performance indicators that are relevant to maintenance proactivity in the petroleum, petrochemical, and natural gas industries: the preventive maintenance work-hours ratio, the corrective maintenance work-hours ratio, and the corrective maintenance workload.
Part 1 of the proposed framework provides a rapid, high-level check of performance health on a regular basis.Therefore, it is important to ensure that all data used for performance measurements are easy to retrieve and able to support automated updates.
4.1.2Part 2: locating performance issues.Well-performed maintenance processes lead to the desired production results.On the flip side, poor maintenance results can indicate a loss of quality in one or more maintenance processes.Process performance diagnostics, as the second part of the diagnostic framework, are applied when unsatisfactory maintenance outcomes are detected in a rapid performance health check.The main objective of this part is to find out which processes are causing the issues and in what way.
The three domains of process performance diagnostics are defined as follows: (1) Process effectiveness: The effectiveness domain measures the degree to which the maintenance objectives of the corresponding maintenance process are achieved.
(2) Process efficiency: The efficiency domain measures the degree to which maintenance processes are carried out in a highly productive manner.
(3) Process compliance: The compliance domain measures the degree to which actual maintenance processes comply with designated routines, procedures, or guidelines.
Note that effectiveness, efficiency, and compliance are three independent diagnostic domains.
The results for one domain do not necessarily lead to certain results in another domain.For instance, a maintenance process performed with poor compliance can still be effective and efficient, which highlights a potential best practice not yet identified in the current guidelines.
On the other hand, a maintenance process carried out with good compliance and efficiency is not guaranteed to be effective if the maintenance strategy is not optimal or up to date.Together, the three independent diagnostic domains provide comprehensive coverage of process performance measures.
The process diagnostic aspects are represented by the maintenance management process, which is also referred to as the maintenance work process, maintenance process, or maintenance effort.Nielsen (1997) introduces six basic maintenance work processes for the strategic management of commercial nuclear power stations, and their use has been expanded into various industries, including maintenance for oil and gas production.Based on a cluster of literature, the definitions of the six maintenance management process steps in end-to-end maintenance are as follows (Muchiri et al., 2010;Sigsgaard, Agergaard, Mortensen et al., 2020;Weber and Thomas, 2006):

Diagnostic framework of maintenance performance
(1) Identification: Preventive maintenance (PM) and corrective maintenance (CM) jobs start with the identification process.In this step, the need for maintenance actions is identified, and notice is made.For corrective maintenance, the process is triggered by a failure that affects or will affect the intended function of the equipment.
Comprehensive information about the failure is gathered and reported for decision support in the following processes.For preventive maintenance, this step identifies the need for proactive maintenance tasks according to the system and equipment.
(2) Prioritization: Maintenance jobs are assessed based on their importance and assigned a priority.Prioritization ensures that highly critical work is planned, scheduled, and executed in a timely manner so that the risk of severe failure consequences can be reduced.
(3) Planning: The planning step determines concrete maintenance tasks according to equipment and failure information.According to these maintenance tasks, resources are estimated and allocated to the jobs in terms of material, personnel competency, and time consumption.The planning step ensures that all necessary resources for execution are considered.
(4) Scheduling: Technically and financially approved jobs are then sent to the scheduling step.This step evaluates the availability of the required resources and schedules jobs for execution according to resource availability and job priority.
(5) Execution: Maintenance tasks required in the jobs are carried out by trained maintenance technicians.
(6) Close-out: The close-out step collects technical and business information from the execution process.The gathered information is documented and used for continuous improvement.
By integrating the maintenance management processes and diagnostic domains as twoprocess evaluation dimensions, a domain-process model is developed, as shown in Figure 2.
Each cell in this model corresponds to a process diagnostic aspect at level 3.These diagnostic aspects are the basis for creating key diagnostic questions and deriving MPIs, which are introduced in the next section.

Deriving maintenance performance indicators
The first three levels of the proposed diagnostic framework are generic and can be utilized for maintenance management in various industries.The fourth level, performance measurement, is case-specific, depending on the context, including data availability, depth of diagnostics, and other considerations.To integrate the context into the derivation process of MPIs, this IJQRM section introduces a question-based performance measurement approach to support the derivation of MPIs for specific cases.An overview of how the question-based performance measurement approach supports the bridging of diagnostic aspects and maintenance data is shown in Figure 3.The MPIs are derived through three steps.

4.2.1
Step 1: interpret diagnostic aspects by proposing key diagnostic questions.Each performance diagnostic aspect, as shown in Figure 2, is first interpreted with a few casespecific key performance diagnostic questions that determine the degree of completion of the aspect.These questions are proposed in a way that points out the most relevant performance information that is of interest from this aspect.Only the most representative questions should be proposed, and preferably no more than three for each aspect.The questions should be phrased in plain and easy-to-understand language.An important note is that a proposed question should target the performance of only one aspect when possible.If a proposed question involves multiple aspects, the question should be decomposed to match each aspect or substituted with another measurement method.At the end of this step, a full list of performance diagnostic questions is obtained.

4.2.2
Step 2: link key diagnostic questions and available data.The second step is to establish links between the key diagnostic questions and available data.In this step, all the data that can potentially be utilized to answer the questions are identified and tagged.The term data refers to all the accessible information related to maintenance performance measurement, which includes, but is not limited to, historical maintenance work records, equipment information, and maintenance strategy documentation.The key diagnostic questions and data usually comply with one-to-many relationships; that is, one question is linked to multiple data fields.The questions can be rated in three categories in terms of the effort required to obtain the answers, formulation, and type of associated data, as follows: (1) Type A questions: Questions that can be answered in a quantitative and automated way.Only simple calculations are required for the corresponding performance measurements, which can be realized with business intelligence software.The results can be easily updated by importing new input data.
(2) Type B questions: Questions that can be answered in an automated way, but the corresponding performance measurement is complex and needs further definition.

Diagnostic framework of maintenance performance
Dedicated algorithms are most likely required for the calculation, which demands a one-time effort for its design and realization.The results can be updated by importing new input data once the algorithm is applied.
(3) Type C questions: Questions that can be answered only in a qualitative way.The corresponding performance evaluation criteria are not quantifiable.Instead, they must be determined manually.The data input involves non-standardized text fields that cannot be processed with numerical calculations.
It is evident that Type A questions are the most preferable, whereas Type C questions are the least preferable in terms of the time and effort required to conduct performance measurement.Therefore, the balance between comprehensiveness and complexity for each performance measurement should be carefully considered in this step.Some of the questions can be reformulated at this step, depending on their type, the purpose of the performance diagnostic, and the resources available.

4.2.3
Step 3: specify maintenance performance indicators.Maintenance performance indicators are specified in this step based on the key diagnostic questions and their associated maintenance data.A list of existing MPIs from the case organization should be summarized before this step, when applicable.This list should first be checked to see if measurement with an existing indicator can provide the answer to a proposed key diagnostic question.Such indicators, if found, should be directly adopted in the diagnostic framework as a case-specific MPI at level 4 to avoid unnecessary work in deriving new indicators.If none of the existing performance indicators can be used as a response to a key diagnostic question, a new performance measure should be formulated using the data fields tagged for this question in the previous step.When deriving a new performance measure, it is important to use only the most relevant data fields as input and keep the calculation of performance indicators as simple as possible.In some cases, a performance measure cannot be formulated for various reasons, such as a lack of data or overly complex calculations.In such situations, the key diagnostic question should be revised to ease the issue but without compromising the representativeness of its corresponding diagnostic aspect.For diagnostic aspects that cannot yield any performance measures due to data availability, the aspects should be omitted with a note about the most critical data required for calculation.

Case study
The maintenance performance diagnostic framework proposed in the previous section is partially applied in a case study.Specifically, the overall performance diagnostics part of the proposed framework is not illustrated in the case study because it shares a hierarchical structure similar to the other part presented in this section.In addition, as introduced in Section 4.1.1,the majority of the MPIs for the overall performance diagnostics have been well defined and applied based on consensus in the literature and many industries.Therefore, this section focuses on the process performance diagnostics and the derivation of MPIs based on the domain-process model (Figure 2).
The case study is conducted on the maintenance activities of ESDVs at an offshore oil and gas extraction and production plant.The maintenance data gathered for this study include historical maintenance records, maintenance work principles, and guidelines.Maintenance records from PM and CM are collected from April 2017 to March 2019 based on the actual start date of the maintenance work.Up to 165 maintenance records are obtained within the two-year period, covering 196 valves in total.
The case studies presented in this section explain how case-specific MPIs on level 4 are derived.Following the question-based performance measurement approach proposed in Section 4.2, the first step is to interpret the level 3 diagnostic aspects by proposing key IJQRM performance diagnostic questions.One to three questions are proposed for each aspect of the domain-process model.These questions are reviewed by a group of maintenance experts from the case company and revised in several rounds based on their feedback.Figure 4 shows the final list of 23 key performance diagnostic questions sorted by the process diagnostic domain (level 2) and aspect (level 3).These questions are then linked to the relevant data that are available and categorized as Type A, B, or C (listed in Figure 4) based on the estimated effort to accomplish performance measurement.Considering the extensiveness of the performance measures involved, the rest of this section will demonstrate the derivation of case-specific MPIs through two examples with numeric and text inputs.
Example 1. Deriving MPIs for compliance with the execution time window Question 3.6, "How many maintenance jobs are not performed within the scheduled time window?", is linked to five data fields that are potentially relevant to the performance measurement, as shown in Table 2.All five data fields-scheduled earliest start date (SESD), execution start date (ESD), scheduled latest finish date (SLFD), execution finish date (EFD), and close-out date (COD)-are retrieved from historical maintenance records in date format, which can be represented numerically.As no manual work is required to read or analyze the data, question 3.6 is classified as Type A. As none of the existing MPIs can be utilized as a performance measure to answer this question, a new performance measure, the time window compliance rate, is formulated.For each historical maintenance record, the time window compliance status is defined as follows: The close-out date is used when the execution finish date data is missing.The time window compliance rate for all records is defined as: Time window compliance rate ¼ number of time window compliance records number of total records 3 100% The time window compliance rate is the derived MPI for the execution compliance aspect.
Example 2. Deriving MPIs for PM identification compliance Question 3.1, "Is PM work identified following maintenance strategies?", is complicated and classified as Type C, as it requires manual extraction and comparison of non-standardized text fields from different data sources.For each maintenance interval, the PM activities described in the strategies should be reflected in the identified PM task lists from historical maintenance records.The data fields that are potentially related to this question are listed in Tables 3 and 4. The data in Table 3 are manually summarized from maintenance strategy documents.Equipment type and maintenance interval are used to match strategies to corresponding maintenance records.Reference is not critical for evaluating strategy compliance; thus, it is not used in formulating the performance measure.Manual text analysis is conducted on the identified PM task lists extracted from the maintenance records to determine whether the activities defined in the strategies have been listed, as shown in Table 4.For a given individual record, the conclusion is either "compliant" or "unknown."Therefore, the PM strategy compliance rate for all records is defined as follows: PM strategy compliance rate ¼ Number of PM strategy compliance records Total number of records 3 100%: The PM strategy compliance rate is the derived MPI and is used to answer question 3.1.In summary, the case study demonstrates the derivation of case-specific MPIs from the generic aspects in the maintenance diagnostic framework using the question-based performance measurement approach.The generic process diagnostic aspects at level 3 are

IJQRM
interpreted with a list of key performance diagnostic questions according to the use case and data availability.Two case study examples are presented to show how process diagnostic MPIs can be created under the guidance of diagnostic questions in real-life cases.

Discussion
The case study illustrates the derivation process of case-specific MPIs from the proposed maintenance performance diagnostic framework.In relation to the research question, "How can maintenance performance be measured in a holistic end-to-end view by utilizing available data?", the proposed framework provides a systematic approach to decompose complex maintenance performance structures while allowing the use of imperfect data.In particular, the framework shows how existing maintenance data, often imperfect and misaligned with common performance measures, can be restructured to formulate new MPIs in the framework.Data availability no longer acts as a passive filter that simply screens out MPIs that do not fit; instead, it joins the conversation of MPI derivation and unlocks new possibilities for measuring performance that were overlooked.In addition, maintenance experts from different disciplines can be guided by the framework to collaborate on creating a complete performance measurement structure, which could benefit the overall understanding of performance status and reduce the complexity of maintenance management.
To identify maintenance processes with performance issues from the entire process flow, the performance measures should ideally be decoupled so that each measure matches only one diagnostic aspect.Although the vast majority of the questions in the case study examples correspond to a single diagnostic aspect, several questions are related to two diagnostic aspects (questions 1.7, 2.1, and 2.2 in Figure 4).A common reason for this coupling is the process design of the industry.For instance, the identification and prioritization processes of CM work are carried out without a clear boundary in the case company.The priority of a failure is determined by the reporting of failure information, making the two processes inseparable in terms of efficiency performance measurement.The undesirable coupling of diagnostic aspects is also evident in the execution process, as the effectiveness of a maintenance job is not solely dependent on its execution but is also strongly related to the planning of the work order, such as the choice of material, personnel, and tasks.
A noticeable observation from the case study is that the performance of the execution process, by its nature, is difficult to measure.The actual actions performed offshore during the execution process are not recorded when operators believe they are carried out correctly, making it difficult to evaluate whether a task is performed as described in guidelines.In such circumstances, the diagnostic aspect can be measured only in indirect and less

IJQRM
comprehensive ways, causing an inevitable loss of measurement integrity.This issue raises the dilemma of deriving a measure that compromises its original intention or not having the measure for now and starting to collect the necessary data to support the measure in the future.Raising awareness of such a dilemma is, in fact, a purpose of the proposed framework: What can we afford not to measure, given that not everything can be measured?While there is no simple answer to this question, choices should be carefully made based on the scope, objective, and expected outcome of the performance diagnostics.Justifications should be presented for the performance diagnostic aspects to be omitted completely or partially to increase transparency.

Practical and theoretical implications 7.1 Practical implications
The implications of the proposed framework are valuable for practitioners involved in the implementation of maintenance performance management, particularly for managers of complex production systems in maintenance, operation, data and performance functions.
The framework provides a holistic performance evaluation structure across maintenance processes and evaluation domains, which highlights processes that have potential performance issues.The overall performance diagnostics allow maintenance managers to conduct rapid initial performance screenings of various types of equipment.Key performance results confine the scope of in-depth evaluations that require more effort, resulting in enhanced management efficiency in maintenance performance evaluation.Performance managers can use this framework to identify performance bottlenecks, prioritize improvement efforts, and allocate resources more effectively.
From a decision support perspective, the decoupling of diagnostic domains and maintenance processes aligns performance indicators with specific performance objectives.Undesirable performance can be traced back to concrete diagnostic aspects, providing opportunities for decision-makers to revisit the current maintenance flows and set up plans for implementation.More importantly, the domain-process performance structure also reveals missing elements in the existing performance measurement system.As an old management adage says, "You can't improve what you don't measure".The awareness of unmeasured performance, together with undesirable measured performance, can assist maintenance managers in taking targeted actions towards continuous improvement of maintenance methods.
Other implications of the presented framework lie in adaptability and complexity reduction.MPIs are derived as outcomes of the interplay between key performance diagnostic questions and available data.Data availability is not only considered a filter, but also plays an active role in the derivation process.Therefore, the framework can be applied with lower requirements for data availability and has the potential to be adopted by other production industries using a similar maintenance principle.The non-value-adding maintenance data is also revealed through the derivation, allowing practitioners to focus on collecting relevant data and avoid information overload.

Theoretical implications
This paper produced two main contributions to the literature: (1) an adaptable maintenance diagnostic framework that enables a holistic view of overall and process maintenance performance and (2) an approach for deriving case-specific performance indicators by aligning maintenance objectives and available data.
For the first contribution, the proposed framework was built based on theories of maintenance performance measurement and process management (Birolini, 1994;Campbell, Diagnostic framework of maintenance performance 1995; Muchiri et al., 2011;Nielsen, 1997;Sigsgaard et al., 2020;Weber and Thomas, 2006).Specifically, the proposed framework introduces a twofold hierarchical structure that utilizes overall performance as a rapid screening tool to identify the needs for detailed performance diagnostics, and locates specific performance issues through a comprehensive domainprocess performance diagnostic model.The twofold, end-to-end maintenance performance diagnostic framework, to the best of our knowledge, has not been discussed in the existing literature.
For the second contribution, the paper explores an MPI derivation approach through guided formulation of new performance indicators and relocation of existing indicators.The creation of MPIs is bipartite, driven by both targeted maintenance performance diagnostic aspects and maintenance data.The mutual connection between the inputs and outputs of maintenance performance measurement mitigates data availability issues for existing equipment and systems.More importantly, the MPI derivation procedures reveal rationales behind the choices of existing and new maintenance performance measures.The adaptable end-to-end performance measurement structure and the bipartite MPI derivation approach enhances the understanding of maintenance performance measurement and contributes to the theoretical knowledge base in the field.

Conclusion
Many existing MPM systems in the oil and gas industry lack totality in end-to-end maintenance assessments.The misalignment between data availability and data demand for existing MPM systems leads to the loss of validity of current measures, and potentially useful measures are disregarded.To address these issues, this paper introduces a multi-level maintenance performance diagnostic framework that consists of two parts.The first part functions as a rapid check of the overall maintenance performance health status, allowing fast and continuous monitoring of the key performance results across maintenance strategies and equipment groups.Upon detection of an undesirable overall performance, the second part provides comprehensive and in-depth performance diagnostics of the end-to-end maintenance process flow.A question-based performance evaluation approach is proposed to derive case-specific maintenance performance indicators from diagnostic aspects.
This research has several limitations.Regarding practical implementation, the proposed framework has not been fully implemented in the case company, so it is not yet possible to quantify its resource requirement and impact on the empirical application.The implementation might require additional time, expertise and resources.The successful adoption of the framework requires interdepartmental collaboration, where change management needs to be carefully planned.In terms of generalizability, the overall performance diagnostics requires more validation as this part of the concept proposal was not covered in the case study due to resource constraints.Only one company was included in the case study, which may limit the generalizability of the findings to other industries or contexts.Considering the complexity of maintenance processes, the maintenance processes are defined as six generic steps in the current study, while the actual maintenance workflows involve many smaller tasks.Regarding scalability, the performance measurement of text data is carried out manually in this study, which limits the application potential on a larger scale.
Based on these limitations, more research is needed to evaluate the benefits of the maintenance diagnostic framework in an empirical setting and test its generalization potential in other companies and industries.In particular, the impact of running progressive performance diagnostics on overall and process level could be investigated through a longitudinal study.The proposed framework should be tested in other industries to validate generalizability.Future studies could investigate MPIs on subprocess level in accordance with maintenance workflows so that non-value-adding tasks and variants can be spotted.

IJQRM
Future studies could also investigate the implementation of automated methods in performance measurement, especially in process mining and artificial intelligence technologies, such as natural language processing (NLP) and unsupervised clustering techniques.
Figure 1.Overview of the maintenance performance diagnostic framework Figure 2. The domain-process model for formulating key diagnostic questions about diagnostic aspects (level 3) Figure 3. Bridging performance diagnostic aspects and maintenance data Figure 4. Full list of key performance diagnostic questions and their corresponding process diagnostic aspects ): Authors' own creation/work

Table 1 .
Data collection in the case study