Total risk evaluation framework

PurposeThe purpose of this paper is to generalize the traditional risk evaluation methods and to specify a multi-level risk evaluation framework, in order to prepare customized risk evaluation and to enable effectively integrating the elements of risk evaluation.Design/methodology/approachA real case study of an electric motor manufacturing company is presented to illustrate the advantages of this new framework compared to the traditional and fuzzy failure mode and effect analysis (FMEA) approaches.FindingsThe essence of the proposed total risk evaluation framework (TREF) is its flexible approach that enables the effective integration of firms’ individual requirements by developing tailor-made organizational risk evaluation.Originality/valueIncreasing product/service complexity has led to increasingly complex yet unique organizational operations; as a result, their risk evaluation is a very challenging task. Distinct structures, characteristics and processes within and between organizations require a flexible yet robust approach of evaluating risks efficiently. Most recent risk evaluation approaches are considered to be inadequate due to the lack of flexibility and an inappropriate structure for addressing the unique organizational demands and contextual factors. To address this challenge effectively, taking a crucial step toward customization of risk evaluation.


Introduction
To respond to exponentially growing stakeholder and societal demands, companies have had to develop solutions for complex operations, which are highly sensitive to the external and internal organizational environment. To ensure smooth operation, it is necessary to understand the hazards and risks associated as well as their mitigation. Risk evaluation is used in many application areas, and many frameworks and methods have been proposed in practice and in the scientific literature. Conventional risk evaluation approaches, however, ignore the fact that many contemporary organizational and process components or failure effects across hierarchical levels of a system are inherently complex (O'Keeffe et al., 2015;Pasman et al., 2014), and they are not sufficient to address the continuously changing organizational demands. Such situations call for new approaches, suggesting flexible and adaptive risk evaluation methods (Aven, 2016;Reiman et al., 2015) that change to fit the environment and situational factors of the organization. As Kanes et al. (2017) stated, it is important to focus research on the area of flexibility in risk evaluation as a way forward for improving current risk evaluation methodologies. In light of the aforementioned, this paper proposes a multi-level risk evaluation framework, where flexibility is given a more prominent role than in the current state of the art in the application area. It helps us to think differently about risk evaluationas a process that is recursive rather than linear, flexible rather than rigid and pluralist not binary (O'Keeffe et al., 2015). Flexibility in risk evaluation is important in the following areas: rating scale, number of risk factors, risk aggregation and warning system (WS); however, their integration into one framework has yet not been developed in the literature. The lack of the flexibility of risk evaluation raises difficulties in integration, customization and adaptivity of risk management systems. Various rating scales have been developed for risk evaluation in the literature, which can be divided into two categories as predefined or invariant scales according to the stage of evaluation. In the case of invariant scales, linguistic scales were mostly used with defined levels (e.g. 3, 4 or 5 levels), and the risk was considered an occurrence of high risk levels (Gauthier et al., 2018). Risk values can also be a result of a pairwise comparison (Merrick et al., 2005), where risk effects are compared by their factors. In these cases, based on the results, risk effects can be ranked and management can treat the most risky effects. However, in this case, the results of risk values are very difficult to interpret. In the implementation of predefined scales, experts use previously defined ordinal scales, for example, the traditional 10-point scales (Liu et al., 2013(Liu et al., , 2012Liu and Tsai, 2012), probabilistic scales with values between ½0; 1 (Gauthier et al., 2018;Malekitabar et al., 2018) or mixed scales such as the Fine-Kinney method (Kinney and Wiruth, 1976). The proposed framework allows both types of risk value to be used and makes it possible for matching the aggregation function to its rating scale in a consistent way.
Methods in the literature define the degree of risk depending on a fixed number of factors. In the failure mode and effect analysis (FMEA) method, the risk value is calculated based on IJQRM 37,4 the occurrence, severity and detectability parameters (Fattahi and Khalilzadeh, 2018), whereas the Fine-Kinney (practical risk assessment) method calculates risk depending on the likelihood of occurrence, exposure to the event and possible consequence parameters (Kinney and Wiruth, 1976). Various extensions of the number of risk factors have been introduced in the literature (see e.g. Karasan et al., 2018;Yousefi et al., 2018). However, they are limited to a fixed number of factors. Unlike the literature, we propose a risk evaluation framework that flexibly considers the impacts of arbitrary ðn ≥ 2Þ number of risk factors to cope with the deficiency mentioned earlier.
Several methods and analyses (such as MacKenzie, 2014;Azadeh-Fard et al., 2015;Panchal et al., 2019cPanchal et al., , 2019a have been proposed for aggregating risks. Summarizing available information into a single number is a difficult and sensitive issue (MacKenzie, 2014), which requires an analyst to carefully select and utilize constituent elements, factors, weights and algebraic operations (Ni et al., 2010;Azadeh-Fard et al., 2015). The proposed total risk evaluation framework (TREF) works with a risk aggregation protocol (RAP); this is a triplet, which includes three elements such as factors, weights and aggregation functions. The aggregation can be implemented at different levels such as factor, effect, mode, process, process area and organization. In addition, various domains such as health and safety, quality or environment can be considered in the course of aggregation. This hierarchical aggregation procedure is flexible; items such as the properties of available information, mathematical expression and details required by the users can determine the exact elements and methods that are best suited to the organization.
Warnings also play a vital role in risk evaluation (Khan et al., 2015;Øien et al., 2011). Conventional risk evaluation has the disadvantage of being rigorous (Kalantarnia et al., 2009); repeatedly adopting a single index (Zheng et al., 2012) or a list of warning indicators (Øien et al., 2011) to signal warning events fails to capture all meaningful failures. There have been many efforts to develop a WS for risk assessment (Ilangkumaran et al., 2015;Øien et al., 2011;Zheng et al., 2012), but none of them addresses warning events from both factors and levels of aggregation, that is, effect, mode and process, in order to capture comprehensive failure identification. The principal feature of the WS of TREF is its flexibility. TREF extends the literature by presenting a flexible WS considering risk values from the levels of factors, domains, effects, modes, processes and the organization. This feature is a novel tool to specify unique warning rules for each risk factor separately in each level.
In summary, the main contribution of this study to the literature is that it is the first step toward customization of risk evaluation. In other words, TREF is a flexible risk evaluation framework, which can be tailored to the specific needs of companies. TREF is composed of the following: (1) An arbitrary number ðn ≥ 2Þ and weight of risk factors can be set by domain in risk evaluation.
(2) To obtain a domain-specific risk value, the attributes of different management systems (e.g. health and safety, quality or environment) can be considered in risk evaluation.
(3) The paper develops a flexible multi-level risk aggregation protocol, which can be implemented at different levels, adapted to the organizational needs.
(4) A flexible warning system can specify unique rules for warnings in each level.
The rest of this paper is organized as follows. Section 2 provides a brief review of risk evaluation and its shortcomings. In Section 3, the proposed new risk evaluation framework, the TREF, is presented. Section 4 presents how the proposed TREF can be applied in risk evaluation. A real case serves to demonstrate the applicability and practicality of the Total risk evaluation framework proposed TREF in Section 5, including comparisons with two most commonly used risk evaluation methods: traditional and fuzzy FMEA. Finally, we draw conclusions and make suggestions for future research in Section 6.

Related work
Risk evaluation is the process of assessing the impact and likelihood of identified risks based on Chang and Wen (2010) and Hansson and Aven (2014). The main aim of risk evaluation is to determine the importance of risks and to prioritize them according to their effects on systems, processes, designs and/or services for further attention and action (Klinke and Renn, 2002). In other words, this process determines which risk source warrants a response. The need for this process is based on the fact that organizations, processes and projects face a large number of risks, each with different effects; thus, it may be impractical or even impossible to manage them all because of time and resource constraints. Risk evaluation is used in many application areas, and many frameworks, methods and techniques have been proposed in practice and in the scientific literature. Liu et al. (2013) conducted an analysis of 75 papers on the subject of risk evaluation. They systematically classified the existing literature and concluded that the FMEA approaches introduced in the last decades can be divided into three categories according to their failure mode prioritization methods: multicriteria decision-making, mathematical programming and integrated approaches. In addition, it can be observed from the surveyed literature that the fuzzy rule-based system is the most popular method for prioritizing failure modes (Liu et al., 2013;Panchal and Kumar, 2017;Panchal et al., 2018a). Conventional risk evaluation approaches nevertheless ignore the fact that many contemporary organizational and process components or failure effects across hierarchical levels of a system are inherently complex (O'Keeffe et al., 2015;Pasman et al., 2014), and they are not sufficient to explain all that can go wrong. Such situations call for new approaches, suggesting the need to develop flexible and adaptive risk evaluation methods (Aven, 2016;Reiman et al., 2015) that change to fit the environmental and situational factors of the organization. As Kanes et al. (2017) stated, it is important to focus on the area of flexible risk evaluation, as a way forward for improving current risk evaluation methodologies. O'Keeffe and his team also emphasized that a risk evaluation process should be recursive rather than linear, flexible rather than rigid and pluralist not binary (O'Keeffe et al., 2015). Such a situation calls for different approaches and methods, and it is a challenge for the risk field to develop suitable frameworks and tools for this purpose (Aven and Zio, 2014;SRA, 2015). As a result of a shift in risk evaluation thinking from traditional and rigid to flexible and adaptive attributes, new risk evaluation methods should be developed where flexibility is one of the most important characteristics. Flexibility in risk evaluation can be implemented in the following areas: scale, number of factors, aggregation and WS.
Various scales have been developed for risk evaluation in the literature; they can be divided into two categories of predefined or invariant scales according to the state of evaluation. In the case of invariant scales, in the early stages of risk evaluation, scale was not used; risk evaluation was performed via percentage of occurrence (Etherton and Myers, 1990). Later, linguistic scales were used with 3-5 distinguished levels, and the assessment was made by the evaluation team's top ratings percentage (Gauthier et al., 2018;ISO 12100, 2010). Linguistic scales Merrick et al. (2005) use the pairwise comparison instead of percentage. After the comparison, we can determine the ranking order of all alternatives and select the best ones from among a set of feasible alternatives. The main challenge of this approach is to interpret the resulting risk values. Indeed, regardless of whether we have compared risky or less risky effects, the results will fluctuate around the same value.
Another approach is to use predefined scales for all factors. Before performing the evaluations, the appropriate numeric scales were defined first in the failure analysis (Liu et al., 2013). Various scoring guidelines exist; for example, Goodman as cited by Silva et al. (2014) IJQRM 37,4 developed the 10-point scales for evaluating the failure modes with respect to each risk factor. Similarly, Lolli et al. (2015) developed an evaluation scale for assessing the three risk factors such as the widely known FMEA. In some cases, mixed scales can be found, as in Fine-Kinney (Kinney and Wiruth, 1976), where for likelihood and exposure [0.1,10] is used and for consequence [1,100] is used. Both approaches can be used in risk evaluation; however, predefined scales, in particular the FMEA method using the product formula, were the most common (Liu et al., 2013).
Methods developed in the literature define the degree of risk depending on a fixed number of factors. In the traditional FMEA method, the risk value is calculated based on the occurrence, severity and detectability parameters (Liu et al., 2013;Fattahi and Khalilzadeh, 2018). The Fine-Kinney method calculates risk depending on the likelihood of occurrence, exposure and consequence parameters (Kinney and Wiruth, 1976). Some extensions of the number of risk factors have been introduced in the literature. Karasan et al. (2018) extend the number of factors, calculating risk based on severity, probability, frequency and detectability values. Ou edraogo et al. (2011) increased the factors to five: risk perception, impact of hazard, research specificities, hazard detectability and probability of occurrence of accident. Maheswaran and Loganathan (2013) proposed four risk factors including severity, occurrence, detection and protection. Yousefi et al. (2018) considered two additional factors including cost and duration of treatment in addition to severity, occurrence and detection. These methods, however, are limited to a fixed number of risk factors. In addition, during our literature investigation, we found that authors calculate with risk factors, as they are independent (Liu et al., 2013). One of the possible causes of ignoring additional risk factors is that their dependence should be addressed. These issues call for new solutions that can address the dependence of risk factors and an arbitrary number of risk factors.
Several methods and analyses have been proposed for aggregating risk. Traditionally, FMEA uses the risk priority number (RPN) to evaluate the risk of failure. The occurrence factor measures the likelihood that a failure mode occurs. The severity is the expected consequence of failure. The ability to recognize an error before it affects customers is measured by the detection factor. Scales based on guidelines for usage (such as Fine-Kinney and FMEA) and for evaluation/aggregation require different functions, such as additive, average, product, geometrical mean, logarithmic (Malekitabar et al., 2018), but the most common is the FMEA method with product formula (Liu et al., 2013). The multiplication of these factors generates the RPN, and the aggregation is performed solely at the factor level. Detailed procedures for carrying out an FMEA have been documented in Stamatis (2003) and Tay and Lim (2006). The traditional FMEA has proven to be one of the most important early preventive methods (Liu et al., 2013(Liu et al., , 2014Silva et al., 2014), whereas the traditional RPN method has been criticized in the literature (see the summary in Liu et al. (2013); Lolli et al. (2015); Malekitabar et al. (2018)). Numerous alternative approaches have been proposed to overcome the shortcomings of traditional FMEA. It can be observed from one of the most recent reviews of FMEA conducted by Liu et al. (2013) that the fuzzy rule-based system is the most popular method for prioritizing failure modes. The fuzzy rule-based FMEA approach uses linguistic variables to prioritize failures in a system to describe the severity, detection and occurrence as the riskiness of failure (Tay and Lim, 2006;Petrovi c et al., 2014;Bowles and Pel aez, 1995). However, the most commonly used membership functions are the triangular and trapezoidal (Riahi et al., 2012). An advantage of using fuzzy rule-based FMEA for risk evaluation is that the resulting evaluation becomes qualitative and has the ability to model uncertain and ambiguous information. A disadvantage of fuzzy rule-based FMEA approaches is that they can produce erroneous results if analysts do not have a sufficiently deep understanding of the system. In addition, similarly to traditional FMEA, fuzzy rule-based FMEA aggregates only at the factor level. Other aggregation techniques have also been proposed in the literature, for example, geometric mean (see e.g. Kokangl et al., Total risk evaluation framework 2017; Maheswaran and Loganathan, 2013;Wang et al., 2009), median Karasan et al. (2018 and radial distance Malekitabar et al. (2018). The weighted geometric mean is also applied in the analytic hierarchy process (AHP) (Braglia and Bevilacqua, 2000) or analytic network process (ANP) (Liu and Tsai, 2012;Torabi et al., 2014;Wang et al., 2018). The AHP/ANP enables the decomposition of elements into a hierarchy and calculates weights for the risk factors. In the AHP, each element in the hierarchy is considered to be independent of all the others. However, ANP does not require independence among elements, so it can be used as an effective tool also in the case of interdependency (Saaty, 2004;Wang et al., 2018).
In addition, the authors emphasize a remarkable shift toward integrated methods for ranking failure modes when aiming at accurate risk evaluation. For instance, fuzzy evidential reasoning is integrated with grey theory (Chang et al., 1999;Liu et al., 2011), fuzzy TOPSIS (technique for order preference by similarity to ideal solution) with fuzzy AHP (Kutlu and Ekmekçio glu, 2012) and VIKOR (VIsekriterijumska optimizacija i KOmpromisno Resenje) or EDAS (evaluation based on distance from average solution) (Panchal et al., 2019c) with fuzzy logic (Liu et al., 2012;Panchal et al., 2019b;Panchal and Srivastava, 2019) and grey techniques (Panchal and Kumar, 2016;Panchal et al., 2018b;Panchal and Srivastava, 2019). There is a trend toward using more than one method to enhance the efficacy and empirical validity of risk evaluation results (Liu et al., 2013). Recent research (Lolli et al., 2015;Liu et al., 2014) also shows a shift toward integrated methods (e.g. ANP (dos Santos et al., 2015;Zammori and Gabbrielli, 2012) has been combined with other models), so that synergies can be maximized. Liu et al. (2013); Shaker et al. (2019) conclude that objective and combination weighting methods should be applied in risk evaluation because they evaluate relative importance objectively without decision-makers. However, some doubts remain concerning the applicability of integrated methods to real-life circumstances, for example, the need to add risk factors to the determination of risk priority of failure modes (Liu et al., 2013) and the need to support the aggregation of risk levels from different domains. Considering risk effects in different domains is important because the same source of hazards often causes risks in multiple management areas with different levels of relevance (Pasman et al., 2014). Therefore, the sources of hazards describing the possible risk effects in different management system areas (e.g. ISO 9001 (2015); ISO 14001 (2015) and ISO 45001 (2017) (previously OHSAS, 18000) should be considered and developed holistically and cohesively (Abad et al., 2014;Asif et al., 2013;Bernardo, 2014;de Oliveira, 2013;Rebelo et al., 2016). Domains such as health and safety, quality or environment can be considered in risk evaluation with different weights. To conclude, priorities and demands can be different by domains, which calls for flexible risk aggregation.
Warnings play a vital role in risk evaluation (Khan et al., 2015;Øien et al., 2011). Conventional risk evaluation has the disadvantage of having rigor (Kalantarnia et al., 2009), repeatedly adopting a single index (Zheng et al., 2012) or a list of warning indicators (Øien et al., 2011) to signal warning events and failing to capture meaningful failures. There have been many efforts to develop the WS of risk evaluation. Ilangkumaran et al. (2015) proposed a hybrid technique (Liu et al., 2015;Panchal et al., 2019a) for assessing work safety in hot environments including a warning rating and safety grade at the risk factor level. Øien et al. (2011) have developed a set of risk indicators that can provide warnings about potential major accidents. Zheng et al. (2012) proposed an early warning rating system for hot and humid environments calculating safety indexes at the factor and subfactor levels. In addition, Xu et al. (2002) suggested two levels of warnings. In the scientific literature, the risk hierarchy is occasionally mixed with risk level; for example, Chen et al. (2012); Manuele (2005) use the action levels as risk hierarchies, and no real hierarchy levels are used. This summary shows that methods developed in the literature do not address warning events originating from multi-levels such as factor, effect, mode and process in order to specify unique warning rules for each risk factor separately in each level.
In conclusion, based on the aforementioned works, the risk evaluation practice of companies is adapted in line with the limits of the existing rigorous methods. Therefore, we IJQRM 37,4 propose a flexible risk evaluation framework, which can be tailored to the specific needs of companies. We found that a fixed number of risk factors mostly from one domain were included only in risk evaluation. Our proposed framework makes it possible for users to set an arbitrary number of risk factors, which can be different in different risk domains. Additional advantages of TREF are that it considers the importance of risk factors by domain and can handle the dependence of risk factors. In addition, most methods aggregate risk value at the factor level and the same is true for signal warnings. Instead, TREF includes a flexible RAP and a flexible WS. The flexible RAP makes it possible for users to use different risk factors, weights and aggregation functions at different levels, adapted to the needs of companies. Flexible WS can specify unique warnings for each single risk factor separately.

Mathematical background
The TREF includes risk evaluation, risk assessment and schedules of corrective/preventive actions. This paper focuses only on the evaluation of the risk effects based on a novel formula.

Formal description of calculating risk values
. . . ; f n T , ðn ≥ 2Þ be the vector of risk factors, and let w ¼ ½w 1 ; w 2 ; ::; w n T ; ðn ≥ 2Þ be the weight vector of risk factors ðw i ∈ R þ Þ.
Denote r ¼ Sð f ; wÞ as a resulting risk value, where S is a monotonous aggregation function. Denote ðf ; w; SÞ as the risk aggregation protocol (RAP).
Remark 1. Usually we assume that risk factors f i and f j , ði ≠ jÞ are independent of each other. However, the proposed RAP does not require their independence.
The proposed RAP can integrate the traditional FMEA, Fuzzy FMEA and the Fine-Kinney risk evaluation methods. RAP generalizes these three types of methods; therefore, they can be considered special cases of the proposed RAP.
(4) S 4 ðf ; wÞ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P n i:¼1 w i f 2 i q is the weighted radial distance of risk factors.

Total risk evaluation framework
In the case of w i ¼ 1=n, the aggregation functions S 1 ; S 3 and S 4 produce the unweighted geometric mean, unweighted median and unweighted radial distance of risk factors.
Remark 2. One of the main advantages of the proposed RAP is that the number of risk factors can be different, for example, for each risk domain and for each risk mode. In the extreme case, TREF allows us to ignore the evaluation of a risk mode if it does not have an effect on a domain. To compare different kinds of resulting risk values that are based on different numbers of risk factors, the choice of an adequate aggregation function is one of the most important topics. According to Calvo et al. (2002) and Beliakov et al. (2008), an aggregation function should produce symmetric result for symmetric inputs and should be continuous if the inputs are continuous. Malekitabar et al. (2018) and others criticized FMEA because production as aggregation is a very (left-side) asymmetric function; therefore, the risks are often underestimated. In contrast to Malekitabar et al. (2018), who suggest radial distance as an aggregation function, we suggest a weighted median if the resulting risk factors are to be compared or aggregated. The (weighted) median maintains the symmetry as much as possible not only for two factors, which were investigated by Malekitabar et al. (2018), but also for more than two factors (see Figure 1), where two (see Figure 1)(a-c) and six (see Figure 1)(d-f) are aggregated. Factors followed discrete uniform distributions f i ∈ 1; 2; . . . ; 10.
Nevertheless, the (weighted) median and (weighted) geometric mean can be robust to outliers, which is commonly referred to as a good feature of an aggregating function, except for the aggregation of risks, where the high value of a risk factor may require intervention. This problem can only be solved with a multi-level WS, where not only the aggregated risk values but also the risk factors are considered (see Section 3.2).
Remark 3. It is important to note that the proposed risk aggregation protocol does not require existing (predefined) scales (see Section 2). Scale values can be a result of a pairwise comparison (see e.g. Merrick et al., 2005).
Applying the risk aggregation protocol iteratively, the risk values can be specified in a higher hierarchy level.
Þ as the total risk priority number i in the hierarchy level N.
The proposed framework provides the opportunity to evaluate a failure mode through different types of aspects. The effects of a failure mode can have impacts on different domains such as quality, environment, health and safety as well as others. These effects are usually evaluated separately if companies follow different types of standards (such as ISO 9001, 14001, 45001). However, in addition to the separated evaluations, aggregated risk values can also be important for the company, which can include these aspects. Considering these aspects, a weighted risk value can be calculated regarding the iterative formula.
Similarly to this bottom-up calculation mode, risk values can be specified for all hierarchy levels.
At the hierarchy level, we can say: (1) The risk factors are aggregated: TRPN e Þ, and they specify the risk value for risk effect e.
(3) The risk values of risk effects are aggregated: TRPN ð3Þ m Þ, and they specify the risk value for failure modes m.
(4) The risk values of failure modes are aggregated: TRPN p Þ, and they specify the risk value for process p.
(5) The risk values of processes are aggregated: TRPN ð5Þ a Þ, and they specify the risk value for the process area a.
(6) The risk values of process areas are aggregated: TRPN M Þ, and they specify the risk value for the main process M.
(7) The risk values of main processes are aggregated: TRPN ð7Þ ¼ R ð7Þ ¼ SðR ð6Þ ; W ð6Þ Þ, and they specify the risk value in the organizational level.
In the proposed framework, an arbitrary number of factors (but at least two) can be specified for each single risk domain. The compatibility between the TREF and the traditional FMEA or fuzzy FMEA can also be realized if severity, occurrence and detection are considered as risk factors. Similarly, compatibility between TREF and Fine-Kinney can also be realized if the applied risk factors are likelihood of occurrence, exposure and consequence. Beyond the traditional methods, additional risk factors can be specified, for example, control, information and range (see Section 5.) The number of risk factors follows the company needs and the nature of the risk domains. The TREF also proposes an extra factor, criticality, to allow the risk evaluation team to specify corrective/preventive actions. Risk factors can be evaluated and aggregated by different domains (e.g. quality, environment, health and safety) that cover different facets of corporate systems. The potential failure modes can be identified, as can their causes and effects (see Figure 3). Organizational level processes (e.g. production, sales) can be divided into subprocesses, sub-subprocesses and so on Certa et al. (2017). This decomposition may go on to an arbitrary depth in the hierarchy depending on the nature and complexity of organizational processes. Nevertheless, hierarchy level is only one option to use the proposed iterative formula. If an acyclic graph or chain of causes, failure modes and effects are specified (see Figure 3), the risk values for common causes and common effects can also be specified.

The warning system
The WS signals to the risk evaluation team where critical failures are, and this team can see the general conditions of the processes. The WS considers risk values at all levels.
As with the calculation of TRPNs, the specification of the WS follows the bottom-up conception. Corrective/preventive actions are scheduled if a risk factor is not lower than a threshold W1, but also corrective/preventive actions are scheduled if the aggregated value is not lower than a threshold W2. The TREF proposes an extra factor, criticality, to allow the risk evaluation team to specify corrective/preventive actions W3, even if the aggregated risk value is lower than the specified threshold. If its value is 1, corrective or preventive actions should be specified. However, if its value is 0, corrective or preventive actions can be specified because both the risk factors and/or the aggregated risk value can be higher than the thresholds. The criticality factor produces another flexibility for the team to override the evaluation and specify preventive tasks for the events that are not risky but that may be potentially risky events (e.g. nonquantifiable risks and difficultly quantifiable customer expectations, or even their possible changes) and should be evaluated independently from other risk factors.

IJQRM 37,4
Formally, corrective/preventive actions can be prescribed in three ways: ∈ R þ . Denote the invention function in level N for factor i.  The thresholds and the rule of thresholds can be specified as arbitrary, based on the company experts. Generally, warning thresholds are specified based on former experiences, but standards can also provide a threshold. (In our case study, because the company had to follow more than one standard requirement, the minimum value of the experts' opinions was the threshold.) In addition, the dependence of risk factors can also be addressed by specifying different thresholds for each single risk factor separately.
Definition 4. We can say that a (risk) effect is a failure effect if at least one of the conditions W1-W3 is satisfied.
If TRPNs are calculated for the total process tree (see Figure 2), thresholds should be specified for all levels.

The process of the application of the proposed framework
The proposed framework has four stages (see Figure 2). The first stage (process specification) is detailed in Section 4.1 Phase 1. The next step, the risk evaluation, is detailed in Section 4.1 Phase 2 and Phase 3, while stage 3 (risk assessment) is detailed in Section 4.2. The risk reduction and mitigation (stage 4) are detailed in Section 4.3.
4.1 Process of risk evaluation 4.1.1 Phase 1: logic planning. The process hierarchy, including the core processes, subprocesses and their subprocesses and so on (see Figure 2), the process-specific elements and failure modes and the chain of causes and risk effects based on their domains should be specified before the proposed TREF is used. This process hierarchy helps us to obtain the failure modes. The proposed framework shows us where we should concentrate efforts to correct and improve processes. Instead of the process hierarchy, a (process) graph can also be specified (see, e.g. Figure 3). To apply the proposed recursive risk evaluation process, the only requirement for the logic network is that the graph be acyclic. In addition, in the process hierarchy, further arbitrary acyclic graphs (furthermore, the TREF graph (see, e.g. Figure 3) can be specified to better fit changing company needs.
4.1.2 Phase 2: defining factors and scales. At the beginning of Phase 2, there are two options. Either scales for evaluating risk factors are identified (1) or invariant scale methods are used (see e.g. Merrick et al., 2005;Gauthier et al., 2018), for example, performing pairwise comparisons of risks for separated risk factors (see, e.g. Merrick et al., 2005) (2). In the first case, first the adequate scales for risk factors should be specified. When specifying scales, linguistic categories can also be specified, and instead of scale values, membership functions can also be defined (see Example 2). In case of an invariant scale, there is no need to use a predefined scale; however, after pairwise comparison, the results should be categorized. In all cases, each effect is evaluated through n þ 1 risk factors, where the total risk priority number Total risk evaluation framework (TRPN) is calculated by these factors (see risk reduction and mitigation in Figure 2). Since one of the main benefits of the proposed TREF is flexibility, the evaluation of at least two factors (e.g. occurrence, severity) is required. However, more than three factors can be specified, and the number of evaluation risk factors is flexible.
4.1.3 Phase 3: calculation of risk values and thresholds through the process hierarchy/ process graph. Based on the proposed iterative bottom-up calculation method (see Definition 2), through the process hierarchy or an acyclic process graph, risk values can be calculated for each hierarchy level.
Contrary to traditional FMEA and fuzzy FMEA, TREF allows the specification of more than one effect to be assigned to a cause (see Figure 2). However, different failure modes and risk effects may have the same causes (common causes) (see Figure 3). The only restriction is to avoid cycles in the process hierarchy.
On the one hand, weights can be calculated by using ANP method, which can follow the process hierarchy. Applying weights gives a general view of the process risks, which are weighted by their importance. On the other hand, using weights is only optional. If there is no information about the importance of risk factors, the equal weights can be used. The other relevant example of unweighted aggregation uses the maximal value of the risk factors. The maximal value can also produce valuable information about risky processes (see S 2 in Example 4). This value presents the weak links (the worst/most risky processes).
In addition to calculating risk values or before performing the task, the thresholds must be specified for all levels (see risk assessment in Figure 2).

Monitoring risk valuesoperating the warning system
While the calculation of risk values and the thresholds should be calculated by the bottom-up iterative formula, the operation of the monitoring system can follow both the bottom-up and the top-down approaches.
4.2.1 Bottom-up approach. At the 0-th hierarchy level, risk factors are evaluated. A warning event has occurred if a risk factor is not lower than the threshold W1 or a criticality value is set to be 1 W3. For maintenance, this monitoring system shows which risk effect (in which domain) of process mode caused a failure mode and which factors are not lower than a threshold; therefore, a specific corrective/preventive action must be prescribed to mitigate the value of the risk factor. If a specific corrective/preventive action is not prescribed but the aggregated risk value is not lower than a threshold, general corrective/ preventive actions should be prescribed W2 to mitigate the aggregated risk values. General corrective/preventive actions should contain the set of specific tasks, which mitigates the values of risk factors. This bottom-up approach can be extended to the higher hierarchy levels, where general activities in a hierarchy level N should contain specific tasks to mitigate risk factors or risk values in the lower hierarchy.
4.2.2 Top-down approach. The top-down or managerial approach can be specified if in addition to the aggregating risk values the number of failure effects are calculated for all hierarchy levels. If there is a warning event on hierarchy level N, a general corrective/ preventive action is specified, which, similarly to the bottom-up, may (but in this case not necessarily) contain a (detailed) corrective/preventive action to mitigate risk factors. The number of failure effects in every level helps management to drill down and specify the set of corrective/preventive actions. While the bottom-up approach goes from the lower hierarchy level, specific corrective/preventive actions are specified to mitigate the risk factors, and general corrective/preventive actions are usually specified as a set of specific corrective/preventive actions. The top-down or managerial level starts at the top level of a hierarchy. Aggregated risk values give a general view of the risks; however, to reduce the number of failure effects, general corrective/preventive actions should be specified. Nevertheless, these general corrective/preventive actions may (but not necessarily) contain IJQRM 37,4 specific corrective/preventive actions. For example, purchasing a new piece of equipment can be a general activity, which can solve several specific problems.

Schedule of corrective/preventive actions
After specifying the set of corrective/preventive actions: (1) The forecasted effect of corrective/preventive actions should be specified (see, e.g. Bowles, 2003;Carmignani, 2009).
(2) Corrective/preventive actions should be organized as a maintenance project to minimize system shutdowns (see, e.g. Koszty an, 2018).
The proposed TREF includes the schedule of corrective/preventive actions, which is a kind of flexible, discrete time/cost/quality trade-off problem; a future paper will focus on this scheduling problem. After completing risk mitigation projects, the improved risk effects will be re-evaluated (see the re-evaluation arrow in Figure 2), and if necessary, a new maintenance project will be organized.

Case study background
This section presents risk evaluation with different methods, including traditional FMEA, fuzzy FMEA and TREF, in the area of electric motor manufacturing. This case was chosen because of its substantive significance (Ragin, 1999). We have used a single-case design approach, where the case is selected because it is critical; that is, its conditions allow our method to be tested (Dub e and Par e, 2003; Yin, 2013). Some features of the company and processes must be clarified before the numerical illustration. This Hungarian subsidiary of a multinational corporation operates in the high-technology automotive industry. In the last decade, the market for high-precision drive systems has grown substantially. This company is a global leader in high-quality electric motors that are installed in critical applications such as surgical power tools, race cars and high-precision industrial applications. In so-called high-added-value manufacturing, the reliability of products plays a crucial role in their long lifespans. To improve the reliability of processes, a risk evaluation was conducted. The company has integrated quality management (ISO 9001), environmental management (ISO 14001) and health and safety management (ISO 45001) systems. Company processes are divided into three categories (main processes): support (e.g. IT and process planning, finance, legal, facility and vehicle fleet, environmental protection); management (e.g. personnel development, risk management, management review, operative planning, process evaluation, audits, continuous improvement); and customer order-related processes (e.g. procurement, material management, production planning, production, quality assurance, warehousing). In this study, maintenance activities were selected as illustrative examples of the proposed model in Figure 2. They allow us to present the evaluation of each domain and all risk factors. Maintenance activities do not occur in separated functional units but are integrated with the core functions of the company.
Maintenance includes series of actions taken to maintain or restore the functionality of facilities/equipment. Maintenance activities occur in three processes: building engineering in facilities and the vehicle fleet (1.4.01P), means of production maintenance (1.6.01P), and maintenance of inspection tools in quality assurance (4.7.03P). In each case, potential failure modes, their causes and effects (on all three domains, i.e.: quality, environmental, health and safety) and the evaluation of risk factors were first identified by the risk evaluation team. Figure 3 shows the logical connections among five failure modes, four identified causes and nine possible effects. The risk evaluation team, including the system manager, the process Total risk evaluation framework manager and an academic expert, first identified five potential failure modes. The column marked "Processes"indicates the three maintenance processes: building engineering, means of production maintenance and inspection tool maintenance. The column marked "Causes"indicates the four causes: 045C for inadequate maintenance and 046C for insufficient technical requirements are common causes of two failure modes, and the remaining two causes are 018C for devices not registered and 012C for lack of knowledge. The column marked "Failure modes" indicates the type, that is, 1.4.01P.001M: equipment failure in building engineering; 1.6.01P.001M: equipment failure in means of production maintenance; 1.6.01P.002M: nonplanned maintenance; 4.7.03P.001M: failure to maintain inspection tools; 4.7.03P.002M: improper maintenance requirements for inspection tools. The For example, failure mode equipment failure (1.4.01P.001M) is caused by insufficient technical requirements (046C) and inadequate maintenance (045C), and it affects quality (time loss (014E(Q)), environment (pollutants released into the environment (050E(E))) and health and safety (discomfort (051E(H)) and health impairment (053E(H))). As can be seen from the identifiers, causes and effects are not assigned to the processes or failure modes; there is a common database for the whole company. For example "operator failure," "mistyping" might occur in many processes, domains. This allows a smaller size data set with codes that are easier to memorize.

Applied methods
To check the applicability of TREF, it was necessary to compare it with the most frequently used risk evaluation methods, traditional FMEA and fuzzy FMEA (Liu et al., 2013). The use of traditional FMEA with fuzzy FMEA at first sounds illogical because both are not used together. Fuzzy FMEA was developed to help those who were not experts in FMEA with linguistic terms. We develop our fuzzy FMEA method by working backward for this test as an example to test the usability of the TREF. We used sigmoid and bell/splay functions as membership functions (Johany ak and Kov acs, 2004), and calculations were conducted via a weight method. Defuzzyfication relied on the multiplication of membership functions.
5.2.1 Implementation of the proposed TREF. For the TREF, we have used three additional risk factors in the case study, namely, control (C), information (I) and range (R), for a total of six factors. The first three are the same as those used in traditional FMEA and fuzzy FMEA: severity (S), occurrence (O) and detectability (D). This shows that the TREF is flexible and can include any number of risk factors ðn ≥ 2Þ. The risk evaluation team agreed on the values of severity, occurrence, detection, control, information and range by using Tables AI-AIII. The next step is to evaluate the importance of each risk factor in all domains to generate their weights. According to ANP, the reciprocal matrix determined through pairwise comparison for the three domains is shown in Table I.
Values in the table were generated according to Saaty (1987Saaty ( , 2004. The CI comes from the matrix of comparisons, RI is the random consistency index and w 5 weight. The CR is the consistency ratio, which can be calculated as follows: CR ¼ P wCI= P wRI. Weights were calculated using geometric means. The consistency ratio (CR) was calculated by using the information in Table I. Based on the risk evaluation team's pairwise comparisons, the importance of the quality and environment domains are judged to be the same, while health and safety is considered less important. Table II shows the (0-th level) weights (W ð0Þ i;j ) of the six risk ði ¼ 1; . . . ; 6Þ factors in three domains ð j ¼ 1; 2; 3Þ.
In the case of the quality domain, detection has the greatest weight, while in the case of the environment and health and safety domains, severity has the greatest weight. Table II also shows that "Range" is the second-most important risk factor in the environment domain.
The effects are evaluated using the method proposed in Section 3. Each effect's TRPN value was obtained by calculating the S 1 − S 4 risk aggregating functions. Figure 4 shows the TRPN calculations and two kinds of warnings, that is, W1 and W3. For example, according to S 1 − S 4 risk aggregation functions, TRPN for the failure mode (1.4.01P.001M) 051E(H) effect can be calculated as follows: To use the proposed TREF as a module in an expert system, different levels of aggregation should be performed. According to risk aggregation function ðS 1 Þ, the weighted geometric mean of total risk priority numbers was calculated for process levels, failure modes, common causes and common effects. Since the effect (discomfort 051E(H)) was judged to be four times less important than health damage (O53E(H)) by the risk evaluation team, we weighted the geometric mean value (the value input into the oval in Figure 4), which is used to calculate the TRPN ð2Þ ¼ 2:426. Failure mode 1.4.01P.001M has two other effects, 014E(Q) (j ¼ 1) and 050E(E) ðj ¼ 2Þ, which were evaluated from the quality (Q) and environmental (E) points of view (see Table BI in Appendix B). These values are TRPN  Table BI) and TRPN ð1Þ 3 ¼ 2:36 (see Figure 4). This value (the average TRPN for the quality/environment/health and safety effects of failure mode 014P.001M) represents a general view of failure modes. The weighted average TRPN for failure mode 1.4.01P.001M is: These values are lower than a critical value (threshold); however, to detect the number of failure effects, we had to calculate both the maximum values of TRPNs and the number of failure effects (see the results in Figure 4 and Table BI). It is important to note the proposed multi-level approach detected more (in this case, three) failure effects, which would not have been possible when calculating RPNs for only one aspect. Moreover, Figure 4 and Table BI show that the traditional RPN, which is based only on the occurrence (O), severity (S) and detection (D) factors, cannot detect the critical range (R) within these effects (014E(Q), 051E(H) and 053E(H)).
This case study shows that the TREF is a flexible risk evaluation framework. First, the same source of hazards caused risks in multiple management areas, such as automotive customer, special environmental concerns and data handling of risky processes, and each effect was evaluated by various criteria for the three domains. In addition, TREF can address an arbitrary number of risk factors; we used 6 þ 1 risk factors, namely, severity (S), occurrence (O), detection (D), control (C), information (I) and range (R), with criticality as þ1. Finally, different risk factors had different weights in the case of the three domains; for example, "range" was the second-most important risk factor in the environment domain.
5.2.2 Comparison of applied risk evaluation methods. FMEA and fuzzy FMEA are special cases of the proposed TREF, where only three (i.e. (O,S,D)) factors are considered (see Table III). In the case of FMEA, the values are represented as determined by the risk evaluation team, and for fuzzy FMEA, the input values were modified for 5-5 linguistic terms (described by the membership function: τ), which covers the factor range.
Both traditional FMEA and Fuzzy FMEA can be extended considering the importance of the weights of factors. In Section 2, we noted that these weights can be obtained by AHP/ANP or any pairwise comparison methods. All of these methods predict a failure mode if the aggregated risk value is not lower than a threshold W2.
As presented in Section 2, the most frequently used aggregation function is the production ð Q Þ; however, other relevant aggregation (see, e.g. S 1 ; S 3 ; S 4 ) functions can be used.
The proposed TREF does not require weights (see Row 3 in Table III), but thresholds for all factors W1 and direct settings W3 can also be applied. Figure 5 shows all quality-related risk effects on the quality domain of failure modes in process 4.7.03P (see Figure 3). Figure 5 shows the setting thresholds for factors highlighting four risk factors (occurrence, control, information and range) related to four failure effects (015, 016, 020 and 021). TREF does not require the complete filling of risk factors: it allows an arbitrary number of risk factors (see 017 and 018 failure effects). Table III shows the detected warnings and predicted failure modes of the hierarchical warning system. FMEA and fuzzy-FMEA methods only focus on aggregated values; therefore, W1 and W3 cannot be interpreted by these traditional methods. Nevertheless, if thresholds are already specified for the first three factors, 1 W1 warning event can be detected, and therefore, 1 failure effect (016) can also be predicted by the traditional FMEA and fuzzy-FMEA methods. If the three extra factors are considered, five additional W1 warning events can be detected. Therefore, three more failure effects (015,020,021) can be predicted (see Figure 5). In addition, based on expert decision, the criticality of effect 019 is set to 1. Therefore, one more warning W3 is detected, and one more failure effect is specified.
The four applied aggregation functions have signaled eight warnings W2. Two warnings and two additional failure effects (017,018) were detected by weighted median ðS 3 Þ, and three warnings and three failure effects (016, 020 and 021) were denoted by the weighted radial distance ðS 4 Þ and unweighted geometric mean ðS 1 Þ.
The six detected W1 warnings indicated specific corrective/preventive actions to mitigate the value of risk factors, while W2 and W3 warnings required general activities.

Advantages of the proposed framework
This empirical case study demonstrates the following advantages. The main benefit of the proposed TREF is its flexibility, which enables different company demands to be considered.
First, the conventional RPN method used by traditional and fuzzy FMEA includes three risk factors, that is, severity (S), occurrence (O) and detection (D). Using TREF, each effect can IJQRM 37,4 Figure 5.
The risk evaluation table Total risk evaluation framework be evaluated through an arbitrary ðn ≥ 2Þ number of risk factors, where the TRPN is calculated by these factors. We used 6 þ 1 risk factors in the case study, namely, severity (S), occurrence (O), detection (D), control (C), information (I) and range (R), with criticality as þ1. Flexibility is useful, as the traditional RPN, which is based only on occurrence (O), severity (S) and detection (D) factors, cannot detect the critical range (R) within the effects (014E(Q), 051E(H) and 053E(H)) (see Figure 4 and Table BI). Additional flexibility in the TREF occurs because it includes an extra risk factor (criticality) to allow the risk evaluation team to specify corrective/preventive actions even if the TRPN is lower than the specified threshold. As Figure 4 shows, despite average TRPNs (TRPN ð2Þ 051E;H and TRPN ð2Þ 053E;H ) that are lower than the specified threshold, 053E(H) is critical. This TREF property allows the risk evaluation team to specify corrective/preventive actions for that case.
Second, failure effects were evaluated by three domains, that is, quality, environment and health and safety. Each effect was evaluated by the different criteria of the three domains. The case study also demonstrated that the three domains had different weights and the different risk factors had different weights in the three domains. We expected that the traditional three FMEA factors (S, O and D) were the most important; however, the range was the second-most important risk factor in the environment domain.
Third, TREF allows for different levels of aggregation. Average TRPNs were calculated for process levels, failure modes, common causes and common effects. By aggregating failure effects, an average TRPN was calculated for the failure modes (e.g. TRPN 1P ). In addition, the proposed TREF allows for another method of aggregation (i.e. horizontal aggregation). In that case, the TRPNs of all maintenance processes (TRPN O45C ) and common effects (e.g. TRPN 005EðH Þ ) were calculated. This property is useful as it allows for flexible (i.e. hierarchical and horizontal) aggregation by TREF users. In addition to these advantages, calculating the maximum values of TRPNs and maximum values of risk factors provides an evaluation of the failure effects on the higher level of hierarchy.
Finally, the WS has signaled a number of warning events and failure effects. Considering the three additional risk factors such as range, information, control, five additional W1 warning events and three additional failure effects (015,020,021) were denoted. Based on expert decision, additional warning W3 and failure effects (019) were specified by the criticality factor. In addition, eight warnings W2 were signaled by the aggregation functions. Depending on the aggregation function, two additional failure effects (017,018) were predicted by the weighted median ðS 3 Þ, and three failure effects (016, 020 and 021) were denoted by the weighted radial distance ðS 4 Þ and unweighted geometric mean ðS 1 Þ.

Conclusion
In this paper, we developed a flexible multi-level risk evaluation framework that is a crucial step in the customization of risk evaluation. The innovative nature of the proposed TREF is its flexible characteristic that helps to overcome the limitations of current risk evaluation methods, which are static in nature. The proposed method overcomes this weakness using flexible RAP and WS and an arbitrary number of risk factors. TREF allows a novel multi-level, such as a hierarchical and horizontal aggregation. In the case of hierarchical aggregation, TRPNs can be calculated at different levels, such as factor, domain, effect, mode, process and organization. In the case of horizontal aggregation, the TRPN for all processes, common causes and common effects can be calculated. Flexible use of RAP only works together with a flexible WS. Warning events and failure effects can be signaled from each IJQRM 37,4 part of the TREF WS, such as W1, W2 and W3. Specific needs can be addressed locally by specifying different thresholds and warning rules for different risk factors. In addition, we conclude that traditional risk factors such as S, O and D are not sufficient for a comprehensive risk evaluation. To overcome the limitation of current methods, TREF makes it possible to evaluate each effect by means of an arbitrary number of ðn ≥ 2Þ risk factors, including criticality. This makes TREF unique compared to other risk evaluation methods. TREF flexibly sets the number and importance of risk factors, which can be different by domain. The proposed framework was examined by means of a case study of an electric motor manufacturing company.
At the same time, TREF has a number of important managerial implications. By adapting TREF, managers can use different and adequate factors, weights, aggregation functions and warning rules at different levels that are adapted to the actual needs of the company. In this way, the proposed framework helps users to more comprehensively identify critical and risky elements in a more tailor-made manner than existing risk evaluation methods. Thus, TREF supports plant managers in designing a more appropriate risk management policy for the company, which might result in lower losses or increased profits. However, in the course of practical application, managers need to bear in mind the actual context of the company.
Future work in the following three directions is considered. First, the proposed framework can only address the acyclic process graph. However, in real operations, elements may occur that are reciprocally related to each other. Regardless, it would be interesting for future studies to incorporate cyclic structures into the framework. The second limitation of the present study is that the RAP works together with the WS. Future studies should try to investigate when is it worthwhile to diverge the structure of RAP from WS and consequently determine the cases for which such divergence is appropriate and advantageous. Third, in addition to the electric motor manufacturing company, in the future, more case studies should be conducted to further verify and reinforce the reliability of the proposed framework.