Sortie-based Aircraft Component Demand Rate to Predict Requirements

Weapon System Sustainment (WSS) costs are growing at an increasing rate despite the vast efforts to reduce them. Researchers have attributed much of the cost increase to inaccurate demand forecasts for weapon system spare parts. In 2011, the forecast to sustain all United States Air Force (USAF) aircraft was 19% accurate and WSS costs per year have continuously increased. The purpose of this study is to explore a parsimonious change to aircraft component forecasting to reduce costly forecast error. This study substitutes flying hours with sorties for the purpose of demand forecasting. Many F-16 and B-52 spare parts are evaluated by employing demand and usage data from the D200 and LIMS-EV. The modified Poisson process modeled in this study indicates error can be decreased for many of the components the USAF invests in. This study resulted in roughly a 15% decrease in forecast error among the F-16 and B-52 platforms. Decision makers can employ the insight gained from the model developed in this study to reduce WSS costs and improve performance.


I. Introduction
Background Predicting future needs for aircraft spare parts in an important issue within the United States Air Force (USAF). In the USAF's complex multi-echelon, multi-indenture supply repair cycle, an inaccurate demand forecast may result in improper work schedules at the Air Logistics Centers, incorrect peacetime operating stock levels at base supply warehouses, and incorrect stock levels in aircraft deployment readiness kits. The consequences of such inaccuracy include a spare part or multiple spare parts not being available for an aircraft that the USAF needs to fly. The impact of an unavailable aircraft could include a missed training opportunity for a pilot. More severely, unavailable aircraft could mean one of the USAF's mission sets, like personnel recovery or air superiority is degraded. If demand for any given component is overestimated, too many spare parts are stocked, and other needed parts are not purchased or repaired due to sustainment budget constraints.
The current USAF process employs reliability theory and forecasting techniques to determine future demand. Some critiques of the USAF forecasting method submit that the USAF's process should be updated because it continues to underperform (Eckbreth et al., 2011). This study is parsimonious effort to improve forecasts and diverges from the critique's recommendation of employing "more sophisticated data analysis". For most spare parts, the USAF currently calculates reliability on the number of flying hours 2 associated with the end item the spare part belongs to. The following section further clarifies this problem and the cost of inaccurate demand predictions.
Sustainment costs for USAF weapon systems, especially legacy systems, are untenable. Eckbreth et al., (2011) also tied the challenges to the sustainment enterprise to supply chains that are inefficient due to the inability to accurately predict parts needs. In 2011, they claim the demand forecast for spare part was only 19% accurate. Furthermore, the USAF expenditures to operate and maintain the active fleet ballooned to $63.7 billion in 2019 dollars. Part of the growth in sustainment expenses were due to the age of the fleet (Gunzinger, et al., 2019). Figure 1 shows the upward trend of operation and maintenance costs per aircraft across all fleets. The 35 years prior to 1997, the USAF funding for operations and maintenance increased by $3.4 million per aircraft. However, in the 20 years since, this portion of funding increased by $5.1 million per aircraft. Figure 2 shows the downward trend in new aircraft procurement spending. Considering prior to 2001 these growth rates were between 1% and 2%, the office finds it alarming that the cost of maintaining and operating an aging fleet remain to grow at a faster rate. With plans to continue to fly legacy systems like the B-52 until 2050 and due to the exceedingly high price tag on many parts, it is imperative that supply chain planners continue to adapt and find more accurate calculations for future spare part needs beyond the inadequacies of the legacy D200 forecast method.

Problem Statement
The USAF uses the flying hour program to determine several rates and percentages to include spare part consumption rates. In many cases, the rate at which aircraft spare parts fail and place a demand on the supply system do not show a strong 4 correlation with actual hours flown. This translates to a demand rate calculation that produces an inaccurate future year forecast.

Research Questions
To address this problem and how it is claimed to affect sustainment costs, this study answers the following research questions: 1. Can sortie data be employed to reduce USAF forecast error? 2. How can the D200 process integrate a sortie rate?
3. What methods are available to simulate future requirements based on sorties?

Research Focus
This study assesses the effectiveness of the current flying hour-based spare part demand rate by comparing it to a sortie-based demand rate. The mean absolute percent error (MAPE) of the forecasts produced by the two methods is calculated to show which method performs best.

Methodology
The USAF calculates future demand for spare parts by multiplying the flyinghour-based demand rate by the approved number of flying hours allocated for the upcoming year. A correlation analysis is performed to investigate whether demand for F-16 and B-52 spare parts has a stronger linear relationship to flying hours or sorties.
Understanding this relationship helps validate a previous study regarding this subject and provides justification for this study to investigate further. This study will replaced the demand rate in this process with a sortie-based rate. Then, this study develops various methods to forecast flight allocations based on number of sorties because the USAF does not allocate flying in these terms. To accurately compare demand predictions, a sortiebased rate must be multiplied by a time period in terms of sorties versus flying hours. The F-16 and the B-52 fleets were chosen to analyze the difference a sortie-based demand rate will have on a large fleet like the F-16 and a small fleet like the B-52. For both fleets, the error from the current system and the proposed model are compared to determine which produces the least amount of error.

Assumptions/Limitations
This study is limited by the data-collection environment. For example, demand data is collected and reported in a complex manner. It is reasonable to assume that during this complex process of reporting the number of times a component failed and maintenance activities placed a request for replacement can result in a level of inaccuracy. Furthermore, some observations were not included in the study because the observations' demand data was not available due to data various data entry errors. Using secondhand data and eliminating samples in this manner can distort the results of this study. This study employs the USAF's D200 Poisson process forecast with sorties to measure time between demand. This allows for an intelligent baseline for comparison, despite potential limitation of the D200 forecast methodology like assuming a constant demand rate.

Implications
There are few studies addressing the time measure of USAF repairable component failures. To the best of the authors' knowledge there are no studies that analyze spare component demand as a function of sorties and compares the results of a sortie-based driver of demand to the current flying hour-based driver. The USAF does not provide a sortie forecast as an input to the D200 model. Thus, the methods in this study to estimate a sortie forecast are original.

Chapter Overview
The purpose of this chapter is to explore the vast knowledge of forecasting. It starts with the structure of forecasting. Then the literature review follows the lineage of time series forecasting techniques as they have grown in complexity and accuracy. Then the literature review differentiates times series forecasting from the techniques that count data like inventory demand require. Furthermore, the literature review examines the Poisson process where demand arrival is exponentially distributed and the expected number of demands in each time period is a discrete Poisson distribution. The Air Force manual that governs the D200 forecasting system is examined to illustrate that it follows the Poisson process. Finally, the related aircraft component demand forecasting literature is assessed to ensure originality of this study.

Forecasting
The review of the literature pertaining to this study begins with the concept of predicting future outcomes, or forecasting. The need for forecasting increases with a managers' attempt to minimize dependency on chance by becoming more scientific in dealing with an uncertain environment. Forecasting techniques can be placed into two main categories: quantitative and qualitative. Quantitative forecasting can be employed when there is enough empirical information regarding the past and it can be assumed that the past patterns in the information will continue into the future. If these conditions are not met, qualitative techniques can be employed. If neither condition is met, the topic of 8 interest is unpredictable. Moreover, quantitative forecasting methods as a continuum with two extremes. At one extreme there are intuitive or ad hoc methods and at the other extreme there are formal statistical methods. Another dimension for classifying quantitative forecasting methods distinguishes this method by the model used. The two main forecasting models are time series and explanatory models. Explanatory models like regression assume that the variable to be predicted has some relationship with one or more independent variables. However, time series models make no effort to explain the factors that may affect the variable that is being predicted. Time series models attempt to find a pattern in the historical data and generalize that pattern into the future (Makridakis, et al., 1998). Bowerman, et al., (2005) define time series as a chronological sequence of observations on a particular variable. It is a quantifiable variable over some time measure. The authors explain that the components of a time series are trend, cycle, seasonal variation, and irregular fluctuation. The authors argue that due to the irregular fluctuation, no single best forecasting model exists. So, the biggest problem with forecasting is fitting an appropriate model to the pattern in the available time series data. The fluctuations are modeled as part of the error in forecasting. So, according to the authors, large forecasting errors can indicate that the irregularity is too great for forecasting or another model or technique could be more appropriate. Before explaining error and the importance of analyzing forecast error, it is important to review the different time series forecasting methods.

9
Beyond using averages or naïve one month moving averages to predict future occurrences, Norbert Wiener (1949), began using the statistical concepts from communication engineering and cybernetics to make predictions based on the smoothing of stationary time series. Much like early communication devices depended on probability distributions to predict the most likely intended message and provide it to the receiver, Wiener proposed that time series data behaves this way and can be used to make predictions. Brown (1959), based his work on much of Wiener's ideas by using statistical forecasting for inventory control. His work was an early application of smoothing and other advanced tools like monte-carlo simulation to advanced demand predictions. It may have been unknown to Brown, but Holt (1957, reprinted 2004 documented the idea of smoothing variation or random fluctuations and derived equations to model trend and seasonal fluctuations. Brown's work in 1959 attempted to make the abstract concept and the mathematics more user friendly for an inventory control specialist or manager. Winters ' (1960) work added to the time-series forecasting body of knowledge by comparing weighted exponential smoothing to traditional methods of the time to show that it can model trend and seasonality, if present, and provide a more accurate forecast.
Additive and multiplicative forms of exponential smoothing were theorized in much of the early works. However, Pegels (1969) formally presented the nine possible models in graphical form and summarizes them into one formula that readers can comprehend. Before Pegels' work, Muth (1960) was the first to apply statistical concepts like linear regression to time series and showed that this method of simple exponential smoothing (SES) provided an optimal forecast for what he called a "random walk with noise". Later, Box & Jenkins, (1970)  With research, scholars articulated other methods, like state-space models (Ord, et al., 1995). These advanced methods are outside the scope of this study. This section of the literature focused on time series with linear relationships between the variable and time. Time series can also show a non-linear relationship between time and the variable of interest, which are much more complex than the aforementioned linear methods.
Furthermore, they are difficult to perform and are outside the scope of this study. The next section will cover the literature on forecasting count data because aircraft spare parts demand can be described as intermittent count data. Croston (1972) argues that simple exponential smoothing can be inappropriate when forecasting count data like inventory when the intervals between demand are shorter than the period between demand. As a routine stock control perspective, this forecasting situation results in inventory predictions based on these intervals instead of 11 average demand. Doing so assumes that the interarrival time is uniform. He then builds on Box and Jenkins' ideas of non-stationary demand, often a characteristic of inventory demand, and employs a stochastic approach to modeling inter-arrival times of demand, which reduces error in intermittent non-stationary count data forecasts. His improved system makes separate forecasts for demand size and the arrival interval of demand and eliminates previous models' biases towards regular demand where there is a demand signal in every interval. His model adjusts for periods without a demand signal. After this work, inventory control forecasters have taken a stochastic approach to forecasting demand. forecast inventory demand. Furthermore, they argue that fitting the demand to a two-parameter distribution provides evidence of improved forecasts. The focus of their study is summarized in Figure 3. Figure 3 can also be used as a model when determining methods to estimate parameters for intermittent demand distributions. The authors build on previous work and empirically show that distributions must be found for the demand size and the demand arrival. Agreeing with Croston (1972), Sytetos et al., (2011 conclude that the Poisson distribution is a "reasonable" distribution to model the behavior of these items and is theoretically expected of slow-moving items like aircraft parts and that the interarrival time of demand is not uniformly distributed. Before this work was presented, practitioners and researchers were studying demand as random. The work presented by Syntetos et al. (2011), provided additional empirical evidence in support of treating demand as random failures versus component wear-out.

Forecasting Count Data
However, demand can be considered a random event or caused by a wear-out process. Evaluating demand from both lenses is the cornerstone of reliability theory. Ebeling (2004) explains why the Poisson process is used to model demand behavior and make predictions of future failures. According to the Ebeling (2004), if a part having constant failure rate λ is immediately repaired or replaced, the number of failures that you would expect over a time period has a Poisson probability mass function. The Poisson distribution is discrete and the mean or the predicted number of failures over time is given by λt.

USAF Forecasting
To assist USAF managers in consolidating complex resource data and become more scientific in dealing with their environment, the USAF uses the Secondary Item Requirements System (SIRS) also referred to as the D200A ( Essentially, the USAF calculates each years' spare parts requirement in order to submit a budget to buy total requirements minus the number of parts that are projected to be fixed and returned to service. The USAF calculates each items' current consumption rate λ which SIRS translates into reliability information called rates and percentages (RAP) or factors (Air Force Materiel Command, 2017). The current consumption rate is multiplied by next years' projected flying hours to calculate requirements. With demand forecast accuracy at 19% as recent as 2011 (Eckbreth et al., 2011), leaders have stressed the need to improve future demand calculations.

Efforts to Improve Forecasts
More recently, the U.S. Government Accountability Office (2015), found that the DoD was still in the early stages of improving their demand forecasting and remained on the annual report's high-risk list. The USAF and other DoD services are still not where they need to be in terms of a more precise demand forecasting methods. This can cause the agencies to overspend on spare parts. GAO maintains a program to concentrate on government operations that it identifies as "high risk" due to the operations' high potential for fraud, waste, abuse and mismanagement or tackle economy, efficiency, or effectiveness challenges.
Even though these problems remain, there has been work done to improve spare part forecasts and increase their accuracy. The same studies agree for the most part that aircraft parts are difficult to forecast. Bachman and O'Malley (1990), credited the difficulty to volatility in item demand rates and the effects of Air Force management decisions. The two researchers continue, suggesting the USAF should pursue improvements in technical forecasting, but any solution should include the development of stronger management controls to improve the stability of the requirement. Bachman and Kruse (1994) found for less volatile items like aircraft consumables, demand was not strongly correlated to weapon programs like flying hours or total number of weapons.
This research is significant to this study because it analyzed demand correlating to other programs besides flying hours and challenges the general notion that aircraft components fail at a rate based on the number of flying hours. Sherbrooke (1997) used maintenance removals to simulate demand and found that in many cases when sortie durations are not constant, demand is more closely correlated to number of sorties. He also noted that supply data should be used rather than maintenance removals because parts are often removed but never turned in for repair.

Chapter Summary
Even though this issue is very complex and difficult solve, the studies in this section claim the USAF has shown a poor track record and has lost credibility in terms of forecasting requirements to make decisions that maximize the nation's return on investment (Eckbreth et al., 2011, Gunzinger et al., 2019. To get after this problem, researchers have shown the USAF may have been incorrectly attributing component demand to the number of flying hours rather than analyzing demands on a per sortie bases. This study will verify Sherbrooke's findings and employ the USAF's Poisson process with sorties as the time measure versus flying hours to determine and validate that sorties are a better predictor of demand than flying hours. Furthermore, this study will develop a methodology that uses supply demand data rather than maintenance data to evaluate requirements on a per sortie bases to predict demand.

Chapter Overview
This chapter will describe how the D200 works and explain how a model can be built to simulate the model with the use of sortie data. To begin, the rationale behind using F-16 and B-52 data and instructions on how to obtain the data is provided. To validate Sherbrooke's (1997) work, it shows how correlations between demand and flying hours, and demand and sorties can be calculated and interpreted for both sets of data. Then, to mimic the current USAF Poisson process to predict demand, a sortie-based demand rate λ is calculated. In order to predict future demand, usage must be forecasted in terms of sorties, not flying hours. Since the USAF does not currently provide the D200 usage estimates as a function of sorties, four methods will be developed to simulate future usage as a function of sorties instead of flying hours. This section will explain the rationale and steps to create the four methods and how to apply them to the model to produce four separate forecasts to compare to the D200 forecast. The mean absolute percent error (MAPE) will be used to compare the D200 process and the sortie-based process's error.

Correlation Analysis
Following the data collection, a correlation analysis was preformed to compare each item's demand with how many hours were flown each year. The correlation (CORREL) function in Microsoft Excel produced a Pearson's r value for each item indicating how strongly throughout the years flying hours correlated with demand for the part. This process was repeated with the number of sorties. items and over 52% of the B-52 items, the proportion of demands to sorties may be used to more accurately calculate future demand. It is important to note, correlation does not mean that the number of sorties that are planned for the next year will predict the demand for each part more accurately. Furthermore, the USAF does not provide the logistics community a forecast or plan for number of sorties. This problem will be addressed later in this chapter with the four proposed methods for forecasting sorties. The next step in the study is to replicate the USAF's D200A calculations for future demand using number of sorties instead of flying hours.

USAF Demand Calculations
The USAF primarily uses an eight-quarter moving average factor method to calculate the next years demand for each spare part (Defrank, 2017). Using the following equation, the USAF begins by calculating the average demand per flying hour: where T is the time period (Defrank, 2017

Average Demand Per Sortie
This study will use the following equation to calculate the factor as the average demand per sortie: The above factor must be multiplied by the number of anticipated future sorties to render a forecast that is comparable to the current method discussed in the previous section.
However, the USAF does not provide a forecast for the number of sorties for the D200 to calculate future demand. The next step of this study is to develop a reasonable forecast for the future year's sorties to closely simulate the results of a sortie based D200 demand forecast.

Sortie Forecasts Method #1
The first method will provide a baseline for the remaining sortie forecast models.
Calculating a demand forecast using the proposed sortie-based factor and the next year's actual number of sorties obtained from LIMS-EV will demonstrate the accuracy that other models can compare to. With the use of actual sorties flown in the next year, which will not be known, this method can be thought of as a goal for the remaining methods in this study to measure against.

Sortie Forecasts Method #2
The second method in this study simulates a sortie forecast by converting flying hours to the average sortie duration. To calculate number of sorties, the average duration rate is divided into the number of flying hours (Air Force Materiel Command, 2017). As the number of sorties is known, this equation can be used to calculate the average sortie duration (ASD).
This sortie forecast method calculates the ASD for each observation year. The product of the current year ASD is then multiplied by the next year's flying hour forecast to transform this forecast into a sortie forecast. Finally, the product of the sortie forecast using this method and the models sortie based rate (λ) is a reasonable demand forecast to compare with our baseline method #1.

Sortie Forecasts Method #3
The second method this study used to provide the model a sortie forecast is to apply the Holt-Winter's forecasting method to historical number of sorties per year to 20 forecast the next year's number of sorties. The triple exponential smoothing accounts for seasonality and trends in the demand and is accomplished by the FORECAST.ETS function in Microsoft Excel. This function will automatically detect any seasonality or trend in the time series data and use the appropriate exponential smoothing formula. The forecasted number of sorties is multiplied by the new sortie-based factor to produce a demand forecast.

Time series Demand Forecast (TDF)
This study also applies Holt-Winter's forecasting algorithm to historical demand.
This method ignores the sortie-based demand rate factor and provides a forecast based exclusively on historical demand. This method is simple and will be useful to understand if a demand rate factor is useful when forecasting demand for spare parts or not.

Mean Absolute Percent Error
This study uses the mean absolute percent error (MAPE) to measure forecast error. The MAPE of the D200 demand forecast and the MAPE of the demand forecast produced by the four models in this study are calculated and then compared to determine the method that will provide the USAF with the best estimate of demand. The MAPE is calculated with the following equation: where n is the number of observations.

Chapter Overview
This chapter will focus on answering the main research question. Tables will be presented to show the results of all forecasts in terms of the MAPE. Then tables will be presented to show the robustness of a sortie-based forecast. Finally, a table will show that future research can use QPA factors and percent applications factors to reduce error further and further justify using sorties to forecast spare part demand.

Aggregate F-16 Forecast Comparison
The results of the correlation analysis and the four simulated demand forecasts is designed to decision makers insight and levers to pull when deciding to use a sortie-based demand forecast rather than a flying hour-based forecast. The analysis of the two data sets resulted in spreadsheet that can be filtered by NIIN, fiscal year, federal stock group, federal stock class, or by the correlation between demand and sorties and demand and hours flown. When filtered by these categories the MAPE is recalculated for each category.
To illustrate, Table one shows the resulting MAPES of the F-16 parts filtered by fiscal year 2018.  shows that this study's proposal is not true for all parts and further research is needed to find possible explanations. For example, item 010454508 had a D200 forecast that was 100% accurate. This may be due to the item manager's ability to override the D200 forecast and negotiate their forecast during the SRRB. A future study could be used to explain this anomaly. Regardless of these anomalies, Tables 1 and 2 show that in general sorties should be used in the demand rate to calculate future demand for F-16 items. Another conclusion that can be made from this study is that the models used are relatively robust and insensitive to some of the limitations. For example, Table 3 demonstrates that not all parts have the same value for flying hours like they do for sorties. The item's flying hours come from D200 which accounts for the item's QPA and its percent applications. The items number of flying hours is the product of total flying hours, QPA, and percent application. Where QPA is 1 if there is only one of the items installed on the aircraft, 2 if there are two, etc. Percent application is the percent of each item that is used by the aircraft of study vs. other aircraft. For example, if a third of the USAF's inventory of a given item is allocated to the F-16 but the other two thirds are allocated to two other mission designs (MD), the percent application is .33. This explains the differences in flying hours between some items and can be seen in Table 3.

B-52 Results
The results from this study are very similar for the B-52 fleet Table 4 shows the 2018 forecast MAPEs for the B-52 fleet

Chapter Summary
This chapter answered the main research question. Essentially, the methodology presented in this study shows that sorties can be employed to improve the accuracy of USAF demand forecasts. Aggregate forecasts for the F-16 and B-52 are compared and both fleets are shown to benefit from a sortie-based forecast.

Chapter Overview
The purpose of this chapter it to relate the study results to actionable recommendations for USAF decision makers. This chapter also outlines recommendation for future research to supplement this study and future research to address the limitations of this study. The results of this study empirically show that there is a possibility for decreased error in USAF spare part forecasting. Although, forecasting count data that is intermittent and non-stationary like spare part demand is difficult, this study employs a parsimonious method to decrease error and get after an area that has historically caused the USAF to lose credibility. This study gives a tool to forecasters that will allow them to compare their current prediction system to a sortie-based system. A quick comparison could result in better buying decisions when the SRRB proposes a budget for spare parts. With weapon system sustainment costs growing at an alarming rate, better decisions based on this study could decrease funds being inappropriately allocated and possibly restore some lost credibility. However, to make meaningful change, action must be taken.

Recommendations for Action
First, the USAF should terminate the use of flying hours to predict all demand.
This model allows decision makers to compare how flying hour forecasts performed in the past and can produce a forecast based on both sorties and flying hours. Essentially, the model provides the tool necessary for the USAF to transition from a one size fits all system to a hybrid system. At the individual item level, the hybrid system will allow the forecaster to select the program that has historically shown less forecast error.
Second, if individual item comparison is infeasible, it is recommended to use the sortie-based model proposed in this study for all items. Demand forecasts aggregated from 2011 to 2018 for all parts in the study on the F-16 and B-52 fleets saw less error with the sortie-based model. Even though the sortie-based forecast did not outperform the USAF forecast for every item in these fleets, it did perform better when demand was aggregated in this fashion.

Recommendations for Future Research
First, future studies or improvements to the model proposed in this study should apply quantity per application (QPA) and percent application to the number of sorties attributed to each item. Appling these factors to the number of sorties USAF aircraft fly allows for a more precise allocation of sorties to each item installed on the aircraft. Future research can employ the model of this study with the more precise sortie allocation and could improve the forecast.
Furthermore, future research should explore the appropriateness of applying the Poisson process to every item. The literature suggests that some aircraft parts may have a failure distribution that differs from the Poisson process (Ebeling, 2004

Summary
It is the belief of this research that a sortie-based demand rate could be applied to future requirements defined by the number of sorties expected to calculate a more precise demand forecast. This study shows exponential smoothing methods can be applied to historical sortie time series data to meet this requirement. The product of these two consistently outperforms the status quo and should be implemented to more accurately budget for future spare part requirements.