A measure of innovation performance: the Innovation Patent Index

Purpose – The measure of companies ’ Innovation Performance is fundamental for enhancing the value and decision-making processes of firms. The purpose of this paper is to present a new measure of Innovation Performance, called Innovation Patent Index (IPI), which makes it possible to quantitatively summarize different aspects of firms ’ innovation. Design/methodology/approach – In order to define the IPI, a secondary source, i.e. patent data, has been used.The fivedimensionsof IPI,i.e. efficiency,time, diversification, quality andinternationalizationhave been defined both analyzing the literature and applying three different machine learning algorithms (regularized least squares, deep neural networks and decision trees), considering patent forward citations as a proxy of the innovation performance. Findings – Results show that the IPI index is a very useful tool, simple to use and very promptly. In fact, it is possibletogetimportantresultswithoutmakingtimeconsuminganalysiswithprimarysources.Itisatoolthatcanbeusedbymanagers,businessmen,policymakers,organizations,patentexpertsandfinancierstoevaluateandplanfutureactivities,toenhancetheinnovationcapability,tofindfinancingandtosupportandimproveinnovation. Research limitations/implications – Patent data are not widely used in all the sectors. Moreover, the pure number of forward citations is not the only forward looking indicator suggested by the literature. Originality/value – ThedemandforauseableInnovationPerformancetool,aswellasthelackoftoolsableto grasp different aspects of the innovation, highlight the need to develop new instruments. In fact, although previousstudiesprovideseveralmeasuresofInnovationPerformance, theseare often difficultformanagers to use, do not appreciate different aspects of the innovation and are not forward looking.


Introduction
Performance measurement is fundamental for enhancing the value of companies and improving the decision-making process. Thanks to a performance measurement system, based on sets of indicators (see for details Dziallas and Blind, 2019), people can be motivated, learning can be stimulated and coordination and communication can be improved (Govindarajan and Govindarajan, 1993;Schumann et al., 1995;Kerssens-van Drongelen and Bilderbeek, 1999). In the few last years, the innovation process has become more and more complex, expensive and risky (Dziallas and Blind, 2019). The technological and competitive environments are changing dramatically and markets becoming dynamic and turbulent. The content of technological knowledge has increased within products and processes, knowledge comes out from the confluence of separate disciplines and stakeholders, and the life cycle is shortening.
In this context, innovation performance measurement is extremely important for both firms' managers and policymakers. From a managerial point of view, new performance measurement systems are needed to efficiently manage innovation. Indeed, innovation is one of the main drivers of productivity performance (Cainelli et al., 2005;Love and Roper, 2015). In addition, firms who can show themselves to be owners of good innovations may be better able to raise funds from investors. From the policymakers point of view, timely, accurate, clear and reliable indicators are needed to set directions and rules to stimulate a fruitful innovation environment, to select industrial sectors requiring special attention, to evaluate the proposals of different applicants for innovation projects and to assess the progress of the initiatives funded (Dziallas and Blind, 2019).
So far, several indicators have been proposed to evaluate innovation performance and its impact on company productivity (see for a complete review Dziallas and Blind (2019) and Chiesa et al. (2008)).
The main source for measuring innovation performance is primary data, i.e. interviews with managers or surveys. However, primary sources are based on small sets of data (Kitsios and Kamariotou, 2016) because of difficulties in terms of firm reachability, costs, insufficient data quality, not to mention, companies' willingness to answer sensitive questions about their processes (Walsh, 1994). In addition, when based on a Likert scale [1], they are affected by respondents' subjectivity. Often, in order to avoid subjectivity, dichotomous questions are proposed, but they do not allow the nuances of the phenomena under investigation to be captured; see for example the Community Innovation Survey (CIS) [2]. In fact, studies on binary variables (yes/no measures) are clear in terms of low bias and are also efficient, but they miss some of the complexity involved in the innovation process (De Jong and Vermeulen, 2006).
In order to overcome the above mentioned problems, other measures based on secondary data, such as patents or publications have been proposed (Garg and Padhi, 1998;Burrus et al., 2018). Patents provide rich qualitative and quantitative information on technological change (Scherer, 1992); in fact the criteria behind patentability of any invention are its utility (industrial application), novelty and non-obviousness (inventive step) (Encaoua et al., 2006). Patents are collected in free databases after being checked and verified by specialists. The availability of online patent databases means that patents are a simple and immediate source to measure innovation performance both for academic and industrial researchers. So far, many innovation measures based on patent data have been proposed in the literature Dziallas and Blind (2019). However, in opposition to the literature that suggests a multidimensional approach, the innovation measures based on patents are usually related only to a specific patent feature such as the number of patents issued by a firm or the number of patents forward citations. It is worth remembering that the patents forward citations are the citations received by a patent from the time of its issue. Even if the number of patents issued by a firm or the number of patents forward citations are two of the main proxies of innovation performance (Carayannis and Provance, 2008), measures based only on a specific patent feature, are restrictive, not promptly to be used and allow only an ex post evaluation. As the issue for academics, managers and policymakers is to use timely, concrete, flexible, adaptable and measurable indicators to evaluate the innovation process and its impacts on the productivity of a company (e.g., Becheikh et al., 2006;Dewangan and Godse, 2014), the existing measures present limitations from both the managerial and policy maker point of view.
Thus, in this scenario, the aim of the paper is to define a multidimensional, promptly and simple to use innovation measure based on patent data, which is richer than the number of patents and more promptly than the number of forward citations, in order to overcome the limits of existing measures. Thus, the main research question of this paper is to define a new innovation measure, that will be called Innovation Patent Index (IPI), based on secondary data, i.e., the patent database, that attempts to enrich the classical innovation measures, considering not only the number of patents but also other patent information identified using different machine learning algorithms. It is worth remembering that the machine learning approach is a very useful tool because it makes it possible to manage huge amounts of data and to capture the complex nonlinear relationships among the data features. In particular, in this paper, as described in Ponta et al. (2020a, b), three different algorithms, i.e., regularized least squares (RLS), deep neural networks (DNNs) and decision trees (DTs), have been employed to identify the most relevant patent features, in predicting innovation performance. Finally, the identified features have been aggregated in dimensions in order to create a userfriendly instrument, i.e. the Innovation Patent Index (IPI), for academics, managers and policymakers. Figure 1 shows the goal of the paper and the process of analysis.
2. Literature review 2.1 Innovation performance and its measures Innovation is one of the main processes of a firm organization and its management and measurement should be defined as a structured process (Janssen et al., 2011). Innovation is one of the determinants of firm performance and results (Hou et al., 2019) and the promotion of an innovation culture is of fundamental importance (Hanifah et al., 2019). For these reasons, innovation performance and its antecedents have been studied in several different contexts, from multinational and subsidiaries (Nuruzzaman et al., 2019;Gaur et al., 2019;Bahl et al., 2020) to SMEs (Hanifah et al., 2019;Saunila, 2016). The definition of appropriate innovation measures contributes to the understanding of the innovation itself, gives the opportunity to enhance the performance and increases the innovation culture (Saunila, 2016).
The measurement systems should act as a managerial tool that supports R&D decisions, providing information about strengths and weaknesses of companies' innovation activities. In addition, they should act also as monitoring and evaluation tools for institutional bodies, who can "see" the behavior of firms or groups of firms. Moreover, it should be a  Research goal and process of analysis A measure of innovation performance benchmarking tool, allowing the company to compare its own results with other companies or also contexts (Lazzarotti et al., 2011).
Over the years the measurement of innovation has raised different issues. Disparities emerging between the studies have failed to develop a unique understanding of innovative performance systems and a common set of indicators at the organization level. A review of more recent studies provides some principles to develop an appropriate measurement system (Dewangan and Godse, 2014). These principles claim that performance measurement systems should be focused also on firm level, should be multi-dimensional, process based, meet stakeholders' goals, propose a cause and effect relationships between measures, as well as being promptly and easy to implement and use (Dewangan and Godse, 2014;Scalera et al., 2014;Lazzarotti et al., 2011). The measurement of innovation performance has often been analyzed at a project or technological level. The analysis at the firm level has been less examined (Carayannis and Provance, 2008). Project-level studies provide a shaded understanding of the innovation mechanisms; this limits the controls that managers have to make decisions in uncertain and dynamic environments. Multiple or composite measures should be preferred in determining a firm's innovation level and a multidimensional view of performance should be provided. Composite indicators allow the multiple determinants of innovation performance to be captured. The assumption behind a multiple-composite indicator is that superior innovation occurs when firms maximize all dimensions of the innovation activity (Lazzarotti et al., 2011;Tohumcu and Karasakal, 2010;Ojanen and Vuola, 2006). The advantage of this approach is that it provides a more informative measure of the innovation performance level of a company. Indeed, each indicator has a different meaning. When one specific indicator diverges from the others, it can be analyzed separately as it contributes to the "innovative performance." The indicators should include throughput measures of innovation (Hagedoorn and Cloodt, 2003). Considering process performance is important in order to focus on continuous improvement and to facilitate the competitive benchmarking and identification of potential problems or inefficiencies. Stakeholder goals should be addressed and merged to take into account those of multiple stakeholders. The performance measurement should be built on the basis of cause and effect linkages between the measures. It should address past and probable future performance by including both leading and lagging indicators. This allows managers to evaluate the value of different activities and relate them to specific results. An effective performance measurement should include indicators that can be measured promptly. In a turbulent environment, managers should take action in advance with respect to their competitors (Liu et al., 2015;Soosay and Chapman, 2006). If the proposed indicators can be evaluated only after a lapse of time or if a long time is needed to collect these data, the information about the performance may no longer be useful. The scheme should be easy for the different firm areas to implement and use.

Innovation performance and their measures with patents
In the literature, many researchers present patents as a source for innovation measurement (Scherer, 1965;Pakes and Griliches, 1980;Griliches, 1990;Jaffe and Palmer, 1997;De Rassenfosse et al., 2013;Acs et al., 2002;Lanjouw and Schankerman, 2004;Nagaoka et al., 2010). Most scientific works investigate the relationship between patent counts and innovation performance, considering patents as an R&D output. In Svensson (2015), the number of patents per capita is used to estimate the rank of sectors and countries in terms of technology intensity or innovation. De Rassenfosse et al. (2013) present a new methodology to assess innovation performance by counting priority patent applications filed by a country's inventors regardless of the patent office. Hagedoorn and Cloodt (2003) analyze innovation performance of countries proposing two ratios: inventive performance and invention productive performance based on the total patent count and R&D indicators. Dechezleprêtre MD 59,13 and Martin (2010) compare both relative levels and trends in clean innovation activity between different countries taking patent counts as an innovation measure to portray a picture of where UK is standing in terms of climate change mitigation innovation. Thus, patent count is still considered as a satisfactory measure for innovation. However, De Rassenfosse et al. (2013) outline some major drawbacks of patent counts: the value distribution of patents is skewed as many patents do not have industrial applications; many inventions are not patented and many inventions are not patentable when other means of protection, i.e. trade secrecy, lead-time on the market or reputation are more useful; counting patents has become difficult due to the differences in patent regulations across countries (Papageorgiadis and Sofka, 2020). The OECD Patent manual of 1994 also underlines that patent statistics should be interpreted with caution due to the variations in propensity to patent across time, firms, industries and countries. These limitations have led researchers to consider other indicators along with patent counts such as forward and backward citations, patent family size, patent renewal, opposition and litigation information and also build patent related indexes (Papageorgiadis and Sofka, 2020;Datta et al., 2015). In particular, patent forward citations appear to be one of the most widely used indicators (Hall et al., 2005). They are the citations received by a given patent from other patents. For that reason, they are partially available only after a substantial time since the granting. Forward citations are particular in that they to grasp not only the technological value of an innovation and its originality, but also its relevance toward the market (Aristodemou and Tietze, 2018). Forward citations may be a measure to understand emerging technologies or to evaluate the economic value of a technology and the innovation capacity of a company (Hall et al., 2005). Moreover, Mueller (2015) states that patent information provides more indicators than just the sheer number of patents over time. For example, the origin of inventor the can represent an excellency of the higher education system when the population of the country is taken into account, or statistical evaluation of jurisdictions in which patent family members were granted can give information about the priority markets for the country considered. Finally, Lanjouw and Schankerman (2004) suggest measuring innovation with a composite index of patent quality using different indicators of patents in order to reduce a measured variance in quality.
The main purpose of this paper is to define an Innovation Patent Index based on secondary data that can overcome the limitation of the indicators based only on one feature of patents, such as the number of patents and also be able to appreciate the value the market gives to the patented technology.

Methodology
In order to build an innovation performance indicator based on secondary data, i.e. the patents database, that can overcome the serious limitations of existing innovation indicator underlined in Section 1, first of all the relevant patent features have been extracted from all the information included in the patents database. In particular, starting from all the data in the patent database, the 24 patent features, reported in Table 1 were identified. As said above in section 1, the paper aims to build a multidimensional, promptly and simple to use measure of the innovation performance. Thus, starting from the 24 patents features identified we wanted to build an indicator that considers all this information. To do this, the number of forward citations, i.e. the number of citations received by a patent, was considered as a proxy of the innovation performance, whereas the other features were considered as potentially predictive of the innovation performance (Narin et al., 1984;Ponta et al., 2020a, b). In this way we are able to create a simple index that is multidimensional because it considers all the information inside the patent database, is timely because it considers all the features that are directly available when the patent is issued and is simple because it considers only a reduced number of features. In order to identify which features are predictive of the forward citations the machine learning approach was chosen. Despite their "dark side" (Siau and Wang, 2020) in terms of ethics, machine learning algorithms let to manage a huge amount of data such as the one embodied in patents and to discover linear and not linear relations (Tu, 1996).

Artificial intelligence approaches
The problem of identifying which patents features are relevant in predicting innovation performance, approximated with the forward citations, can be mapped into a conventional regression framework. The main steps are: first, the identification of the relation R between the input X , namely the features described in Table 1, and an output space Y ⊆R, namely the number of forward patent citations (Shalev-Shwartz and Ben-David, 2014); second, the discovery of the most influencing factors in the input space for predicting the correct associated element in the output space (Altmann et al., 2010); third, understanding how the influencing factors in the input space influence the associated value in the output space. The goal is to identify a model/function M: X → Y , for approximating R, through a learning algorithm A H characterized by a particular set of hyperparameters H . The accuracy of approximation M in representing the unknown relation R is measured with the Mean Absolute Percentage Error (MAPE) (Cincotti et al., 2014). The interpretability, namely the possibility to understand how it behaves, ranks the features based on their effect on the learned model (Guyon and Elisseeff, 2003;Altmann et al., 2010). Three different machine learning algorithms have been implemented to determine the most important features, i.e., RLS, DNNs and DTs. These algorithms are fast to train, powerful in managing non-linear relations, easy to exploit in practice and interpretable (Lulli et al., 2018). For details in solving the problem described above with RLS, DNNs and DT algorithms, see Ponta et al. (2020a, b)

Machine learning results
The results of the three machine learning algorithms used to predict forward citations show that the most relevant features identified are the technological classes (4 and 7 digit), based on the IPC, the technological domain and the number of backward citations (Ponta et al., 2020a, b). Moreover, the size of the patents family, representing the geographical extension of the patent, is a valuable feature for estimating the market value of a patent, as suggested by Harhoff et al. (2003). Another important feature is represented by the time-related determinant of months between the date of publication of the youngest and the oldest patent of the family. This feature provides information on how companies invest in innovation over-time and build their capabilities (Dechezleprêtre et al., 2017).
It is worth noting that all the three algorithms extracted the same patents features as important in predicting the forward citations, i.e. innovation performance. See Ponta et al. (2020a, b) for further details.

Innovation indicators and Innovation Patent Index
Starting from both the literature and the results described in Section 3.3, an index composed of the simplest and most common measures of innovation used in literature, i.e. the number of patents and of the patents' features identified above was defined. The index is formed of five different indicators: (a) Efficiency: the normalized number of patents (b) Internationalization: the number of extensions (c) Diversification: the number of IPC classes (d) Quality: the number of backward citations (e) Time: the number of months between the publication date of the youngest and the oldest patent of the family. The efficiency indicator makes reference to the literature, whereas the others make reference to the main patents' features, empirically derived by the machine learning algorithms, i.e. RLS, DNNs and DTs, as described in Section 3.3. In fact, machine learning is a well suited tool to extract significant relevant information from complex, nonlinear and noisy data (Tk a c and Verner, 2016). For this reason, in recent years and, thanks to the large availability of data, authors have applied machine learning algorithms to various problems such as financial topics, for example, credit scoring, financial analysis, or stock performance prediction, and costs monitoring and sales analysis (Tk a c and Verner, 2016). Recently, the use of these tools has been extended to the field of management in order to support business research and managers in their decision making process. Table 2 summarizes all the patent indicators, derived by the machine learning algorithms or by the literature, that form the IPI index.
Before defining each indicator, let's define the main characteristics of patent p (1) e p number of extensions of patent p; (2) b p number of backward citation of patent p; (3) c p number of classes patent p; (4) m p number of months between the publication date of the youngest and the oldest patent of the family.

Efficiency
The efficiency indicator is defined as the normalized number of patents. Given a company i, its efficiency E i,t at time t is given by the following formula: where N i is the total number of employees of firm i. In the case of a region or a country i, the efficiency is evaluated according to Eq. (1), where n i (t) is the number of patents published in region i at time t and N i is the total number of employees of region or country i.

Diversification
The diversification indicator is defined as the number of IPC classes (4 digit) of each patent. Given a company i, the diversification D i,t is given by: This indicator is important because it has been proven that small incremental innovations do not enhance firms innovation performance, see for example (Moaniba et al., 2018), whereas the kind and not the number of technical domains selected by the company are significant.

Quality
The quality indicator is defined as the number of backward citations. Given a company i, the quality indicator is given by: Backward citations play a main role in predicting IC, so companies must be aware that their capacity to absorb previous knowledge and make use of it will strongly affect future innovation and the ability to be competitive (Harhoff et al., 2003;Hall et al., 2007).

Internationalization
The internationalization indicator is defined as the number of geographical extensions of a patent. Given a company i, the internationalization I i,t indicator is given by:

Time
The time indicator is defined as the road chosen by the firm to patent. Given a company i, the time dimension R i,t is given by:

Innovation Patent Index
Given the five previously defined indicators, an index that summarizes the five values is defined. The Innovation Patent Index (IPI) is defined by five dimensions, which are the five indicators just described: efficiency, time, diversification, quality and internationalization, as shown in Figure 2. The IPI is evaluated as a weighted average of the normalized five indicators, in formula, where α 1 , α 2 , α 3 , α 4 and α 5 are real numbers. So far, the α coefficients are positive and all equal.
Thus, IPI is a real number that ranges between 0 and 10. Moreover, it is important to underline that in order to evaluate the IPI for a firm or a region it is necessary to define the period of observation T, i.e. the number of years.

Application to real world
In order to show how the IPI, defined in Section 4, can be used, this section shows how it has been applied in two cases. In case (a) IPI was employed to evaluate the innovation performance of the Lombardy region and of its provinces, whereas in case (b) IPI was used to measure the innovation performance of the firms belonging to the AMAPLAST [4] association. AMAPLAST is the association of the most important Italian firms working in the plastic materials sector. The data source used in these analyses is The Orbit Intelligence database [5]. All the patents published by inventors with residence in Lombardy or by firms with registered office in Lombardy, or belonging to the AMAPLAST association were A measure of innovation performance collected over the period, 2000 to 2017. For each patent, the list of all considered features is shown in Table 1. Figure 3 shows the IPI measure in case (a), the Lombardy region in the period 2012-2017. The different colors on the map represent the classification. The province in dark blue represents the one with the largest IPI, and thus the most innovative province, whereas the province in light blue represents the one with the smallest IPI. All the shades of blue represent the IPI classification.

IPI and Lombardy region (case (a))
As said in Section 3.3 IPI is defined by five indicators, so it is possible to investigate in more detail the results shown in Figure 3, to give information to managers about the reasons for the classification. Thus, an analysis of the five IPI dimensions was performed. Figure 4 shows the IPI dimensions in the period after the economic crisis for three Lombardy provinces, Varese, Milan and Lodi. The same analysis was performed for all the provinces and the results are shown in Appendix. The green line represents the province and the pink line the region. It is possible to observe that provinces that are ranked as most innovative by the IPI have very different IPI indicators. For example, the province of Milan has a very high efficiency score whereas Lodi, a very small province of Lombardy, has a very high quality and time score. The province of Varese, that is ranked in the upper half of the classification, is only higher than the other provinces in terms of diversification. Moreover, it is possible to investigate more each dimension in more detail considering its evolution during the time period of analysis. Figure 5 shows the five dimensions of IPI for the provinces of Varese, Milan and Lodi for each year and thus their evolution over time. The analysis of all the provinces of the Lombardy region is presented in Appendix. In particular, it is possible to observe that, in the time range considered, in Figure 5a the diversification is the most important dimension in the IPI evaluation, in Figure 5b the efficiency and in Figure 5c time and quality. The analysis of each dimension of IPI for each year gives further information about the innovation trend of the companies belonging to the same province. For companies this information has very important results because they have information about what to improve in order to increase the innovation performance.Moreover, it is possible to investigate the results in the period 2012-2017 (post crisis) comparing different time ranges. Figure 6 shows the IPI's dimensions of the Lombardy region and of the three provinces  Figure 6a shows that all the indicators except quality decrease after the crisis. This means that after the crisis there are fewer patents but with better quality. During the crisis, the time spent in developing the patent, i.e. the resources spent on it, seems longer than the period before the crisis.
In addition, the IPI's dimensions of each province is evaluated in different periods. Figures  6b, 6c, and 6d show the different IPI dimensions in different time periods for the province of Varese, Milan and Lodi, respectively. Each province has its own peculiarity but also for the provinces the IPI dimensions after the crisis are smaller than the ones before the crisis. Summarizing the results of case (a), Tables 3-5 report for each province the values of the indicators and the IPI in the three periods, before, during and after crisis, respectively. For example, we can notice how the territory of Milan clearly stands out in terms of efficiency, while other territories such as Como, smaller than Milan, grow in terms of diversification and time.

IPI and AMAPLAST (case (b))
The IPI measure was used to measure the innovation performance of firms belonging to the AMAPLAST [6] association. For privacy reasons the paper does not report the innovation performance of each firm, but the behavior of the five indicators of IPI for small, medium and large firms. Small means that firms have less than 50 employees, medium means that firms    have between 51 and 250 employees and large means that firms have more than 251 employees. Figure 7a shows a radar plot of the five indicators composing IPI for all the companies who patent and, that belong to the AMAPLAST organization. It is worth remarking that in this graph only the 67 firms that have patented in the period of observation are plotted. In general companies have different performances if the five indicators are considered separately. There are no companies where all the indicators are very high. For example, one company may have a very high efficiency indicator but a very low diversification indicator. This shows further the importance and completeness of the IPI measure, which is able to identify the areas of weakness on which the companies must work even if the total value of the index is high. Moreover, Figure 7b shows the radar plot for AMAPLAST firms divided into small, medium and large companies. It is worth noting that IPI also gives information to sets of companies. For example, Figure 7b shows that all the three sets (small, medium and large companies) consider all the five indicators in the same way, but in more detail it is possible to observe how the set of large firms, in black, has a high quality indicator compared to the set of small and medium firms but a low efficiency indicator. Information here for managers is that large firms are more interested in having few patents but with high quality whereas for small firms, even if they give importance to the quality dimension, the efficiency, i.e. number of patents, is more important. Finally, these results show how IPI can be applied to measure the innovation performance of firms or territories, giving information both at company and territory level.  Table 5. From a managerial point of view, IPI gives each firm or territory not only a rank but also information about the innovation performance profile. In fact, with the information included in the IPI dimension indicators it is possible to identify the strengths and weaknesses of innovation performance.

Discussion
The study shows an attempt to find an innovation performance measurement that overcomes the limits of previous studies. Over the years, researchers search for the definition of indicators to measure the company innovation process that is one of the most important determinants of the firm performance (Janssen et al., 2011;Hou et al., 2019). Nevertheless, studies are searching for developing a complete and comprehensive understanding of innovative performance systems and an appropriate set of indicators. The definition of appropriate innovation measures contributes to enhance the innovation performance, to spread the innovation culture, to benchmark companies against each other and to provide information also to institutional bodies, who can monitor and take decisions according to the performance of firms or groups of them. As suggested by Lanjouw and Schankerman (2004), the study proposes an index, i.e., the IPI. It is a novel approach and follows the main guidelines provided in the literature (Dewangan and Godse, 2014) and it is applicable in different contexts (Nuruzzaman et al., 2019;Gaur et al., 2019;Bahl et al., 2020;Hanifah et al., 2019;Saunila, 2016). The IPI is a composite indicator (Harhoff et al., 2003;Lazzarotti et al., 2011) that uses multiple information set out in patents. Moreover, it offers the possibility to use multiple indicators in prompt. Most of the previous innovation performance systems use primary sources to address the multiple perspective requirements. However, it is often difficult to collect this kind of information. Differently secondary sources data, such as those of patents are generally widely available, but they are not immediately accessible nor do they provide a multiple perspective. Most of the previous indicators to build a performance measurement system were based on the patent count or the number of patents forward citations (Dziallas and Blind, 2019). However, the use of patent count alone does not allow managers to have a multiple view. On the other hand, the use of forward citations limits the prompt use of the innovation performance measurement. Indeed, forward citations are accessible only after some time. With the identification of the patent forward citations predictors, that can be combined with the count of a firm's number of patents, the IPI is both a multiple-composite and prompt system of performance measurement. Moreover, it addresses the guidelines provided in the literature providing scholars a possible convergence on the wide range of perspectives on measurement, specifically using patent data. In particular, the IPI can be used at several different level. It is not only a project or technological level measure but, as shown in the application, it allows evaluations at project, firm and regional levels (Scalera et al., 2014;Gaur et al., 2019). Moreover, it provides a benchmark with other actors or contexts (Liu et al., 2015;Soosay and Chapman, 2006). The indicators should include throughput measures of innovation (Hagedoorn and Cloodt, 2003). It makes possible to address different perspectives and multiple stakeholder's goals, for example including learning aspects (measured with the backward citations), marketing aspects (measured with the internationalization citations). The scheme is also easy to implement and use because it only the extraction of patent information and the calculation of the indicators. In addition, the use of public and secondary data, i.e. patent data, permits to avoid also the ethics issues due to the use of artificial intelligence and machine learning approaches (Siau and Wang, 2020;Makarius et al., 2020). In fact, data security, privacy, transparency, autonomy, intentionality, responsibility, human and social rights challenges do not come up. Even if the IPI is based on results of machine learning analyses, privacy issues do not emerge because IPI do not use personal data. Transparency, autonomy, intentionality, and responsibility concerns do not arise because the IPI is easily interpretable and it is not used to take decision that are not under the control of managers. Finally, human and social rights are respected because the use of the IPI should not influence for example ethical standards but only provide managers a decision tool that permits to better decide firms' R&D investments and enhance firms' innovation performance.

Theoretical implications
The study aims to contribute to the innovation performance measurement literature. In particular, it provides a step forward in the innovation stream of literature that seeks to measure R&D performance using secondary data. Thus, it takes advantage of the use of secondary data opening up to several uses, measures and analyses that can be done with patents in order to measure the innovation performance. From a methodological standpoint, it shows the usefulness of applying machine learning approaches also in the innovation field for decision making purposes. Indeed, machine learning approaches and artificial intelligence are still barely used in innovation fields, in particular when trying to measure and forecast performance and take decisions out of these results. In addition, the IPI is an instrument that can be used not only at a firm level but also at a regional level. For this reason, it could be an instrument useful also in innovation macroeconomic and regional studies and it can provide a contribution in this stream of literature. Finally, the IPI shows also a new way to combine current and future data, thanks to highly accurate techniques of forecasting such as the machine learning ones.

Practical implications
For managers and those in business the IPI may provide several implications and suggestions. As mentioned above, IPI can be calculated at a firm level, providing an indication not only on the performance of the single project but more in general on the innovation area. In this way R&D managers may take decisions on the full portfolio and not only "correct" the single project. In addition, the information included in the IPI provides a nuanced overview of R&D, suggesting in which areas to invest promptly (Park and LiPuma, 2020). The dimensions allow for identifying the strengths and weaknesses of innovation performance in their companies. It makes possible to position the company with respect to competitors and suggests the dimensions/areas that should be improved by the managers in order to increase innovation performance. The fact of having a multiple perspective and indicators that refer to different firms' areas and skills may foster the culture of innovation and all the department could be involved to enhance the innovation process of the organization (Hanifah et al., 2019). This is even truer considering the usability and flexibility of the instrument. That flexibility may be of particular importance in terms of improving the responsiveness and achieving better performance (Kumar and Singh, 2019;Kindstr€ om et al., 2013).
For policy-makers, it is worth noting that IPI can be applied in different contexts, such as companies that belong to the same geographic area, to the same industrial sector, to the same technological area, with the same dimension, with the same governance, etc. This shows the high flexibility of the tool and gives policy-makers the possibility to quickly compare different sectors and territories and take decisions as a result (Kumar and Singh, 2019;Kindstr€ om et al., 2013). For financiers, the IPI may support their decision, suggesting which companies that perform better, and what their strengths and weaknesses are.
For patent experts, IPI provides some suggestions on how to write the patent. It indicates the most important features to look at in order to increase the number of forward citations and consequently the technology performance.

Conclusion, limitations and future research
The paper has presented a new measure of Innovation Performance, called Innovation Patent Index (IPI) based on secondary data, i.e. the patent database. The IPI is defined by means of five indicators that represent the five most relevant dimensions, which are efficiency, time, diversification, quality and internationalization. These dimensions have been identified according to the literature  and applying three different machine learning algorithms, i.e. RLS, the DNNs and the DTs to the patent database.
The IPI has been applied to a territory and an industrial sector. Results have shown that IPI is an immediate tool of measurement. In fact, it is able to overcome the limits of measures based on primary data and, through an analysis of the indicators, it gives information about the reasons for a good or poor innovation performance. Compared to existing indicators, generally not widely used because of their complexity or their limited ability to predict, the IPI is a simple and at the same time forward-looking tool of innovation performance measurement . Moreover, IPI overcomes limits of other, widely used measures based on secondary data, such as, patent counts and forward citations (Park and Park, 2006;Aristodemou and Tietze, 2018). In fact, the number of patents does not seem to be fully representative of the innovation performance and the use of other indicators provides a better understanding of innovation performance. Concerning the forward citations, they are considered a better indicator than the patent counts (Hall et al., 2005), however, they can be known only after a long period of time (Brinn et al., 2003). The IPI, on the other hand, merges the patent count with forward citation predictors, i.e. the other patents' features already included in the patent when issued, overcoming the limitations of the other measures. In fact it enriches the number of patents measure and overcomes the time issues of the forward citation measure. IPI gives each firm or territory information not only about the level of IP (the rank) but also the IP profile that can be used by managers, businessmen, policymakers, organizations, patent experts and financiers to evaluate and plan future activities, to enhance the innovation performance and capability, to find financing and to support and improve innovation.
This work is not without limitations. Despite most of companies extensively apply for patents, in some sector they are not widely used. Thus, a specific innovation index could be developed for the sectors that do not largely apply for patents. In addition, the IPI is based only on the pure number of forward citations to have a forward looking perspective. In further studies, other patent features could be used, as for example forward citations indexes (Aristodemou and Tietze, 2018).

Results for all Lombardy provinces
This appendix shows all the figure for the provinces of the Lombardy region. Figure A1 shows the IPI dimensions in the period after crisis for the provinces not shown in the paper, i.e. Bergamo, Brescia, Monza Brianza, Como, Pavia, Mantova, Cremona, Lecco and Sondrio. Figure A2 shows, year by year, the five dimensions of IPI for the provinces not shown in the paper, i.e. Bergamo, Brescia, Monza Brianza, Como, Pavia, Mantova, Cremona, Lecco and Sondrio. Figure A3 shows the IPI's dimensions of the provinces not shown in the paper, i.e. Bergamo, Brescia, Monza Brianza, Como, Pavia, Mantova, Cremona, Lecco and Sondrio evaluated in different time ranges.