An IoT-based and cloud-assisted AI-driven monitoring platform for smart manufacturing: design architecture and experimental validation

Purpose – This work aimsat proposinga novel InternetofThings (IoT)-based and cloud-assisted monitoring architecture for smart manufacturing systems able to evaluate their overall status and detect eventual anomaliesoccurringintotheproduction.Anovelartificialintelligence(AI)basedtechnique,abletoidentifythe specific anomalous event and the related risk classification for possible intervention, is hence proposed. Design/methodology/approach – The proposed solution is a five-layer scalable and modular platform in Industry5.0perspective,wherethecruciallayeristheCloudCyberone.Thisembedsanovelanomalydetection solution,designedbyleveragingcontrolcharts,autoencoders(AE)longshort-termmemory(LSTM)andFuzzyInferenceSystem(FIS).Thepropercombinationofthesemethodsallows,notonlydetectingtheproducts defects,butalsorecognizingtheircausalities. Findings – Theproposedarchitecture,experimentallyvalidatedonamanufacturingsysteminvolvedintothe production of a solar thermal high-vacuum flat panel, provides to human operators information about anomalous events, where they occur, and crucial information about their risk levels.


Introduction
It is well-known that Industry 4.0 has completely reshaped the manufacturing sector by integrating, within the production environment, different technologies such as artificial intelligence (AI), Internet of Things (IoT), cloud computing (CC) and cyber physical systems (CPSs).By integrating these technological pillars into an organized framework, Industry 4.0 is considered a technology-driven paradigm shift that aims at higher productivity through the better utilization of resources (Jafari et al., 2022).
Conversely, Industry 5.0 recognizes the power of the industry to achieve societal goals beyond jobs and growth, to become a resilient provider of prosperityby making production respect the boundaries of the planetand placing the well-being of the industry worker at the center of the production process (Xu et al., 2021).The main difference between the two industrial paradigms relies in the active role of human experts which coexist and work together with efficient, intelligent and accurate machines, in order to obtain more resourceefficient and costumer-preferred manufacturing solutions than that of Industry 4.0, which, instead, promotes mass production.The main pillars of Industry 4.0 and Industry 5.0, as well as the main differences between them, are detailed in Table 1.
Since the core value of Industry 5.0 is the human-centric vision, nowadays, this framework is gaining momentum (Maddikunta et al., 2021).Specifically, the concept of the human-cyberphysical system (HCPS) along with the so-called Operator 5.0 is emerging.Operator 5.0 is a smart and skilled operator able to use its creativity, ingenuity and innovation, aided by information and technology.This new emerging view allows overcoming obstacles on the way to develop novel and cost-effective solutions which make manufacturing operations' long-term sustainability and ensure workforce well-being in the face of difficult and/or unexpected conditions (Mourtzis et al., 2022).In doing so, human centricity is one of the crucial aspects of Industry 5.0, which aims at intertwining machines and humans in a synergistic collaboration to Merging AI, IoT, CC and CPS with critical, cognitive human thinking iv) A fully automated production process iv) A fully automated production process in collaboration with human experts v) Active engagement of robots in large scale production v) Exploitation of robots/machines for repetitive tasks and humans for critical thinking ones vi) No skilled jobs vi) More skilled jobs vii) No attention on environmental issues vii) Sustainable production with greener solutions ix) Novel features: human centricity, sustainability and resilience Table 1.Industry 4.0 vs Industry 5.0 JMTM 34,4 increase productivity in the manufacturing industry, while safeguarding workers' fundamental rights (M€ uller 2020).This cooperation, together with predictive analytics and operational intelligence, allows creating fully automated production processes via the real-time processing/ elaboration of the data measured and the highly-equipped human specialists, connected (also remotely) to the overall networked systems.Besides the human-centric approach, the notion of resilience has been also introduced in this brand-new industrial revolution and refers to the robustness with respect to disruptions/emergency production issues (Xu et al., 2021).In this perspective, on one hand, in order to move toward more efficient and robust manufacturing systems, simulation-based technologies constitute a focal point of digital manufacturing solution since they allow the experimentation and validation of different products, processes and manufacturing system configurations (Mourtzis 2020).On the other hand, to ensure a robust and resilient smart production environment, the real-time monitoring of the production process becomes crucial not only in each time interval and for each operation involved in the multistage processes, but also in maintenance and optimal scheduling activities.
The performance of a successful monitoring operation can be guaranteed due to the widespread diffusion of different smart sensors, which ensure a multi-scale information flow, thus creating knowledge about the main production parameters within the entire manufacturing process.
Moreover, by exploiting this information, it is also possible properly elaborating it in order to capture and catch possible anomalies occurring into the multistage manufacturing systems.Accordingly, it is also possible sending alerts when some critical/emergency events occur along with their risk level and suggestions about possible human interventions.
In this perspective, the objective of this paper is to introduce a new IoT-based cloudassisted monitoring architecture for smart manufacturing systems able to check the production status any time and, hence, understand if some anomalous events occur.As a result, it offers to the manufacturers the possibility of timely and properly counteracting abnormal situations.More specifically, the proposed architecture consists of five modular layers, where the central one, i.e. the cyber module, represents its core.Indeed, it deals with data collection, elaboration and processing operations, anomaly detection and classification purposes by embedding into the platform a novel AI-driven algorithm which exploits both statistical and AI-based tools.The human-automation symbiosis, i.e. the cooperation among machine and human operators in production environment, properly designed, so to enable their coexistence in a more efficient fashion, is the main feature of the proposed architecture, which leads the proposed solution to be completely conceived in the Industry 5.0 framework.Moreover, the proposed architecture also moves toward repair and prevention strategies according to the sustainability objectives imposed by Industry 5.0 paradigm and the zero defect manufacturing (ZDM) concept via the possible products waste avoidance.
Finally, the structure of the paper is as follows.Section 2 presents an overview of the related works.In Section 3, a detailed description of an IoT-based and cloud-assisted multistage manufacturing system is provided.A comprehensive explanation of the proposed IoT-based and cloud-assisted monitoring architecture is given in Section 4, where the novel AI-driven solution proposed for detection and classification anomalies is deeply detailed.The experimental validation of the proposed architecture, carried out for a real case-study of thermal high-vacuum flat panel (HVFP) production, is disclosed in Section 5. Finally, concluding remarks are drawn in Section 6.

Related works
This work represents one of the first attempts toward a unified framework for monitoring platform in smart manufacturing systems from Industry 5.0 perspective.Within this context, the technical literature is split into two main research lines: (1) conceptualization of IoT-based platforms for smart manufacturing without addressing the problems related to their An IOT based smart monitoring platform implementation and (2) the design of AI-based solutions to address defects' detection and prediction in manufacturing systems, but restricting their results on the performance evaluation of the exploited tools without finding causalities in the smart manufacturing processes evolution.Accordingly, in the sequel, the main latest real-time monitoring platforms are investigated and, then, the specific tools enabling defects' detection and prediction in smart industrial environments are presented.
2.1 Toward the conceptualization of an IoT-based platform Industry 4.0 is leading manufacturing enterprises to the new generation of cyber-physical systems (CPSs) and network-enabled smart manufacturing.Internet of Things (IoT), big data analytics, CC and AI tools are recognized as the main enabling technologies within Industry 4.0 paradigm that, linking interconnected "things" like sensors, actuators, controllers, robots and machines, allows coping with system complexity while improving the performance of the entire production system, as well as its flexibility and the production rate (Yang et al., 2019).Although many efforts have been made in this direction, how to build an integrated smart manufacturing platform, able to emulate production environment and guarantee interaction/cooperation between physic and cyber spaces, is still an open issue that only few works try to solve.For example, an advanced manufacturing cloud of things (AMCoT) platform for enhancing and assuring yields is proposed in the study by Lin et al. (2017) within the context of a bumping process of a semiconductor company in Taiwan, where IoT, CC, big data analytics and CPS's ad prediction technologies represent its core value.Moreover, a new cloud-assisted self-organized architecture (CASOA) is designed in the study by Tang et al. (2017) where, based on both distributed agent-based modeling approach and cloud, communication and negotiation among the different network entities are ensured, thus enabling dynamic reconfiguration of the production process and its flexibility, but without providing a discussion about the production status.Manufacturing objects virtualization, data processing and data-driven decision-making are embedded within the platform designed in the study by Woo et al. (2018) which, by exploiting again the agent framework, creates decisionmaking models based on real and historical data.Again, the digital twin (DT) tool is exploited in the study by Leng et al. (2021) to build architecture able to ensure the remote semi-physical commissioning.Herein, the authors carry out its validation for the case study of a smartphone assembly line, thus showing a reduction of commissioning iteration times.
Although the above-mentioned works represent first attempts toward practical solutions that could bring benefits to modern industries, other works in the technical literature on the field limit, instead, their analysis on the conceptualization of the main pillars required to ensure smartness in manufacturing processes (see e.g.Zheng et al., 2018 and references therein).However, they are so far from the Industry 5.0 paradigm.Indeed, very recently, the brand new Industry 5.0 concept recognizes the power of industries in achieving, besides the digitalization goals, environmental objectives and resilience w.r.t.emergency manufacturing situations, while admitting workers' centrality during each production stage (Maddikunta et al., 2021).Specifically, this brings to the fifth industrial revolution aiming at changing manufacturing systems via the combination of computation and digitalization skills with manufacturers' expertise, while the machines in the workplace become smarter and more connected.One of the first attempts to show the benefits deriving from the collaborations between robots and the human brain can be found in the study by Nahavandi (2019), where the advantages of merging physical world with workers intelligence and information and communication technologies (ICTs) in improving process efficiency are clearly explained.Moreover, AI tools and, in particular, machine learning (ML), are deeply used by researchers in the manufacturing industry field in order to successfully perform different operations, such as scheduling, monitoring, quality assessment and failure detection (Nassehi et al., 2022).In addition, as pointed out in the study by Javaid and Haleem (2020), ML and AI can be used to analyze manufacturing process data, while the human critical thinking allows achieving JMTM 34,4 higher accuracy and faster industrial automation.Finally Sherburne, 2020 highlights the benefits of the potential use of Industry 5.0 in the textile industry via a qualitative research.In this direction, this work would lay the basis in proposing a unified monitoring platform which picks up digitalization and AI tools together with the additional Industry 5.0 strength points.

Detection and prediction algorithms via artificial intelligence
Detection strategies are related to the early detection of defects, anomalies and faults by classifying them on the basis of the parameters causing undesirable effects (Caiazzo et al., 2022).In this context, deep learning (DL) techniques are recently exploited for fault detection and for discovering all intrinsic linear/nonlinear relationship among main manufacturing parameters.For example (Wang et al., 2020a, b), introduces an extended deep belief network (EDBN) which allows fully exploiting useful information from raw data, hence becoming additional inputs, together with other hidden features, to each extended restricted Boltzmann machine (ERBM).Again, to extract quality-relevant features from raw data, an autoencoder (AE) model, named stacked quality-driven autoencoder (SQAE), is introduced in the study by Yuan et al. (2020) and consists of both classical input and quality variables in order to capture quality-relevant features, as well as to neglect irrelevant ones.Most data-driven detection methods exploit convolutional neural networks (CNNs) (Dong et al., 2019a, b), which are data-based online fault diagnosis methods, with a good trade-off between accuracy and training period length due to a significant reduction of the training time w.r.t.other DL methods (see e.g.Xu et al., 2020).However, these methods need a large amount of data to be elaborated.When the required information is reduced in number, a helpful tool could be the DT concept, consisting of a virtual model of the physical system in cloud connected to the physical system itself for information exchange purpose (Chakraborty et al., 2021).Anyhow, the training and testing data from DT present the same feature distribution, and this could be not realistic due to multiple loading conditions, working environments and faults' severity (Han et al., 2020).To overcome these limitations Han et al. (2020), proposes a deep transfer network (DTN) with joint distribution adaptation (JDA).Other examples in this direction can be found in the study by Tabernik et al. (2020), He et al. (2019), Chen et al. (2020) and Dong et al. (2019a, b) for surface-quality control of industrial products.Moreover, other examples of the DL method for detection in the specific additive manufacturing field are provided by Okaro et al. (2019) and Ravindranath et al. (2020).For achieving, instead, combined fault detection and fault diagnosis of rare events in multivariate time-series data (Park et al., 2019), combines an AE for rare events detection and a long short-term-memory (LSTM) network for the identification of faults type.This technique allows achieving a good trade-off between the AE strong low-dimensional nonlinear representations of the rare event detection and the strong time-series learning ability of LSTM for fault diagnosis.Due to the fact that abnormal samples are often of insufficient size in real industrial environment (Jiang et al., 2019), and Wu et al. (2020) propose a Gaussian Bayes AE LSTM other approaches, different than DL, are exploited in the technical literature, such as the support vector machine (SVM), e.g. in visual inspection (Zhou et al., 2019), but with difficulties in treating multistage manufacturing processes, where multiparameters have to be considered (Du et al., 2015).Therefore, from the literature review about defects' detection, it is possible observing how the most common suitable tools exploited are the CNN and LSTM.
Besides detection, prediction strategies, aiming at forecasting the quality of each part of the product before its production (Psarommatis et al., 2020;May and Kiritsis, 2019), are also deeply investigated.In this context, future health conditions and the remaining useful life (RUL) of both equipment and products are also investigated in order to schedule optimal maintenance actions, hence timely performing preventive replacements and preventing unexpected failures while minimizing total maintenance costs (Tian 2012;Liu et al., 2019;Petrillo et al., 2020).Indeed, quality in a manufacturing process implies that the product performance characteristics and the An IOT based smart monitoring platform process itself are designed to meet specific objectives (Garc ıa et al., 2019).In this framework, jointloss CNN architecture is proposed in the study by Liu et al. (2020) to deal with fault recognition and RUL prediction in parallel by sharing the parameters and partial networks.Some extension of CNN approaches, combined with the Takagi-Sugeno-Kang (TSK) fuzzy model, are also suggested by Bhowmik et al. (2019) for guaranteeing surface roughness requirements of products.Furthermore, it is worth noting that quality insurance is crucial for batch processes in manufacturing and chemical industries due to their modeling difficulties and prediction problems.Along this line Wang et al. (2020a, b), combines LSTM and stacked autoencoder (SAE) to extract quality-relevant features by capturing nonlinear relationships during the training of the networks.With the aim at improving the quality monitoring and prediction accuracy, a generative neural network model is proposed in the study by Wang et al. (2019) for automatically predicting work-in-progress (WIP) quality, while the extracted features are reformed as time series.These latter are fed to a multilayer perceptron for product quality prediction and, finally, the outputs are decoded into a forecast quality measurement.By analyzing the literature review about prediction strategies, the majority of the articles concerns with unsupervised pretrained networks (UPNs), which are recognized as the most efficient approach to extract the main features from data and to neglect the redundant ones.

Contribution of the present work
The main contribution of this work lies in fact that it is one of the first attempts in leading modern smart manufacturing systems toward the Industry 5.0 concept.The proposed IoTbased cloud-assisted monitoring platform fits the following Industry 5.0 core values (Xu et al., 2021): (1) human centricity, where the human needs and interests become the heart of the production system, hence, shifting the workers' vision from costs to investments, (2) resilience, which refers to the need of creating an higher level of robustness in industrial environment, hence counteracting with disruptions and supporting critical emergency situations and (3) sustainability, with the aim at reducing waste and environmental impact, thus reaching better resource efficiency and effectiveness (Xu et al., 2021).Indeed, although Lin et al. (2017), Tang et al. (2017) and Woo et al. (2018) represent first attempts toward the conceptualization of smart manufacturing platform able to mimic production environment, they are so far from the inclusion of the above-mentioned Industry 5.0 principles, since no discussion about human centricity, resilience and sustainability issues is provided.Conversely, the proposed monitoring platform represents a unified framework combining human strengths, IoT technology on machines, and cloud-based solutions with AI to detect causalities in complex dynamic systems.As a consequence, since workers play a crucial role in the final stage of the detection algorithm according to human-centricity vision, the proposed monitoring architecture is causality based, i.e. its elaboration results not only provide the production status health but also identify and localize the anomaly variable.This is in opposition, to the best of authors' knowledge, with the aforementioned related works which, instead, propose correlation-based anomaly detection solutions, i.e. they provide information about production defects without specifying their causing or where the faults occur.Specifically, although they exploit AI techniques, such as the EDBN (Wang et al., 2020a, b), AE (Yuan et al., 2020;Jiang et al., 2019), the CNN (Dong et al., 2019a, b) and LSTM (Park et al., 2019;Wu et al., 2020) are able to capture all the linear/nonlinear intrinsic relations between input parameters, it is difficult recognizing their impacts on the results.To overcome this issue, the proposed platform is endowed with fuzzy logic tools which, designed with the aid of human expertise (according to human-automation symbiosis Industry 5.0 principle), allows to clearly identify both the production defects' correlations and causalities.
Furthermore, the monitoring platform also moves toward repair and prevention solutions, which represent two strategies not widely addressed in the ZDM field.Indeed, repairing defected parts is often a difficult and costly process, while the prevention strategy is a complex process requiring multiple inputs from different sources in order to be effective.Therefore, the proposed platform tries to solve some of the crucial challenges pointed out in the very recent survey works (Caiazzo et al., 2022;Psarommatis et al., 2020) about ZDM.Specifically, it guarantees (1) adaptive quality prediction, since the platform provides a quality prediction which reflects the current industrial process state, (2) data collection management, since in the technical literature, no procedure for data collection, management and elaboration is provided in a unified framework, thus breaking the barriers in the implementation of the ZDM strategies in industrial realities and (3) repair as a sustainable solution, since the platform result allows avoiding resource wastes with the aid of human operators which, remotely connected, could timely know the health process status and properly act.Hence, according to Industry 5.0 paradigm, sustainability and respects of planetary boundaries are taken into account.
Most notably, the effectiveness of the proposed architecture is experimentally evaluated on the manufacturing process of a solar thermal HVFP made by a company in Italy.Experimental results prove the efficiency of the proposed solution in recognizing, not only the nature of possible anomalies, but also localizing them and understanding their causality, as well as their risk levels.Indeed, human operators, remotely connected to the cloud, are supported by an abnormal panel risk (APR) which provides the anomaly risk level and, hence, supports them into the decision-making process by suggesting if some interventions are needed.

Problem statement: smart monitoring in manufacturing system
Consider a multistage manufacturing system consisting of N stations, each of them is associated with the various stages of products' manufacture as reported in Figure 1.This latter provides a high-level representation of a general cyber-physical production system (CPPS) which, by exploiting smart devices, IoT technologies and CC, is characterized by the following three main features: (1) smartness, since each single entity is able to acquire information from the surrounding environment and to autonomously act, (2) connectivity, since there exist connection links allowing cooperation and collaboration among the manufacturing entities and (3) responsiveness toward internal/external changes (Monostori et al., 2016).At the physical manufacturing level, machines within the smart stations are equipped with smart sensors, An IOT based smart monitoring platform smart meters, actuators and controllers, which represent the basic technology for collecting and controlling production data in real time (Kang et al., 2016).They can connect each other via communication technologies, such as Wi-Fi and Bluetooth, so to make production elements smarter and self-adaptive (Qu et al., 2019).Interconnection among physical entities, cloud and people is enabled thanks to low energy and high efficiency communication networks, such as Wireless Sensor Networks (WSNs), Internet Protocol version 6 (IPV6), Wi-Fi, wireless personal area netwok (WPAN), W-Mesh, WLAN, Wireless Wide Area Network (WWAN), 4G/5G, Narrowband Internet of Things (NB-IoT), Bluetooth, Zigbee, radio frequency identification devices (RFIDs) and Global Positioning System (GPS) (Qu et al., 2019).Finally, the data from physical manufacturing entities are transmitted to the cloud-based data center to be further analyzed.Specifically, the cloud infrastructure is responsible for the processes of data collection, integration, storage, analysis, visualization and application.By leveraging cloud-based high performance computing, big data analytics enables users to accelerate computationally expensive tasks while also reducing costs (Tao et al., 2018).Moreover, it is worth noting that new emerging technologies for data storage and processing, such as fog computing (FC) and edge computing (EC) can be also included in the schematic representation of Figure 1, which are able to significantly reduce bandwidth requirement, latency time and service downtime.Hence, in a multistage production environment, each station i (i 5 1, . .., N) is equipped with smart devices able to measure all the variables involved into the current process stage and endowed with communication capabilities to share the acquired information with both the other stations j (with j 5 1, . .., N, i ≠ j) and a cloud-assisted upper layer.This latter gathers all the information coming from the N stations, processes data and converts them into a user friendly data format so as to aid operators, eventually in remote-access, in being aware of the real-time status of the manufacturing systems.This allows humans to be supported into the decisionmaking process and in the manufacturing system efficiency monitoring.In this perspective, the design of an innovative smart solution, able to guarantee the preventive maintenance strategy and the enhancement of in-process quality control by reducing/eliminating the need for postprocess quality inspection is desirable.Monitoring plays a crucial role in ensuring product quality and letting all the manufacturing facilities run more efficiently.Indeed, thanks to the realtime knowledge about the status of all the process stations; the manufacturing process can be rearranged to counteract the specific products' defects occurring into the ith station itself.The anomaly detection algorithm should run on the cloud-assisted layer and, on the basis of data herein stored, has to recognize abnormal behaviors into each stage i of the manufacturing system.Furthermore, it has to classify the gravity risk level related to the problem occurred.Accordingly, a proper alert notification has to be sent to the human operators, remotely connected to the cloud-assisted upper layer, in order to aid them in making timely adjustments.However, the identification of production anomalies is not a trivial task since a large variety of information may cause the occurrence of anomalous events, which can be revealed by certain patterns captured by data time series.By analyzing the real-time data combined with time series, it is possible highlighting the overall correlations among the different data, impacting on the whole process, so to characterize and localize the specific anomaly occurring, as well as its gravity.This promotes the decision-making process via different kinds of notification alert to be sent to the human operators, which, based on the revealed gravity of the anomaly and its work experience, can decide if interventions are needed.In this operative framework, the aim of this work is to propose new IoT-based and cloud-assisted monitoring architecture able to evaluate the status of the overall multistage manufacturing system and detect eventual anomalies occurring in the production.More specifically, w.r.t.this latter problem, a novel AI-based detection algorithm is proposed.Based on the data sensing and communication capabilities of each ith station, the novel algorithm is able to catch abnormal behavior into each production stage and to capture data correlations so to identify the specific anomalous event and the related risk classification for possible interventions.JMTM 34,4

IoT-based and cloud-assisted monitoring architecture
To solve the problem stated in Section 3, the smart monitoring architecture, reported in Figure 2, is proposed.This is a scalable and modular platform consisting of five interconnected layers, namely, (1) physical layer, (2) transmission layer, (3) cloud cyber layer, (4) real-time monitoring layer and (5) smart decision-making layer.The proposed architecture is modular due to its hierarchical structure, which is based on different modules/layers.Each of them deals with a specific task and can be also redesigned independently from each other based on the technological development needs and/or requirements, without modifying the overall structure in Figure 2.Moreover, the architecture is scalable since it can be easily reused or replicated for different production environment, regardless of the number of production parameters to be monitored.
The physical layer is characterized by a network of distributed smart devices able to sense and monitor in real time the main features of the production quality, as well as the crucial parameters of machining processes.Specifically, the physical layer could be composed of RFID tags, attached to the products to be processed, and different sensors, such as accelerometers, dynamometers, thermocouple, pressure sensors, cameras and so on, hence allowing the continuous monitoring of the equipment and the products health status.
Due to the recent advances in IoT, and in general, in ICT (Shahbazi and Byun, 2021), the transmission layer aims at sharing the data, acquired by the distributed smart devices network, with the central cloud-assisted data elaboration unit.The different data communication technologies, including Ethernet, Wi-Fi, 4G/5G network, RS 232 and Bluetooth, ensure the real-time information transmission in the different data formats originated by the heterogeneous distributed smart entities exploited.
Data collection, storage and processing operations are performed at the cloud cyber layer.Therefore, this layer, not only aims at collecting and storing the shared information coming from the underlying two layers, but also at creating new knowledge from heterogeneous large amount of data.The various manufacturing information can be classified into structured (such as for example digit, symbols, tables), semi-structured (such as trees, graphs, XML documents) and unstructured data (i.e.logs, audios, videos and images) (Zheng et al., 2018).Then, through CC, data storage can be guaranteed in a highly cost effective, energy efficient and flexible fashion (Qi and Tao, 2019) in order to process manufacturing information for predictive maintenance and products' quality insurance purposes.However, first of all, these data have to be precisely preprocessed in order to put them into a suitable form for the next elaboration phase by discarding redundant, misleading, duplicate and inconsistent information.The preprocessing operations could involve the following steps (Alasadi and Bhaya, 2017): data cleaning in order to eliminate garbage data; data integration; data reduction for converting the massive volume of data into ordered, meaningful and simplified forms by means of feature or case selection.Once these operations are completed, the new obtained dataset is fed to the proposed novel detection and classification algorithm which provides a real-time monitoring of the manufacturing system and, eventually, notifies the occurrence of anomalous events.The proposed novel cloud-assisted detection and classification algorithm, designed via AI-based techniques, aims at capturing the deviations of each manufacturing system parameter from the nominal trend, hence classifying the acquired data as normal or abnormal.This classification allows defining latent variables/indexes, involving different correlated parameters, which are properly processed in order to return accurate information about the anomaly occurring into the specific stage of the manufacturing system along with its risk assessment.Indeed, the algorithm provides a risk scale, with different gravity levels, which suggests to the human operators, remotely connected to the cloud via the real-time monitoring layer, if some interventions are needed.A proper discussion about the novel proposed AI algorithm is provided in the next section.
The real-time monitoring layer, instead, allows the real-time visualization of the manufacturing system, i.e. the monitoring data and the results of anomaly detection algorithm, to the dedicated human operators, remotely connected.Visualization is performed via user friendly graphical means (Mittal et al., 2019), such as charts, diagrams, graphs and alert notification messages, which indicate the gravity, the type and the position of anomalies.
This fourth layer represents the bridge between the field-level manufacturing devices and the high-level smart decision-making Layer, which is demanded to human operators and/or business management system (e.g.enterprise resource planning (ERP)).Indeed, based on the processed information about the manufacturing systems and the product health status, rational decisions or interventions on the system could be undertaken.Note that a rational and cognitive decision process would be unpractical if based on large amount of rough/not processed data.
It is worth noting that the CPS modeling approach is the key enabler of a smart manufacturing system.From Figure 2, this representation allows building bidirectional interactions (highlighted via the bidirectional arrows in Figure 2) among the layers involved in the resulting smart monitoring architecture (Ding et al., 2019).This interaction and interoperability firstly guarantee the elaboration, the processing and, hence, the creation of new knowledge which, related to the actual status of the production, is sent to manufacturers via proper Human-Machine Interface (HMI) (from the bottom to the top) via the displaying of the machine status, machine progress, as well as alarm information.Then, smart operators (according to the Operator 5.0 concept) are able to start a smart decision-making process in order to support efficient production control with timely and proactive anomalies response within a flexible and robust production environment (from the top to the bottom).More details about the integration of remote control activities within the smart monitoring platform are provided in Section 4.2.Moreover, regarding the communication protocol in the smart manufacturing field, it is worth noting that industrial communication networks are evolved through several stages, ranging from dedicated Fieldbus networks, such as PROFIBUS and Modbus, to modern Ethernet-based networks, such as EtherNet and EtherCAT, hence allowing easier communication at a higher level.More recently, due to the IoT and wireless sensor network (WSN) applications, new communication standards have emerged, such as Institute of Electrical and Electronics Engineers (IEEE) 802.11, IEEE 802.15.1 and IEEE 802.15.4 (Lu et al., 2020).For more details about the latest developments on industrial communication, interested readers may refer to Wollschlaeger et al. (2017).
The problem of selecting the most suitable cyber architecture is beyond the scope of the work.However, it is worth noting that the wide adoption of IoT devices has introduced new challenges in the current CC paradigm (Dustdar et al., 2019).New cloud-based architecture, such as EC FC, are emerging; thanks to their faster response time.Indeed, EC and FC would allow to tackle some challenges of CC, such as the low efficiency in analyzing a large amount of data in short time and the negative impact of the Quality of Service (QoS).However, although EC and FC enhance the energy saving consumption and resources utilization, there still exist some open issues in their implementation (Zietsch et al., 2020;Laroui et al., 2021).

Cloud-assisted anomaly detection and classification algorithm
The proposed cloud-assisted anomaly detection and classification algorithm leverages the combination of control charts, AE with LSTM layers (AE-LSTM), latent indices/variables and the fuzzy inference system (FIS).The algorithm consists of two main phases, i.e. detection of possible anomalies and their classification.Its inputs are the multistage process parameters, not necessarily labeled, which are categorized into normal distributed and not normal distributed.On the basis of the different types of parameters, the solution processes the data according to two different techniques, i.e.AE-LSTM for not normal distributed parameters and control charts otherwise.Then, the algorithm returns information about possible deviations from nominal parameters trend.The values of these deviations allow classifying the measured parameters as normal or abnormal and, then, defining latent variables/indices, while maintaining the same information content, as well as reducing their dimensions and computational complexity.Finally, these latent variables/indices become the inputs to the FIS which, leveraging properly defined fuzzy sets and fuzzy rules, returns the anomaly risk assessment jointly with the localization of the anomalous events occurring.A flow chart of the proposed algorithm is reported in Figure 3, where all the main involved decision steps are highlighted, while Figure 4 discloses its functioning scheme along with the exploited tools.More specifically, Figure 4 points out how the N acquired manufacturing system parameters, involving both not normal distributed (i.e. the blue line) and normal distributed (i.e. the red line), which are processed via the two possible anomaly detection tools, hence obtaining the deviations vector d x 5 [d x1 , d x2 , . .., d xN ], useful to construct the latent variables vector y 5 [y 1 , y 2 , . .., y M ], with M ≤ N.This latter vector represents the input to the FIS, which elaborates the anomaly risk related to the detected anomalous parameter.
In the following, all the AI-based steps involved into the construction of additional knowledge about the manufacturing system, i.e.
4.1.1Input identification.This first step consists in identifying the type of the parameters to be monitored, i.e. not normally distributed and normally distributed.To this aim, a proper statistical analysis is carried out by verifying the distribution of the historical time series of each parameter in order to choose the best suitable anomaly detection method.For this task, the following tools are exploited: graphs, such as the histograms, boxplots or quantile plots; descriptive indices, such as the asymmetry and the kurtosis, returning null in case of normal distribution An IOT based smart monitoring platform  and Bera, 2000); tests of normality, such as the Shapiro-Wilk, which is preferable for small samples (Royston 1992), or the Kolmogorov-Smirnov, used for larger samples (Dimitrova et al., 2020).Once data distinction is made, it is possible selecting the control chart detection method for normally distributed parameters and the AE-LSTM for the not normally distributed ones.
Algorithm 1.The AE-LSTM algorithm for thresholds' definition via modified sliding window training (SWT)  (Hochreiter and Schmidhuber, 1997).All its hyper-parameters depend on the appraised case study and have to be properly selected in order to obtain the best suitable configuration providing the highest accuracy into the reconstruction phase.The first input to the ANN is the slide window allowing to divide the time series in shorter subseries to be analyzed, whose dimensions depend on the correlation among the time instant of each subsequence (Suzuki et al., 2014).This correlation can be discovered via the correlogram of the available time series.As best practice, it is suggested using a time window large enough to include the time instants which present higher correlation.

Algorithm 2. The AE-LSTM detection algorithm
The training phase is devoted to evaluation of the mean absolute error (MAE), which is committed in the reconstruction of each subsequence (window) of the training set.The maximum value of MAE, found during the training, is then set as the threshold for recognizing if the different time series of the monitoring parameters are anomalous or not.
Algorithm 1 summarizes all the steps which allow the computation of thresholds for each not normally distributed monitoring parameter, i.e.T k ∀k.Then, when a new time series is put in input to the AE-LSTM, this latter analyzes each of their subsequences via the computation of the MAE for the reconstructed signals.Accordingly, the MAE is compared with the threshold founded during the training phase and the algorithm notifies if some anomalous events occur.This allows classifying each sample of the monitoring parameters as normal or abnormal.The procedure is presented in Algorithm 2. JMTM 34,4 4.1.3Control chart detection.As mentioned before, this method allows analyzing the normally distributed parameters of the manufacturing system.For the construction of the control chart limits for each appraised parameter, i.e.UCL, CL and LCL, the random sampling technique is exploited (Montgomery, 2020).The limits are computed by considering different observations of the reference time series.These samples are plotted in the related control chart in order to determine whether the considered parameter meets the prescribed distribution or if it is out of control.In this latter case, these samples are marked as outliers and, hence, excluded.Then, the control chart limits are re-computed via an iterative procedure ending when all the parameter data are in control and the final limits are given.Once the UCL, CL and LCL are derived for the appraised parameter, when the real-time observation of the normal distributed parameters is put in input to the control chart, it returns the deviations of the parameters from UCL or LCL.More specifically, if the parameter x k (∀k 5 1, . .., N) is in control, i.e. within the UCL and LCL limits, the value of the deviation is d x k ¼ 0 and the parameter x k is classified as normal.Conversely, if the parameter is out of control, then and the parameter x k is classified as abnormal.
4.1.4Identification of indices/latent variables.The input at this elaboration stage is the deviation vector d x , whose dimension is comparable to the one of manufacturing parameters to be monitored.Since a large amount of parameters can characterize a multistage manufacturing process, the dimension of the vector d x increases with the number of the parameters x k .As a consequence, this can increase the computational burden required by the subsequent fuzzy inference system.To avoid this issue, a vector of indices/latent variables with lower dimension, i.e. y 5 [y 1 , y 2 , . .., y M ] with M < N, is constructed.The fundamental aspect to be considered when constructing the indices/latent variables lies in the fact that they should remain interpretable so as to be usable for the next elaboration step.To this aim, the data correlation matrix and/or the scatter plots are used.Accordingly, it is possibly merging parameters which are homogeneous and correlated.
4.1.5Fuzzy inference system.The FIS is here exploited to capture the specific products' anomalies from different parameters and to understand how they affect the final process outputs.More specifically, the FIS determines, on the basis of the latent/indices variables, the anomaly occurring into the production along with its level gravity.The proposed the FIS is the Mamdani type (Pourjavad and Mayorga, 2019) and the fuzzy rules can be properly designed, depending on the specific appraised manufacturing system, with the aid of process experts.Indeed, these latter are able to provide a correct judgment about the production results.The first phase of the FIS consists in identifying the linguistic variables, the related terms and universes of reference, which represent the antecedents and the consequents in the inference system.In this case, the antecedents are the latent indexes/variables previously identified, while the consequents are the anomaly risks on the final outputs of the process.
Regarding the construction of the fuzzy set, they strongly depend on the specific case of study and are derived by leveraging the expertise of the human operators/manufacture engineers about the impact of parameters' variations on the final outputs of the process.In addition, the above information is combined with the ones describing the values assumed by the deviation vector in correspondence of parameters' variations.This operation is performed by running the AE-LSTM and putting these parameters' variations as its inputs.The fuzzy rules IF-THEN type, covering the space of the possible combinations between the antecedents, are also strongly related to the specific use case and are derived with the aid of the experts of the manufacturing process and by exploiting, eventually, the correlation matrix and/or scatter plots of the appraised parameters.

An IOT based smart monitoring platform
Finally, thanks to the de-fuzzification operation, the FIS returns a database containing, for each temporal instance, information about the risk level of the eventual anomalous outputs and about the parameters causing anomalies, along with the related values.

Smart monitoring for remote control
The Iot-based cloud-assisted monitoring platform, herein presented, brings many benefits to the smart manufacturing system from Industry 5.0 point of view.Indeed, given the availability on cloud of real time data from field-level devices, human operators and business managers are able to remotely visualize the real-time status of each production parameter via smartphones, tablets and/or other connected devices.Accordingly, they can properly act, only when necessary, on the specific manufacturing stage causing the eventual anomalous events.This is especially relevant in emergency situation, such as the COVID-19 pandemic, where factories are forced to operate with a reduced number of human operators.Therefore, the proposed smart monitoring architecture provides a first attempt to the remote control of smart manufacturing system and paves the way to the design of more sophisticated adaptive remote control approaches, which lead to a more flexible production environment where proactive decisions can be timely taken without any human interventions.Within this framework, a possible conceptual architecture can be realized by integrating a distributed control layer, able to autonomously take decisions on manufacturing processes, on the top level of the architecture in Figure 2.This can be done by exploiting a network of smart control devices which, based on the monitoring results provided by platform, further elaborates this information and provides, for example, new set points to be imposed on each smart station controller.This makes each production unit more responsive to abnormal events.

Smart monitoring for problem processing
Besides the real-time monitoring of each production entity health status and control operations, other key operations have to be covered in modern smart manufacturing systems, such as production planning and maintenance scheduling tasks.Although these latter are not the focus of the work, the proposed architecture, could involve, as further development, an additional module for problem processing, which aims at finding recommended possible solutions and estimating their effectiveness, while evaluating the resulting impacts on other manufacturing activities.In so doing, this module is responsible for the maintenance scheduling and production planning operations.Advanced analytics (AA) can be exploited to solve the above-mentioned issues, hence improving the performance of the smart manufacturing systems in the information age (Vater et al., 2019).Specifically, among the different AA techniques, prescriptive analytics could be useful to determine the priority of decisions/actions to be executed in order to achieve desired results, by answering the following question: What do I have to do to achieve a desired goal?Through the introduction of the problem processing module, Operator 5.0 can exploit predictive and prescriptive analytics suggestions so to take smarter decisions and prescribe future activities within a near-, short-, medium-and long-term horizon, thus enabling the vision of the so-called smart resilient manufacturing system (Mourtzis et al., 2022).This results in an agile and flexible/ reconfigurable system able to react and recover from disruption by adjusting its functioning prior to, during or after operational changes and disturbances, hence sustaining, required the operations under both expected and unexpected conditions.

Case study: solar thermal high-vacuum flat panels production
In order to verify the effectiveness and the reliability of the proposed monitoring platform, an experimental validation on a real manufacturing system, involving the production of solar JMTM 34,4 thermal HVFPs with an operative temperature into the range [60; 200] 8C, is carried out.The manufacturing system is composed of six stages, namely, (1) absorber pipe assembly preparation, (2) framed glass preparation, (3) molded bottom preparation, (4) panel assembly, (5) panel sealing and (6) panel conditioning.The stages from 1 to 3 lead to the three main panel components creation (see Figure 5(b)), while the remaining three guarantee the realization of the innovative panel according to the specific requirements.More specifically, Stage 4 allows assembling the different components produced in the previous three phases, while Stage 5 focuses on cleaning the panel.Finally, Stage 6 aims at realizing the required ultra highvacuum panel via a multi-zone oven (see Figure 5(a)).Since this stage is the most crucial to ensure high quality of the final product, according to the manufacturers' requirements and availability, the experimental validation analysis is restricted at this stage.The main subject of the panel conditioning phase is the cart transporting from one to four panels toward the ten subzones of the oven (see 5(c)-(d)).The cart is composed of four floors (i.e.0, 1, 2 and 3), where all the floors are embedded with four lamps ensuring the reaching of the required panel conditioning range temperature, while Floors 0 and 1 are also equipped with two heaters for further increasing the temperature when necessary.Again, a proper pump, mounted on the cart itself, is exploited to create the high-vacuum state inside the panels.The cart moves toward the different oven sub-zones; thanks to a control system based on Programmable Logic Controller (PLC) Siemens S7-1200, while smart sensors allow the measurements of temperature, pressure and current of the ten production phases.By leveraging Microsoft Azure IoT Hub, supporting Message Queue Telemetry Transport (MQTT) communication An IOT based smart monitoring platform protocol, these measurements are, then, sent to the Microsoft Windows-based Azure Cloud with DBMS Microsoft Structured Query Language (SQL) Server 2019 for their proper elaboration, according to the proposed IoT-based cloud-assisted anomaly detection and classification algorithm.Note that, MQTT protocol is considered as the best suitable connection protocol for machine-to-machine and IoT applications since it provides near-realtime data transmission (El Attaoui et al., 2020).Finally, it is worth noting that anomaly detection and classification algorithm, implemented via the Python 3.7 environment, run on a virtual machine, in Azure Cloud, having the following specifications: Intel (R) Xeon (R) Platinum 8171 M CPU @ 2.60 GHz 2.10 GHz processor, 16 GB RAM and an internal storage of 250 GB.The exploited Python libraries are numpy, pandas, random, glob, math, seaborn, matplotlib, tensorflow, keras, statsmodels, skfuzzy and itertools.

Tailoring the cloud-assisted anomaly detection and classification algorithm
In this section the cloud-assisted anomaly detection and classification algorithm is tailored to HVFP case study.The inputs process identification, described in Section 4.1.1,allows defining the parameters to be monitored along with the specification of their distribution as in Table 2.
The selected structure for the AE-LSTM is reported in Table 3, while all its hyperparameters are defined in Table 4.For the training and validation of the ANN, 25,000 samples of offline in-control and out-of-control parameters are collected, whose 90% of total amount is exploited for the training, while the remaining 10% for the validation phase.The effectiveness of the training phase (see Algorithm 1) is disclosed via exemplary results, related to the pressure of the cart in the Sub-zone 3 of the oven, in Figure 6, where it is possible to appreciate the behavior of the loss function over the epochs (see Figure 6  x k;p ; p ¼ f1; . . .; Mg; k ¼ f1; . . .; N g of each Sub-zone of the oven to be monitored.An example of the founded thresholds T k , referred to the Sub-zone 3, can be found in Table 5. Next, the designed ANN is validated by considering the real time monitoring of new time series referred to the appraised monitoring parameters.Exemplary results, again referred to Sub-zone 3 for sake of brevity, are reported in Figure 7, where it is possible to appreciate the occurrence of an anomaly for the monitoring parameter P_CART into the range [17 : 07 : 30; 17 : 07 : 47] (Smith, 1994).The derived control chart limits, referred to the Sub-zone 3 of the oven, are reported in Table 6.As example of the effectiveness of anomaly detection via the control chart for the monitoring parameter C_HS is reported in Figure 8, where it is possible to appreciate the occurrence of an anomaly at 17:07:27 and at 17:07:28 (see Figure 8  (i 5 1, . .., 14) represents the difference between the actual value and the related threshold T k for the kth not normally distributed parameter, while it is equal to for normal distributed parameters.Starting from the deviations vector, it is possible defining the indices/latent variables (according to Section 4.1.4)by exploiting the correlation matrix and the scatter plot reported in Figure 9.
This analysis allows grouping correlated parameters and, hence, it brings to the definition of the latent variables as follows: Now, the FIS is exploited for identifying the abnormal panel risk (APR) for the detected anomalous events.Within this phase, the antecedents are defined as the resulting latent variables and the consequent as the APR.FIS features, derived with the help process experts, are reported in   The final output of the FIS is a new database containing for each time instant the value of possible anomalies, their deviations from normal trends, the value of the related latent indices/variables and its APR.Exemplary final elaboration results, related to the aforementioned manufacturing status, are reported in Figure 11.Herein it is possible to observe that, when there is a combined anomaly into Lamps 1 (see Figure 8) and into the cart (see Figure 7), the monitoring architecture notifies the anomaly occurring into final manufacturing system output, as well as its relative gravity risk, i.e. "low" 1.5.Then, this final result is reported to external users, connected to the platform, which suggests them that no interventions are necessary yet.The latency between the occurrence of an anomalous event (5 3 10 À7 , 5 3 10 À6 , 2 3 10 À5 , 3 3 10 À5 ) High (2 3 10 À5 , 3 3 10 À5 , 1100, 1100) APR Low (0, 0, 2, 4) Consequent Moderate (2, 4, 6, 8) High (6, 8, 10, 10) Table 7.An IOT based smart monitoring platform and user-alarming, i.e.D, strongly depends on two main algorithm features, i.e. the sliding window size L sw and the sampling time of the monitoring platform T. For HVFP production, the sampling time is selected as T 5 1 [min], while the sliding window size is reported in Table 4. Therefore D 5 L sw 3 T 5 5 [min].Note that although the latency is of the order of minutes this latter has to be correlated to the specific manufacturing life-cycle process.Indeed, since production time cycle of 1 HVFP is of about 9 hours, the latency D 5 5 [min] is reasonable due to the poor significance of parameters variations for smaller time intervals.Finally, results show that a percentage of 95.46% of the normal samples are correctly identified as normal, while the resulting 4.54% of them are false abnormal data.However, in this latter case, it is worthy remarking that false abnormal samples lead to a low APR; thanks to the integration of the FIS along with human expertise, which strongly improves the global performances, thus suggesting to external users that no interventions are needed.This highlights the crucial role of human operators in the final stage of the proposed detection mechanism in Industry 5.0 perspective.

Concluding remarks
In this work, novel smart monitoring IoT-based cloud-assisted and AI-driven architecture is proposed to evaluate the overall status of a multistage manufacturing process and detect possible anomalies occurring.To this latter aim, a novel AI-based strategy is suggested.By exploiting both statistical tools, it is able to identify and localize production abnormal events; thanks to the combination of the proposed technique with human expertise.Experimental results, carried out on a real manufacturing system, disclose the ability of the proposed solution in capturing products' health state while maintaining some core values, i.e. humancentric aspect, resilience and products sustainability.This highlights how the proposed architecture moves toward the new concept of Industry 5.0 for manufacturing systems, while also solving some open challenges in ZDM context.

Future directions
Although smart monitoring is a key operation in modern smart manufacturing systems as disclosed via the experimental results in Section 5, other crucial activities are required in order to enhance productivity and optimize the whole production process.Based on considerations in Sections 4.2-4.3, the integration of both distributed control and problem-processing modules within the proposed IoT-based and cloud-assisted monitoring architecture could be addressed in the next future.Specifically, first of all, the introduction of a distributed control layer on the top of the proposed architecture, based on the networked control system (NCS) theory, could be helpful to reach faster and more proactive responsiveness to abnormal events by exploiting previous production information in order to obtain deviations compensation.In this perspective, the aim could be the generation of deviations' compensations for subsequent production steps on the basis of collected data, while, at the same time, the enabling of adaptive remote control would be possible by exploiting cloud-computing services and/or its extension, such as fog/edge computing.Secondly, a problem-processing layer could be effective to support the decisionmaking process by suggesting recommended actions to smart operators along with the evaluation of their potential impact on the whole manufacturing system.Here, the objective would be the exploitation of advanced analytics and, in particular, prescriptive analytics to determine the optimal sequence of decisions/actions for the resilient Operator 5.0, thus leading to the minimization of human judgment in decision-making phase, which suffers from subjectivity.In so doing, the huge amount of data from heterogeneous data sources could be exploited in order to build adaptive prescriptive analytics models, which are able to dynamically adapt their behaviour as soon as new data are acquired.However, real-time prescriptive analytics is still at JMTM 34,4 its dawn since most of the existing works on the topic deal with offline approaches.Hence, the development of real-time and sensor-driven information systems for prescriptive analytics, as well as recursive algorithms will have to be studied for future extension of the platform, which could also be able to process data with time-varying characteristics and, thus, to solve large-scale problems.

Figure 1 .
Figure 1.IoT-based and cloudassisted multistage smart manufacturing system Figure 2. Smart monitoring architecture Figure 3. Flow chart of the proposed cloudassisted anomaly detection and classification algorithm Figure 4. Functioning scheme of the proposed cloudassisted anomaly detection and classification algorithm Figure 5. Panel conditioning process: (a) production line, (b) solar thermal HVFP schematization, (c) frontal cart view with heaters at Floor 0 and 1 and temperature sensors and (d) back cart view with temperature sensors and vacuum tube system (a)) and the distribution of the samples MAE (see Figure6(b)).The correctness of the AE-LSTM prediction is corroborated in the testing phase by the following performance indexes: accuracy equal to 93.5% and recall equal to 88.77%.After this training phase, according to Algorithm 2, it is possible deriving the different anomaly thresholds for each time series of each not normally distributed parameter Figure 6.Exemplary training process for the P_CART variable: (a) trend of loss functions and (b) distribution of samples MAE Figure 9. Correlation matrix and scatter plot for the definition of the vector y Figure 10.Membership functions of the APR

Table 4 .
and at 17 : 07 : 55.Conversely, w.r.t. the normally distributed monitoring parameters, they are analyzed via the random sampling technique with time series of ten samples and exploiting the Xbar-S chart AE-LSTM hyperparameters

Table 3 .
AE-LSTM structure (a)), while the other variables are in control (see Figure 8(b)).The result of the anomaly detection stage provides the deviation error vectors d x1 , d x2 , . .., d xN for the N 5 14 monitored parameters.Note that, each deviation d xi

Table 5 .
Table 7, while in Figure10the membership function of the APR is disclosed.Moreover, 48 fuzzy rules are derived according to IF-THEN-ELSE format, while the defuzzification process is carried out via the Mean of Maximum (MoM) method.