Model discovery, and replay fitness validation using inductive mining techniques in medical training of CVC surgery

Medical training is a foundation on which better health care quality has been built. Freshly graduated doctors have required agood knowledge ofpractical competencies, whichdemands the importance ofmedical training activities. As of this, we propose a methodology to discover a process model for identifying the sequence of medicaltrainingactivitiesthathadimplementedintheinstallationofaCentralVenousCatheter(CVC)withtheultrasoundtechnique.AdatasetwithtwentymedicalvideorecordingswerecomposedwitheventsintheCVCinstallation.Todeveloptheprocessmodel,theadoptionofprocessminingtechniquesofinfrequentInductive Miner (iIM) with a noise threshold value of0.3 had done. A combinationof paralleland sequential eventsof the process model was developed. Besides, process conformance was validated with replay fitness value about 61.1%, and it provided evidence that four activities were not correctly fit in the process model. The present study can assist upcoming doctors involved in CVCs surgery by providing continuous training and feedback on better patient care.


Introduction
Process mining is the latest research discipline, which support the business processes analysis based on event log data. The process mining research area has been used in the field of healthcare processes for the detection of process models, especially for conformance checking and analysis of social networks. It is highly focused on knowledge extraction from the data that was generated and stored in the IT systems [1]. Previously, the number of methods have been used to analyze the hospital processes, and Evidence-Based Medicine [2,3].
Recently, hospitals are using software systems to record patient data. These systems are indeed interacting with the real world. By default, these systems did not communicate with the outside world. As of this, the process model can be used to organize and explain how software systems can interacting with the real world [4]. It is an ideal method for learning these techniques by splitting event data recorded in hospital frameworks [5].
Mainly, in the medical training procedures, process mining has been utilized in various contextual investigations with better results [6,7]. Moreover, these algorithms can help doctors in the treatment process, thereby generating immediate medication [8][9][10]. Despite this, process model discovery and performance calculations can provide better chances in advantage gain from the information stored in hospital systems [8]. Using process-mining techniques in healthcare processes not only ensures process understanding, but also can improve the service quality, and positive patient feedback [9].
From cardiologists to health specialists, radiologists to anesthesiologists, and nurses to radiologic experts, there is a high demand for training sessions for medical students. There has been much exchange in the medical literature about the significance of compassion and doctor correspondence style in therapeutic practice. Some proofs explain a decrease in health care during the long clinical periods of medical school and continuing in an entire traineeship. Especially, surgeons are exposed to this problem due to the lack of idea in working place, and nonattendance of patient sessions. Ultimately, this is creating new issues for real doctors [11].
In this study, we aimed to present the importance of process mining techniques in the assessment of sequential activities involved in surgical training of Central Venous Catheter (CVC). The process mining tool (ProM) [12] with infrequent Inductive Miner (iIM) was adopted to process validation. Besides, this study addresses the conformance calculations. Therefore, it can be an easy way of finding the reasons for data deviation from the model. As per our best knowledge, this is the first study that uses conventional process mining techniques in medical training practices involved with CVC installation.

Dataset
The medical training dataset from the conformance checking challenge (CCC.2019) [13] was adopted for this study. The particular dataset is composed of the medical training process, especially for medical students of the Pontifical Catholic University of Chile, to learn CVC installation with ultrasound [14]. This process suggests induction of catheter (tube) in a central vein, helping on conveying liquid or prescriptions to the patient, among different uses.

Observing event logs from data
The starting point of any process is data. Primarily, data selection was made in the form of XES (extensible event stream) conditions [15]. As discussed, data is measured and collected by medical students. For the adopted dataset, data labeled with 20 instances of 29 individual activities. Each instance is involved with a separate ten video recordings (each recording was conducted in two times) of CVC installation. Each record has checked and verified by the corresponding tutor in the university.

Mapping of CSV to XES
Process mining involved with event logs and each event corresponds to a single activity at a particular instance. This event data either available from the hospital database (for example, patient data) or spreadsheets or comma-separated value (CSV) files. However, because of many process mining tools adopt XES standard format, it was demanded to convert CSV to XES format ( Figure 1).

Infrequent inductive miner (iIM)
Several tools are available for process mining and algorithms for model generation, tables, and data analysis. However, ProM is an open-source tool that consists of many methods for process model development. ProM version 6.9, coupled with iIM was employed to ensure the model soundness [16]. Inductive miner is a discovery algorithm, depending on the outcomes from soundness in-process models (i.e., Petri-nets). Petri nets are useful for model visualization in the process mining [17]. In the end, model validation was done with a threshold noise level of the model was equal to 0.3.

Replay log fitness to conformance check:
Replay log fitness defines the model characteristics and also whether a model can replay the observed behavior or not [16]. In simple words, process conformance explains the gap between the fitting model with traces (trace is a series of activities) and the actual model (which is in mind). These alignments are fundamental techniques to calculate the replay fitness and conformance checking and helped to find the data deviation in the model. Once the Petri-net model was generated, then the model coupled with a log file can estimate the replay fitness.

Dataset outcomes
The follow-up to data mapping, event distribution with total activities in CVC installation was observed (Figure 2), and resource occurrence portions (high to low) were estimated (Table 1).

Mining of CVC schemas
In health care, process mining was involved in the control stream, execution, conformance, and reliability [18]. As of this, after the log extraction, different Petri-net models were generated by adjusting distinct noice values. Ultimately, the model linked with primary activities was generated with a threshold noise level equal to 0.3 ( Figure 3).
In the first section of the process, a parallel flow of event classes can be observed. The method initiates with three parallel events called get in sterile cloths, anesthetize, and doppler identification. It defines that there is no strict rule in the activity initiation. Once the doctor in sterile cloths, the next steps would follow hand washing and keeping the patient in the right position. After that, the process follows two flows: one starts with a clean puncture area followed by a drop puncture area, and implant preparation indicating ending the process. Other flows start with a puncture, followed by blood return and dropping probe. After these steps, rest of the flow followed with sequential events of syringe removal, guide ware install, trocar removal, wire checking with the short and long axis.

Conformance checking
The preliminary model outcome of conformance check and synchronous activity flow represented with green and asynchronous activity flow others with red-colored edges can be observed (Figure 4). Trace fitness value was validated with 61.1%, and 38.9% of model events were not accurately fit in the given trace; also, perform activities in an asynchronous manner. Total four activities were not performing according to the clinical practice plan. Consequently, this was affecting the overall model performance. For example, Activities of anesthesia, trocar removal, and short-axis wire check were conducted (17 times) in a synchronous manner, and two times the model move had observed (Table 2). Therefore, 2 out of 19 instances, mentioned activities were not found in the traces. Similarly, long axis wire check activity was not observed a single time in the observed traces.

Discussion
The present study is aimed to discover process model design, and data deviation causes between activities in medical training of CVC installation. In this study, we adopted the iIM   ACI technique to develop a Petri net model to highlight the medical activity sequences. Besides, a conformance check involved in calculating the model fitness was well explained. The synchronous behavior of 61.1% event logs, and asynchronous behavior of 38.9% event logs was observed. Four primary activities affecting the model performance (Table 2) are observed. Results revealed that it is essential to have immediate steps for log fitness improvement. A study of generating enhanced structural medical process [19], helped to gain a better understanding of various hospital models. In real clinical practices, workforce setting to manage the CVC position under elective and planned conditions. Due to the involvement of advanced computing methods in healthcare, the requirement of central venous access should be fast, lifesaving measures, and the specialist should do the CVC position [20,21]. Our findings can contribute to understanding the insight process behavior of medical training for enabling doctors to finish the surgical process on time.

Model discovery and replay fitness validation
To the best of our knowledge, it is the first study on the conformance check challenge on CVC installation. In the study of [14], an explanation on the use of process mining strategies to find out feedback on different interests of the surgery process was done. The main limitation of this study was not to disclose the primary activity involvement and their contribution to the conformance check. However, the current research avoids this limitation by including primary activities of CVC surgery. We observed three parallel flows in the first stage of operation. The doctor might choose individual flow according to the activity plan. After that, process flow was followed by the synchronous flow of activities until the end of the surgery. This study is in line with [22] that united with the proposed methodology of process patterns detection in structured learning practices.
Despite this, another limitation is a low process instance rate, which might hamper the supposition of outcomes to the overall medical students' population. However, we believe that this is an initial step to conduct process model schemas in the particular program of medical training. Another important issue was low replay fitness value (61.1%), as per Ref. [23]; if any model possesses low replay fitness value, the model soundness (i.e., process model able to reach the end state without any error) will be highly affected. These issues could be addressed in future studies for validating the process mining usage in the real world.

Conclusions
In health care, designing high-level learning practice could be essential for the acquisition of better results in surgeries. Supplement of primary training for medical trainees could enable them to develop specific surgery methods in a short time. This paper mainly discussed the usage of process mining in the medical interest of CVC installation. An Inductive mining technique was employed in finding activity sequences at surgery time. The outcome model represents a structural view of the process model with a combination of sequential and parallel events. Besides, the present study limited to low model fitness values, and it could be addressed in future studies by involving advanced process mining techniques.