WOW !! MUCH LOVE ! SO WORLD PEACE !
Fond bitcoin pour l'amélioration du site: 1memzGeKS7CB3ECNkzSn2qHwxU6NZoJ8o
  Dogecoin (tips/pourboires): DCLoo9Dd4qECqpMLurdgGnaoqbftj16Nvp


Home | Publier un mémoire | Une page au hasard

 > 

Comparison of process mining techniques application to flexible and unstructured processes

( Télécharger le fichier original )
par HANANE ARIOUAT
Université Paris est créteil - Master 2 2014
  

sommaire suivant

Bitcoin is a swarm of cyber hornets serving the goddess of wisdom, feeding on the fire of truth, exponentially growing ever smarter, faster, and stronger behind a wall of encrypted energy

September 2013/2014.

FACULTY OF SCIENCE AND TECHNOLOGY
UNIVERSITY PARIS XII

SECOND MASTER SCIENCE FOR ENGINEERING

SPECIALTY:

COMPLEX SYSTEMS, INFORMATION TECHNOLOGY AND CONTROL.

Theme

COMPARISON OF PROCESS MINING TECHNIQUES
APPLICATION TO FLEXIBLE AND UNSTRUCTURED
PROCESSES

Presented by:

M$ ARIOUAT Hanane

Supervisor: Pr BARKAOUI kamal
Co-Supervisor: Pr JACKY Akoka

1

Abstract

Process Mining refers to the extraction of process models from event logs. There are many process mining algorithms with different theoretical foundations and aims, raising the question of how to choose the best for a particular situation. A framework is proposed for objectively comparing algorithms for process discovery against a known ground truth, with an implementation using existing tools. Real-life processes tend to be less structured and more flexible. Traditional process mining algorithms have problems dealing with such unstructured processes and generate spaghetti-like process models that are hard to comprehend. To address these problems, we use an approach where the event log is clustered iteratively such that each of the resulting clusters corresponds to a coherent set of cases that can be adequately represented by a process model. In this thesis we present an approach to trace clustering. We rely on the traces profiles, earlier proposed, and adjust it to our purpose by considering only the activities in a case. We propose a new distance measure by using the Logical operator XOR and a new technique to calculate new cluster centers by using the logical operator AND. We have implemented and experimented our approach using real life event logs. Using our, provide homogeneous subsets, and for each subset a better process model is created using the ProM framework. We evaluate the goodness of the formed clusters using established fitness and comprehensibility metrics defined in the context of process mining. The proposed approach is able to generate clusters such that the process models mined from the clustered traces show a high degree of fitness and comprehensibility when compared to contemporary approaches [32].

i

Contents

Table of Content i

List of Figures vii

List of Tables ix

Introduction 1

1 Business Process and Process Mining 4

1.1 Introduction 4

1.2 Business Processes 4

1.2.1 Definitions 5

1.2.2 Business Process Management 6

1.2.3 BPM life cycle 6

1.2.4 Business processes modeling languages 7

1.2.4.1 Petri Nets 8

1.2.4.2 Workflow Nets 10

1.2.4.3 Causal Nets (C-Nets) 11

1.2.4.4 BPMN 12

1.3 Process Mining 14

1.3.1 Event logs 15

1.3.2 Log filtering 16

1.3.3 Process Mining Perspectives 17

1.3.4 Process Mining as Control-Flow Discovery 18

1.3.4.1 The á-algorithm 20

1.3.4.2 The á+-algorithm 23

1.3.4.3 The Heuristics algorithm 24

1.3.4.4 The Genetic algorithm 25

1.4 Evaluation metrics of Business Processes 27

1.4.1 Performance of a Process Mining Algorithm 27

1.4.2 Evaluating the discovered process 27

1.4.2.1 Model-to-log Metrics 29

1.4.2.2 Model-to-model Metrics 30

1.5 Conclusion 31

2 Process Mining Tools 32

2.1 Introduction 32

2.2 ProM Architecture 32

2.3 Log Filters 34

2.3.1 Adding artificial 'start' and 'end' events 35

2.3.2 Duplicate Task filter 37

2.3.3 Remove attributes with empty value 37

2.3.4 Enhanced Event Log filter 38

2.3.5 Time based log filter 39

2.4 Mining Tools 39

2.4.1 á-algorithm plug-in 39

2.4.2 Tshinghua-á-algorithm plug-in 40

2.4.3 Heuristics miner plug-in 40

2.4.4 Genetic algorithm plug-in 41

2.4.5 Social Network miner plug-in 42

2.4.6 Organizational Miner plug-in 43

2.4.7 Staff Assignment Miner plug-in 43

2.4.8 Decision Miner plug-in 43

 
 

2.4.9 Fuzzy Miner plug-in

43

 

2.5

Analysis Tools

44

 
 

2.5.1 Conformance checker

44

 
 

2.5.2 Woflan plug-in ( A Petri-net-based Workflow Analyzer)

46

 
 

2.5.3 Performance analysis

46

 
 

2.5.4 LTL checker

47

 

2.6

Conversion Plug-ins

48

 
 

2.6.1 BPMN to Petri-Net

48

 
 

2.6.2 Petri Net to Yawl Model

49

 
 

2.6.3 Petri Net into WF-Net

49

 
 

2.6.4 YAWL model into EPC

50

 

2.7

Conclusion

51

3

Comparison and analysis of process mining algorithms

52

 

3.1

Introduction

52

 

3.2

Problem Description

52

 

3.3

Comparison and analysis

53

 
 

3.3.1 Log description

53

 
 

3.3.2 Experimentation

55

 
 

3.3.2.1 Fitness

55

 
 

3.3.2.2 Precision and Generality

57

 
 

3.3.2.3 Precision and Recall

61

 

3.4

Conclusion

63

4

Proposition

64

 

4.1

Introduction

64

 

4.2

Proposition description

64

 
 

4.2.1 Trace Profiles

66

 
 

4.2.2 XOR Operator

67

 
 

4.2.3 AND Operator

67

 
 

4.2.4 Clustering Algorithm

68

4.2.4.1 Distance Measure 69

4.2.4.2 Calculate new clusters 69

4.3 Running Examples 69

4.4 Discussion 72

4.5 Conclusion 73

Conclusion and Future Works 74

Bibiography 76

v

List of Figures

1.1 Different aspects of a business process [54] 6

1.2 BPM life cycle [15] 7

1.3 A marked Petri net. 9

1.4 Petri Nets in two different states. 10
1.5 Some basic workflow templates that can be modeled using Petri Net notation. 11

1.6 A causal net example. 12

1.7 Example of some basic components, used to model a business process using

BPMN notation. 13

1.8 A Petri Net and a BPMN model that exhibit similar set of traces, annotated

with some control-flow patterns that exist in both models. 14

1.9 From event logs to models via process mining 15

1.10 Some mining results for the process perspective (a) and organizational(b,c). 18

1.11 Illustration of the main control-flow patterns. 20

1.12 Four process where different dimensions are pointed out (inspired by Fig. 2 of [28]). The (a) model represents the original process, that generates the log of Table 2.2; in this case all the dimensions are correctly highlighted; (b) is a model with a low fitness; (c) has low precision and (d) has low

generalization and structure 29

2.1 Overview of the ProM Framework. (adapted from [41]) 33

2.2 Log filter Screenshot (In ProM) 34

2.3 Process model using the Fuzzy Miner on a non filtered event log. 35

2.4 Process model after filtering the event log 36

2.5 Process model using the heuristic miner before filtering. 36

2.6 Process model using the heuristic miner after filtering. 37

2.7 Duplicate Task filter 37

2.8 Removing attributes with empty value 38

2.9 Enhanced Event Log filter 38

2.10 Time based log filter 39

2.11 alpha miner 40

2.12 Tshinghua alpha miner. 40

2.13 Process Model of the example log 41

2.14 Process Model of the example log with genetic miner 42

2.15 Descovering social networks by ProM. 42

2.16 Model view shows places in the model where problems occurred during the

log replay. 45

2.17 Analysis of the precision of a model allows to detect overgeneral parts. . 45

2.18 Structural analysis detects duplicate task that list alternative behavior and

redundant tasks. 46

2.19 The correctness of a model using Woflan tool. 46

2.20 Scheenshot of the analysis plug-in Performance Analysis with Petri net. . 47

2.21 Scheenshot of the analysis plug-in LTL Checker Plugin (Example raining)

[41] 47

2.22 BPMN example 48

2.23 Petri net translation of the BPMN in Figure 2.22 49

2.24 The mined review process model converted to a YAWL model 49

3.1 The processes within Rabobank ICT 54

3.2 Fitness against the number of traces before filtering 56

3.3 Fitness against the number of traces after filtering 57

3.4 Mined model by Alpha algorithm 59

3.5 Mined model by Genetic algorithm 59

3.6 Mined model by Heuristic miner 60

3.7 Mined model by Alpha algorithm case 2 60

vii

List of Tables

3.8 Mined model by Genetic algorithm case 2 61

3.9 Mined model by Heuristic miner case 2 61

3.10 Average Structural precision and Behavioral precision values 62

3.11 Average Structural recall and Behavioral recall values 63

4.1 Our clustering approach overview 66

viii

List of Tables

1.1

An event log (audit trail).

16

1.2

Example of log traces, generated from the executions of the process pre-

 
 

sented in (Figure 1.12 (a))

28

2.1

Example of an event log with 3 process instances, for the process of patient care in a hospital (A: Register patient, B: Contact family doctor, C: Treat

 
 

patient, D: Give prescription to the patient, E: Discharge patient)

41

3.1

Extract of the event log

54

3.2

A fragment of the second event log: each line corresponds to an event. . . .

55

3.3

Conformance checker between the first example and its mined models . . .

60

3.4

Conformance checker between IM0047050 case and its mined models. . . .

61

4.1

Interpretation of XOR results

67

4.2

Interpretation of AND results

68

4.3

Example process logs (A: Receive a item and repair request, B: Check the item, C: Check the warranty, D: Notify the customer, E: Repair the item, F: Test the repaired product, G: Issue payment, H: Send the cancellation

 
 

letter, I: Return the item)

70

4.4

traces profiles for the example log from 4.3

70

4.5

Metrics values

70

4.6

Metrics values

71

4.7

Event logs form [39]

71

4.8

Metrics values for the second event log

71

ix

List of Tables

4.9 Metrics values 71

4.10 Metrics values 72

1

sommaire suivant






Bitcoin is a swarm of cyber hornets serving the goddess of wisdom, feeding on the fire of truth, exponentially growing ever smarter, faster, and stronger behind a wall of encrypted energy








"L'ignorant affirme, le savant doute, le sage réfléchit"   Aristote