<p>Effective decision support systems (DSS) are quickly becoming key to businesses gaining a competitive advantage, and the effectiveness of these systems depends on the ability to construct, maintain, and extract information from data warehouses. While many still perceive data warehousing as a subd
Intelligent data warehousing: from data preparation to data mining
β Scribed by Zhengxin Chen
- Publisher
- CRC Press
- Year
- 2002
- Tongue
- English
- Leaves
- 228
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Table of Contents
INTELLIGENT DATA WAREHOUSING: From Data Preparation to Data Mining......Page 1
About the author......Page 3
Contents......Page 4
Part I......Page 11
1.1 Why this book is needed......Page 12
1.3 Why intelligent data warehousing......Page 14
1.4 Organization of the book......Page 15
1.5 How to use this book......Page 16
References......Page 17
2.2 Data warehousing and enterprise intelligence......Page 19
2.3.1 Prehistory......Page 20
2.3.2 Stage 1: early 1990s......Page 21
2.4 Basic elements of data warehousing......Page 22
2.5.1 World Wide Web and e-commerce......Page 23
2.5.2 Data Webhouse......Page 25
2.5.3 Ontologies and semantic Web......Page 28
2.6.1 Artificial intelligence as construction of intelligent agents......Page 29
2.6.2 State space search and knowledge representation......Page 30
2.6.4 Symbol-based machine learning......Page 32
2.6.5 Genetic algorithms......Page 33
2.7.1 Integration of database and knowledge-based systems......Page 34
2.7.2 The role of AI in warehousing......Page 35
2.7.3 Java and agent technology......Page 36
2.8.1 What can be analyzed using intelligent data analysis......Page 37
2.8.2.1 Background......Page 38
2.8.2.3 Mining Web data......Page 39
2.8.2.4 Other issues of Web mining......Page 40
2.8.3.2 Clickstream data mart......Page 41
2.9 The future of data warehouses......Page 42
2.10 Summary......Page 43
References......Page 44
3.2.1 Data modeling......Page 46
3.2.2 Relational data model......Page 47
3.2.3 Integrity constraints......Page 49
3.2.5 Basics of query processing......Page 50
3.2.6 Basics of transaction processing......Page 51
3.3.1 Basics of deductive databases......Page 53
3.3.3 Distributed and parallel databases......Page 54
3.3.4 Motivations of data warehousing: a technical examination......Page 55
3.4.1 Operational systems and warehouse data......Page 59
3.4.2 Data warehouse components......Page 60
3.4.3 Data warehouse design......Page 61
3.5.1 Why data marts......Page 62
3.5.2 Types of data marts......Page 63
3.5.4 Networked data marts......Page 64
3.6 Metadata......Page 65
3.7.1 Materialized views......Page 67
3.7.3 Indexing using metadata......Page 69
3.8.1 Measuring data warehouse performance......Page 71
3.8.2 Performance and warehousing activities......Page 72
3.9.1 Basics of OLAP......Page 73
3.9.2 Relationship between data warehousing and OLAP......Page 74
3.10 Summary......Page 75
References......Page 76
Part II......Page 77
4.2 Schema and data integration......Page 78
4.3 Data pumping......Page 80
4.4 Middleware......Page 81
4.5 Data quality......Page 82
4.6.1 General aspects of data cleansing......Page 83
4.6.2.1 Domain relevance......Page 84
4.6.2.3 Multi-pass sorted neighborhood duplicate detection method......Page 85
4.6.2.5 Union-find algorithms......Page 86
4.6.2.6 K-way sorting method......Page 87
4.7 Dealing with data inconsistency in multidatabase systems......Page 88
4.8 Data reduction......Page 89
4.9.1 Overview......Page 90
4.9.2 Preparing the data......Page 92
4.9.2.2 Data cleaning......Page 93
4.9.2.5 SQL query examples......Page 94
4.9.4 Resulting data......Page 96
4.10 Web log file preparation......Page 98
References......Page 101
5.2.1 Entity-relationship ER modeling......Page 102
5.2.2 Dimension modeling......Page 104
5.3.1 An example......Page 105
5.3.2 Steps in using ER model for warehousing conceptual modeling......Page 107
5.3.3 Research work on conceptual modeling......Page 109
5.4.1 Physical design......Page 110
5.4.2 Using functional dependencies......Page 111
5.4.4 Metadata management......Page 112
5.4.6 Using data warehouse tools......Page 113
5.4.7 User behavior modeling for warehouse design......Page 114
5.4.9 Prototyping data warehouses......Page 115
5.5 Data cubes......Page 116
References......Page 118
6.1 Overview......Page 121
6.2.1 Materialization of data cubes......Page 122
6.2.2.1 Hierarchies in lattice......Page 125
6.2.2.2 Composite lattices for multiple, hierarchical dimensions......Page 126
6.2.2.3 The cost analysis......Page 127
6.3 Using a simple optimization algorithm to select views......Page 129
6.4.1 Preliminaries of aggregation functions......Page 131
6.4.2.1 Calculating SUM on data cube using PRE_SUM cube......Page 132
6.4.2.2 Calculating COUNT......Page 133
6.5 Case study: view selection for a human service data warehouse......Page 134
6.5.1 Overview of the case study......Page 136
6.5.3 Data model......Page 137
6.5.5 Development of the OR view graph......Page 138
6.5.6 Implementation......Page 139
6.5.6.2 Solution encoding, reproduction, and mutation......Page 140
6.5.6.3 Description of the fitness function......Page 141
6.5.6.5 Penalty function......Page 142
6.5.7.1 Small OR view graph......Page 143
6.5.7.2 Results from a small view graph......Page 145
6.5.7.3 Complete OR view graph......Page 146
6.5.7.4 Results from a complete view graph......Page 147
6.5.8 Summary of the case study......Page 148
References......Page 149
7.1 Overview......Page 151
7.2.2 View selection problem......Page 152
7.2.3 View data lineage problem......Page 154
7.3.2 Using full and partial information for view maintenance......Page 155
7.3.3 Using incremental techniques......Page 156
7.3.4 Using auxiliary data and auxiliary views......Page 157
7.3.6 Incremental maintenance of materialized views with duplicates......Page 158
7.3.8 Views and queries......Page 159
7.3.9 Unified view selection and maintenance......Page 160
7.4.1 Immediate and deferred view maintenance......Page 161
7.4.2 Dealing with anomalies in view maintenance......Page 163
7.4.3 Concurrent updates in distributed environment......Page 166
7.5.1 Integrity constraints......Page 167
7.5.2 Active databases......Page 168
7.6.2 Warehouse evolution......Page 169
7.6.3 From static to dynamic warehouse design......Page 170
7.6.4 View redefinition and adaptation......Page 171
7.7.3 Online updates......Page 172
7.8 Data cubes......Page 173
7.9.1 Materialized views and deductive databases......Page 177
7.9.2 Materialized views in object-oriented databases......Page 178
7.11.1 Temporal view self-maintenance......Page 179
7.11.3 Materialized views in Oracle......Page 180
References......Page 181
Part III......Page 188
8.1 Overview......Page 189
8.2.1 Categories of data mining......Page 190
8.2.2 Association rule mining......Page 191
8.2.3 Data classification and characterization......Page 193
8.3.2 Implementing the Apriori algorithm......Page 194
8.3.3 Graphical user interface......Page 195
8.3.4 Analysis......Page 196
8.4.1 Basics of rough set theory......Page 197
8.4.2.2 Source data used in the case study......Page 201
8.4.2.3 Applying RSDA on sample data......Page 205
8.5 Recent progress of data mining......Page 206
8.5.4 Dynamics of data mining......Page 207
References......Page 208
9.2 Integration of OLAP and data mining......Page 210
9.3 Influential association rules......Page 211
9.4 Significance of influential association rules......Page 213
9.5 Reviews of algorithms for discovery of conventional association rules......Page 215
9.6.1 The IARM algorithm......Page 217
9.6.3 Categorization and support counting of a numeric measure......Page 218
9.6.5 Pruning and refining candidate influential association rules......Page 219
9.6.6 Problems of the IARM algorithm......Page 220
9.7.1 Basic idea of bitmap indexing......Page 221
9.7.2 Bitmap indexing in data warehouses......Page 222
9.8 Mining influential association rules using bitmap indexing IARMBM......Page 224
References......Page 227
π SIMILAR VOLUMES
Data mining (if you havenβt heard of it before), is the βAutomated Extraction of Hidden Predictive Information from Databases.β This book discusses in a step by step approach instructions for the entire data modeling process, with special emphasis on the business knowledge necessary for effective re
It experiences the real-time environment and promotes planning, managing, designing, implementing, supporting, maintaining and analyzing data warehouse in organizations and it also provides various mining techniques as well as issues in practical use of Data Mining Tools. The book is designed for th
..first comprehensive guide to provide practical, step-by- step directions for designing and delivering data- warehousing and mining applications in a telecommunications environment.