๐”– Scriptorium
โœฆ   LIBER   โœฆ

๐Ÿ“

Guide to Big Data Applications

โœ Scribed by Srinivasan, S(Editor)


Publisher
Springer
Year
2017;2018
Tongue
English
Leaves
567
Series
Studies in Big Data 26
Category
Library

โฌ‡  Acquire This Volume

No coin nor oath required. For personal study only.

โœฆ Synopsis


This handbook brings together a variety of approaches to the uses of big data in multiple fields, primarily science, medicine, and business. This single resource features contributions from researchers around the world from a variety of fields, where they share their findings and experience. This book is intended to help spur further innovation in big data. The research is presented in a way that allows readers, regardless of their field of study, to learn from how applications have proven successful and how similar applications could be used in their own field. Contributions stem from researchers in fields such as physics, biology, energy, healthcare, and business. The contributors also discuss important topics such as fraud detection, privacy implications, legal perspectives, and ethical handling of big data.

โœฆ Table of Contents


Foreword......Page 7
Preface......Page 9
Acknowledgements......Page 12
Contents......Page 13
List of Reviewers......Page 15
Part I General......Page 16
1.1 Introduction......Page 17
1.1.3 Better Customer Relationships......Page 18
1.2 From Value Disciplines to Digital Disciplines......Page 19
1.2.2 Solution Leadership......Page 20
1.2.4 Accelerated Innovation......Page 21
1.3.1 Real-Time Process and Resource Optimization......Page 22
1.3.3 Digital-Physical Substitution and Fusion......Page 24
1.3.4 Exhaust-Data Monetization......Page 25
1.3.6 Beyond Business......Page 26
1.4.1 Digital-Physical Mirroring......Page 27
1.4.3 Product/Service Usage Optimization......Page 28
1.4.6 Long-Term Product Improvement......Page 29
1.4.9 Transformations......Page 30
1.5 Collective Intimacy......Page 31
1.5.3 Recommendations......Page 33
1.5.5 Beyond Business......Page 34
1.6 Accelerated Innovation......Page 35
1.6.2 Contest Economics......Page 36
1.6.4 Beyond Business......Page 37
1.7 Integrated Disciplines......Page 38
References......Page 39
2.1 Introduction......Page 42
2.2 Information Privacy Defined......Page 43
2.2.1 Is It Personally Identifiable Information?......Page 44
2.3 Big Data: Understanding the Challenges to Privacy......Page 46
2.3.1 Big Data: The Antithesis of Data Minimization......Page 48
2.3.2 Predictive Analysis: Correlation Versus Causation......Page 49
2.3.3 Lack of Transparency/Accountability......Page 50
2.4 Privacy by Design and the 7 Foundational Principles......Page 51
2.4.1 The 7 Foundational Principles......Page 53
2.5.1 Being Proactive About Privacy Through Prevention......Page 54
2.5.2 Data Minimization as the Default Through De-identification......Page 55
2.5.3 Embedding Privacy at the Design Stage......Page 56
2.5.4 Aspire for Positive-Sum Without Diminishing Functionality......Page 57
2.6 Conclusion......Page 58
References......Page 59
3.1 Introduction......Page 62
3.2.1.1 Server/Client Architecture......Page 66
3.2.1.2 Decentralized Architecture......Page 68
3.2.2.2 Alternating Direction Method of Multipliers......Page 69
3.3.1 Applications Based on the Newton-Raphson Method......Page 71
3.3.2 Applications Based on ADMM......Page 75
3.3.2.1 Regression......Page 76
3.3.2.2 Classification......Page 77
3.3.2.3 Convergence and Robustness for Decentralized Data Analysis......Page 78
3.4.1 Regression......Page 79
3.4.2 Classification......Page 81
3.4.3 Evaluation......Page 85
3.5 Asynchronous Optimization......Page 86
3.5.1 Asynchronous Optimization Based on Fixed-Point Algorithms......Page 87
3.5.3 Asynchronous Alternating Direction Method of Multipliers......Page 88
3.6 Discussion and Conclusion......Page 89
References......Page 91
4.1 Introduction......Page 96
4.2.1 Background......Page 98
4.2.2 Neural Network Language Model......Page 99
4.2.2.1 Restricted Boltzmann Machine (RBM)......Page 102
4.2.2.2 Recurrent and Recursive Neural Network......Page 103
4.2.2.3 Convolutional Neural Network (CNN)......Page 106
4.2.2.4 Hierarchical Neural Language Model (HNLM)......Page 107
4.2.3 Sparse Coding Approach......Page 108
4.2.4 Evaluations of Word Embedding......Page 110
4.3 Word Embedding Applications......Page 111
4.4 Conclusion and Future Work......Page 113
References......Page 114
Part II Applications in Science......Page 118
5.1 Introduction......Page 119
5.2.1 Sample Applications......Page 121
5.3.1 RapidMiner......Page 123
5.4.1 Background......Page 124
5.4.2 Dataset......Page 125
5.4.3 Data Analysis......Page 126
5.4.4 Summary......Page 127
5.5 Case Study II: Big Data in Environmental Microbiology......Page 129
5.5.2 Genome Dataset......Page 130
5.5.3 Analysis of Big Data......Page 131
5.6 Discussion and Future Work......Page 132
References......Page 134
6.1 Introduction......Page 137
6.2.1.2 Ingestion......Page 138
6.2.1.4 Batch Event Processing......Page 139
6.3 High-Performance and Big Data Deployment Types......Page 140
6.3.2 Cloud-Based Hardware Deployment......Page 141
6.3.5 Summary......Page 142
6.4.1.1 Data Ingestion Stacks......Page 143
6.4.1.4 Indexing and Querying Engine......Page 145
6.4.2.1 Data Ingestion Cluster......Page 146
6.4.2.3 Data Stores......Page 148
6.4.2.5 Batch-Processing Frameworks......Page 149
6.4.2.6 Interoperability Between Frameworks......Page 150
6.4.2.7 Summary......Page 151
6.4.4.1 Software Defined Infrastructure (SDI)......Page 152
6.4.4.3 Intelligent Software for Performance Management......Page 154
6.5 Designing Data Pipelines for High Performance......Page 155
6.6 Conclusions......Page 156
References......Page 157
7.1 Introduction......Page 160
7.2 Improve Classification Accuracy of Bayesian Inversion Through Big Data Learning......Page 161
7.2.1 Bayesian Inversion and Measurement Errors......Page 162
7.2.2 Bayesian Graphical Model......Page 163
7.2.3 Statistical Inference for the Gaussian Mixture Model......Page 166
7.2.3.1 Distributed Markov Chain Monte Carlo for Big Data......Page 168
7.2.4 Tests from Synthetic Well Integrity Logging Data......Page 169
7.3 Proactive Geosteering and Formation Evaluation......Page 171
7.3.1.1 The Deterministic Inversion Method......Page 173
7.3.1.2 The Statistical Inversion Method......Page 175
7.3.2 Hamiltonian Monte Carlo and MapReduce......Page 177
7.3.3.1 Tests from Synthetic Data......Page 178
7.3.3.2 Test from Field Data......Page 180
References......Page 182
8.1 Introduction......Page 185
8.2 The Value of Big Data for the Petroleum Industry......Page 186
8.2.1.1 SaaS (Software as a Service)......Page 187
8.2.1.3 Reduced IT Support......Page 188
8.2.2 Collaboration......Page 189
8.2.2.1 Ideal Ecosystem......Page 190
8.2.2.2 Collaboration Enhancement by Universal Cloud-based Database Organization......Page 191
8.2.2.4 Collaboration Enhancement by Low Cost Subscription Model......Page 192
8.2.3.1 Workflow Variety......Page 193
8.3.1 Big Data......Page 194
8.3.3 Browser......Page 195
8.4.1 Past Steps......Page 196
8.4.2.1 What Is GeoFit?......Page 197
8.4.2.2 Why Use a Universal Cloud-Based Hybrid Database?......Page 198
8.4.2.3 Workflow Engine......Page 200
8.4.2.5 Viewing......Page 201
8.4.3.1 Project Structure......Page 203
8.5 Future of Tools, Example......Page 204
8.6.1.1 Planning Storage......Page 206
8.6.1.2 Planning CPU Capacity......Page 207
8.6.1.3 Planning Memory Requirements......Page 208
8.6.2 Eventual Consistency......Page 209
8.6.3 Fault Tolerance......Page 210
8.7 Big Data is the future of O&G......Page 212
References......Page 213
9.1 Introduction......Page 215
9.1.1 Organization of the Chapter and Summary of Results......Page 218
9.2 A Brief Review of the Statistics of Friendship Paradoxes: What are Strong Paradoxes, and Why Should We Measure Them?......Page 219
9.2.2 What Does Feld's Argument Imply?......Page 220
9.2.4 Beyond Random-Wiring Assumptions: Why Weak and Strong Friendship Paradoxes are Ubiquitous in Undirected Networks......Page 222
9.2.6 Strong Degree-Based Paradoxes in Directed Networks and Strong Generalized Paradoxes are Nontrivial......Page 223
9.3.1 Definition of the Network and Core Questions......Page 224
9.3.2 All Four Degree-Based Paradoxes Occur in the Quora Follow Network......Page 225
9.3.3 Anatomy of a Strong Degree-Based Paradox in Directed Networks......Page 227
9.3.4 Summary and Implications......Page 229
9.4.1 What are Upvotes and Downvotes?......Page 230
9.4.2 The Downvoting Network and the Core Questions......Page 231
9.4.3 The Downvoting Paradox is Absent in the Full Downvoting Network......Page 232
9.4.4 The Downvoting Paradox Occurs When The Downvotee and Downvoter are Active Contributors......Page 235
9.4.5 Does a ``Content-Contribution Paradox'' Explain the Downvoting Paradox?......Page 238
9.4.6 Summary and Implications......Page 241
9.5.1 Content Dynamics in the Follow Network......Page 242
9.5.2 Core Questions and Methodology......Page 245
9.5.3 Demonstration of the Existence of the Paradox......Page 247
9.5.4 Can We Measure the Impact of the Paradox?......Page 249
9.5.5 Summary and Implications......Page 250
9.6 Conclusion......Page 251
References......Page 253
10.1 Context and Motivation......Page 255
10.2.2 Deduplication Level......Page 256
10.2.4 Client- vs Server-Side Deduplication......Page 257
10.4 Secure Image Deduplication Through Image Compression......Page 258
10.6 Proposed Image Deduplication Scheme......Page 259
10.6.1 Image Compression......Page 260
10.6.2 Partial Encryption of the Compressed Image......Page 261
10.6.3 Image Hashing from the Compressed Image......Page 262
10.6.4.1 Experimental Settings......Page 263
10.6.4.2 Deduplication Analysis......Page 264
10.6.4.3 Performance Analysis......Page 267
10.6.5 Security Analysis......Page 268
10.7 Secure Video Deduplication Scheme in Cloud Storage Environment Using H.264 Compression......Page 269
10.9 Proposed Video Deduplication Scheme......Page 270
10.9.1 H.264 Video Compression......Page 271
10.9.2 Signature Generation from the Compressed Videos......Page 272
10.9.3 Selective Encryption of the Compressed Videos......Page 273
10.9.4 Experimental Results......Page 275
10.9.5 Security Analysis......Page 278
10.10 Chapter Summary......Page 279
References......Page 280
11.1 Introduction......Page 282
11.2 Searchable Encryption Models......Page 283
11.3 Text......Page 284
11.3.1.1 Encrypted Indexes......Page 285
11.3.1.2 Bloom Filters......Page 287
11.3.2 Private/Public Search Scheme: Cloud Document Storage Extended......Page 288
11.3.3 Public/Private Search Scheme: Email Filtering System......Page 289
11.3.3.1 Identity-Based Encryption (IBE)......Page 290
11.3.3.2 An IBE-Based Secure Email Filtering System......Page 291
11.3.4 Public/Public Search Scheme: Delegated Investigation of Secured Audit Logs......Page 292
11.3.4.1 Symmetric Scheme......Page 293
11.3.4.2 Asymmetric Scheme......Page 294
11.4.1 Order Preserving Encryption......Page 295
11.5.1 Keyword Based Media Search......Page 296
11.5.2 Content Based Media Search......Page 297
11.5.2.1 Media Search Using OPE......Page 298
11.5.2.2 Homomorphic Encryption......Page 299
11.6 Other Applications......Page 300
11.7 Conclusions......Page 301
References......Page 302
12.1.1 Motivations of Big-Data Enabled Structural Health Monitoring......Page 303
12.1.2 Overview of the Proposed MS-SHM-Hadoop......Page 305
12.1.3 Organization of This Chapter......Page 306
12.2.1 Infrastructure of MS-SHM-Hadoop......Page 307
12.2.2 Flowchart of the MS-SHM-Hadoop......Page 309
12.3 Acquisition of Sensory Data and Integration of Structure-Related Data......Page 310
12.4.1 The Features Used in Nationwide Civil Infrastructure Survey......Page 312
12.4.2.1 Overview of Machine-Learning-Enabled Life-Expectancy Estimation......Page 315
12.4.2.2 Weibull Linear Regression Model......Page 316
12.4.2.5 Deep Learning Models......Page 317
12.4.3.3 Dimensionality Reduction......Page 319
12.5 Global Structural Integrity Analysis......Page 320
12.5.2.1 Data Query for the Measured Global Characteristics of the Targeted Civil Infrastructure......Page 322
12.5.2.3 Assessment of the Integrity Level of Civil Infrastructure......Page 324
12.6.1 Deep-Learning-Enabled Component Reliability Analysis......Page 325
12.6.2 Probe Prolongation Strategies via Simulating Crack Initialization and Growth......Page 327
12.7 Civil Infrastructure's Reliability Analysis Based on Bayesian Network......Page 328
References......Page 329
Part III Applications in Medicine......Page 334
13.1 Introduction......Page 335
13.2.1 Epilepsy......Page 338
13.2.2 Parallel Computing......Page 341
13.2.3 Challenges......Page 342
13.2.4 Current State of Art......Page 343
13.3 Nonlinear Dynamical Systems with Chaos......Page 344
13.4 Lyapunov Exponents......Page 349
13.4.1 Numerical Computation of Lyapunov Exponents......Page 350
13.4.1.1 The Wolf Algorithm......Page 351
13.4.1.2 The Rosenstein and Kantz Algorithm......Page 353
13.5 Rapid Prototyping HPCmatlab Platform......Page 354
13.5.1 Bigdata and Matlab......Page 355
13.5.2 Parallel Computing and HPCmatlab......Page 356
13.5.2.1 Parallel Computation of Lyapunov Exponents......Page 357
13.5.2.2 Epilepsy as a Big Data Problem......Page 359
13.6 Case Study: Epileptic Seizure Prediction and Control......Page 360
13.7 Future Research Opportunities and Conclusions......Page 367
Appendix 1: Electrical Stimulation and Experiment Setup......Page 369
Appendix 2: Preparation of Animals......Page 370
References......Page 371
14.1 Introduction......Page 376
14.2 Capture Reliably......Page 379
14.2.2 Data Sparsity......Page 380
14.2.3 Feature Selection......Page 381
14.2.4 State-of-the Art and Novel Algorithms......Page 383
14.2.5 Physiological Precision and Stratified Medicine......Page 384
14.2.6 The Green Button: A Case Study on Capture Reliably Principle of the CAPE Roadmap......Page 385
14.3.1 Networks Medicine......Page 386
14.3.3 Agent Based Models......Page 388
14.3.4 Prevalence of Symptoms on a Single Indian Healthcare Day on a Nationwide Scale (POSEIDON): A Case Study on Approach Systemically Principal of the CAPE Roadmap......Page 389
14.4 Phenotype Deeply*-4pt......Page 391
14.4.2 Heart Rate Variability: A Case Study on Phenotype Deeply Principle of the CAPE Roadmap......Page 392
14.5 Enable Decisions......Page 397
14.5.1 Causal Modeling Through Bayesian Networks......Page 398
14.5.4 SAFE-ICU Initiative: A Full Spectrum Case Study in Biomedical Data-Science......Page 399
References......Page 401
15.1 Introduction......Page 405
15.2.1 Data Description and Preparation......Page 406
15.2.2 Data Preparation Process for Sequence Analysis......Page 408
15.3 Results......Page 409
15.3.1 Top 20 Diseases in TUD and Non-TUD Patients......Page 411
15.3.2 Top 20 Diseases in TUD Patients and Corresponding Prevalence in Non-TUD Patients......Page 413
15.3.3 Top 15 Comorbidities in TUD Patients Across Two Hospital Visits (Second Iteration) and Corresponding Prevalence in Non-TUD Patients......Page 414
15.3.4 Comorbidities in TUD Patients Across Three Hospital Visits (Third Iteration) and Comparison with Non-TUD Patients......Page 415
15.4 Discussion and Concluding Remarks......Page 416
References......Page 417
16 The Impact of Big Data on the Physician......Page 418
16.1.1 Defining Quality Care......Page 419
16.1.2 Choosing the Best Doctor......Page 420
16.1.3 Choosing the Best Hospital......Page 421
16.1.5 What is mHealth?......Page 423
16.1.6 mHealth from the Provider Side......Page 424
16.1.7 mHealth from the Patient Side......Page 425
16.1.9 Accessibility......Page 428
16.1.10 Privacy and Security......Page 429
16.1.11 Regulation and Liability......Page 430
16.1.12 Patient Education and Partnering with Patients......Page 431
16.1.13 Developing Online Health Communities Through Social Media: Creating Data that Fuels Research......Page 432
16.1.15 Beyond the Package Insert: Iodine.com......Page 434
16.1.16 Data Inspires the Advent of New Models of Medical Care Delivery......Page 436
16.2.1.1 Imagine the Following Scenario......Page 437
16.2.2 Probabilistic Systems......Page 439
16.2.4 Data Driven Approaches......Page 440
16.2.5 Challenges and Areas of Exploration......Page 442
16.2.6 Using Big Data to Improve Treatment Options......Page 443
References......Page 447
Part IV Applications in Business......Page 452
17.1 Introduction......Page 453
17.2.1 The Nature of Bank Information Activities......Page 454
17.2.1.1 The Corporate Culture of Banks......Page 457
17.2.2 Analytical Information Needs in Banks......Page 458
17.3.1 Tradeoffs of Big Data Analytics......Page 459
17.3.2 Data Collection and Integration......Page 460
17.3.3 Data Quality Challenges......Page 462
17.3.4 Discovery and Detection......Page 463
17.3.4.1 Risks......Page 465
17.3.4.2 Customers......Page 466
17.3.5 Big Data Approaches in Banking Supervision......Page 468
17.4 Big Data Analysis Methods and Instruments in Banking......Page 471
17.4.1 Big Data Processing Methods: Data Mining, Text Mining, Machine Learning, Visualization......Page 472
17.4.1.1 Visualization......Page 474
17.4.2 Analytical Platforms......Page 475
17.4.3 Examples of Fraud Detection......Page 478
17.5 Managerial Implications......Page 484
References......Page 486
18.1 Introduction......Page 489
18.2 Multi-Touch Attribution (MTA)......Page 490
18.3 Granular Audience Targeting......Page 492
18.4 Forecasting......Page 495
18.5 Predictive Analytics......Page 498
18.6 Content Marketing......Page 500
18.7 Weaving Big Data in Applications......Page 502
18.8 Summary......Page 503
References......Page 504
19.2.1 The Rise of Social Media Data......Page 505
19.2.2 The Quick Service Restaurant Sector......Page 506
19.3 Analysis of Numeric and Text Reviews in Yelp......Page 507
19.3.1 Description of Numeric Ratings......Page 508
19.3.2 Comparison Between Non-Franchise and Franchise Locations......Page 509
19.3.4 Analysis of All U.S. Locations......Page 513
19.3.5 Description of Text Reviews......Page 514
19.3.6 Caution Related to Analysis of Reviews......Page 516
19.4 Guide to Using R to Extract Yelp Data......Page 517
References......Page 523
Author Biographies......Page 525
Index......Page 555


๐Ÿ“œ SIMILAR VOLUMES


Guide to Big Data Applications
โœ Srinivasan S. (ed.) ๐Ÿ“‚ Library ๐Ÿ“… 2018 ๐Ÿ› Springer ๐ŸŒ English

<p>This handbook brings together a variety of approaches to the uses of big data in multiple fields, primarily science, medicine, and business. This single resource features contributions from researchers around the world from a variety of fields, where they share their findings and experience. This

Guide to Big Data Applications
โœ S. Srinivasan ๐Ÿ“‚ Library ๐Ÿ“… 2017 ๐Ÿ› Springer ๐ŸŒ English

This handbook brings together a variety of approaches to the uses of big data in multiple fields, primarily science, medicine, and business. This single resource features contributions from researchers around the world from a variety of fields, where they share their findings and experience. This bo

Guide to Big Data Applications (ed.)
โœ S. Srinivasan ๐Ÿ“‚ Library ๐Ÿ“… 2017 ๐Ÿ› Springer ๐ŸŒ English

<p>This handbook brings together a variety of approaches to the uses of big data in multiple fields, primarily science, medicine, and business. This single resource features contributions from researchers around the world from a variety of fields, where they share their findings and experience. This

Analytics in a Big Data World: The Essen
โœ Bart Baesens [Bart Baesens] ๐Ÿ“‚ Library ๐Ÿ“… 2014 ๐Ÿ› John Wiley & Sons ๐ŸŒ English

<span><span><p><b>The guide to targeting and leveraging business opportunities using big data &amp; analytics</b></p><p>By leveraging big data &amp; analytics, businesses create the potential to better understand, manage, and strategically exploiting the complex dynamics of customer behavior. <em>An

Analytics in a Big Data World: The Essen
โœ Bart Baesens ๐Ÿ“‚ Library ๐Ÿ“… 2014 ๐Ÿ› Wiley ๐ŸŒ English

The guide to targeting and leveraging business opportunities using big data & analytics <p>By leveraging big data & analytics, businesses create the potential to better understand, manage, and strategically exploiting the complex dynamics of customer behavior. Analytics in a Big Data World reveals

Analytics in a Big Data World. The Esse
โœ Bart Baesens ๐Ÿ“‚ Library ๐Ÿ“… 2014 ๐Ÿ› Wiley ๐ŸŒ English

The guide to targeting and leveraging business opportunities using big data & analytics<br>By leveraging big data & analytics, businesses create the potential to better understand, manage, and strategically exploiting the complex dynamics of customer behavior. Analytics in a Big Data World reveals h