𝔖 Scriptorium
✩   LIBER   ✩

📁

Hypothesis Generation and Interpretation: Design Principles and Patterns for Big Data Applications (Studies in Big Data, 139)

✍ Scribed by Hiroshi Ishikawa


Publisher
Springer
Year
2024
Tongue
English
Leaves
380
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✩ Synopsis


This book focuses in detail on data science and data analysis and emphasizes the importance of data engineering and data management in the design of big data applications. The author uses patterns discovered in a collection of big data applications to provide design principles for hypothesis generation, integrating big data processing and management, machine learning and data mining techniques.

The book proposes and explains innovative principles for interpreting hypotheses by integrating micro-explanations (those based on the explanation of analytical models and individual decisions within them) with macro-explanations (those based on applied processes and model generation). Practical case studies are used to demonstrate how hypothesis-generation and -interpretation technologies work. These are based on “social infrastructure” applications like in-bound tourism, disaster management, lunar and planetary exploration, and treatment of infectious diseases.

The novel methods and technologies proposed in Hypothesis Generation and Interpretation are supported by the incorporation of historical perspectives on science and an emphasis on the origin and development of the ideas behind their design principles and patterns.

Academic investigators and practitioners working on the further development and application of hypothesis generation and interpretation in big data computing, with backgrounds in data science and engineering, or the study of problem solving and scientific methods or who employ those ideas in fields like machine learning will find this book of considerable interest.

✩ Table of Contents


Preface
Contents
1 Basic Concept
1.1 Big Data
1.1.1 Big Data in the 5G Era
1.1.2 Characteristics of Big Data
1.1.3 Society 5.0
1.1.4 5G
1.2 Key Concepts of Big Data Analysis
1.2.1 Interaction Between Real-World Data and Social Data
1.2.2 Universal Key
1.2.3 Ishikawa Concept
1.2.4 Single Event Data and Single Data Source
1.2.5 Process Flow of Big Data Analysis
1.3 Big Data’s Vagueness Revisited
1.3.1 Issues
1.3.2 Integrated Data Model Approach
1.4 Hypothesis
1.4.1 What Is Hypothesis?
1.4.2 Hypothesis in Data Analysis
1.4.3 Hypothesis Generation
1.4.4 Hypothesis Interpretation
1.5 Design Principle and Design Pattern
1.6 Notes on the Cloud
1.7 Big Data Applications
1.7.1 EBPM
1.7.2 Users of Big Data Applications
1.8 Design Principles and Design Patterns for Efficient Big Data Processing
1.8.1 Use of Tree Structures
1.8.2 Reuse of Results of Subproblems
1.8.3 Use of Locality
1.8.4 Data Reduction
1.8.5 Online Processing
1.8.6 Parallel Processing
1.8.7 Function and Problem Transformation
1.9 Structure of This Book
References
2 Hypothesis
2.1 What Is Hypothesis?
2.1.1 Definition and Properties of Hypothesis
2.1.2 Life Cycle of Hypothesis
2.1.3 Relationship of Hypothesis with Theory and Model
2.1.4 Hypothesis and Data
2.2 Research Questions as Hints for Hypothesis Generation
2.3 Data Visualization
2.3.1 Low-Dimensional Data
2.3.2 High-Dimensional Data
2.3.3 Tree and Graph Structures
2.3.4 Time and Space
2.3.5 Statistical Summary
2.4 Reasoning
2.4.1 Philosophy of Science and Hypothetico-Deductive Method
2.4.2 Deductive Reasoning
2.4.3 Inductive Reasoning
2.4.4 Generalization and Specialization
2.4.5 Plausible Reasoning
2.5 Problem Solving
2.5.1 Problem Solving of Pólya
2.5.2 Execution Means for Problem Solving
2.5.3 Examples of Problem Solving
2.5.4 Unconscious Work
References
3 Science and Hypothesis
3.1 Kepler Solving Problems
3.1.1 Brahe’s Data
3.1.2 Obtaining Orbit Data from Observation Data (Task 1)
3.1.3 Deriving Kepler’s First Law (Task 2)
3.2 Galileo Conducting Experiments
3.2.1 Galileo’s Law of Free Fall
3.2.2 Thought Experiments
3.2.3 Galileo’s Law of Inertia
3.2.4 Galileo’s Principle of Relativity
3.3 Newton Seeking After Universality
3.3.1 Reasoning Rules
3.3.2 Three Laws of Motion
3.3.3 The Universal Law of Gravitation
3.4 Darwin Observing Nature
3.4.1 Theory of Evolution
3.4.2 Population Growth Model
3.4.3 Fibonacci Sequence Revisited
3.4.4 Logistic Model
References
4 Regression
4.1 Basics of Regression
4.1.1 Ceres Orbit Prediction
4.1.2 Method of Least Squares
4.1.3 From Regression to Orthogonal Regression to Principal Component Analysis
4.1.4 Nonlinear Regression
4.1.5 From Regression to Sparse Modeling
4.2 From Regression to Correlation to Causality
4.2.1 Genetics and Statistics
4.2.2 Galton
4.2.3 Karl Pearson
4.2.4 Neyman and Gosset
4.2.5 Wright
4.2.6 Spearman
4.2.7 Nightingale
4.2.8 Mendel
4.2.9 Hardy–Weinberg Equilibrium
4.2.10 Fisher
References
5 Machine Learning and Integrated Approach
5.1 Clustering
5.1.1 Definition and Brief History of Clustering
5.1.2 Clustering Based on Partitioning
5.1.3 Hierarchical Clustering
5.1.4 Evaluation of Clustering Results
5.1.5 Advanced Clustering
5.2 Association Rule Mining
5.2.1 Applications
5.2.2 Basic Concept
5.2.3 Overview of Apriori Algorithm
5.2.4 Generation of Association Rule
5.3 Artificial Neural Network and Deep Learning
5.3.1 Cross-Entropy and Gradient Descent
5.3.2 Biological Neurons
5.3.3 Artificial Neural Network
5.3.4 Classification
5.3.5 Deep Learning
5.4 Integrated Hypothesis Generation
5.5 Data Structures
5.5.1 Hierarchy
5.5.2 Graph and Network
5.5.3 Digital Ecosystem
References
6 Hypothesis Generation by Difference
6.1 Difference-Based Method for Hypothesis Generation
6.1.1 Classification of Difference-Based Methods
6.1.2 Difference Operations
6.2 Difference in Time
6.2.1 Analysis of Time Series Data
6.2.2 Time Difference: Case of Discovery of Satisfactory Spot
6.2.3 Time Difference: Case of Tankan of BOJ
6.2.4 Difference in Differences: Case of Effect of New Drug
6.2.5 Time Series Model: Smoothing and Filtering
6.2.6 Multiple Moving Averages: Case of Estimating Best Time to View Cherry Blossoms
6.2.7 Exponential Smoothing: Case of Detecting Local Trending Spots
6.2.8 Nested Moving Averages: Case of El Niño–Southern Oscillation
6.2.9 Time Series Forecasting
6.2.10 MQ-RNN
6.2.11 Difference Equation
6.3 Differences in Space
6.3.1 Image with Time Difference
6.3.2 Difference Analysis of Medical Images
6.3.3 Difference Analysis of Topographic Data
6.3.4 Difference in Lunar Surface Images: Case of Discovery of Newly Created Lunar Craters
6.3.5 Image Processing
6.4 Differences in Conceptual Space
6.4.1 Case of Creating the Essential Meaning of Concept
6.4.2 Case of International Cuisine Notation by Analogy
6.5 Difference Between Hypotheses
6.5.1 Case of Discovery of Candidate Installation Sites for Free Wi-Fi Access Point
6.5.2 Case of Analyzing Influence of Weather on Tourist Behavior
6.5.3 GWAS
References
7 Methods for Integrated Hypothesis Generation
7.1 Overview of Integrated Hypothesis Generation Methods
7.1.1 Hypothesis Join
7.1.2 Hypothesis Intersection
7.1.3 Hypothetical Union
7.1.4 Ensemble Learning
7.2 Hypothesis Join: Case of Detection of High-Risk Paths During Evacuation
7.2.1 Background
7.2.2 Proposed System
7.2.3 Experiments and Considerations
7.3 Hypothesis Intersection: Case of Detection of Abnormal Automobile Vibration
7.3.1 Background
7.3.2 Proposed Method
7.3.3 Experiments
7.3.4 Considerations
7.4 Hypothesis Intersection: Case of Identification of Central Peak Crater
7.4.1 Introduction
7.4.2 Proposed Method
7.4.3 Experiments
References
8 Interpretation
8.1 Necessity to Interpret and Explain Hypothesis
8.2 Explanation in the Philosophy of Science
8.2.1 Deductive Nomological Model of Explanation
8.2.2 Statistical Relevance Model of Explanation
8.2.3 Causal Mechanical Model of Explanation
8.2.4 Unificationist Model of Explanation
8.2.5 Counterfactual Explanation
8.3 Subjects and Types of Explanation
8.3.1 Subjects of Explanation
8.3.2 Types of Explanation
8.4 Subjects of Explanation Explained
8.4.1 Data Management
8.4.2 Data Analysis
8.5 Model-Dependent Methods for Explanation
8.5.1 How to Generate Data (HD)
8.5.2 How to Generate Hypothesis (HH)
8.5.3 What Features of Hypothesis (WF)
8.5.4 What Reason for Hypothesis (WR)
8.6 Model-Independent Methods of Explanation
8.6.1 LIME
8.6.2 Kernel SHAP
8.6.3 Counterfactual Explanation
8.7 Reference Architecture for Explanation Management
8.8 Overview of Case Studies
8.9 Case of Discovery of Candidate Installation Sites for Free Wi-Fi Access Point
8.9.1 Overview
8.9.2 Two Hypotheses
8.9.3 Explanation of Integrated Hypothesis
8.9.4 Experiments and Considerations
8.10 Case of Classification of Deep Moonquakes
8.10.1 Overview
8.10.2 Features for Analysis
8.10.3 Balanced Random Forest
8.10.4 Experimental Settings
8.10.5 Experimental Results
8.10.6 Considerations
8.11 Case of Identification of Central Peak Crater
8.11.1 Overview
8.11.2 Integrated Hypothesis
8.11.3 Explanation of Results
8.12 Case of Exploring Basic Components of Scientific Events
8.12.1 Overview
8.12.2 Data Set
8.12.3 Network Configuration and Algorithms
8.12.4 Visualization of Judgment Evidence by Grad-CAM
8.12.5 Experiments to Confirm Important Features
8.12.6 Seeking Basic Factors
References
Index


📜 SIMILAR VOLUMES


Hypothesis Generation and Interpretation
✍ Hiroshi Ishikawa 📂 Library 📅 2024 🏛 Springer 🌐 English

<span>This book focuses in detail on data science and data analysis and emphasizes the importance of data engineering and data management in the design of big data applications. The author uses patterns discovered in a collection of big data applications to provide design principles for hypothesis g

Towards the Integration of IoT, Cloud an
✍ Vinay Rishiwal (editor), Pramod Kumar (editor), Anuradha Tomar (editor), Priyan 📂 Library 📅 2023 🏛 Springer 🌐 English

<p><span>This book discusses integration of internet of things (IoT), cloud computing, and big data. It presents a unique platform where IoT, cloud computing, and big data are fused together and can be foreseen as a perfect solution to many applications. Usually, IoT, cloud computing, and big data a

Data Analytics and Computational Intelli
✍ Gilberto Rivera (editor), Laura Cruz-Reyes (editor), BernabĂ© Dorronsoro (editor) 📂 Library 📅 2023 🏛 Springer 🌐 English

<p><span>In the age of transformative artificial intelligence (AI), which has the potential to revolutionize our lives, this book provides a comprehensive exploration of successful research and applications in AI and data analytics.</span></p><p><span>Covering innovative approaches, advanced algorit

Big Data and Blockchain for Service Oper
✍ Ali Emrouznejad (editor), Vincent Charles (editor) 📂 Library 📅 2022 🏛 Springer 🌐 English

<p><span>This book aims to provide the necessary background to work with big data blockchain by introducing some novel applications in service operations for both academics and interested practitioners, and to benefit society, industry, academia, and government. Presenting applications in a variety

Modeling and Processing for Next-Generat
✍ Fatos Xhafa, Leonard Barolli, Admir Barolli, Petraq Papajorgji (eds.) 📂 Library 📅 2015 🏛 Springer International Publishing 🌐 English

<p><p>This book covers the latest advances in Big Data technologies and provides the readers with a comprehensive review of the state-of-the-art in Big Data processing, analysis, analytics, and other related topics. It presents new models, algorithms, software solutions and methodologies, covering t

Big Data Management: Data Governance Pri
✍ Peter Ghavami 📂 Library 📅 2020 🏛 De Gruyter 🌐 English

<p><strong>Data analytics is core to business and decision making.</strong></p> <p>The rapid increase in data volume, velocity and variety offers both opportunities and challenges. While open source solutions to store big data, like Hadoop, offer platforms for exploring value and insight from big da