𝔖 Scriptorium
✦   LIBER   ✦

📁

Informatics and Machine Learning: From Martingales to Metaheuristics

✍ Scribed by Stephen Winters-Hilt


Publisher
Wiley
Year
2022
Tongue
English
Leaves
585
Edition
1
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


Informatics and Machine Learning

Discover a thorough exploration of how to use computational, algorithmic, statistical, and informatics methods to analyze digital data

Informatics and Machine Learning: From Martingales to Metaheuristics delivers an interdisciplinary presentation on how analyze any data captured in digital form. The book describes how readers can conduct analyses of text, general sequential data, experimental observations over time, stock market and econometric histories, or symbolic data, like genomes. It contains large amounts of sample code to demonstrate the concepts contained within and assist with various levels of project work.

The book offers a complete presentation of the mathematical underpinnings of a wide variety of forms of data analysis and provides extensive examples of programming implementations. It is based on two decades worth of the distinguished author’s teaching and industry experience.

  • A thorough introduction to probabilistic reasoning and bioinformatics, including Python shell scripting to obtain data counts, frequencies, probabilities, and anomalous statistics, or use with Bayes’ rule
  • An exploration of information entropy and statistical measures, including Shannon entropy, relative entropy, maximum entropy (maxent), and mutual information
  • A practical discussion of ad hoc, ab initio, and bootstrap signal acquisition methods, with examples from genome analytics and signal analytics

Perfect for undergraduate and graduate students in machine learning and data analytics programs, Informatics and Machine Learning: From Martingales to Metaheuristics will also earn a place in the libraries of mathematicians, engineers, computer scientists, and life scientists with an interest in those subjects.

✦ Table of Contents


Cover
Title Page
Copyright Page
Contents
Chapter 1 Introduction
1.1 Data Science: Statistics, Probability, Calculus Python (or Perl) and Linux
1.2 Informatics and Data Analytics
1.3 FSA-Based Signal Acquisition and Bioinformatics
1.4 Feature Extraction and Language Analytics
1.5 Feature Extraction and Gene Structure Identification
1.5.1 HMMs for Analysis of Information Encoding Molecules
1.5.2 HMMs for Cheminformatics and Generic Signal Analysis
1.6 Theoretical Foundations for Learning
1.7 Classification and Clustering
1.8 Search
1.9 Stochastic Sequential Analysis (SSA) Protocol (Deep Learning Without NNs)
1.9.1 Stochastic Carrier Wave (SCW) Analysis–Nanoscope Signal Analysis
1.9.2 Nanoscope Cheminformatics–A Case Study for Device Smartening´´ 1.10 Deep Learning using Neural Nets 1.11 Mathematical Specifics and Computational Implementations Chapter 2 Probabilistic Reasoning and Bioinformatics 2.1 Python Shell Scripting 2.1.1 Sample Size Complications 2.2 Counting, the Enumeration Problem, and Statistics 2.3 From Counts to Frequencies to Probabilities 2.4 Identifying Emergent/Convergent Statistics and Anomalous Statistics 2.5 Statistics, Conditional Probability, and Bayes' Rule 2.5.1 The Calculus of Conditional Probabilities: The Cox Derivation 2.5.2 Bayes' Rule 2.5.3 Estimation Based on Maximal Conditional Probabilities 2.6 Emergent Distributions and Series 2.6.1 The Law of Large Numbers (LLN) 2.6.2 Distributions 2.6.3 Series 2.7 Exercises Chapter 3 Information Entropy and Statistical Measures 3.1 Shannon Entropy, Relative Entropy, Maxent, Mutual Information 3.1.1 The Khinchin Derivation 3.1.2 Maximum Entropy Principle 3.1.3 Relative Entropy and Its Uniqueness 3.1.4 Mutual Information 3.1.5 Information Measures Recap 3.2 Codon Discovery from Mutual Information Anomaly 3.3 ORF Discovery from Long-Tail Distribution Anomaly 3.3.1 Ab initio Learning with smORF´s, Holistic Modeling, and Bootstrap Learning 3.4 Sequential Processes and Markov Models 3.4.1 Markov Chains 3.5 Exercises Chapter 4 Ad Hoc, Ab Initio, and Bootstrap Signal Acquisition Methods 4.1 Signal Acquisition, or Scanning, at Linear Order Time-Complexity 4.2 Genome Analytics: The Gene-Finder 4.3 Objective Performance Evaluation: Sensitivity and Specificity 4.4 Signal Analytics: The Time-Domain Finite State Automaton (tFSA) 4.4.1 tFSA Spike Detector 4.4.2 tFSA-Based Channel Signal Acquisition Methods with Stable Baseline 4.4.3 tFSA-Based Channel Signal Acquisition Methods Without Stable Baseline 4.5 Signal Statistics (Fast): Mean, Variance, and Boxcar Filter 4.5.1 Efficient Implementations for Statistical Tools (O(L)) 4.6 Signal Spectrum: Nyquist Criterion, Gabor Limit, Power Spectrum 4.6.1 Nyquist Sampling Theorem 4.6.2 Fourier Transforms, and Other Classic Transforms 4.6.3 Power Spectral Density 4.6.4 Power-Spectrum-Based Feature Extraction 4.6.5 Cross-Power Spectral Density 4.6.6 AM/FM/PM Communications Protocol 4.7 Exercises Chapter 5 Text Analytics 5.1 Words 5.1.1 Text Acquisition: Text Scraping and Associative Memory 5.1.2 Word Frequency Analysis: Machiavelli´s Polysemy on Fortuna and Virtu 5.1.3 Word Frequency Analysis: Coleridge´s Hidden Polysemy on Logos 5.1.4 Sentiment Analysis 5.2 Phrases–Short (Three Words) 5.2.1 Shakespearean Insult Generation–Phrase Generation 5.3 Phrases–Long (A Line or Sentence) 5.3.1 Iambic Phrase Analysis: Shakespeare 5.3.2 Natural Language Processing 5.3.3 Sentence and Story Generation: Tarot 5.4 Exercises Chapter 6 Analysis of Sequential Data Using HMMs 6.1 Hidden Markov Models (HMMs) 6.1.1 Background and Role in Stochastic Sequential Analysis (SSA) 6.1.2 When to Use a Hidden Markov Model (HMM)? 6.1.3 Hidden Markov Models (HMMs)–Standard Formulation and Terms 6.2 Graphical Models for Markov Models and Hidden Markov Models 6.2.1 Hidden Markov Models 6.2.2 Viterbi Path 6.2.3 Forward and Backward Probabilities 6.2.4 HMM: Maximum Likelihood discrimination 6.2.5 Expectation/Maximization (Baum–Welch) 6.3 Standard HMM Weaknesses and their GHMM Fixes 6.4 Generalized HMMs (GHMMs – "Gems"): Minor Viterbi Variants 6.4.1 The Generic HMM 6.4.2 pMM/SVM 6.4.3 EM and Feature Extraction via EVA Projection 6.4.4 Feature Extraction via Data Absorption (a.k.a. Emission Inversion) 6.4.5 Modified AdaBoost for Feature Selection and Data Fusion 6.5 HMM Implementation for Viterbi (in C and Perl) 6.6 Exercises Chapter 7 Generalized HMMs (GHMMs): Major Viterbi Variants 7.1 GHMMs: Maximal Clique for Viterbi and Baum–Welch 7.2 GHMMs: Full Duration Model 7.2.1 HMM with Duration (HMMD) 7.2.2 Hidden Semi-Markov Models (HSMM) with sid-information 7.2.3 HMM with Binned Duration (HMMBD) 7.3 GHMMs: Linear Memory Baum–Welch Algorithm 7.4 GHMMs: Distributable Viterbi and Baum–Welch Algorithms 7.4.1 Distributed HMM processing via "Viterbi-overlap-chunking" with GPU speedup 7.4.2 Relative Entropy and Viterbi Scoring 7.5 Martingales and the Feasibility of Statistical Learning (further details in Appendix) 7.6 Exercises Chapter 8 Neuromanifolds and the Uniqueness of Relative Entropy 8.1 Overview 8.2 Review of Differential Geometry 8.2.1 Differential Topology – Natural Manifold 8.2.2 Differential Geometry – Natural Geometric Structures 8.3 Amari´s Dually Flat Formulation 8.3.1 Generalization of Pythagorean Theorem 8.3.2 Projection Theorem and Relation Between Divergence and Link Formalism 8.4 Neuromanifolds 8.5 Exercises Chapter 9 Neural Net Learning and Loss Bounds Analysis 9.1 Brief Introduction to Neural Nets (NNs) 9.1.1 Single Neuron Discriminator 9.1.2 Neural Net with Back-Propagation 9.2 Variational Learning Formalism and Use in Loss Bounds Analysis 9.2.1 Variational Basis for Update Rule 9.2.2 Review and Generalization of GD Loss Bounds Analysis 9.2.3 Review of the EG Loss Bounds Analysis 9.3 The The “sinh−1(ω)” link algorithm (SA) 9.3.1 Motivation for “sinh−1(ω)” link algorithm (SA) 9.3.2 Relation of sinh Link Algorithm to the Binary Exponentiated Gradient Algorithm 9.4 The Loss Bounds Analysis for sinh−1(ω) 9.4.1 Loss Bounds Analysis Using the Taylor Series Approach 9.4.2 Loss Bounds Analysis Using Taylor Series for the sinh Link (SA) Algorithm 9.5 Exercises Chapter 10 Classification and Clustering 10.1 The SVM Classifier–An Overview 10.2 Introduction to Classification and Clustering 10.2.1 Sum of Squared Error (SSE) Scoring 10.2.2 K-Means Clustering (Unsupervised Learning) 10.2.3 k-Nearest Neighbors Classification (Supervised Learning) 10.2.4 The Perceptron Recap (See Chapter for Details) 10.3 Lagrangian Optimization and Structural Risk Minimization (SRM) 10.3.1 Decision Boundary and SRM Construction Using Lagrangian 10.3.2 The Theory of Classification 10.3.3 The Mathematics of the Feasibility of Learning 10.3.4 Lagrangian Optimization 10.3.5 The Support Vector Machine (SVM)–Lagrangian with SRM 10.3.6 Kernel Construction Using Polarization 10.3.7 SVM Binary Classifier Derivation 10.4 SVM Binary Classifier Implementation 10.4.1 Sequential Minimal Optimization (SMO) 10.4.2 Alpha-Selection Variants 10.4.3 Chunking on Large Datasets: O(N2) ➔ n O(N2/n2) = O(N2)/n 10.4.4 Support Vector Reduction (SVR) 10.4.5 Code Examples (in OO Perl) 10.5 Kernel Selection and Tuning Metaheuristics 10.5.1 TheStability´´ Kernels
10.5.2 Derivation of Stability´´ Kernels 10.5.3 Entropic and Gaussian Kernels Relate to Unique, Minimally Structured, Information Divergence and Geometric Distance ... 10.5.4 Automated Kernel Selection and Tuning 10.6 SVM Multiclass from Decision Tree with SVM Binary Classifiers 10.7 SVM Multiclass Classifier Derivation (Multiple Decision Surface) 10.7.1 Decomposition Method to Solve the Dual 10.7.2 SVM Speedup via Differentiating BSVs and SVs 10.8 SVM Clustering 10.8.1 SVM-External Clustering 10.8.2 Single-Convergence SVM-Clustering: Comparative Analysis 10.8.3 Stabilized, Single-Convergence Initialized, SVM-External Clustering 10.8.4 Stabilized, Multiple-Convergence, SVM-External Clustering 10.8.5 SVM-External Clustering–Algorithmic Variants 10.9 Exercises Chapter 11 Search Metaheuristics 11.1 Trajectory-Based Search Metaheuristics 11.1.1 Optimal-Fitness Configuration Trajectories – Fitness Function Known and Sufficiently Regular 11.1.2 Optimal-Fitness Configuration Trajectories – Fitness Function not Known 11.1.3 Fitness Configuration Trajectories with Nonoptimal Updates 11.2 Population-Based Search Metaheuristics 11.2.1 Population with Evolution 11.2.2 Population with Group Interaction – Swarm Intelligence 11.2.3 Population with Indirect Interaction via Artifact 11.3 Exercises Chapter 12 Stochastic Sequential Analysis (SSA) 12.1 HMM and FSA-Based Methods for Signal Acquisition and Feature Extraction 12.2 The Stochastic Sequential Analysis (SSA) Protocol 12.2.1 (Stage 1) Primitive Feature Identification 12.2.2 (Stage 2) Feature Identification and Feature Selection 12.2.3 (Stage 3) Classification 12.2.4 (Stage 4) Clustering 12.2.5 (All Stages) Database/Data-Warehouse System Specification 12.2.6 (All Stages) Server-Based Data Analysis System Specification 12.3 Channel Current Cheminformatics (CCC) Implementation of the Stochastic Sequential Analysis (SSA) Protocol 12.4 SCW for Detector Sensitivity Boosting 12.4.1 NTD with Multiple Channels (or High Noise) 12.4.2 Stochastic Carrier Wave 12.5 SSA for Deep Learning 12.6 Exercises Chapter 13 Deep Learning Tools–TensorFlow 13.1 Neural Nets Review 13.1.1 Summary of Single Neuron Discriminator 13.1.2 Summary of Neural Net Discriminator and Back-Propagation 13.2 TensorFlow from Google 13.2.1 Installation/Setup 13.2.2 Example: Character Recognition 13.2.3 Example: Language Translation 13.2.4 TensorBoard and the TensorFlow Profiler 13.2.5 Tensor Cores 13.3 Exercises Chapter 14 Nanopore Detection–A Case Study 14.1 Standard Apparatus 14.1.1 Standard Operational and Physiological Buffer Conditions 14.1.2 α-Hemolysin Channel Stability–Introduction of Chaotropes 14.2 Controlling Nanopore Noise Sources and Choice of Aperture 14.3 Length Resolution of Individual DNA Hairpins 14.4 Detection of Single Nucleotide Differences (Large Changes in Structure) 14.5 Blockade Mechanism for 9bphp 14.6 Conformational Kinetics on Model Biomolecules 14.7 Channel Current Cheminformatics 14.7.1 Power Spectra and Standard EE Signal Analysis 14.7.2 Channel Current Cheminformatics for Single-Biomolecule/Mixture Identifications 14.7.3 Channel Current Cheminformatics: Feature Extraction by HMM 14.7.4 Bandwidth Limitations 14.8 Channel-Based Detection Mechanisms 14.8.1 Partitioning and Translocation-Based ND Biosensing Methods 14.8.2 Transduction Versus Translation 14.8.3 Single-Molecule Versus Ensemble 14.8.4 Biosensing with High Sensitivity in Presence of Interference 14.8.5 Nanopore Transduction Detection Methods 14.9 The NTD Nanoscope 14.9.1 Nanopore Transduction Detection (NTD) 14.9.2 NTD: A Versatile Platform for Biosensing 14.9.3 NTD Platform 14.9.4 NTD Operation 14.9.5 Driven Modulations 14.9.6 Driven Modulations with Multichannel Augmentation 14.10 NTD Biosensing Methods 14.10.1 Model Biosensor Based on Streptavidin and Biotin 14.10.2 Model System Based on DNA Annealing 14.10.3 Y-Aptamer with Use of Chaotropes to Improve Signal Resolution 14.10.4 Pathogen Detection, miRNA Detection, and miRNA Haplotyping 14.10.5 SNP Detection 14.10.6 Aptamer-Based Detection 14.10.7 Antibody-Based Detection 14.11 Exercises Appendix A Python and Perl System Programming in Linux A.1 Getting Linux and Python in a Flash (Drive) A.2 Linux and the Command Shell A.3 Perl Review: I/O, Primitives, String Handling, Regex Appendix B B Physics B.1 The Calculus of Variations Appendix C Math C.1 Martingales Martingale Definition Induced Martingales with Markov Chains In HMM Learning Have Sequences of Likelihood Ratios, Which Is a Martingale, Proof Supermartingales and Submartingales Martingale Convergence TheoremsMaximal´´ Inequalities for Martingales
Mean-Square Convergence Theorem for Martingales
Martingales w.r.t s-Field Formalism
Backwards Martingale Definition (w.r.t Sigma Sub-fields)
Backwards Martingale Convergence Theorem
Strong Law of Large Numbers Proof
Stationary Processes
Strong Ergodic Theorem
Asymptotic Equipartition Property (AEP)
De Finetti´s Theorem
C.2 Hoeffding Inequality
Hoeffding Lemma Proof
Hoeffding Inequality Proof (for Further Details, See [104])
Chernoff Bounding Technique:
References
Index
EULA


📜 SIMILAR VOLUMES


Machine Learning and Metaheuristics: Met
✍ Uma N. Dulhare; Essam Halim Houssein 📂 Library 📅 2023 🏛 Springer Nature Singapore 🌐 English

This book takes a balanced approach between theoretical understanding and real-time applications. All the topics included real-world problems which show how to explore, build, evaluate, and optimize machine learning models fusion with metaheuristic algorithms. Optimization algorithms classified into

Metaheuristics for Machine Learning - Al
✍ Kanak Kalita; Narayanan Ganesh; S. Balamurugan 📂 Library 📅 2024 🏛 WILEY 🌐 English

The field of metaheuristic optimization algorithms is experiencing rapid growth, both in academic research and industrial applications. These nature-inspired algorithms, which draw on phenomena like evolution, swarm behavior, and neural systems, have shown remarkable efficiency in solving complex op

Metaheuristics for Machine Learning: New
✍ Mansour Eddaly, Bassem Jarboui, Patrick Siarry 📂 Library 📅 2023 🏛 Springer 🌐 English

Using metaheuristics to enhance machine learning techniques has become trendy and has achieved major successes in both supervised (classification and regression) and unsupervised (clustering and rule mining) problems. Furthermore, automatically generating programs via metaheuristics, as a form of ev

Metaheuristics for Machine Learning: New
✍ Mansour Eddaly; Bassem Jarboui; Patrick Siarry 📂 Library 📅 2023 🏛 Springer Nature 🌐 English

Using metaheuristics to enhance machine learning techniques has become trendy and has achieved major successes in both supervised (classification and regression) and unsupervised (clustering and rule mining) problems. Furthermore, automatically generating programs via metaheuristics, as a form of ev