Introduction to Semi-Supervised Learning
β Scribed by Xiaojin Zhu, Andrew. B Goldberg
- Publisher
- Springer
- Year
- 2009
- Tongue
- English
- Leaves
- 122
- Series
- Synthesis Lectures on Artificial Intelligence and Machine Learning
- Edition
- 1
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
Semi-supervised learning is a learning paradigm concerned with the study of how computers and natural systems such as humans learn in the presence of both labeled and unlabeled data. Traditionally, learning has been studied either in the unsupervised paradigm (e.g., clustering, outlier detection) where all the data are unlabeled, or in the supervised paradigm (e.g., classification, regression) where all the data are labeled. The goal of semi-supervised learning is to understand how combining labeled and unlabeled data may change the learning behavior, and design algorithms that take advantage of such a combination. Semi-supervised learning is of great interest in machine learning and data mining because it can use readily available unlabeled data to improve supervised learning tasks when the labeled data are scarce or expensive. Semi-supervised learning also shows potential as a quantitative tool to understand human category learning, where most of the input is self-evidently unlabeled. In this introductory book, we present some popular semi-supervised learning models, including self-training, mixture models, co-training and multiview learning, graph-based methods, and semi-supervised support vector machines. For each model, we discuss its basic mathematical formulation. The success of semi-supervised learning depends critically on some underlying assumptions. We emphasize the assumptions made by each model and give counterexamples when appropriate to demonstrate the limitations of the different models. In addition, we discuss semi-supervised learning for cognitive psychology. Finally, we give a computational learning theoretic perspective on semi-supervised learning, and we conclude the book with a brief discussion of open questions in the field. Table of Contents: Introduction to Statistical Machine Learning / Overview of Semi-Supervised Learning / Mixture Models and EM / Co-Training / Graph-Based Semi-Supervised Learning / Semi-Supervised Support Vector Machines / Human Semi-Supervised Learning / Theory and Outlook
β¦ Table of Contents
Cover
Copyright Page
Title Page
Dedication
Contents
Preface
Introduction to Statistical Machine Learning
The Data
Unsupervised Learning
Supervised Learning
Overview of Semi-Supervised Learning
Learning from Both Labeled and Unlabeled Data
How is Semi-Supervised Learning Possible?
Inductive vs. Transductive Semi-Supervised Learning
Caveats
Self-Training Models
Mixture Models and EM
Mixture Models for Supervised Classification
Mixture Models for Semi-Supervised Classification
Optimization with the EM Algorithm
The Assumptions of Mixture Models
Other Issues in Generative Models
Cluster-then-Label Methods
Co-Training
Two Views of an Instance
Co-Training
The Assumptions of Co-Training
Multiview Learning
Graph-Based Semi-Supervised Learning
Unlabeled Data as Stepping Stones
The Graph
Mincut
Harmonic Function
Manifold Regularization
The Assumption of Graph-Based Methods
Semi-Supervised Support Vector Machines
Support Vector Machines
Semi-Supervised Support Vector Machines
Entropy Regularization
The Assumption of S3VMs and Entropy Regularization
Human Semi-Supervised Learning
From Machine Learning to Cognitive Science
Study One: Humans Learn from Unlabeled Test Data
Study Two: Presence of Human Semi-Supervised Learning in a Simple Task
Study Three: Absence of Human Semi-Supervised Learning
Study Three: Absence of Human Semi-Supervised Learning in a Complex Task
Discussions
Theory and Outlook
A Simple PAC Bound for Supervised Learning
A Simple PAC Bound for Semi-Supervised Learning
Future Directions of Semi-Supervised Learning
Basic Mathematical Reference
Semi-Supervised Learning Software
Symbols
Biography
Index
π SIMILAR VOLUMES
In the field of machine learning, semi-supervised learning (SSL) occupies the middle ground, between supervised learning (in which all training examples are labeled) and unsupervised learning (in which no label data are given). Interest in SSL has i
In the field of machine learning, semi-supervised learning (SSL) occupies the middle ground, between supervised learning (in which all training examples are labeled) and unsupervised learning (in which no label data are given). Interest in SSL has increased in recent years, particularly because of a
A comprehensive review of an area of machine learning that deals with the use of unlabeled data in classification problems: state-of-the-art algorithms, a taxonomy of the field, applications, benchmark experiments, and directions for future research.
<p>While labeled data is expensive to prepare, ever increasing amounts of unlabeled data is becoming widely available. In order to adapt to this phenomenon, several semi-supervised learning (SSL) algorithms, which learn from labeled as well as unlabeled data, have been developed. In a separate line