Accelerated Optimization for Machine Learning: First-Order Algorithms

✍ Scribed by Zhouchen Lin; Huan Li; Cong Fang

Publisher: Springer
Year: 2020
Tongue: English
Leaves: 286
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

This book on optimization includes forewords by Michael I. Jordan, Zongben Xu and Zhi-Quan Luo. Machine learning relies heavily on optimization to solve problems with its learning models, and first-order optimization algorithms are the mainstream approaches. The acceleration of first-order optimization algorithms is crucial for the efficiency of machine learning. Written by leading experts in the field, this book provides a comprehensive introduction to, and state-of-the-art review of accelerated first-order optimization algorithms for machine learning. It discusses a variety of methods, including deterministic and stochastic algorithms, where the algorithms can be synchronous or asynchronous, for unconstrained and constrained problems, which can be convex or non-convex. Offering a rich blend of ideas, theories and proofs, the book is up-to-date and self-contained. It is an excellent reference resource for users who are seeking faster optimization algorithms, as well as for graduate students and researchers wanting to grasp the frontiers of optimization in machine learning in a short time.

✦ Table of Contents

Foreword by Michael I. Jordan
Foreword by Zongben Xu
Foreword by Zhi-Quan Luo
Preface
References
Acknowledgements
Contents
About the Authors
Acronyms
1 Introduction
1.1 Examples of Optimization Problems in Machine Learning
1.2 First-Order Algorithm
1.3 Sketch of Representative Works on Accelerated Algorithms
1.4 About the Book
References
2 Accelerated Algorithms for Unconstrained Convex Optimization
2.1 Accelerated Gradient Method for Smooth Optimization
2.2 Extension to Composite Optimization
2.2.1 Nesterov's First Scheme
2.2.2 Nesterov's Second Scheme
2.2.2.1 A Primal-Dual Perspective
2.2.3 Nesterov's Third Scheme
2.3 Inexact Proximal and Gradient Computing
2.3.1 Inexact Accelerated Gradient Descent
2.3.2 Inexact Accelerated Proximal Point Method
2.4 Restart
2.5 Smoothing for Nonsmooth Optimization
2.6 Higher Order Accelerated Method
2.7 Explanation: A Variational Perspective
2.7.1 Discretization
References
3 Accelerated Algorithms for Constrained Convex Optimization
3.1 Some Facts for the Case of Linear Equality Constraint
3.2 Accelerated Penalty Method
3.2.1 Generally Convex Objectives
3.2.2 Strongly Convex Objectives
3.3 Accelerated Lagrange Multiplier Method
3.3.1 Recovering the Primal Solution
3.3.2 Accelerated Augmented Lagrange Multiplier Method
3.4 Alternating Direction Method of Multiplier and Its Non-ergodic Accelerated Variant
3.4.1 Generally Convex and Nonsmooth Case
3.4.2 Strongly Convex and Nonsmooth Case
3.4.3 Generally Convex and Smooth Case
3.4.4 Strongly Convex and Smooth Case
3.4.5 Non-ergodic Convergence Rate
3.4.5.1 Original ADMM
3.4.5.2 ADMM with Extrapolation and Increasing Penalty Parameter
3.5 Primal-Dual Method
3.5.1 Case 1: μg=μh=0
3.5.2 Case 2: μg>0, μh=0
3.5.3 Case 3: μg=0, μh>0
3.5.4 Case 4: μg>0, μh>0
3.6 Faster Frank–Wolfe Algorithm
References
4 Accelerated Algorithms for Nonconvex Optimization
4.1 Proximal Gradient with Momentum
4.1.1 Convergence Theorem
4.1.2 Another Method: Monotone APG
4.2 AGD Achieves Critical Points Quickly
4.2.1 AGD as a Convexity Monitor
4.2.2 Negative Curvature Descent
4.2.3 Accelerating Nonconvex Optimization
4.3 AGD Escapes Saddle Points Quickly
4.3.1 Almost Convex Case
4.3.2 Very Nonconvex Case
4.3.3 AGD for Nonconvex Problems
4.3.3.1 Locally Almost Convex → Globally Almost Convex
4.3.3.2 Outer Iterations
4.3.3.3 Inner Iterations
References
5 Accelerated Stochastic Algorithms
5.1 The Individually Convex Case
5.1.1 Accelerated Stochastic Coordinate Descent
5.1.2 Background for Variance Reduction Methods
5.1.3 Accelerated Stochastic Variance Reduction Method
5.1.4 Black-Box Acceleration
5.2 The Individually Nonconvex Case
5.3 The Nonconvex Case
5.3.1 SPIDER
5.3.2 Momentum Acceleration
5.4 Constrained Problem
5.5 The Infinite Case
References
6 Accelerated Parallel Algorithms
6.1 Accelerated Asynchronous Algorithms
6.1.1 Asynchronous Accelerated Gradient Descent
6.1.2 Asynchronous Accelerated Stochastic Coordinate Descent
6.2 Accelerated Distributed Algorithms
6.2.1 Centralized Topology
6.2.1.1 Large Mini-Batch Algorithms
6.2.1.2 Dual Communication-Efficient Methods
6.2.2 Decentralized Topology
References
7 Conclusions
References
A Mathematical Preliminaries
A.1 Notations
A.2 Algebra and Probability
A.3 Convex Analysis
A.4 Nonconvex Analysis
References
Index

📜 SIMILAR VOLUMES

First-Order and Stochastic Optimization

📁 First-Order and Stochastic Optimization Methods for Machine Learning

✍ Guanghui Lan 📂 Library 📅 2020 🏛 Springer Nature 🌐 English

<p>This book covers not only foundational materials but also the most recent progresses made during the past few years on the area of machine learning algorithms. In spite of the intensive research and development in this area, there does not exist a systematic treatment to introduce the fundamental

First-order and stochastic optimization

📁 First-order and stochastic optimization methods for machine learning

✍ Lan G 📂 Library 📅 2020 🏛 Springer 🌐 English

Optimization Algorithms for Distributed

📁 Optimization Algorithms for Distributed Machine Learning

✍ Gauri Joshi 📂 Library 📅 2022 🏛 Springer 🌐 English

<span>This book discusses state-of-the-art stochastic optimization algorithms for distributed machine learning and analyzes their convergence speed. The book first introduces stochastic gradient descent (SGD) and its distributed version, synchronous SGD, where the task of computing gradients is divi

Optimization Algorithms for Distributed

📁 Optimization Algorithms for Distributed Machine Learning

✍ Gauri Joshi 📂 Library 📅 2023 🏛 Springer 🌐 English

Optimization and Machine Learning: Optim

📁 Optimization and Machine Learning: Optimization for Machine Learning and Machine Learning for Optimization

✍ Rachid Chelouah, Patrick Siarry 📂 Library 📅 2022 🏛 Wiley-ISTE 🌐 English

<span>Machine learning and optimization techniques are revolutionizing our world. Other types of information technology have not progressed as rapidly in recent years, in terms of real impact. The aim of this book is to present some of the innovative techniques in the field of optimization and machi

Optimization and Machine Learning: Optim

📁 Optimization and Machine Learning: Optimization for Machine Learning and Machine Learning for Optimization

✍ Rachid Chelouah 📂 Library 📅 2022 🏛 Wiley-Iste 🌐 English

Machine learning and optimization techniques are revolutionizing our world. Other types of information technology have not progressed as rapidly in recent years, in terms of real impact. The aim of this book is to present some of the innovative techniques in the field of optimization and machine lea