๐”– Scriptorium
โœฆ   LIBER   โœฆ

๐Ÿ“

Stochastic recursive algorithms for optimization : simultaneous perturbation methods

โœ Scribed by S. Bhatnagar, H.L. Prasad, and L.A. Prashanth


Publisher
Springer
Year
2013
Tongue
English
Leaves
309
Series
Lecture Notes in Control and Information Sciences, 434
Category
Library

โฌ‡  Acquire This Volume

No coin nor oath required. For personal study only.

โœฆ Table of Contents


Title
Preface
Contents
Part I Introduction to Stochastic Recursive
Algorithms
Introduction
Introduction
Overview of the Remaining Chapters
Concluding Remarks
References
Deterministic Algorithms for Local Search
Introduction
Deterministic Algorithms for Local Search
References
Stochastic Approximation Algorithms
Introduction
The Robbins-Monro Algorithm
Convergence of the Robbins-Monro Algorithm
Multi-timescale Stochastic Approximation
Convergence of the Multi-timescale Algorithm
Concluding Remarks
References
Part II
Gradient Estimation Schemes
Kiefer-Wolfowitz Algorithm
Introduction
The Basic Algorithm
Extension to Multi-dimensional Parameter
Variants of the Kiefer-Wolfowitz Algorithm
Fixed Perturbation Parameter
One-Sided Variants
Concluding Remarks
References
Gradient Schemes with Simultaneous
Perturbation Stochastic Approximation
Introduction
The Basic SPSA Algorithm
Gradient Estimate Using Simultaneous Perturbation
The Algorithm
Convergence Analysis
Variants of the Basic SPSA Algorithm
One-Measurement SPSA Algorithm
One-Sided SPSA Algorithm
Fixed Perturbation Parameter
General Remarks on SPSA Algorithms
SPSA Algorithms with Deterministic Perturbations
Properties of Deterministic Perturbation Sequences
Hadamard Matrix Based Construction
Two-Sided SPSA with Hadamard Matrix Perturbations
One-Sided SPSA with Hadamard Matrix Perturbations
One-Measurement SPSA with Hadamard Matrix Perturbations
SPSA Algorithms for Long-Run Average Cost Objective
The Framework
The Two-Simulation SPSA Algorithm
Assumptions
Convergence Analysis
Projected SPSA Algorithm
Concluding Remarks
References
Smoothed Functional Gradient Schemes
Introduction
Gaussian Based SF Algorithm
Gradient Estimation via Smoothing
The Basic Gaussian SF Algorithm
Convergence Analysis of Gaussian SF Algorithm
Two-Measurement Gaussian SF Algorithm
General Conditions for a Candidate Smoothing Function
Cauchy Variant of the SF Algorithm
Gradient Estimate
Cauchy SF Algorithm
SF Algorithms for the Long-Run Average Cost Objective
The G-SF1 Algorithm
The G-SF2 Algorithm
Projected SF Algorithms
Concluding Remarks
References
Part III
Hessian Estimation Schemes
Newton-Based Simultaneous Perturbation
Stochastic Approximation
Introduction
The Framework
Newton SPSA Algorithms
Four-Simulation Newton SPSA (N-SPSA4)
Three-Simulation Newton SPSA (N-SPSA3)
Two-Simulation Newton SPSA (N-SPSA2)
One-Simulation Newton SPSA (N-SPSA1)
Woodbury's Identity Based Newton SPSA Algorithms
Convergence Analysis
Assumptions
Convergence Analysis of N-SPSA4
Convergence Analysis of N-SPSA3
Convergence Analysis of N-SPSA2
Convergence Analysis of N-SPSA1
Convergence Analysis of W-SPSA Algorithms
Concluding Remarks
References
Newton-Based Smoothed Functional Algorithms
Introduction
The Hessian Estimates
One-Simulation Hessian SF Estimate
Two-Simulation Hessian SF Estimate
The Newton SF Algorithms
The One-Simulation Newton SF Algorithm (N-SF1)
The Two-Simulation Newton SF Algorithm (N-SF2)
Convergence Analysis of Newton SF Algorithms
Convergence of N-SF1
Convergence of N-SF2
Concluding Remarks
References
Part IV
Variations to the Basic Scheme
Discrete Parameter Optimization
Introduction
The Framework
The Deterministic Projection Operator
The Random Projection Operator
A Generalized Projection Operator
Regular Projection Operator to
Basic Results for the Generalized Projection Operator Case
The Algorithms
The SPSA Algorithm
The SFA Algorithm
Convergence Analysis
Concluding Remarks
References
Algorithms for Constrained Optimization
Introduction
The Framework
Algorithms
Constrained Gradient-Based SPSA Algorithm (CG-SPSA)
Constrained Newton-Based SPSA Algorithm (CN-SPSA)
Constrained Gradient-Based SF Algorithm (CG-SF)
Constrained Newton-Based SF Algorithm (CN-SF)
A Sketch of the Convergence
Concluding Remarks
References
Reinforcement Learning
Introduction
Markov Decision Processes
Numerical Procedures for MDPs
Numerical Procedures for Discounted Cost MDPs
Numerical Procedures for Long-Run Average Cost MDPs
Reinforcement Learning Algorithms for Look-up Table Case
An Actor-Critic Algorithm for Infinite Horizon Discounted Cost MDPs
The Q-Learning Algorithm and a Simultaneous Perturbation Variant for Infinite Horizon Discounted Cost MDPs
Actor-Critic Algorithms for Long-Run Average Cost MDPs
Reinforcement Learning Algorithms with Function Approximation
Temporal Difference (TD) Learning with Discounted Cost
An Actor-Critic Algorithm with a Temporal Difference Critic for Discounted Cost MDPs
Function Approximation Based Q-Learning Algorithm and a Simultaneous Perturbation Variant for Infinite Horizon Discounted Cost MDPs
Concluding Remarks
References
Part V
Applications
Service Systems
Introduction
Service System Framework
Problem Formulation
Solution Methodology
First Order Methods
SASOC-SPSA
SASOC-SF-N
SASOC-SF-C
Second Order Methods
SASOC-H
SASOC-W
Notes on Convergence
Summary of Experiments
Concluding Remarks
References
Road Traffic Control
Introduction
Q-Learning for Traffic Light Control
Traffic Control Problem as an MDP
The TLC Algorithm
Summary of Experimental Results
Threshold Tuning Using SPSA
The Threshold Tuning Algorithm
Traffic Light Control with Threshold Tuning
Summary of Experimental Results
Concluding Remarks
References
Communication Networks
Introduction
The Random Early Detection (RED) Scheme for the Internet
Introduction to RED Flow Control
The Framework
The B-RED and P-RED Stochastic Approximation Algorithms
Summary of Experimental Results
Optimal Policies for the Retransmission Probabilities in Slotted Aloha
Introduction to the Slotted Aloha Multiaccess Communication Protocol
The SDE Framework
The Algorithm
Summary of Experimental Results
Dynamic Multi-layered Pricing Schemes for the Internet
Introduction to Dynamic Pricing Schemes
The Pricing Framework
The Price Feed-Back Policies and the Algorithms
Summary of Experimental Results
Concluding Remarks
References
Part VI
Appendix
Convergence Notions for a Sequence of Random Vectors
Reference
Martingales
References
Ordinary Differential Equations
References
The Borkar-Meyn Theorem for Stability and Convergence
References
The Kushner-Clark Theorem for Convergence
References
Index


๐Ÿ“œ SIMILAR VOLUMES


Stochastic Recursive Algorithms for Opti
โœ S. Bhatnagar, H.L. Prasad, L.A. Prashanth (auth.) ๐Ÿ“‚ Library ๐Ÿ“… 2013 ๐Ÿ› Springer-Verlag London ๐ŸŒ English

<p>Stochastic Recursive Algorithms for Optimization presents algorithms for constrained and unconstrained optimization and for reinforcement learning. Efficient perturbation approaches form a thread unifying all the algorithms considered. Simultaneous perturbation stochastic approximation and smooth

The Stochastic Perturbation Method for C
โœ Marcin Kaminski(auth.) ๐Ÿ“‚ Library ๐Ÿ“… 2013 ๐ŸŒ English

Probabilistic analysis is increasing in popularity and importance within engineering and the applied sciences. However, the stochastic perturbation technique is a fairly recent development and therefore remains as yet unknown to many students, researchers and engineers. Fields in which the methodolo

The stochastic perturbation method for c
โœ Kaminski M. ๐Ÿ“‚ Library ๐Ÿ“… 2013 ๐Ÿ› Wiley ๐ŸŒ English

Probabilistic analysis is increasing in popularity and importance within engineering and the applied sciences. However, the stochastic perturbation technique is a fairly recent development and therefore remains as yet unknown to many students, researchers and engineers. Fields in which the methodolo

Stochastic Simulation Optimization For D
โœ Chun-Hung Chen; Qing-shan Jia; Loo Hay Lee ๐Ÿ“‚ Library ๐Ÿ“… 2013 ๐Ÿ› World Scientific Publishing Company ๐ŸŒ English

Discrete event systems (DES) have become pervasive in our daily lives. Examples include (but are not restricted to) manufacturing and supply chains, transportation, healthcare, call centers, and financial engineering. However, due to their complexities that often involve millions or even billions of

Convex Optimization Algorithms (for Algo
โœ Dimitri P. Bertsekas ๐Ÿ“‚ Library ๐Ÿ“… 2015 ๐Ÿ› Athena Scientific ๐ŸŒ English

This book, developed through class instruction at MIT over the last 15 years, provides an accessible, concise, and intuitive presentation of algorithms for solving convex optimization problems. It relies on rigorous mathematical analysis, but also aims at an intuitive exposition that makes use of vi

Stochastic Optimization Methods
โœ Univ. Professor Dr. sc. math. Kurt Marti (auth.) ๐Ÿ“‚ Library ๐Ÿ“… 2005 ๐Ÿ› Springer Berlin Heidelberg ๐ŸŒ English

<P>Optimization problems arising in practice involve random parameters. For the computation of robust optimal solutions, i.e., optimal solutions being insensitive with respect to random parameter variations, deterministic substitute problems are needed. Based on the distribution of the random data,