Optimization and Games for Controllable Markov Chains: Numerical Methods with Application to Finance and Engineering (Studies in Systems, Decision and Control, 504)

✍ Scribed by Julio B. Clempner, Alexander Poznyak

Publisher: Springer
Year: 2024
Tongue: English
Leaves: 340
Edition: 1
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

This book considers a class of ergodic finite controllable Markov's chains. The main idea behind the method, described in this book, is to develop the original discrete optimization problems (or game models) in the space of randomized formulations, where the variables stand in for the distributions (mixed strategies or preferences) of the original discrete (pure) strategies in the use. The following suppositions are made: a finite state space, a limited action space, continuity of the probabilities and rewards associated with the actions, and a necessity for accessibility. These hypotheses lead to the existence of an optimal policy. The best course of action is always stationary. It is either simple (i.e., nonrandomized stationary) or composed of two nonrandomized policies, which is equivalent to randomly selecting one of two simple policies throughout each epoch by tossing a biased coin. As a bonus, the optimization procedure just has to repeatedly solve the time-average dynamic programming equation, making it theoretically feasible to choose the optimum course of action under the global restriction. In the ergodic cases the state distributions, generated by the corresponding transition equations, exponentially quickly converge to their stationary (final) values. This makes it possible to employ all widely used optimization methods (such as Gradient-like procedures, Extra-proximal method, Lagrange's multipliers, Tikhonov's regularization), including the related numerical techniques. In the book we tackle different problems and theoretical Markov models like controllable and ergodic Markov chains, multi-objective Pareto front solutions, partially observable Markov chains, continuous-time Markov chains, Nash equilibrium and Stackelberg equilibrium, Lyapunov-like function in Markov chains, Best-reply strategy, Bayesian incentive-compatible mechanisms, Bayesian Partially Observable Markov Games, bargaining solutions for Nash and Kalai-Smorodinsky formulations, multi-traffic signal-control synchronization problem, Rubinstein's non-cooperative bargaining solutions, the transfer pricing problem as bargaining.

✦ Table of Contents

Preface
Acknowledgements
Contents
1 Controllable Markov Chains
1.1 Finite Markov Chains
1.1.1 General Properties
1.1.2 Ergodic Markov Chains
1.1.3 Transition Equation and Invariant State Distribution
1.2 Controllable Markov Chain
1.2.1 Control Policy
1.2.2 Main Definition
1.3 Cost Functions for Markov Chains
1.3.1 Average Cost Function
1.3.2 Auxiliary c-Variables
1.4 Markov Decision Process as Linear a Programming Problem
1.4.1 Lineal Programming Formulation
1.4.2 Realization of Stationary Strategies in the State Space
1.4.3 Numerical Example
References
2 Multiobjective Control
2.1 Introduction
2.1.1 Pareto Set
2.1.2 Parametrization of All Pareto Points
2.1.3 Utopia Point
2.2 Regularized Penalty Function Optimization Method (RPFOM)
2.2.1 Poly-Linear Optimization Problem Formulation
2.2.2 Penalty Functions Approach
2.2.3 Expected Property of RPFA
2.2.4 The Main Result on the Extremal Points of the Penalty Functions
2.2.5 Recurrent Algorithm
2.2.6 Special Selection of the Parameters
2.2.7 Numerical Example: Pareto Front Calculation
2.3 Portfolio Optimization Problem
2.3.1 Problem Formulation
2.3.2 Markowitz Function
2.3.3 Numerical Example: A Rational Investor
2.4 Appendix
References
3 Partially Observable Markov Chains
3.1 Introduction
3.1.1 Brief Review
3.2 Partially Observable Markov Chains
3.3 Formulation of the Problem
3.4 Description in the c-Variables
3.5 Computation of the Estimated Value by Measurable Realization …
3.5.1 Adaptive Algorithm
3.6 Numerical Example: A Partially Observable Markowitz Portfolio
References
4 Continuous-Time Markov Chains
4.1 Introduction
4.1.1 Related Work
4.2 Continuous-Time Markov Chains
4.3 Programming Solver for CTMDP
4.3.1 The c-Variable Method
4.3.2 Linear Programming Solver
4.4 Chemical Reaction Markov Models
4.4.1 Example 1. Formation of the Amidogen Radical
4.4.2 Example 2. Proton Transfer, Hydration and Tautomeric Reaction of Anthocyanin Pigments
References
5 Nash and Stackelberg Equilibrium
5.1 Optimization and Equilibrium
5.2 epsilonε-Nash Equilibrium and Tanaka's Function
5.2.1 Individual Cost Function
5.2.2 Regularized Lagrange Function
5.2.3 Tanaka's Function
5.2.4 ASG Continuous-Time Algorithm
5.3 Extraproximal Method
5.3.1 Proximal Format
5.4 Numerical Example: Strong Nash Equilibrium in Pareto Front
5.4.1 Euler Approach
5.4.2 Numerical Data
5.5 The Stackelberg-Nash Equilibrium Concept
5.5.1 Specific Features
5.5.2 Individual Aims and Tanaka's Representation
5.5.3 Extraproximal Procedure
5.6 Convergence Analysis
5.6.1 Auxiliary Results
5.6.2 Main Convergence Theorem
5.7 Application Example: Four Supermarkets Chain
References
6 Best-Reply Strategies in Repeated Games
6.1 Introduction
6.2 Preliminaries
6.2.1 Controllable Markov Decision Process
6.2.2 Game Description
6.3 Problem Formulation
6.3.1 The State-Value Function
6.3.2 The Recursive Matrix Form
6.4 Construction of a Lyapunov-Like Function
6.4.1 Recurrent Form for the Cost Function
6.4.2 The Lyapunov Function Design
6.5 Examples
6.5.1 Example 1 (Banks Marketing Planning as Prisoner's Dilemma)
6.5.2 Example 2 (Duel Game)
References
7 Mechanism Design
7.1 Introduction
7.1.1 Brief Review
7.2 Description of the Model
7.3 Mechanism and Equilibrium
7.4 Reinforcement Learning Approach
7.5 Risk-Averse Agents Strategies in Contracting Problem
References
8 Joint Observer and Mechanism Design
8.1 Introduction
8.1.1 Brief Problem Analysis
8.1.2 Related Work
8.2 Markov Games
8.3 Problem Formulation
8.3.1 Initial Problem
8.3.2 Auxiliary Problem
8.4 Relation of Solutions for Initial and Auxiliary Problems
8.4.1 Ergodicity Condition
8.5 Reinforcement Learning Approach
8.5.1 Iterative Procedure
8.5.2 Learning Algorithm
8.6 Nash Equilibrium as a Solution of a Max-Min Problem
8.7 Application: Patrolling
8.7.1 Description of the Patrolling Problem
8.7.2 Solver
8.7.3 Random Walk Problem Formulation
8.7.4 Resulting Values
References
9 Bargaining Games or How to Negotiate
9.1 Introduction
9.1.1 Brief Review
9.1.2 Related Work
9.1.3 Nash Versus Kalai-Smorodinsky
9.2 Motivation
9.3 Preliminaries
9.4 The Nash Bargaining Model
9.5 The Kalai-Smorodinsky Bargaining Model
9.5.1 The nn-Person Kalai-Smorodinsky Solution
9.6 The Bargaining Solver
9.6.1 The Nash Bargaining Solver
9.6.2 Kalai-Smorodinsky Solver
9.6.3 The Extraproximal Solver Method
9.7 The Model for the Disagreement Point
9.7.1 The Extraproximal Solver Method
9.8 Numerical Illustration
9.8.1 Computing the Disagreement Point
9.8.2 Computing the Nash Bargaining Solution
9.8.3 Computing the Kalai-Smorodinsky Bargaining Solution
References
10 Multi-traffic Signal-Control Synchronization
10.1 Introduction
10.1.1 Brief Review
10.1.2 Related Work on Traffic Control
10.2 Preliminaries
10.3 Nash Equilibrium
10.3.1 The Regularized Lagrange Principle Application
10.3.2 The Proximal Format
10.3.3 The Extraproximal Method
10.4 Traffic-Signal-Control Problem Formulation
10.4.1 Transition Matrix
10.4.2 Ergodicity
10.4.3 Cost Function
10.5 Gradient Solver
10.6 Application Example
References
11 Non-cooperative Bargaining with Unsophisticated Agents
11.1 Introduction
11.1.1 Related Literature
11.2 The Rubinstein's Alternating-Offers Model
11.3 Bargaining with Unsophisticated Players
11.4 An Extension to Continuous-Time Markov Chains
11.4.1 Solution Method
11.4.2 The Pareto Optimal Solution of the Bargaining Problem
11.4.3 The Non-cooperative Bargaining Solution
11.5 Numeric Simulations
11.5.1 Division of a Fix Resource
11.6 Extensions
11.6.1 Bargaining Under Different Discounting
11.6.2 Bargaining with Collusive Behavior
11.7 Appendix: Proofs
11.7.1 The Non-cooperative Bargaining Game
11.7.2 Formulation of the Problem
11.7.3 Convergence Analysis
11.7.4 Convergence Conditions of deltaδ and alphaα
References
12 Transfer Pricing as Bargaining
12.1 Introduction
12.1.1 Transfer Pricing Process
12.1.2 Brief Review
12.2 Preliminaries
12.2.1 Nash's Bargaining
12.2.2 Continuous-Time Bargaining
12.3 Transfer Pricing
12.4 The Transfer Pricing Nash Bargaining Solution
12.5 Transfer Price Bargaining Solver with Additional Constraints
12.5.1 Numerical Example for Nash's Bargaining Transfer Pricing
12.6 Continuous-Time Transfer Pricing
12.6.1 Revenue of a Passenger Between Members of an Airline Alliance
12.7 Extensions
12.7.1 Bargaining Under Different Discounting
12.7.2 Bargaining with Collusive Behavior
References
Index

📜 SIMILAR VOLUMES

Optimization and Games for Controllable

📁 Optimization and Games for Controllable Markov Chains: Numerical Methods with Application to Finance and Engineering (Studies in Systems, Decision and Control, 504)

✍ Julio B. Clempner, Alexander Poznyak 📂 Library 📅 2024 🏛 Springer 🌐 English

This book considers a class of ergodic finite controllable Markov's chains. The main idea behind the method, described in this book, is to develop the original discrete optimization problems (or game models) in the space of randomized formulations, where the variables stand in for the distr

Modeling, Control and Optimization of Wa

📁 Modeling, Control and Optimization of Water Systems: Systems Engineering Methods for Control and Decision Making Tasks

✍ Thomas Rauschenbach (ed.) 📂 Library 📅 2015 🏛 Springer 🌐 English

Contributors: Thomas Bernard, Albrecht Gnauck, Marco Jacobi, Divas Karimanzira, Oliver Krol, Torsten Pfützenreuter, Buren Scharaw, Thomas Westerhoff This book provides essential background knowledge on the development of model-based real-world solutions in the field of control and decision mak

Advanced Simulation-Based Methods for Op

📁 Advanced Simulation-Based Methods for Optimal Stopping and Control: With Applications in Finance

✍ Denis Belomestny,John Schoenmakers (auth.) 📂 Library 📅 2018 🏛 Palgrave Macmillan UK 🌐 English

This is an advanced guide to optimal stopping and control, focusing on advanced Monte Carlo simulation and its application to finance. Written for quantitative finance practitioners and researchers in academia, the book looks at the classical simulation based algorithms before introducing some of

Advanced Simulation-Based Methods for Op

📁 Advanced Simulation-Based Methods for Optimal Stopping and Control: With Applications in Finance

✍ Denis Belomestny; John Schoenmakers 📂 Library 📅 2018 🏛 Springer 🌐 English

This is an advanced guide to optimal stopping and control, focusing on advanced Monte Carlo simulation and its application to finance. Written for quantitative finance practitioners and researchers in academia, the book looks at the classical simulation based algorithms before introducing some of th

Intelligent Optimal Adaptive Control for

📁 Intelligent Optimal Adaptive Control for Mechatronic Systems (Studies in Systems, Decision and Control Book 120)

✍ Marcin Szuster; Zenon Hendzel 📂 Library 🌐 English

Towards Analytical Techniques for System

📁 Towards Analytical Techniques for Systems Engineering Applications (Studies in Systems, Decision and Control, 286)

✍ Griselda Acosta, Eric Smith, Vladik Kreinovich 📂 Library 📅 2020 🏛 Springer 🌐 English

This book is intended for specialists in systems engineering interested in new, general techniques and for students and practitioners interested in using these techniques for solving specific practical problems. For many real-world, complex systems, it is possible to create easy-to-compute exp