<p>Computer Vision is a rapidly growing field of research investigating computational and algorithmic issues associated with image acquisition, processing, and understanding. It serves tasks like manipulation, recognition, mobility, and communication in diverse application areas such as manufacturin
Foundations of Computer Vision
β Scribed by Antonio Torralba; Phillip Isola;William T. Freeman;; Phillip Isola; William T. Freeman
- Publisher
- MIT Press
- Year
- 2024
- Tongue
- English
- Leaves
- 240
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
An accessible, authoritative, and up-to-date computer vision textbook offering a comprehensive introduction to the foundations of the field that incorporates the latest deep learning advances. Machine learning has revolutionized computer vision, but the methods of today have deep roots in the history of the field. Providing a much-needed modern treatment, this accessible and up-to-date textbook comprehensively introduces the foundations of computer vision while incorporating the latest deep learning advances. Taking a holistic approach that goes beyond machine learning, it addresses fundamental issues in the task of vision and the relationship of machine vision to human perception. Foundations of Computer Vision covers topics not standard in other texts, including transformers, diffusion models, statistical image models, issues of fairness and ethics, and the research process. To emphasize intuitive learning, concepts are presented in short, lucid chapters alongside extensive illustrations, questions, and examples. Written by leaders in the field and honed by a decade of classroom experience, this engaging and highly teachable book offers an essential next-generation view of computer vision. Up-to-date treatment integrates classic computer vision and deep learning Accessible approach emphasizes fundamentals and assumes little background knowledge Student-friendly presentation features extensive examples and images Proven in the classroom Instructor resources include slides, solutions, and source code
β¦ Table of Contents
Cover
Contents
Preface
Notation
1: The Challenge of Vision
I: FOUNDATIONS
2: A Simple Vision System
3: Looking at Images
4: Computer Vision and Society
II: IMAGE FORMATION
5: Imaging
6: Lenses
7: Cameras as Linear Systems
8: Color
III: FOUNDATIONS OF LEARNING
9: Introduction to Learning
10: Gradient-Based Learning Algorithms
11: The Problem of Generalization
12: Neural Networks
13: Neural Networks as Distribution Transformers
14: Backpropagation
IV: FOUNDATIONS OF IMAGE PROCESSING
15: Linear Image Filtering
16: Fourier Analysis
V: LINEAR FILTERS
17: Blur Filters
18: Image Derivatives
19: Temporal Filters
VI: SAMPLING AND MULTISCALE IMAGE REPRESENTATIONS
20: Image Sampling and Aliasing
21: Downsampling and Upsampling Images
22: Filter Banks
23: Image Pyramids
VII: NEURAL ARCHITECTURES FOR VISION
24: Convolutional Neural Nets
25: Recurrent Neural Nets
26: Transformers
VIII: PROBABILISTIC MODELS OF IMAGES
27: Statistical Image Models
28: Textures
29: Probabilistic Graphical Models
IX: GENERATIVE IMAGE MODELS AND REPRESENTATION LEARNING
30: Representation Learning
31: Perceptual Grouping
32: Generative Models
33: Generative Modeling Meets Representation Learning
34: Conditional Generative Models
X: CHALLENGES IN LEARNING-BASED VISION
35: Data Bias and Shift
36: Training for Robustness and Generality
37: Transfer Learning and Adaptation
XI: UNDERSTANDING GEOMETRY
38: Representing Images and Geometry
39: Camera Modeling and Calibration
40: Stereo Vision
41: Homographies
42: Single View Metrology
43: Learning to Estimate Depth from a Single Image
44: Multiview Geometry and Structure from Motion
45: Radiance Fields
XII: UNDERSTANDING MOTION
46: Motion Estimation
47: 3D Motion and Its 2D Projection
48: Optical Flow Estimation
49: Learning to Estimate Motion
XIII: UNDERSTANDING VISION WITH LANGUAGE
50: Object Recognition
51: Vision and Language
XIV: ON RESEARCH, WRITING AND SPEAKING
52: How to Do Research
53: How to Write Papers
54: How to Give Talks
XV: CLOSING REMARKS
55: A Simple Vision SystemβRevisited
Bibliography
Index
π SIMILAR VOLUMES
Few developments have influenced the field of computer vision in the last decade more than the introduction of statistical machine learning techniques. Particularly kernel-based classifiers, such as the support vector machine, have become indispensable tools, providing a unified framework for solvin
<p><span>The remarkable progress in computer vision over the last few years is, by and large, attributed to deep learning, fueled by the availability of huge sets of labeled data, and paired with the explosive growth of the GPU paradigm. While subscribing to this view, this work criticizes the suppo
<p>This book introduces the fundamentals of computer vision (CV), with a focus on extracting useful information from digital images and videos. Including a wealth of methods used in detecting and classifying image objects and their shapes, it is the first book to apply a trio of tools (computational