<p>Traditional Pattern Recognition (PR) and Computer Vision (CV) technologies have mainly focused on full automation, even though full automation often proves elusive or unnatural in many applications, where the technology is expected to assist rather than replace the human agents. However, not all
Multimodal Processing and Interaction: Audio, Video, Text
β Scribed by Petros Maragos, Patrick Gros, Athanassios Katsamanis, George Papandreou (auth.), Petros Maragos, Alexandros Potamianos, Patrick Gros (eds.)
- Publisher
- Springer US
- Year
- 2008
- Tongue
- English
- Leaves
- 379
- Series
- Multimedia Systems and Applications 33
- Edition
- 1
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
Multimodal Processing and Interaction: Audio, Video and Text presents high quality, state-of-the-art research ideas and results from theoretic, algorithmic and application viewpoints. This edited volume contains both state-of-the-art reviews and original contributions by leading experts in the scientific and technological field of multimedia. It grew out of a four-year collaboration among research groups participating in the European network of Excellence on Multimedia Understanding, Semantics, Computation and Learning (MUSCLE).
Multimodal Processing and Interaction: Audio, Video and Text covers a broad spectrum of novel perspectives, analytic tools, algorithms, design practices and applications in multimedia science and engineering with emphasis on multimodal integration and modality fusion. This volume also contains contributions in the area of interaction with multimedia, especially multimodal interfaces for accessing multimedia content.
Multimodal Processing and Interaction: Audio, Video and Text is designed for a professional audience composed of practitioners and researchers in industry and academia. This book is suitable for advanced-level students in computer science and engineering as well.
β¦ Table of Contents
Front Matter....Pages 1-23
Front Matter....Pages 1-1
Cross-Modal Integration for Performance Improving in Multimedia: A Review....Pages 1-46
Human-Computer Interfaces to Multimedia Content a Review....Pages 1-39
Front Matter....Pages 1-1
Stochastic Models for Multimodal Video Analysis....Pages 1-19
Adaptive Multimodal Fusion by Uncertainty Compensation with Application to Audio-Visual Speech Recognition....Pages 1-15
Action Recognition in Multimedia Streams....Pages 1-16
Surveillance Using Both Video and Audio....Pages 1-13
Movie Analysis with Emphasis to Dialogue and Action Scene Detection....Pages 1-21
Audiovisual Attention Modeling and Salient Event Detection....Pages 1-21
Toward the Integration of Natural Language Processing and Automatic Speech Recognition: Using Morpho-Syntax and Pragmatics for Transcription....Pages 1-18
Front Matter....Pages 1-1
Interactive Image Retrieval Using a Hybrid Visual and Conceptual Content Representation....Pages 1-20
Multimodal Analysis of Text and Audio Features for Music Information Retrieval....Pages 1-17
Intelligent Search for Image Information on the Web through Text and Link Structure Analysis....Pages 1-17
Front Matter....Pages 1-1
IDesign Principles for Multimodal Spoken Dialogue Systems....Pages 1-18
Eye Tracking: A New Interface for Visual Exploration....Pages 1-14
User Interaction for Mobile Devices....Pages 1-17
Back Matter....Pages 1-41
β¦ Subjects
Multimedia Information Systems; Computer Imaging, Vision, Pattern Recognition and Graphics; Biometrics; Information Storage and Retrieval; Artificial Intelligence (incl. Robotics); Computer Communication Networks
π SIMILAR VOLUMES
<p>Traditional Pattern Recognition (PR) and Computer Vision (CV) technologies have mainly focused on full automation, even though full automation often proves elusive or unnatural in many applications, where the technology is expected to assist rather than replace the human agents. However, not all
This book presents an interactive multimodal approach for efficient transcription of handwritten text images. This approach, rather than full automation, assists the expert in the recognition and transcription process.Until now, handwritten text recognition (HTR) systems are far from being perfect a
The Real-time Transport Protocol (RTP) provides a framework for delivery of audio and video across IP networks with unprecedented quality and reliability. In RTP: Audio and Video for the Internet, Colin Perkins, a leader of the RTP standardization process in the IETF, offers readers detailed technic
The Real-time Transport Protocol (RTP) provides a framework for delivery of audio and video across IP networks with unprecedented quality and reliability. In RTP: Audio and Video for the Internet, Colin Perkins, a leader of the RTP standardization process in the IETF, offers readers detailed technic
Multimodality is a fast-growing interdisciplinary approach that aims to analyze the interplay of multiple modes such as gaze, gesture or spoken language that are utilized in interaction, and to examine the multimodal production and consumption of communicated messages. This Reader provides a compreh