Learn to build cost-effective apps using Large Language Models InLarge Language Model-Based Solutions: How to Deliver Value with Cost-Effective Generative AI Applications, Principal Data Scientist at Amazon Web Services, Shreyas Subramanian, delivers a practical guide for developers and data scient
Large Language Model-Based Solutions : HOW TO DELIVER VALUE WITH COST-EFFECTIVE GENERATIVE AI APPLICATIONS
โ Scribed by Shreyas Subramanian
- Publisher
- WILEY
- Year
- 2024
- Tongue
- English
- Leaves
- 224
- Category
- Library
No coin nor oath required. For personal study only.
โฆ Synopsis
Learn to build cost-effective apps using Large Language Models
In Large Language Model-Based Solutions: How to Deliver Value with Cost-Effective Generative AI Applications, Principal Data Scientist at Amazon Web Services, Shreyas Subramanian, delivers a practical guide for developers and data scientists who wish to build and deploy cost-effective large language model (LLM)-based solutions. In the book, you'll find coverage of a wide range of key topics, including how to select a model, pre- and post-processing of data, prompt engineering, and instruction fine tuning.
The author sheds light on techniques for optimizing inference, like model quantization and pruning, as well as different and affordable architectures for typical generative AI (GenAI) applications, including search systems, agent assists, and autonomous agents. You'll also find:
โฆ Table of Contents
Cover
Table of Contents
Title Page
Introduction
GenAI APPLICATIONS AND LARGE LANGUAGE MODELS
IMPORTANCE OF COST OPTIMIZATION
MICRO CASE STUDIES
WHO IS THIS BOOK FOR?
SUMMARY
1 Introduction
OVERVIEW OF GenAI APPLICATIONS AND LARGE LANGUAGE MODELS
PATHS TO PRODUCTIONIZING GenAI APPLICATIONS
THE IMPORTANCE OF COST OPTIMIZATION
SUMMARY
2 Tuning Techniques for Cost Optimization
FINEโTUNING AND CUSTOMIZABILITY
PARAMETERโEFFICIENT FINEโTUNING METHODS
COST AND PERFORMANCE IMPLICATIONS OF PEFT METHODS
SUMMARY
3 Inference Techniques for Cost Optimization
INTRODUCTION TO INFERENCE TECHNIQUES
PROMPT ENGINEERING
CACHING WITH VECTOR STORES
CHAINS FOR LONG DOCUMENTS
SUMMARIZATION
BATCH PROMPTING FOR EFFICIENT INFERENCE
MODEL OPTIMIZATION METHODS
PARAMETERโEFFICIENT FINEโTUNING METHODS
COST AND PERFORMANCE IMPLICATIONS
SUMMARY
REFERENCES
4 Model Selection and Alternatives
INTRODUCTION TO MODEL SELECTION
MOTIVATING EXAMPLE: THE TALE OF TWO MODELS
THE ROLE OF COMPACT AND NIMBLE MODELS
EXAMPLES OF SUCCESSFUL SMALLER MODELS
DOMAINโSPECIFIC MODELS
THE POWER OF PROMPTING WITH GENERALโPURPOSE MODELS
SUMMARY
5 Infrastructure and Deployment Tuning Strategies
INTRODUCTION TO TUNING STRATEGIES
HARDWARE UTILIZATION AND BATCH TUNING
INFERENCE ACCELERATION TOOLS
MONITORING AND OBSERVABILITY
SUMMARY
CONCLUSION
BALANCING PERFORMANCE AND COST
FUTURE TRENDS IN GenAI APPLICATIONS
SUMMARY
INDEX
Copyright
Dedication
ABOUT THE AUTHOR
ABOUT THE TECHNICAL EDITOR
End User License Agreement
๐ SIMILAR VOLUMES
<p><span>Learn to build cost-effective apps using Large Language Models</span></p><p><span>In </span><span>Large Language Model-Based Solutions: How to Deliver Value with Cost-Effective Generative AI Applications</span><span>, Principal Data Scientist at Amazon Web Services, Shreyas Subramanian, del
Learn to unleash the power of AI creativity KEY FEATURES โ Understand the core concepts related to generative AI. โ Different types of generative models and their applications. โ Learn how to design generative AI neural networks using Python and TensorFlow. DESCRIPTION This book researches the intri
Learn to unleash the power of AI creativity KEY FEATURES โ Understand the core concepts related to generative AI. โ Different types of generative models and their applications. โ Learn how to design generative AI neural networks using Python and TensorFlow. DESCRIPTION This book researches the intri
<p><span>This book is a guide to productionizing AI solutions using best-of-breed cloud services with workarounds to lower costs. Supplemented with step-by-step instructions covering data import through wrangling to partitioning and modeling through to inference and deployment, and augmented with pl
<p><span>This book is a guide to productionizing AI solutions using best-of-breed cloud services with workarounds to lower costs. Supplemented with step-by-step instructions covering data import through wrangling to partitioning and modeling through to inference and deployment, and augmented with pl