Managing Cloud Native Data on Kubernetes: Architecting Cloud Native Data Services Using Open Source Technology

✍ Scribed by Jeff Carpenter, Patrick McFadin

Publisher: O'Reilly Media
Year: 2023
Tongue: English
Leaves: 331
Edition: 1
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

Is Kubernetes ready for stateful workloads? This open source system has become the primary platform for deploying and managing cloud native applications. But because it was originally designed for stateless workloads, working with data on Kubernetes has been challenging. If you want to avoid the inefficiencies and duplicative costs of having separate infrastructure for applications and data, this practical guide can help.

Using Kubernetes as your platform, you'll learn open source technologies that are designed and built for the cloud. Authors Jeff Carpenter and Patrick McFadin provide case studies to help you explore new use cases and avoid the pitfalls others have faced. You'll get an insider's view of what's coming from innovators who are creating next-generation architectures and infrastructure.

With this book, you will:
• Learn how to use basic Kubernetes resources to compose data infrastructure
• Automate the deployment and operations of data infrastructure on Kubernetes using tools like Helm and operators
• Evaluate and select data infrastructure technologies for use in your applications
• Integrate data infrastructure technologies into your overall stack
• Explore emerging technologies that will enhance your Kubernetes-based applications in the future

✦ Table of Contents

Cover
Copyright
Table of Contents
Foreword
Preface
Why We Wrote This Book
Who Is This Book For?
How to Read This Book
Conventions Used in This Book
Using Code Examples
O’Reilly Online Learning
How to Contact Us
Acknowledgments
Chapter 1. Introduction to Cloud Native Data Infrastructure: Persistence, Streaming, and Batch Analytics
Infrastructure Types
What Is Cloud Native Data?
More Infrastructure, More Problems
Kubernetes Leading the Way
Managing Compute on Kubernetes
Managing Network on Kubernetes
Managing Storage on Kubernetes
Cloud Native Data Components
Looking Forward
Getting Ready for the Revolution
Adopt an SRE Mindset
Embrace Distributed Computing
Principles of Cloud Native Data Infrastructure
Summary
Chapter 2. Managing Data Storage on Kubernetes
Docker, Containers, and State
Managing State in Docker
Bind Mounts
Volumes
Tmpfs Mounts
Volume Drivers
Kubernetes Resources for Data Storage
Pods and Volumes
PersistentVolumes
PersistentVolumeClaims
StorageClasses
Kubernetes Storage Architecture
Flexvolume
Container Storage Interface
Container Attached Storage
Container Object Storage Interface
Summary
Chapter 3. Databases on Kubernetes the Hard Way
The Hard Way
Prerequisites for Running Data Infrastructure on Kubernetes
Running MySQL on Kubernetes
ReplicaSets
Deployments
Services
Accessing MySQL
Running Apache Cassandra on Kubernetes
StatefulSets
Accessing Cassandra
Summary
Chapter 4. Automating Database Deployment on Kubernetes with Helm
Deploying Applications with Helm Charts
Using Helm to Deploy MySQL
How Helm Works
Labels
ServiceAccounts
Secrets
ConfigMaps
Updating Helm Charts
Uninstalling Helm Charts
Using Helm to Deploy Apache Cassandra
Affinity and Anti-Affinity
Helm, CI/CD, and Operations
Summary
Chapter 5. Automating Database Management on Kubernetes with Operators
Extending the Kubernetes Control Plane
Extending Kubernetes Clients
Extending Kubernetes Control Plane Components
Extending Kubernetes Worker Node Components
The Operator Pattern
Controllers
Custom Resources
Operators
Managing MySQL in Kubernetes Using the Vitess Operator
Vitess Overview
PlanetScale Vitess Operator
A Growing Ecosystem of Operators
Choosing Operators
Building Operators
Summary
Chapter 6. Integrating Data Infrastructure in a Kubernetes Stack
K8ssandra: Production-Ready Cassandra on Kubernetes
K8ssandra Architecture
Installing the K8ssandra Operator
Creating a K8ssandraCluster
Managing Cassandra in Kubernetes with Cass Operator
Enabling Developer Productivity with Stargate APIs
Unified Monitoring Infrastructure with Prometheus and Grafana
Performing Repairs with Cassandra Reaper
Backing Up and Restoring Data with Cassandra Medusa
Creating a Backup
Restoring from Backup
Deploying Multicluster Applications in Kubernetes
Summary
Chapter 7. The Kubernetes Native Database
Why a Kubernetes Native Approach Is Needed
Hybrid Data Access at Scale with TiDB
TiDB Architecture
Deploying TiDB in Kubernetes
Serverless Cassandra with DataStax Astra DB
What to Look for in a Kubernetes Native Database
Basic Requirements
The Future of Kubernetes Native
Summary
Chapter 8. Streaming Data on Kubernetes
Introduction to Streaming
Types of Delivery
Delivery Guarantees
Feature Scope
The Role of Streaming in Kubernetes
Streaming on Kubernetes with Apache Pulsar
Preparing Your Environment
Securing Communications by Default with cert-manager
Using Helm to Deploy Apache Pulsar
Stream Analytics with Apache Flink
Deploying Apache Flink on Kubernetes
Summary
Chapter 9. Data Analytics on Kubernetes
Introduction to Analytics
Deploying Analytic Workloads in Kubernetes
Introduction to Apache Spark
Deploying Apache Spark in Kubernetes
Build Your Custom Container
Submit and Run Your Application
Kubernetes Operator for Apache Spark
Alternative Schedulers for Kubernetes
Apache YuniKorn
Volcano
Analytic Engines for Kubernetes
Dask
Ray
Summary
Chapter 10. Machine Learning and Other Emerging Use Cases
The Cloud Native AI/ML Stack
AI/ML Definitions
Defining an AI/ML Stack
Real-Time Model Serving with KServe
Full Lifecycle Feature Management with Feast
Vector Similarity Search with Milvus
Efficient Data Movement with Apache Arrow
Versioned Object Storage with lakeFS
Summary
Chapter 11. Migrating Data Workloads to Kubernetes
The Vision: Application-Aware Platforms
Charting Your Path to Success
People
Technology
Process
The Future of Cloud Native Data
Summary
Index
About the Authors
Colophon

✦ Subjects

Machine Learning; Analytics; Databases; Docker; Kubernetes; Persistence; Helm; Data Management; Streaming

📜 SIMILAR VOLUMES

Managing Cloud Native Data on Kubernetes

📁 Managing Cloud Native Data on Kubernetes: Architecting Cloud Native Data Services Using Open Source Technology

✍ Jeff Carpenter 📂 Library 📅 2023 🏛 O'Reilly Media 🌐 English

Is Kubernetes ready for stateful workloads? This open source system has become the primary platform for deploying and managing cloud native applications. But because it was originally designed for stateless workloads, working with data on Kubernetes has been challenging. If you want to avoid the

Managing Cloud Native Data on Kubernetes

📁 Managing Cloud Native Data on Kubernetes: Architecting Cloud Native Data Services Using Open Source Technology (Final)

✍ Jeff Carpenter and Patrick McFadin 📂 Library 📅 2023 🏛 O'Reilly Media, Inc. 🌐 English

Managing Cloud Native Data on Kubernetes

📁 Managing Cloud Native Data on Kubernetes

✍ Jeff Carpenter; Patrick McFadin 📂 Library 📅 2023 🏛 O'Reilly Media, Inc. 🌐 English

Architecting Cloud-Native Serverless Sol

📁 Architecting Cloud-Native Serverless Solutions: Design, build, and operate serverless solutions on cloud and open source platforms

✍ Safeer CM 📂 Library 📅 2023 🏛 Packt Publishing 🌐 English

Get up and running with serverless workloads across AWS, Azure, GCP, Kubernetes, and virtual machines with real-life examples and best practices for design, development, and security of serverless applicationsPurchase of the print or Kindle book includes a free PDF eBook

Architecting Cloud-Native Serverless Sol

📁 Architecting Cloud-Native Serverless Solutions: Design, build, and operate serverless solutions on cloud and open source platforms

✍ Safeer CM 📂 Library 📅 2023 🏛 Packt Publishing 🌐 English

Architecting Cloud-Native Serverless Sol

📁 Architecting Cloud-Native Serverless Solutions: Design, build, and operate serverless solutions on cloud and open source platforms

✍ Safeer CM 📂 Library 📅 2023 🏛 Packt Publishing 🌐 English