Kafka Streams in Action, Second Edition

✍ Scribed by Bill Bejeck;

Publisher: Simon & Schuster
Year: 2024
Tongue: English
Leaves: 506
Edition: 2
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

Kafka Streams in Action, Second Edition teaches you how to create event streaming applications on the amazing Apache Kafka platform. This thoroughly revised new edition now covers a wider range of streaming architectures and includes data integration with Kafka Connect. As you go, you’ll explore real-world examples that introduce components and brokers, schema management, and the other essentials. Along the way, you’ll pick up practical techniques for blending Kafka with Spring, low-level control of processors and state stores, storing event data with ksqlDB, and testing streaming applications.

✦ Table of Contents

copyright
contents
Praise for the first edition
Kafka Streams in Action
foreword
preface
acknowledgments
about this book
about the author
about the cover illustration
Part 1
1 Welcome to the Kafka event streaming platform
1.1 Event streaming
1.2 What is an event?
1.3 An event stream example
1.4 Introducing the Apache Kafka event streaming platform
1.4.1 Kafka brokers
1.4.2 Schema Registry
1.4.3 Producer and consumer clients
1.4.4 Kafka Connect
1.4.5 Kafka Streams
1.4.6 ksqlDB
1.5 A concrete example of applying the Kafka event streaming platform
2 Kafka brokers
2.1 Introducing Kafka brokers
2.2 Produce requests
2.3 Fetch requests
2.4 Topics and partitions
2.4.1 Offsets
2.4.2 Determining the correct number of partitions
2.5 Sending your first messages
2.5.1 Creating a topic
2.5.2 Producing records on the command line
2.5.3 Consuming records from the command line
2.5.4 Partitions in action
2.6 Segments
2.6.1 Data retention
2.6.2 Compacted topics
2.6.3 Topic partition directory contents
2.7 Tiered storage
2.8 Cluster metadata
2.9 Leaders and followers
2.9.1 Replication
2.10 Checking for a healthy broker
2.10.1 Request handler idle percentage
2.10.2 Network handler idle percentage
2.10.3 Underreplicated partitions
Part 2
3 Schema Registry
3.1 Objects
3.2 What is a schema, and why do you need one?
3.2.1 What is Schema Registry?
3.2.2 Getting Schema Registry
3.2.3 Architecture
3.2.4 Communication: Using Schema Registry’s REST API
3.2.5 Registering a schema
3.2.6 Plugins and serialization platform tools
3.2.7 Uploading a schema file
3.2.8 Generating code from schemas
3.2.9 End-to-end example
3.3 Subject name strategies
3.3.1 TopicNameStrategy
3.3.2 RecordNameStrategy
3.3.3 TopicRecordNameStrategy
3.4 Schema compatibility
3.4.1 Backward compatibility
3.4.2 Forward compatibility
3.4.3 Full compatibility
3.4.4 No compatibility
3.5 Schema references
3.6 Schema references and multiple events per topic
3.7 Schema Registry (de)serializers
3.7.1 Avroserializers and deserializers
3.7.2 Protobuf
3.7.3 JSON Schema
3.8 Serialization without Schema Registry
4 Kafka clients
4.1 Introducing Kafka clients
4.2 Producing records with the KafkaProducer
4.2.1 Producer configurations
4.2.2 Kafka delivery semantics
4.2.3 Partition assignment
4.2.4 Writing a custom partitioner
4.2.5 Specifying a custom partitioner
4.2.6 Timestamps
4.3 Consuming records with the KafkaConsumer
4.3.1 The poll interval
4.3.2 The group id configuration
4.3.3 Applying partition assignment strategies
4.3.4 Static membership
4.3.5 Committing offsets
4.4 Exactly-once delivery in Kafka
4.4.1 The idempotent producer
4.4.2 Transactional producer
4.4.3 Consumers in transactions
4.4.4 Producers and consumers within a transaction
4.5 Using the Admin API for programmatic topic management
4.6 Handling multiple event types in a single topic
4.6.1 Producing multiple event types
4.6.2 Consuming multiple event types
5 Kafka Connect
5.1 An introduction to Kafka Connect
5.2 Integrating external applications into Kafka
5.3 Getting started with Kafka Connect
5.4 Applying Single Message Transforms
5.5 Adding a sink connector
5.6 Building and deploying your own connector
5.6.1 Implementing a connector
5.6.2 Making your connector dynamic with a monitoring thread
5.6.3 Creating a custom transformation
Part 3
6 Developing Kafka Streams
6.1 A look at Kafka Streams
6.2 Kafka Streams DSL
6.3 Hello World for Kafka Streams
6.3.1 Creating the topology for the Yelling app
6.3.2 Kafka Streams configuration
6.3.3 Serde creation
6.4 Masking credit card numbers and tracking purchase rewards in a retail sales setting
6.4.1 Building the source node and the masking processor
6.4.2 Adding the purchase-patterns processor
6.4.3 Building the rewards processor
6.4.4 Using Serdes to encapsulate serializers and deserializers in Kafka Streams
6.4.5 Kafka Streams and Schema Registry
6.5 Interactive development
6.6 Choosing which events to process
6.6.1 Filtering purchases
6.6.2 Splitting/branching the stream
6.6.3 Naming topology nodes
6.6.4 Dynamic routing of messages
7 Streams and state
7.1 Stateful vs. stateless
7.2 Adding stateful operations to Kafka Streams
7.2.1 Group-by details
7.2.2 Aggregation vs. reducing
7.2.3 Repartitioning the data
7.2.4 Proactive repartitioning
7.2.5 Repartitioning to increase the number of tasks
7.2.6 Using Kafka Streams optimizations
7.3 Stream-stream joins
7.3.1 Implementing a stream-stream join
7.3.2 Join internals
7.3.3 ValueJoiner
7.3.4 JoinWindows
7.3.5 Co-partitioning
7.3.6 StreamJoined
7.3.7 Other join options
7.3.8 Outer joins
7.3.9 Left-outer join
7.4 State stores in Kafka Streams
7.4.1 Changelog topics restoring state stores
7.4.2 Standby tasks
7.4.3 Assigning state stores in Kafka Streams
7.4.4 State stores’ location on the filesystem
7.4.5 Naming stateful operations
7.4.6 Specifying a store type
7.4.7 Configuring changelog topics
8 The KTable API
8.1 KTable: The update stream
8.1.1 Updates to records or the changelog
8.1.2 KStream and KTable API in action
8.2 KTables are stateful
8.3 The KTable API
8.4 KTable aggregations
8.5 GlobalKTable
8.6 Table joins
8.6.1 Stream–table join details
8.6.2 Versioned KTables
8.6.3 Stream–global table join details
8.6.4 Table–table join details
9 Windowing and timestamps
9.1 Understanding the role of windows and the different types
9.1.1 Hopping windows
9.1.2 Tumbling windows
9.1.3 Session windows
9.1.4 Sliding windows
9.1.5 Window time alignment
9.1.6 Retrieving window results for analysis
9.2 Handling out order data with grace—literally
9.3 Final windowed results
9.3.1 Strict buffering
9.3.2 Eager buffering
9.4 Timestamps in Kafka Streams
9.5 The TimestampExtractor
9.5.1 WallclockTimestampExtractorSystem .currentTimeMillis() method
9.5.2 Custom TimestampExtractor
9.5.3 Specifying a TimestampExtractor
9.6 Stream time
10 The Processor API
10.1 Working with sources, processors, and sinks to create a topology
10.1.1 Adding a source node
10.1.2 Adding a processor node
10.1.3 Adding a sink node
10.2 Digging deeper into the Processor API with a stock analysis processor
10.2.1 The stock-performance processor application
10.2.2 Punctuation semantics
10.2.3 The process() method
10.2.4 The punctuator execution
10.3 Data-driven aggregation
10.4 Integrating the Processor API and the Kafka Streams API
11 ksqlDB
11.1 Understanding ksqlDB
11.2 More about streaming queries
11.3 Persistent vs. push vs. pull queries
11.4 Creating Streams and Tables
11.5 Schema Registry integration
11.6 ksqlDB advanced features
12 Spring Kafka
12.1 Introducing Spring
12.2 Using Spring to build Kafka-enabled applications
12.2.1 Spring Kafka application components
12.2.2 Enhanced application requirements
12.3 Spring Kafka Streams
13 Kafka Streams Interactive Queries
13.1 Kafka Streams and information sharing
13.2 Learning about Interactive Queries
13.2.1 Building an Interactive Queries app with Spring Boot
14 Testing
14.1 Understanding the difference between unit and integration testing
14.1.1 Testing Kafka producers and consumers
14.1.2 Creating tests for Kafka Streams operators
14.1.3 Writing tests for a Kafka Streams topology
14.1.4 Testing more complex Kafka Streams applications
14.1.5 Developing effective integration tests
appendix A Schema compatibility workshop
A.1 Backward compatibility
A.2 Forward compatibility
A.3 Full compatibility
appendix B Confluent resources
B.1 Confluent Cloud
B.2 Confluent command-line interface
B.3 Confluent local
appendix C Working with Avro, Protobuf, and JSON Schema
C.1 Apache Avro
C.1.1 Default and alias
C.1.2 Union
C.2 Protocol Buffers
C.2.1 Complex messages
C.2.2 Importing
C.2.3 Oneof type
C.2.4 Code generation
C.2.5 Specific and dynamic types
C.3 JSON Schema
C.3.1 Nested objects
C.3.2 JSON references
C.3.3 JSON Schema Registry schema references
C.3.4 JSON Schema code generation
C.3.5 Specific and generic types
appendix D Understanding Kafka Streams architecture
D.1 High-level view
D.2 Consumer and producer clients in Kafka Streams
D.3 Assigning, distributing, and processing events
D.4 Threads in Kafka Streams: StreamThread
D.5 Processing records

📜 SIMILAR VOLUMES

Kafka Streams in Action, Second Edition

📁 Kafka Streams in Action, Second Edition Version 8

✍ Bill Bejeck 📂 Library 📅 2022 🏛 Manning Publications 🌐 English

Kafka Streams in Action, Second Edition

📁 Kafka Streams in Action, Second Edition (MEAP V11)

✍ Bill Bejeck 📂 Library 📅 2023 🏛 Manning Publications 🌐 English

Everything you need to implement stream processing on Apache KafkaⓇ using Kafka Streams and the kqsIDB event streaming database. This totally revised new edition of Kafka Streams in Action has been expanded to cover more of the Kafka platform used for building event-based applications. You’ll als

Kafka Streams in Action: Real-time apps

📁 Kafka Streams in Action: Real-time apps and microservices with the Kafka Streaming API

✍ Bill Bejeck 📂 Library 📅 2018 🏛 Manning Publications 🌐 English

Kafka Streams is a library designed to allow for easy stream processing of data flowing into a Kafka cluster. Stream processing has become one of the biggest needs for companies over the last few years as quick data insight becomes more and more important but current solutions can be complex and lar

Pega Stream Events In Action: Real-time

📁 Pega Stream Events In Action: Real-time Streaming with Kafka

✍ Nikhil Garge 📂 Library 📅 2021 🏛 Independently Published 🌐 English

Stream events to Kafka is commonly used in today's information technology world as data is flowing in and out through systems in various industries like banking, healthcare, CRM, sales etc. Key factor of information technology is data analytics, data cleansing, real time data monitoring etc. This bo

Pega Stream Events In Action: Real-time

📁 Pega Stream Events In Action: Real-time Streaming with Kafka

✍ Nikhil Garge 📂 Library 📅 2021 🏛 Independently Published 🌐 English

Kafka Streams in Action: Event-driven ap

📁 Kafka Streams in Action: Event-driven applications and microservices

✍ Bill Bejeck 📂 Library 📅 2024 🏛 Manning 🌐 English

Everything you need to implement stream processing on Apache Kafka using Kafka Streams and the kqsIDB event streaming database. Kafka Streams in Action, Second Edition guides you through setting up and maintaining your streaming processing with Kafka. Inside, you’ll find comprehensive coverage of