Streams for Apache Kafka
What is Streams for Apache Kafka?
Self-managed distribution of Apache Kafka designed to deliver a superior install, configuration, and management experience on Red Hat OpenShift.
Based on the open source Strimzi project (CNCF Incubation project), it provides:
-
Container images for Apache Kafka
-
Operators to manage clusters, topics, and users
-
HTTP Bridge for Apache Kafka
-
Console UI (StreamsHub)
Apache Kafka — Fundamentals
Apache Kafka is a distributed data streaming system built on a publish-subscribe model.
Key characteristics:
-
Horizontal scalability
-
Fault tolerance
-
Immutable data (append-only log)
-
Open source (Apache License 2.0)
Use cases:
-
Real-time recommendations
-
IoT applications
-
Data gathering for AI
-
Change Data Capture (CDC)
Strimzi Operators
The Cluster Operator watches the Kafka CR and reconciles desired state. The Topic and User Operators run as containers inside the Entity Operator.
KRaft deployment (without Zookeeper)
Starting with Streams for Apache Kafka 3.x, metadata is stored inside Kafka using KRaft, removing the Zookeeper dependency.
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: cdc-cluster
namespace: kafka-cdc
spec:
kafka:
version: "4.0.0"
replicas: 3
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
config:
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
default.replication.factor: 3
min.insync.replicas: 2
storage:
type: persistent-claim
size: 10Gi
deleteClaim: false
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
name: kafka-metrics-config
key: kafka-metrics-config.yml
entityOperator:
topicOperator: {}
userOperator: {}
kafkaExporter:
topicRegex: ".*"
groupRegex: ".*"
How it Works
KRaft consensus (without Zookeeper)
Starting with Kafka 3.x, cluster metadata is managed internally using the KRaft (Kafka Raft) protocol:
-
Each broker has a role: controller (metadata management) or broker (data storage and serving), or both in small clusters (combined mode).
-
Controllers form a quorum using Raft. One controller is elected active controller — it is the only one that writes metadata.
-
When a topic is created or a partition is reassigned, the active controller writes the change to an internal log (
__cluster_metadata), which is replicated to the other controllers. -
Brokers obtain metadata by subscribing to the controller log — there is no polling; it is push-based.
-
If the active controller fails, the remaining controllers elect a new leader via Raft in milliseconds, eliminating the multi-minute delay that could occur with Zookeeper.
Message lifecycle
-
A producer sends a batch of records to the leader broker of the target partition.
-
The leader writes the batch to its local log (append-only on disk, sequential I/O).
-
Followers (replicas) fetch the batch from the leader and write it to their own logs.
-
When
min.insync.replicasreplicas (including the leader) acknowledge the write, the leader sends an ACK to the producer. -
The batch offset is assigned sequentially — it is immutable and monotonically increasing.
-
A consumer with an assigned consumer group calls
poll()on the leader of each assigned partition, receiving batches from its last committed offset.
Strimzi reconciliation loop
The Strimzi Cluster Operator behaves like a classic Kubernetes controller:
-
It watches changes to the
Kafka,KafkaConnect,KafkaTopic, andKafkaUserCRs. -
It computes the difference between desired state (CR) and current state (StatefulSets, ConfigMaps, Secrets).
-
It applies changes in a rolling fashion — it never stops all brokers at once.
-
The Topic and User Operators run as sidecars in the Entity Operator pod and reconcile their respective CRs against the Kafka API directly (not Kubernetes resources).
Kafka Bridge (HTTP REST)
The Kafka Bridge lets you produce and consume messages over HTTP REST — ideal for demos and testing without a native Kafka client:
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaBridge
metadata:
name: cdc-bridge
namespace: kafka-cdc
spec:
replicas: 1
bootstrapServers: cdc-cluster-kafka-bootstrap:9092
http:
port: 8080
Example — produce a message:
curl -X POST https://kafka-bridge-kafka-cdc.apps.<domain>/topics/cdc.public.customers \
-H "Content-Type: application/vnd.kafka.json.v2+json" \
-d '{"records":[{"value":{"first_name":"Test","last_name":"User","email":"test@demo.io"}}]}'
Security: Authentication and encryption
The Kafka cluster exposes two listeners: plain (9092, no TLS) and tls (9093, with TLS). Production clients should use the TLS listener with SCRAM-SHA-512 authentication.
KafkaUser with SCRAM-SHA-512
Strimzi manages credentials automatically when you create a KafkaUser resource. The operator generates a Secret with the password:
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaUser
metadata:
name: cdc-user
namespace: kafka-cdc
labels:
strimzi.io/cluster: cdc-cluster
spec:
authentication:
type: scram-sha-512
authorization:
type: simple
acls:
- resource:
type: topic
name: cdc
patternType: prefix
operations: [Read, Write, Describe]
- resource:
type: topic
name: dlq
patternType: prefix
operations: [Read, Write, Describe]
- resource:
type: group
name: cdc-connect-cluster
patternType: literal
operations: [Read]
- resource:
type: group
name: camel-cdc-consumer
patternType: literal
operations: [Read]
ACLs restrict access to topics with the cdc and dlq prefixes only, and to the pipeline’s specific consumer groups.
Client configuration — KafkaConnect
KafkaConnect is configured to use the TLS listener (9093) with SCRAM-SHA-512:
spec:
bootstrapServers: cdc-cluster-kafka-bootstrap:9093
authentication:
type: scram-sha-512
username: cdc-user
passwordSecret:
secretName: cdc-user
password: password
tls:
trustedCertificates:
- secretName: cdc-cluster-cluster-ca-cert
certificate: ca.crt
Strimzi automatically generates the cdc-cluster-cluster-ca-cert secret with the cluster CA certificate.
Client configuration — Apache Camel
Camel uses SASL/SSL properties in the kafka component parameters:
parameters:
brokers: cdc-cluster-kafka-bootstrap.kafka-cdc.svc:9093
securityProtocol: SASL_SSL
saslMechanism: SCRAM-SHA-512
saslJaasConfig: "org.apache.kafka.common.security.scram.ScramLoginModule required username=\"${env:KAFKA_USER}\" password=\"${env:KAFKA_PASSWORD}\";"
sslTruststoreLocation: /etc/kafka-certs/ca.crt
sslTruststoreType: PEM
Credentials are injected as environment variables from the Secret generated by Strimzi.
Streams for Apache Kafka ecosystem
| Component | Role in the ecosystem |
|---|---|
Kafka Core |
Brokers, topics, partitions, replication |
Kafka Connect |
Connector framework (source/sink) |
Kafka Bridge |
HTTP REST proxy to produce/consume |
Apicurio Registry |
Schema Registry (Avro, JSON Schema, Protobuf) |
Debezium |
CDC connectors for PostgreSQL, MySQL, MongoDB, etc. |
Streams Console |
Web UI to monitor clusters, topics, and consumer groups |
Kafka Exporter |
Exports consumer group lag metrics to Prometheus |
Mirror Maker 2 |
Cross-cluster replication |
Kroxylicious |
Kafka Proxy — encryption at rest, schema validation |
Official Documentation
-
Red Hat Streams for Apache Kafka — Installation, configuration, and operations guide
-
Strimzi Documentation — Upstream project, operators, and CRDs
-
Strimzi Blog — News, KRaft migration, and best practices
-
Apache Kafka Documentation — Official Kafka project reference
-
Kafka Bridge — HTTP REST proxy for Kafka