KubeBlocks
BlogsEnterprise
⌘K
​
Blogs

Overview
Quickstart
Architecture

Operations

Lifecycle Management
Vertical Scaling
Horizontal Scaling
Volume Expansion
Configuration
Minor Version Upgrade
Manage Services

Backup And Restore

Backup
Restore

Monitoring

Observability for ZooKeeper Clusters
FAQs
  1. Resource Hierarchy
  2. Containers Inside Each Pod
  3. Node Roles
  4. High Availability via ZAB Protocol
    1. Leader Election Process
  5. Traffic Routing
  6. Automatic Failover
  7. System Accounts

ZooKeeper Architecture in KubeBlocks

This page describes how KubeBlocks deploys an Apache ZooKeeper ensemble on Kubernetes — covering the resource hierarchy, pod internals, the ZAB consensus protocol, and traffic routing.

Application / Client
Write/coord  zk-cluster-zookeeper:2181
Read (all nodes)  zk-cluster-zookeeper-readable:2181
RW → roleSelector: leader
Read → all pods (no roleSelector)
Kubernetes Services
zk-cluster-zookeeper
ClusterIP · :2181 client · :8080 admin
selector: kubeblocks.io/role=leader
Endpoints auto-switch with leader
Leader
zk-cluster-zookeeper-readable
ClusterIP · :2181 client
no roleSelector — all pods
Distribute reads across all nodes
All Nodes
→ leader pod
→ any pod (load balanced)
ZooKeeper Pods · Worker Nodes
zookeeper-0LEADER
zookeeper
:2181 client · :2888 quorum · :3888 elect · :8080 admin
leader
PVC data-0 · /bitnami/zookeeper/data
zookeeper-1FOLLOWER
zookeeper
:2181 client · :2888 quorum · :3888 elect · :8080 admin
follower
PVC data-1 · /bitnami/zookeeper/data
zookeeper-2FOLLOWER
zookeeper
:2181 client · :2888 quorum · :3888 elect · :8080 admin
follower
PVC data-2 · /bitnami/zookeeper/data
ZAB Protocol (ZooKeeper Atomic Broadcast)all writes go through leader · broadcast to followers · majority ack required
Headless service — stable pod DNS for internal use (quorum, leader election, operator probes); not a client endpoint
Leader / Write Traffic
All-Node Read Traffic
Follower Pod
Persistent Storage

Resource Hierarchy

KubeBlocks models a ZooKeeper ensemble as a hierarchy of Kubernetes custom resources:

Cluster  →  Component  →  InstanceSet  →  Pod × N
ResourceRole
ClusterUser-facing declaration — specifies the number of ensemble members, storage size, and resources
ComponentGenerated automatically; references a ComponentDefinition that describes container specs, lifecycle actions, and services
InstanceSetKubeBlocks custom workload (replaces StatefulSet); manages pods with stable identities and role awareness
PodActual running ZooKeeper server; each pod gets a unique ordinal (myid), a stable DNS name, and its own PVC

ZooKeeper requires an odd number of members (3, 5, or 7) to maintain a voting quorum. KubeBlocks assigns a unique myid to each pod, derived from its ordinal, which persists across restarts.

Containers Inside Each Pod

Every ZooKeeper pod runs one container:

ContainerPortPurpose
zookeeper2181 (client), 2888 (quorum/follower), 3888 (leader election), 7000 (metrics), 8080 (admin)ZooKeeper server participating in the ZAB consensus protocol and serving client requests; exposes Prometheus metrics natively on port 7000; roleProbe runs /kubeblocks/scripts/roleprobe.sh inside this container

Each pod mounts two PVCs:

  • data PVC → /bitnami/zookeeper/data — snapshot files and the ZooKeeper data tree
  • snapshot-log PVC → /bitnami/zookeeper/log — transaction log files

Node Roles

RoleDescription
LeaderCoordinates all write transactions; broadcasts proposals to followers and observers; elected via the ZAB leader election protocol
FollowerParticipates in voting for write quorum; serves client read requests locally; forwards writes to the leader
ObserverNon-voting member that replicates state from the leader; serves read requests; used to scale read throughput without affecting write quorum. Observer is not part of the default addon deployment — it requires explicit configuration in zoo.cfg and is not provisioned automatically by KubeBlocks.

High Availability via ZAB Protocol

ZooKeeper provides HA through the ZooKeeper Atomic Broadcast (ZAB) protocol, which guarantees total order of updates and crash-recovery:

ZAB PhaseDescription
Leader electionOn startup or after leader failure, servers exchange votes using a FastLeaderElection algorithm; the server with the most up-to-date transaction log and highest ID wins
SynchronizationThe new leader synchronizes followers to bring them up to date before resuming normal operation
BroadcastAll write requests go through the leader; the leader sends a proposal to all followers; a write is committed when a quorum acknowledges it
Quorum(N/2) + 1 servers must be available for writes to succeed; reads can be served by any server

A 3-member ensemble tolerates 1 failure; a 5-member ensemble tolerates 2 failures.

Leader Election Process

When the leader becomes unavailable:

  1. All remaining servers detect the missing heartbeat and enter leader election mode
  2. Each server votes for the candidate with the highest zxid (transaction ID) and myid
  3. The server that collects a quorum of votes becomes the new leader
  4. The new leader synchronizes followers before resuming write operations
  5. Leader election typically completes in 200 ms to 2 seconds under normal network conditions

Traffic Routing

KubeBlocks creates three services for each ZooKeeper ensemble:

ServiceTypePortSelector
{cluster}-zookeeperClusterIP2181 (client), 2888 (quorum), 8080 (admin)kubeblocks.io/role=leader
{cluster}-zookeeper-readableClusterIP2181 (client), 2888 (quorum), 8080 (admin)all pods (no roleSelector)
{cluster}-zookeeper-headlessHeadless2181, 2888, 3888all pods
  • Write / coordination traffic: connect to {cluster}-zookeeper:2181 — the Endpoints always point at the current leader. Write requests go directly to the leader without any forwarding overhead.
  • Read-heavy workloads: connect to {cluster}-zookeeper-readable:2181 — distributes client connections across all ensemble members (leader and followers). Followers serve read requests locally; write requests are still forwarded to the leader by the follower that receives them.

Port 7000 (Prometheus metrics) is exposed on each pod but is not included in any ClusterIP service. To scrape metrics, connect directly to the pod on port 7000, or configure a PodMonitor targeting port 7000 on the headless service.

Quorum and leader election traffic (ports 2888 and 3888) uses the headless service, where each ensemble member is individually addressable by its stable pod DNS name:

{pod-name}.{cluster}-zookeeper-headless.{namespace}.svc.cluster.local

The zoo.cfg configuration file references all peer addresses using these stable DNS names, ensuring correct cluster membership after pod restarts or rolling updates.

Automatic Failover

When a ZooKeeper ensemble member fails:

  1. Member goes offline — peers detect the missing heartbeat within the session timeout (default 2× tick time)
  2. Leader election (if the lost member was the leader) — surviving members elect a new leader in milliseconds to seconds
  3. Write continuity — as long as a quorum remains available, all write and read operations continue normally
  4. Pod recovery — when the failed pod restarts, it reads its myid from the PVC, contacts the leader, and syncs any missed transactions before rejoining the ensemble

System Accounts

KubeBlocks manages the following ZooKeeper system account. The password is auto-generated and stored in a Secret named {cluster}-{component}-account-admin (replace {cluster} and {component} with your Cluster metadata.name and the ZooKeeper component name, typically zookeeper).

AccountRolePurpose
adminAdminAdministrator user when ZooKeeper authentication is enabled (ZOO_ENABLE_AUTH=yes); credentials are injected into pods as ZK_ADMIN_USER and ZK_ADMIN_PASSWORD for authenticated client and administrative access

© 2026 KUBEBLOCKS INC