Elasticsearch Architecture in KubeBlocks

KubeBlocks supports four distinct Elasticsearch deployment topologies:

Topology	`spec.topology`	Components	Use Case
Single Node	`single-node`	`mdit` (1 replica)	Development and local testing
MDIT	`mdit`	`mdit` (N replicas)	Medium workloads — scalable, no dedicated masters
Multi-Node	`multi-node` (default)	`master` (3) + `dit` (N)	Production HA — dedicated master + data/ingest/transform nodes
Full Separation	`m-d-i-t`	`m` + `d` + `i` + `t`	Large scale — each role component sized and scaled independently

Each topology uses the same KubeBlocks resource hierarchy: Cluster → Component → InstanceSet → Pod × N.

Single-Node Topology

A single Elasticsearch pod running all roles simultaneously — master, data, ingest, and transform. The entire cluster state, data, and pipeline processing lives on one node. Designed purely for development and testing: there is no failover, no shard replication, and no HA.

Application / Client

REST es-cluster-mdit-http:9200

REST :9200

es-cluster-mdit-http

ClusterIP · :9200 · all pods

single pod

mdit-0All Roles

masterdataingesttransform

🔍elasticsearch:9200 · :9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 20Gi

No HA — single pod failure means cluster unavailable · recommended for development only

All-Roles Node

Client Traffic

Persistent Storage

Components

Component	Replicas	Roles
`mdit`	1	master · data · ingest · transform

Containers inside the pod

Container	Port	Purpose
`elasticsearch`	9200 (HTTP), 9300 (transport)	Elasticsearch engine — all roles
`es-agent`	8080	Lifecycle operations and configuration management
`exporter`	9114	Prometheus metrics (`elasticsearch-exporter`)

Traffic routing

Service	Type	Port	Notes
`{cluster}-mdit-http`	ClusterIP	9200	Client REST API — single pod
`{cluster}-mdit-agent`	ClusterIP	8080	es-agent sidecar
`{cluster}-mdit-headless`	Headless	9200, 9300	Pod DNS — operator probes

High availability

None. Pod failure means the cluster is unavailable until Kubernetes restarts the pod. Do not use in production.

MDIT Topology

Multiple Elasticsearch pods each running all four roles (master, data, ingest, transform). Unlike single-node, you can scale out replicas for higher throughput and capacity. All pods participate in master election — any pod can become master. There are no dedicated master nodes.

Application / Client

REST es-cluster-mdit-http:9200 · any pod can serve requests

REST :9200

es-cluster-mdit-http

ClusterIP · :9200 · all pods (no roleSelector)

→ all pods

mdit Component — N replicas, all roles

mdit-0MASTER

masterdataingesttransform

🔍elasticsearch:9200/:9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 30Gi

mdit-1ALL ROLES

masterdataingesttransform

🔍elasticsearch:9200/:9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 30Gi

mdit-2ALL ROLES

masterdataingesttransform

🔍elasticsearch:9200/:9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 30Gi

🗳️Master election across all pods· :9300 transport · quorum required

+ scale out via HorizontalScaling OpsRequest · each new pod joins the election quorum

Elected Master

Client Traffic :9200

Transport :9300

Persistent Storage

Components

Component	Replicas	Roles
`mdit`	N (configurable)	master · data · ingest · transform

Containers inside each pod

Container	Port	Purpose
`elasticsearch`	9200, 9300	Elasticsearch engine — all roles
`es-agent`	8080	Lifecycle operations and configuration management
`exporter`	9114	Prometheus metrics

Traffic routing

Service	Type	Port	Notes
`{cluster}-mdit-http`	ClusterIP	9200	Client REST API — load-balanced across all pods
`{cluster}-mdit-agent`	ClusterIP	8080	es-agent sidecar
`{cluster}-mdit-headless`	Headless	9200, 9300	Per-pod DNS — inter-node transport and operator probes

High availability

Any node failure requires re-election among surviving pods. With N=3 pods, the cluster tolerates 1 pod failure. Data is stored on each pod's own PVC; shard replication ensures data redundancy across pods. For stronger master stability, prefer multi-node.

Multi-Node Topology

The recommended production topology. Separates cluster management from data operations: three dedicated master nodes handle cluster state and shard allocation, while the combined DIT (data + ingest + transform) nodes handle indexing, search, and pipeline processing. Client traffic goes only to DIT nodes.

Application / Client

REST es-cluster-dit-http:9200 · DIT component handles all client traffic

REST :9200

es-cluster-dit-http

ClusterIP · :9200 REST API
Client traffic → DIT pods

*-headless (both components)

Headless · :9200 / :9300
Inter-node transport + operator probes

to DIT pods

Master Component

master-0 (elected)ELECTED

🧠elasticsearchrole: master

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 10Gi

master-1MASTER-ELIG

🧠elasticsearchrole: master

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 10Gi

master-2MASTER-ELIG

🧠elasticsearchrole: master

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 10Gi

🗳️Quorum election· tolerates 1 failure

DIT Component (data + ingest + transform)

dit-0DATA

🔍elasticsearchdata·ingest·transform

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 30Gi

dit-1DATA

🔍elasticsearchdata·ingest·transform

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 30Gi

+ scale-out via HorizontalScaling OpsRequest

↔Shard replication· primaries + replicas distributed across dit pods

🔗Transport :9300— master ↔ DIT inter-node communication (cluster state, shard allocation, replication)

Master — cluster state & election

DIT — data, ingest, transform

Client REST traffic

Inter-node transport

Persistent storage

Components

Component	Replicas	Roles	Purpose
`master`	3	master only	Cluster state management, shard allocation, index mappings
`dit`	N (configurable)	data · ingest · transform	Indexing, search, pipeline processing

Containers inside each pod

Both master and dit pods:

Container	Port	Purpose
`elasticsearch`	9200 (dit only), 9300	Elasticsearch engine; master pods do not expose :9200 to clients
`es-agent`	8080	Lifecycle operations and configuration management
`exporter`	9114	Prometheus metrics

Traffic routing

Service	Type	Port	Notes
`{cluster}-dit-http`	ClusterIP	9200	Client REST API — routes to DIT pods
`{cluster}-master-headless`	Headless	9300	Inter-node transport for master component
`{cluster}-dit-headless`	Headless	9200, 9300	Per-pod DNS for DIT component

Master pods are only reachable via the headless service on :9300 — they do not serve client REST traffic.

High availability

Mechanism	Description
Master quorum	3 master-eligible nodes — tolerates 1 failure without losing cluster state
Shard replication	Primary and replica shards distributed across DIT pods — promotes replica on data node failure
Split-brain prevention	Dedicated master nodes never also hold data — master election cannot be confused by data node failures
Rolling upgrades	KubeBlocks upgrades DIT nodes first, then master nodes last (quorum maintained throughout)

Full Separation (m-d-i-t) Topology

Fully separated role components: each of the four Elasticsearch roles runs as its own independent KubeBlocks Component with its own replica count, resource limits, and PVC size. Best for large-scale deployments where fine-grained resource tuning and independent scaling per role is critical.

Application / Client

Search → es-cluster-d-http:9200 · Ingest → es-cluster-i-http:9200

es-cluster-d-http

ClusterIP · :9200
Search & query → data nodes

es-cluster-i-http

ClusterIP · :9200
Bulk pipelines → ingest nodes

search :9200

ingest :9200

Master (m)

m-0 (elected)ELECTED

🧠master only:9300

⚙️es-agent:8080

💾 10Gi

m-1ELIG

🧠master only:9300

⚙️es-agent:8080

💾 10Gi

m-2ELIG

🧠master only:9300

⚙️es-agent:8080

💾 10Gi

🗳️ Quorum · tolerates 1 failure

Data (d)

d-0DATA

🔍data role:9200/:9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC · 100Gi+

d-1DATA

🔍data role:9200/:9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC · 100Gi+

+ scale-out for search throughput

Ingest (i)

i-0INGEST

⚡ingest role:9200/:9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC · 20Gi

i-1INGEST

⚡ingest role:9200/:9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC · 20Gi

+ scale-out for pipeline throughput

Transform (t)

t-0TRANSFORM

🔄transform role:9200/:9300

⚙️es-agent:8080

💾 PVC · 20Gi

+ scale-out for transform jobs

Independent scaling per component

📦 Master — 3 fixed (quorum safety)

🔍 Data — scale out for more search throughput & storage

⚡ Ingest — scale out for higher pipeline ingestion rate

🔄 Transform — scale out for more concurrent transform jobs

Each component uses its own resource limits (CPU/memory) and PVC size

🔗Transport :9300— all components communicate via inter-node transport (cluster state, shard allocation, replication, transform coordination)

Master

Data

Ingest

Transform

Transport :9300

Persistent Storage

Components

Component	Replicas	Role	Primary Resource Driver
`m`	3	master	Low CPU/memory — manages metadata only
`d`	N	data	High storage (large PVCs) + memory for JVM heap
`i`	N	ingest	High CPU for pipeline transforms
`t`	N	transform	Medium CPU for continuous aggregation jobs

Containers inside each pod

All components include elasticsearch (role-specific config) and es-agent (:8080). Data and ingest pods also include exporter (:9114) for Prometheus metrics.

Traffic routing

Service	Type	Port	Notes
`{cluster}-d-http`	ClusterIP	9200	Search traffic → data nodes
`{cluster}-i-http`	ClusterIP	9200	Ingest traffic → ingest nodes (pipeline processing)
`{cluster}-m-headless`	Headless	9300	Inter-node transport for master component
`{cluster}-d-headless`	Headless	9200, 9300	Per-pod DNS for data component
`{cluster}-i-headless`	Headless	9200, 9300	Per-pod DNS for ingest component
`{cluster}-t-headless`	Headless	9200, 9300	Per-pod DNS for transform component

High availability

Same mechanisms as multi-node (master quorum + shard replication), with the additional benefit that ingest pipeline failures and transform job failures are fully isolated from search and indexing workloads. Each component can be scaled independently without affecting other roles.

Common Pod Internals

All Elasticsearch pods include three init containers on startup:

Init Container	Purpose
`prepare-plugins`	Stages plugin files from a plugin image into a shared volume
`install-plugins`	Installs plugins and prepares the filesystem layout
`install-es-agent`	Copies the `es-agent` binary into the container's local bin path

Each pod mounts its own PVC for the Elasticsearch data directory (/usr/share/elasticsearch/data), providing independent persistent storage per node.

System Accounts

KubeBlocks automatically provisions the following Elasticsearch accounts. Credentials are stored in Secrets named {cluster}-{component}-account-{name}.

Account	Role	Purpose
`elastic`	Superuser	Built-in Elasticsearch superuser; used for cluster setup, index management, and security configuration
`kibana_system`	Monitor / manage index	Built-in account used by Kibana to communicate with Elasticsearch

Elasticsearch Architecture in KubeBlocks

KubeBlocks supports four distinct Elasticsearch deployment topologies:

Topology	`spec.topology`	Components	Use Case
Single Node	`single-node`	`mdit` (1 replica)	Development and local testing
MDIT	`mdit`	`mdit` (N replicas)	Medium workloads — scalable, no dedicated masters
Multi-Node	`multi-node` (default)	`master` (3) + `dit` (N)	Production HA — dedicated master + data/ingest/transform nodes
Full Separation	`m-d-i-t`	`m` + `d` + `i` + `t`	Large scale — each role component sized and scaled independently

Each topology uses the same KubeBlocks resource hierarchy: Cluster → Component → InstanceSet → Pod × N.

Single-Node Topology

Application / Client

REST es-cluster-mdit-http:9200

REST :9200

es-cluster-mdit-http

ClusterIP · :9200 · all pods

single pod

mdit-0All Roles

masterdataingesttransform

🔍elasticsearch:9200 · :9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 20Gi

No HA — single pod failure means cluster unavailable · recommended for development only

All-Roles Node

Client Traffic

Persistent Storage

Components

Component	Replicas	Roles
`mdit`	1	master · data · ingest · transform

Containers inside the pod

Container	Port	Purpose
`elasticsearch`	9200 (HTTP), 9300 (transport)	Elasticsearch engine — all roles
`es-agent`	8080	Lifecycle operations and configuration management
`exporter`	9114	Prometheus metrics (`elasticsearch-exporter`)

Traffic routing

Service	Type	Port	Notes
`{cluster}-mdit-http`	ClusterIP	9200	Client REST API — single pod
`{cluster}-mdit-agent`	ClusterIP	8080	es-agent sidecar
`{cluster}-mdit-headless`	Headless	9200, 9300	Pod DNS — operator probes

High availability

None. Pod failure means the cluster is unavailable until Kubernetes restarts the pod. Do not use in production.

MDIT Topology

Application / Client

REST es-cluster-mdit-http:9200 · any pod can serve requests

REST :9200

es-cluster-mdit-http

ClusterIP · :9200 · all pods (no roleSelector)

→ all pods

mdit Component — N replicas, all roles

mdit-0MASTER

masterdataingesttransform

🔍elasticsearch:9200/:9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 30Gi

mdit-1ALL ROLES

masterdataingesttransform

🔍elasticsearch:9200/:9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 30Gi

mdit-2ALL ROLES

masterdataingesttransform

🔍elasticsearch:9200/:9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 30Gi

🗳️Master election across all pods· :9300 transport · quorum required

+ scale out via HorizontalScaling OpsRequest · each new pod joins the election quorum

Elected Master

Client Traffic :9200

Transport :9300

Persistent Storage

Components

Component	Replicas	Roles
`mdit`	N (configurable)	master · data · ingest · transform

Containers inside each pod

Container	Port	Purpose
`elasticsearch`	9200, 9300	Elasticsearch engine — all roles
`es-agent`	8080	Lifecycle operations and configuration management
`exporter`	9114	Prometheus metrics

Traffic routing

Service	Type	Port	Notes
`{cluster}-mdit-http`	ClusterIP	9200	Client REST API — load-balanced across all pods
`{cluster}-mdit-agent`	ClusterIP	8080	es-agent sidecar
`{cluster}-mdit-headless`	Headless	9200, 9300	Per-pod DNS — inter-node transport and operator probes

High availability

Multi-Node Topology

Application / Client

REST es-cluster-dit-http:9200 · DIT component handles all client traffic

REST :9200

es-cluster-dit-http

ClusterIP · :9200 REST API
Client traffic → DIT pods

*-headless (both components)

Headless · :9200 / :9300
Inter-node transport + operator probes

to DIT pods

Master Component

master-0 (elected)ELECTED

🧠elasticsearchrole: master

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 10Gi

master-1MASTER-ELIG

🧠elasticsearchrole: master

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 10Gi

master-2MASTER-ELIG

🧠elasticsearchrole: master

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 10Gi

🗳️Quorum election· tolerates 1 failure

DIT Component (data + ingest + transform)

dit-0DATA

🔍elasticsearchdata·ingest·transform

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 30Gi

dit-1DATA

🔍elasticsearchdata·ingest·transform

⚙️es-agent:8080

📊exporter:9114

💾 PVC data · 30Gi

+ scale-out via HorizontalScaling OpsRequest

↔Shard replication· primaries + replicas distributed across dit pods

🔗Transport :9300— master ↔ DIT inter-node communication (cluster state, shard allocation, replication)

Master — cluster state & election

DIT — data, ingest, transform

Client REST traffic

Inter-node transport

Persistent storage

Components

Component	Replicas	Roles	Purpose
`master`	3	master only	Cluster state management, shard allocation, index mappings
`dit`	N (configurable)	data · ingest · transform	Indexing, search, pipeline processing

Containers inside each pod

Both master and dit pods:

Container	Port	Purpose
`elasticsearch`	9200 (dit only), 9300	Elasticsearch engine; master pods do not expose :9200 to clients
`es-agent`	8080	Lifecycle operations and configuration management
`exporter`	9114	Prometheus metrics

Traffic routing

Service	Type	Port	Notes
`{cluster}-dit-http`	ClusterIP	9200	Client REST API — routes to DIT pods
`{cluster}-master-headless`	Headless	9300	Inter-node transport for master component
`{cluster}-dit-headless`	Headless	9200, 9300	Per-pod DNS for DIT component

Master pods are only reachable via the headless service on :9300 — they do not serve client REST traffic.

High availability

Mechanism	Description
Master quorum	3 master-eligible nodes — tolerates 1 failure without losing cluster state
Shard replication	Primary and replica shards distributed across DIT pods — promotes replica on data node failure
Split-brain prevention	Dedicated master nodes never also hold data — master election cannot be confused by data node failures
Rolling upgrades	KubeBlocks upgrades DIT nodes first, then master nodes last (quorum maintained throughout)

Full Separation (m-d-i-t) Topology

Application / Client

Search → es-cluster-d-http:9200 · Ingest → es-cluster-i-http:9200

es-cluster-d-http

ClusterIP · :9200
Search & query → data nodes

es-cluster-i-http

ClusterIP · :9200
Bulk pipelines → ingest nodes

search :9200

ingest :9200

Master (m)

m-0 (elected)ELECTED

🧠master only:9300

⚙️es-agent:8080

💾 10Gi

m-1ELIG

🧠master only:9300

⚙️es-agent:8080

💾 10Gi

m-2ELIG

🧠master only:9300

⚙️es-agent:8080

💾 10Gi

🗳️ Quorum · tolerates 1 failure

Data (d)

d-0DATA

🔍data role:9200/:9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC · 100Gi+

d-1DATA

🔍data role:9200/:9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC · 100Gi+

+ scale-out for search throughput

Ingest (i)

i-0INGEST

⚡ingest role:9200/:9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC · 20Gi

i-1INGEST

⚡ingest role:9200/:9300

⚙️es-agent:8080

📊exporter:9114

💾 PVC · 20Gi

+ scale-out for pipeline throughput

Transform (t)

t-0TRANSFORM

🔄transform role:9200/:9300

⚙️es-agent:8080

💾 PVC · 20Gi

+ scale-out for transform jobs

Independent scaling per component

📦 Master — 3 fixed (quorum safety)

🔍 Data — scale out for more search throughput & storage

⚡ Ingest — scale out for higher pipeline ingestion rate

🔄 Transform — scale out for more concurrent transform jobs

Each component uses its own resource limits (CPU/memory) and PVC size

🔗Transport :9300— all components communicate via inter-node transport (cluster state, shard allocation, replication, transform coordination)

Master

Data

Ingest

Transform

Transport :9300

Persistent Storage

Components

Component	Replicas	Role	Primary Resource Driver
`m`	3	master	Low CPU/memory — manages metadata only
`d`	N	data	High storage (large PVCs) + memory for JVM heap
`i`	N	ingest	High CPU for pipeline transforms
`t`	N	transform	Medium CPU for continuous aggregation jobs

Containers inside each pod

All components include elasticsearch (role-specific config) and es-agent (:8080). Data and ingest pods also include exporter (:9114) for Prometheus metrics.

Traffic routing

Service	Type	Port	Notes
`{cluster}-d-http`	ClusterIP	9200	Search traffic → data nodes
`{cluster}-i-http`	ClusterIP	9200	Ingest traffic → ingest nodes (pipeline processing)
`{cluster}-m-headless`	Headless	9300	Inter-node transport for master component
`{cluster}-d-headless`	Headless	9200, 9300	Per-pod DNS for data component
`{cluster}-i-headless`	Headless	9200, 9300	Per-pod DNS for ingest component
`{cluster}-t-headless`	Headless	9200, 9300	Per-pod DNS for transform component

High availability

Common Pod Internals

All Elasticsearch pods include three init containers on startup:

Init Container	Purpose
`prepare-plugins`	Stages plugin files from a plugin image into a shared volume
`install-plugins`	Installs plugins and prepares the filesystem layout
`install-es-agent`	Copies the `es-agent` binary into the container's local bin path

Each pod mounts its own PVC for the Elasticsearch data directory (/usr/share/elasticsearch/data), providing independent persistent storage per node.

System Accounts

KubeBlocks automatically provisions the following Elasticsearch accounts. Credentials are stored in Secrets named {cluster}-{component}-account-{name}.

Account	Role	Purpose
`elastic`	Superuser	Built-in Elasticsearch superuser; used for cluster setup, index management, and security configuration
`kibana_system`	Monitor / manage index	Built-in account used by Kibana to communicate with Elasticsearch