KubeBlocks
BlogsEnterprise
⌘K
​
Blogs
Overview
Quickstart
Architecture

Topologies

MongoDB ReplicaSet Cluster
MongoDB Sharding Cluster

Operations

Lifecycle Management
Vertical Scaling
Horizontal Scaling
Volume Expansion
Manage MongoDB Services
Modify MongoDB Parameters
MongoDB Switchover
Decommission MongoDB Replica

Custom Secret

Custom Password

tpl

  1. Replica Set Architecture
    1. Resource Hierarchy
    2. Containers Inside Each Pod
    3. High Availability
    4. Automatic Failover
    5. Traffic Routing
  2. Sharding Architecture
    1. Resource Hierarchy
    2. Component Details
    3. How Query Routing Works
    4. Traffic Routing
    5. Automatic Failover
  3. System Accounts

MongoDB Architecture in KubeBlocks

KubeBlocks supports two distinct MongoDB deployment architectures:

ArchitectureTopologyUse Case
Replica SetPrimary + secondaries, oplog replicationSingle-dataset HA; datasets that fit on one node
ShardingMongos routers + config servers + data shardsHorizontal scaling; datasets too large for a single replica set; high write throughput

Replica Set Architecture

A MongoDB replica set maintains multiple copies of the same dataset across pods. One pod acts as the primary (accepts all writes); the others are secondaries that replicate the primary's oplog and can serve reads.

Application / Client
Read/Write  mongo-cluster-mongodb-mongodb:27017
Read-Only    mongo-cluster-mongodb-mongodb-ro:27017
RW → roleSelector: primary
RO → roleSelector: secondary
Kubernetes Services
mongo-cluster-mongodb-mongodb
ClusterIP · :27017
selector: kubeblocks.io/role=primary
Endpoints auto-switch with primary
ReadWrite
mongo-cluster-mongodb-mongodb-ro
ClusterIP · :27017
selector: kubeblocks.io/role=secondary
Distribute reads across replicas
ReadOnly
→ primary pod
→ secondary pods
Pods · Worker Nodes
mongodb-0PRIMARY
🍃
mongodb (mongod + Replica Set)
:27017 mongo · primary status
primary
📊
mongodb-exporter
:9216 metrics
⚙ init-syncer (copies syncerctl → /tools)
💾 PVC data-0 · 20Gi
mongodb-1SECONDARY
🍃
mongodb (mongod + Replica Set)
:27017 mongo · secondary status
secondary
📊
mongodb-exporter
:9216 metrics
⚙ init-syncer (copies syncerctl → /tools)
💾 PVC data-1 · 20Gi
mongodb-2SECONDARY
🍃
mongodb (mongod + Replica Set)
:27017 mongo · secondary status
secondary
📊
mongodb-exporter
:9216 metrics
⚙ init-syncer (copies syncerctl → /tools)
💾 PVC data-2 · 20Gi
↔Replica Set Oplog Replicationprimary-0 → secondary-1 · secondary-2  |  w:majority write concern
🔗Headless service — stable pod DNS for internal use (replication, HA heartbeat, operator probes); not a client endpoint
Primary / RW Traffic
Secondary / RO Traffic
Replica Set DCS
Persistent Storage

Resource Hierarchy

Cluster  →  Component  →  InstanceSet  →  Pod × N
ResourceRole
ClusterUser-facing declaration — specifies topology, replica count, storage size, and resources
ComponentGenerated automatically; references a ComponentDefinition describing container specs, lifecycle actions, and services
InstanceSetKubeBlocks custom workload (replaces StatefulSet); manages pods with stable identities and role awareness
PodActual running MongoDB instance; each pod gets a unique ordinal and its own PVC

Containers Inside Each Pod

Each replica set pod runs three containers (plus three init containers on startup: init-syncer copies /bin/syncer and /bin/syncerctl to /tools; init-kubectl copies the kubectl binary to dataMountPath/tmp/bin; init-pbm-agent copies pbm, pbm-agent, and pbm-agent-entrypoint to /tools):

ContainerPortPurpose
mongodb27017, 3601 (ha replication)MongoDB database engine; participates in replica set replication and election; roleProbe runs /tools/syncerctl getrole inside this container
mongodb-backup-agent—Percona Backup for MongoDB (PBM) agent; coordinates cluster-wide consistent backups
exporter9216Prometheus metrics exporter

Each pod mounts its own PVC for the MongoDB data directory (default /data/mongodb, set by dataMountPath in chart values).

High Availability

MongoDB replica sets use oplog-based replication and a majority-vote (Raft-like) election protocol:

ConceptDescription
PrimaryReceives all write operations; records changes to the oplog
SecondaryReplicates the primary's oplog; can serve reads when readPreference is configured
ElectionWhen the primary fails, secondaries vote; the candidate with the most up-to-date oplog and a majority of votes wins
Write concernw:majority ensures a write is durable on a quorum before acknowledging

A 3-member replica set tolerates 1 failure.

Automatic Failover

  1. Primary pod crashes or becomes unreachable — secondaries stop receiving heartbeat pings
  2. Election timeout — after approximately 10 seconds (electionTimeoutMillis), one secondary calls for an election
  3. Majority vote — the candidate with the most up-to-date oplog and a majority of votes wins and becomes the new primary
  4. KubeBlocks roleProbe detects the change — syncerctl getrole returns primary for the new pod → kubeblocks.io/role=primary label is applied
  5. Service endpoints switch — the {cluster}-mongodb-mongodb ClusterIP service automatically routes writes to the new primary

Failover typically completes within 10–30 seconds.

Traffic Routing

ServiceTypePortSelector
{cluster}-mongodb-mongodbClusterIP27017kubeblocks.io/role=primary
{cluster}-mongodb-mongodb-roClusterIP27017kubeblocks.io/role=secondary
{cluster}-mongodbClusterIP27017all pods (no roleSelector — everypod)
{cluster}-mongodb-headlessHeadless27017all pods
  • Write traffic: connect to {cluster}-mongodb-mongodb:27017 (roleSelector: primary — always routes to the current primary)
  • Read-only traffic: connect to {cluster}-mongodb-mongodb-ro:27017 (roleSelector: secondary)
  • {cluster}-mongodb (two-segment name) is the everypod service — it routes to all pods and is not a write-only endpoint

Sharding Architecture

MongoDB Sharding distributes data across multiple independent replica sets (shards) using a shard key. A layer of stateless mongos routers sits in front, and a config server replica set (CSRS) stores the chunk routing metadata.

Application / Client
Connect via per-pod services {cluster}-mongos-mongos-0:27017, {cluster}-mongos-mongos-1:27017 … (one ClusterIP per pod, podService: true)
Or use {cluster}-mongos-headless for DNS-based discovery · Mongos routes each query to the correct shard
standard MongoDB connection → mongos router
Mongos · Query Routers· stateless · no PVC · reads chunk map from config servers
mongos-0
Stateless
🍃mongos:27017
📊exporter:9216
mongos-1
Stateless
🍃mongos:27017
📊exporter:9216
mongos-2
Stateless
🍃mongos:27017
📊exporter:9216
reads chunk routing metadata
forwards query to shard primary
Config Servers · CSRS Replica Set· stores chunk map, shard membership, cluster metadata
config-0PRIMARY
🍃mongodb:27017
📊exporter:9216
⚙init-syncer (syncerctl → /tools)
config-1SECONDARY
🍃mongodb:27017
📊exporter:9216
⚙init-syncer (syncerctl → /tools)
config-2SECONDARY
🍃mongodb:27017
📊exporter:9216
⚙init-syncer (syncerctl → /tools)
↔CSRS Oplog Replicationconfig-0 → config-1, config-2 · w:majority writes
routed writes/reads → shard primary
Data Shards · Independent Replica Sets· each shard = 1 KubeBlocks Sharding Component · owns a range of chunk space
shard-0chunk range A
shard-0-0PRIMARY
🍃mongodb:27017
📊exporter:9216
⚙init-syncer (syncerctl → /tools)
shard-0-1SECONDARY
shard-0-2SECONDARY
💾 PVC per pod · 20Gi
shard-1chunk range B
shard-1-0PRIMARY
🍃mongodb:27017
📊exporter:9216
⚙init-syncer (syncerctl → /tools)
shard-1-1SECONDARY
shard-1-2SECONDARY
💾 PVC per pod · 20Gi
shard-2chunk range C
shard-2-0PRIMARY
🍃mongodb:27017
📊exporter:9216
⚙init-syncer (syncerctl → /tools)
shard-2-1SECONDARY
shard-2-2SECONDARY
💾 PVC per pod · 20Gi
↔Each shard replica set replicates independently via oplog · inter-shard failover is independent
Mongos (query router)
Config Server (CSRS)
Shard Primary
Shard Secondary
Persistent Storage

Resource Hierarchy

The sharding topology uses both Component (for mongos and config-server) and Sharding (for data shards):

Cluster  →  Component (mongos)         →  InstanceSet  →  Pod × N
         →  Component (config-server)  →  InstanceSet  →  Pod × 3
         →  Sharding  (shard)          →  Shard × N    →  InstanceSet  →  Pod × replicas
ResourceRole
ClusterSpecifies topology sharding; declares mongos, config-server, and shard specs
Component (mongos)Stateless query routers; requires config-server to be reachable before routing
Component (config-server)3-node replica set (CSRS) storing chunk map and shard membership
ShardingKubeBlocks sharding spec; manages N identical shard Components
ShardAn independent replica set owning a range of chunks; each shard fails over independently

Component Details

Mongos pods (stateless — no PVC; each pod also runs an init-kubectl init container that copies the kubectl binary into the container's tools path for use by lifecycle scripts):

ContainerPortPurpose
mongos27017MongoDB query router — reads chunk map from CSRS and forwards queries to the correct shard
exporter9216Prometheus metrics exporter

Config server pods (3-node CSRS replica set; same three init containers as replica set pods — init-syncer, init-kubectl, init-pbm-agent):

ContainerPortPurpose
mongodb27017, 3601 (ha replication)Config server mongod — stores chunk routing metadata; must use w:majority for all config writes; roleProbe runs /tools/syncerctl getrole
mongodb-backup-agent—Percona Backup for MongoDB (PBM) agent
exporter9216Prometheus metrics exporter

Shard pods (each shard = independent replica set; same three init containers — init-syncer, init-kubectl, init-pbm-agent):

ContainerPortPurpose
mongodb27017, 3601 (ha replication)Data shard mongod — stores documents assigned to this shard's chunk range; roleProbe runs /tools/syncerctl getrole
mongodb-backup-agent—Percona Backup for MongoDB (PBM) agent
exporter9216Prometheus metrics exporter

Each shard pod mounts its own PVC for its data directory. At least 3 shards are recommended for balanced distribution (not enforced by the addon).

How Query Routing Works

  1. Client connects to any mongos on port 27017 — no cluster-aware driver required
  2. Mongos reads the chunk map from the CSRS to determine which shard owns the target key range
  3. Mongos forwards the query to the primary of the target shard
  4. For scatter-gather queries (no shard key filter), mongos fans out to all shard primaries and merges results
  5. On chunk migrations or shard additions, mongos automatically discovers the updated routing table

Traffic Routing

ServiceTypePortNotes
{cluster}-mongos-mongos-<ordinal>ClusterIP (per-pod)27017One service per mongos pod (podService: true); use all pod addresses as URI seed list
{cluster}-mongos-headlessHeadless27017DNS-based discovery of all mongos pods
{cluster}-mongos-internalClusterIP27018Intra-cluster use only; not for application traffic

Clients connect through mongos using a MongoDB URI seed list, e.g.:

mongodb://{cluster}-mongos-mongos-0:27017,{cluster}-mongos-mongos-1:27017/

Or use {cluster}-mongos-headless for DNS-based discovery. Direct shard or config server access is not intended for application traffic.

Automatic Failover

Each component fails over independently:

  • Shard primary fails → that shard's replica set elects a new primary (≈10 s); mongos retries on the new primary automatically
  • Config server primary fails → CSRS elects a new primary; chunk map reads resume; mongos does not lose routing data (it caches the chunk map locally)
  • Mongos pod fails → clients reconnect to another mongos pod; mongos is stateless so no data is lost

System Accounts

KubeBlocks automatically manages the following MongoDB system accounts. Passwords are stored in Secrets named {cluster}-{component}-account-{name}.

AccountRolePurpose
rootSuperuserDefault administrative account used for cluster initialization and management

© 2026 KUBEBLOCKS INC