High Availability

High availability for mirrord Operator in Enterprise Tier

Starting from chart version 1.40.1, the mirrord Operator is by default highly available. This means that mirrord sessions should survive transient failures in the cluster, with respect to the set of advanced mfT features used by the session (see advanced features section). This includes failures of nodes where the mirrord Operator pods are running.

High availability is implemented in a lightweight fasion, by saving sessions' state in the cluster's builtin etcd database (as Custom Resourcesarrow-up-right). Therefore, no persistent volume is required. At startup, the operator restores and resumes saved sessions.

circle-info

This feature is available to users on the Enterprise pricing plan.

Multiple replicas

By default, the mirrord Operator workload runs with a single replica. Starting from chart version 1.40.3, this can be configured with the operator.replicas setting in the chart.

Regardless of the configured scale, only one replica acts as a leader, and serves mirrord sessions. Any additional replicas wait in a standby mode, ready to acquire leadership and resume work, in case of the current leader's failure. Configuring the operator to run with multiple replicas allows for smoother leadership transitions, since the new leader candidates are already running.

Advanced features

Not all of the advanced mfT features have yet been migrated to highly available sessions. Sessions using such features will be forcefully terminated in case of operator pod failure.

The table below summarizes the current state of development.

Feature
HA
Chart version

SQS splitting

1.40.3

Kafka splitting

Copy target

DB branching

Last updated

Was this helpful?