Skip to main content

The Roadmap for Making Khepri the Default Metadata Store in RabbitMQ

· 3 min read

Khepri, the new Raft-based RabbitMQ metadata store, became fully supported with RabbitMQ 4.0. Starting with the next release series, RabbitMQ 4.2, we consider Khepri to be mature enough to become the default metadata store, especially given its substantial data safety and recovery improvements over Mnesia.

We have performed a number of benchmarks, showing significant performance improvements in many metadata operations.

Khepri Feature Flag now stable

The khepri_db feature flag has now been upgraded to Stable, meaning it will now be enabled when running the command rabbitmqctl enable_feature_flag all, which should be done after every successful version upgrade.

Starting with version 4.2, all RabbitMQ clusters will be strongly recommended to adopt Khepri by enabling the khepri_db feature flag. This feature flag will likely become mandatory for upgrading from 4.2 onwards.

While the final decision depends on the community feedback, we expect that starting with RabbitMQ 4.3, the khepri_db feature flag will graduate to be Required.

Feature Flag Subsystem

The RabbitMQ feature flag subsystem was recently improved by introducing a new category of feature flags known as Soft Required. If a feature flag is Soft Required starting from version N, it is automatically enabled once all RabbitMQ nodes are upgraded to version N of RabbitMQ. This is a change from the previous behavior of Required, where a feature flag that became required in version N of RabbitMQ must be enabled before upgrading to version N.

It remains best practice to enable feature flags as soon as they become Stable, generally immediately after a successful upgrade by running the command rabbitmqctl enable_feature_flag all. Nonetheless, we view the introduction of Soft Required feature flags as an improvement in user experience, as any required feature flags not already enabled will be automatically enabled when required.

Khepri Performance Improvements

The benchmarks below were performed on a 3 node cluster running on Kubernetes

1000 queues, each with 100 bindings

benchmarkmnesiakhepri
import446 s51 s
re-import16 s46 s
stop_app1.6 s1.7 s
start_app22 s4.3 s
rolling cluster restart108 s67 s
mnesia to khepri migration12.7 s

1000 Vhosts

benchmarkmnesiakhepri
import284 s21 s
re-import2.2 s2.2 s
stop_app2.6 s2.4 s
start_app419 s16 s
rolling cluster restart1447 s106 s
mnesia to khepri migration5.5 s

100,000 Classic Queues

benchmarkmnesiakhepri
import76 s76 s
re-import5.4 s5.3 s
stop_app13 s6 s
start_app26 s40 s
rolling cluster restart185 s307 s
mnesia to khepri migration9.7 s

10,000 Quorum Queues

benchmarkmnesiakhepri
import49 s46 s
re-import1.9 s1.8 s
stop_app1.9 s1.7 s
start_app44 s44 s
rolling cluster restart285 s267 s
mnesia to khepri migration4.7 s

1,000 Streams

benchmarkmnesiakhepri
import3.5 s1.2 s
re-import1.6 s1.2 s
stop_app1.9 s1.2 s
start_app2.5 s2.3 s
rolling cluster restart56 s55 s
mnesia to khepri migration5 s