Open Source · Kubernetes-Native · AI-Powered

Upgrade Fearlessly.
Validate Everything.
Lose Nothing.

NebulaCB is the complete Couchbase management platform. Orchestrate rolling upgrades, validate XDCR replication integrity, monitor multi-cluster health, and get AI-powered root cause analysis — all from a mission-control cockpit dashboard.

Get Started View on GitHub
Zero Data Loss Validation Kubernetes-Aware Upgrades Bidirectional XDCR Local AI with Ollama
🚀
Rolling Upgrades
Helm-based orchestration with pause, resume, abort, and rollback. Track node-by-node progress in real time.
🔍
Data Integrity
SHA-256 hash validation, sequence gap detection, and continuous doc-count monitoring across clusters.
🤖
AI Analysis
Local Ollama integration. Ask AI about cluster health, get root cause analysis, and auto-train on your logs.
🌍
Multi-Region
Manage clusters across regions with cross-region XDCR, automatic failover, and region-aware monitoring.

Everything You Need to Manage Couchbase

From rolling upgrades to AI-powered troubleshooting, NebulaCB covers every aspect of Couchbase cluster lifecycle management.

🛰️
Mission Control Cockpit
A NASA-style mission-control view that puts every signal you need on one screen. Status pill, 4-tile top strip (source / target / XDCR flow / load + alerts), full-width upgrade timeline, and a 3-column bottom row (live logs / data integrity / controls).
  • Glowing health tiles with pulsing animations
  • Phase rail with 6-stage upgrade tracking
  • Live event stream from XDCR + alerts + node state
  • Tail-style log panel with severity + source filters
  • Side-by-side data integrity proof + control buttons
⚙️
Upgrade Orchestrator
Automate Couchbase rolling upgrades via the Kubernetes Operator. NebulaCB patches the CouchbaseCluster CR image, monitors pod rollover, and tracks rebalance completion.
  • Helm-based rolling upgrade via CouchbaseCluster CR
  • Pre-check validation (all nodes ready)
  • Node-by-node progress tracking (10s poll)
  • Downgrade button to roll back to previous version
  • Abort to stop tracking mid-upgrade
🔁
XDCR Replication Management
Monitor cross-datacenter replication in real time. Track replication lag, pipeline restarts, topology changes, and GOXDCR delay with a 5-minute countdown timer.
  • Bidirectional XDCR monitoring (source ↔ target)
  • Pipeline pause / resume / restart / stop controls
  • GOXDCR delay detection with countdown timer
  • Topology change tracking during upgrades
  • Integrated XDCR troubleshooting modal
Data Integrity Validation
Continuous proof that zero data loss occurs during upgrades. Compare source and target clusters with hash verification and sequence gap detection.
  • Document count timeline with delta convergence chart
  • SHA-256 hash sampling for content verification
  • Sequence gap detection for ordered streams
  • Full audit on demand (all keys compared)
  • Real-time convergence status (ZERO LOSS CONFIRMED)
Storm Load Generator
Simulate production traffic during upgrades. Generate writes, reads, and deletes with configurable rates, burst patterns, and hot-key distributions.
  • Configurable writes/reads per second and doc sizes
  • Burst mode with multiplier and interval
  • Hot key percentage for realistic access patterns
  • Standalone xdcr-loadtest script for dual-cluster writes
  • Real-time latency P50/P95/P99 tracking
🛡️
HA & Automatic Failover
Configure automatic failover with health checks, timeout thresholds, and recovery modes. Supports manual and graceful failover between clusters.
  • Auto-failover with configurable timeout
  • Manual and graceful failover triggers
  • Failover history and event timeline
  • Cross-region failover support
  • Preserve data mode for safe recovery
💾
Backup & Restore (EE + CE)
Works on Couchbase Enterprise and Community Edition. EE clusters use cbbackupmgr; CE clusters fall back to a parallel SDK JSONL export — no license required.
  • Auto-detect engine: cbbackupmgr → SDK JSONL
  • CE mode: 16-worker KV fetch pool, millions of docs
  • Restore modal: pick backup from list, target any cluster
  • Live progress: docs / bytes while running
  • JSONL + metadata.json on disk, easy to inspect
  • Cron-based scheduling, retention, compression (EE)
📦
Data Migration
Migrate data between clusters with parallel workers, batch processing, and optional transformation rules. Validates integrity after migration.
  • Parallel worker pool (configurable)
  • Batch processing with retry logic
  • Transform rules (rename, convert, filter)
  • Post-migration validation
  • Progress tracking with ETA
☸️
Kubernetes Operator Integration
Deep integration with the Couchbase Autonomous Operator. Auto-discovers pods, manages NodePort exposure, and patches CRDs for upgrades.
  • Auto port-forwarding for k8s clusters
  • Direct NodePort access with kv_port config
  • CouchbaseCluster CR patching for upgrades
  • Pod discovery and health monitoring
  • Helm chart for deploying NebulaCB itself
📜
K8s Observability Suite
Four new enterprise tabs that bring the rest of the cluster lifecycle into NebulaCB: pod logs, Kubernetes events, Operator state, and an opinionated runbook library — all without leaving the dashboard.
  • Pod Logs — live tail across namespaces
  • Events — real-time K8s event stream with filters
  • Operator — CouchbaseCluster CR + Operator health
  • Runbooks — built-in remediation playbooks
  • Force-reconnect button for SDK pool recovery
📦
System Packages & systemd
First-class native install. Ships as .deb for Ubuntu/Debian, .rpm for CentOS/RHEL/Rocky/Alma/Fedora/openSUSE/SLES, plus a distro-aware shell installer. Runs under a hardened systemd unit on port 8899.
  • nfpm-built .deb and .rpm with pre/post hooks
  • install.sh detects Ubuntu/Debian/CentOS/SUSE
  • Hardened systemd unit (NoNewPrivileges, ProtectSystem)
  • Dedicated nebulacb system user, /etc/nebulacb config
  • Make targets: package-deb, package-rpm, install-local

Mission Control Dashboard

A cockpit-style interface with real-time WebSocket updates, live cluster health, XDCR flow visualization, and one-click controls.

📊
Cluster Health & Metrics
Node status, CPU/memory, ops/sec, doc counts, version, edition, rebalance state per cluster.
🔁
XDCR Replication Flow
Visual pipeline: source → target with lag, restarts, topology changes, mutation queue, GOXDCR delay.
Data Loss Proof Panel
Doc count timeline, delta convergence chart, hash sampling results, monitoring duration, zero-loss verdict.
🎮
Control Panel
16 action buttons: load, upgrade, downgrade, XDCR, audit, AI analyze, backup, failover, chaos injection.

Tab Navigation — 10 Workspaces

Cockpit is the new default. The legacy Dashboard tab stays for parity, and four new enterprise tabs bring K8s observability inside NebulaCB.

🛰️
Cockpit
Mission-control grid (default)
Dashboard
Legacy panel-stack view
🤖
Ask AI
Chat with AI about cluster issues
🔍
RCA
Root cause analysis reports
📚
Knowledge
12+ built-in troubleshooting guides
📊
Insights
History of all AI analyses
📜
Pod Logs
Live K8s pod log tail
Events
Real-time Kubernetes events
☸️
Operator
CouchbaseCluster CR & operator state
📋
Runbooks
Opinionated remediation playbooks

Mission Control Panel — 18 Commands

Every operator action is one click away. All commands hit the backend via /api/v1/command and stream live state back through WebSocket.

CategoryCommandWhat it does
Load start_loadStart the Storm generator against configured clusters
pause_loadPause writers without tearing down workers
resume_loadResume paused writers without re-initialising
stop_loadStop generation and flush stats
Upgrade start_upgradePatch CouchbaseCluster CR and track pod-by-pod rollout
abort_upgradeStop tracking the in-flight upgrade
downgradeRoll back to the previous image via Operator rolling restart
XDCR pause_xdcrPause replication pipeline
resume_xdcrResume after pause
stop_xdcrStop and remove the replication
restart_xdcrRecreate the pipeline (useful after topology change)
xdcr_troubleshootOpen diagnostics modal with delay history + live state
Validation run_auditFull source-vs-target comparison (hash + sequence + key diff)
Chaos inject_failureInject XDCR partition or node failure for resilience testing
AI ai_analyzeTrigger on-demand AI root cause analysis
Backup start_backupStart a cluster backup (EE cbbackupmgr or CE SDK JSONL fallback)
start_restoreRestore from a previous backup — modal lists backups and target cluster
HA manual_failoverPromote target, mark source failed (with confirmation modal)

AI-Powered Analysis with Ollama

Run AI locally with Ollama — no cloud API keys needed. NebulaCB learns from your cluster logs and metrics to provide context-aware recommendations.

💬
Ask AI
Chat with NebulaCB AI about any cluster issue. It has full context of your cluster state, XDCR status, alerts, and metrics. Ask questions in natural language and get actionable answers.
> "Why is XDCR replication lag increasing?"
> "Is the cluster ready for an upgrade?"
> "Analyze performance bottlenecks"
> "What's the best backup strategy?"
🔍
Root Cause Analysis
Trigger AI-powered RCA for specific categories: XDCR issues, upgrade failures, performance problems, failover events, backup errors, or data integrity concerns. Get structured reports with evidence chains and remediation steps.
Result: severity, root cause, evidence chain,
remediation steps with risk levels + commands
📚
Knowledge Base
Built-in library of 12+ common Couchbase issues covering XDCR lag, pipeline restarts, stuck rebalances, auto-failover, backup failures, memory pressure, disk queues, and Kubernetes NodePort configuration. Searchable and filterable.
Categories: XDCR, Upgrade, Failover, Backup,
Performance, Data Integrity, Configuration
🧠
Ollama Integration
Run AI 100% locally with Ollama. No data leaves your network. Supports llama3, llama4, and any Ollama-compatible model. Also supports Anthropic Claude and OpenAI as cloud providers.
config.json:
"ai": {
  "enabled": true,
  "provider": "ollama",
  "model": "llama3",
  "api_endpoint": "http://127.0.0.1:11434"
}

How It Works

From setup to production-grade upgrade validation in four steps.

Deploy Clusters
Set up source and target Couchbase clusters via Docker Compose, k3s with the Couchbase Operator, or existing infrastructure. Configure XDCR replication between them.
Configure NebulaCB
Edit config.json with cluster addresses, credentials, and NodePort KV ports. Enable AI with Ollama. Start the server with make run.
Generate Load & Upgrade
Start the Storm generator or xdcr-loadtest script to simulate production traffic. Trigger a rolling upgrade from the dashboard. Monitor XDCR and data integrity in real time.
Validate & Report
Run a full data audit after upgrade. Use AI to analyze any issues. Generate a comprehensive report with upgrade timeline, XDCR gap analysis, and zero-loss proof.

Installation

Four ways to get started with NebulaCB.

Build from Source (Go 1.24+)

# Clone and build git clone https://github.com/bwalia/nebulacb.git cd nebulacb make build # Configure your clusters vim config.json # Start the server make run # Open dashboard open http://localhost:8899 # Login: admin / nebulacb

Run the XDCR Load Test

# Send random writes to both clusters during upgrades go run ./cmd/xdcr-loadtest/ -rate 100 -duration 30m # Custom: 70% to source, 30% to target, larger docs go run ./cmd/xdcr-loadtest/ -rate 500 -ratio 0.7 -doc-min 1024 -doc-max 8192

Enable AI with Ollama

# Install Ollama (macOS) brew install ollama # Pull a model ollama pull llama3 # Update config.json "ai": { "enabled": true, "provider": "ollama", "model": "llama3", "api_endpoint": "http://127.0.0.1:11434" }

Native Install (Ubuntu / Debian / CentOS / RHEL / Rocky / Alma / Fedora / openSUSE / SLES)

# Build distributable .deb and .rpm with nfpm make package # builds both .deb and .rpm into ./dist/ make package-deb # Ubuntu / Debian only make package-rpm # CentOS / RHEL / Rocky / Alma / Fedora / openSUSE / SLES # Install on a Debian-family host sudo dpkg -i dist/nebulacb_1.0.0_amd64.deb # Install on an RPM-family host sudo rpm -i dist/nebulacb-1.0.0-1.x86_64.rpm # or, with dependency resolution: sudo dnf install dist/nebulacb-1.0.0-1.x86_64.rpm sudo zypper install dist/nebulacb-1.0.0-1.x86_64.rpm

Local Install via Shell Script (no package manager)

# Builds binary + UI, then installs system-wide make install-local # Or supply your own config to seed /etc/nebulacb/config.json make install-local SOURCE_CONFIG=/path/to/config.json # Service runs as user 'nebulacb' on port 8899 sudo systemctl status nebulacb sudo journalctl -u nebulacb -f curl http://localhost:8899/api/v1/health # Uninstall (preserves /etc/nebulacb) make uninstall-local # Full purge (removes config, data, logs, user) make uninstall-local ARGS=--purge

What Gets Installed

/usr/local/bin/nebulacb # Static Go binary (~20 MB) /usr/local/share/nebulacb/web/... # React UI build /etc/nebulacb/config.json # Editable config (0640 root:nebulacb) /etc/systemd/system/nebulacb.service # Hardened systemd unit /var/lib/nebulacb/ # State directory /var/log/nebulacb/ # Log directory (journal also works) # Hardening flags in the unit: NoNewPrivileges=true PrivateTmp=true ProtectSystem=strict ProtectHome=true ProtectKernelTunables=true ProtectKernelModules=true

After install, edit /etc/nebulacb/config.json to point at your Couchbase clusters and run sudo systemctl restart nebulacb. The dashboard is served at http://<host>:8899.

Docker Compose (includes two Couchbase clusters)

# Start everything: NebulaCB + Couchbase 7.2.2 + Couchbase 7.6.0 docker-compose up -d # Open dashboard at http://localhost:8080 # Source cluster: http://localhost:8091 # Target cluster: http://localhost:9091 # Tear down docker-compose down -v

After starting, initialize both Couchbase clusters, create the test bucket, and set up XDCR replication. See the README for step-by-step instructions.

Deploy to Kubernetes with Helm

# Install NebulaCB helm install nebulacb deploy/helm/nebulacb \ -n nebulacb --create-namespace # Upgrade helm upgrade nebulacb deploy/helm/nebulacb -n nebulacb # Uninstall helm uninstall nebulacb -n nebulacb

Expose Couchbase Clusters via NodePort

# Patch Couchbase Operator for external access kubectl patch couchbasecluster cb-local -n couchbase --type merge \ -p '{"spec":{"networking":{"exposedFeatures":["client","admin"]}}}' # Set kv_port in config.json to skip port-forwarding "source": { "host": "192.168.1.193:32451", "kv_port": 32419, ... }

Try It Live — Smoke Test After Install

Every endpoint below works against the default install. Use these to verify your deployment in under 60 seconds.

📡
Health & Dashboard
Check the server is up, all clusters are connected, and the dashboard API returns live cluster state.
# Public health probe (no auth) curl http://localhost:8899/api/v1/health # Full dashboard state (auth required) curl -u admin:nebulacb \ http://localhost:8899/api/v1/dashboard # Get a session token for the UI curl -X POST http://localhost:8899/api/v1/login \ -H 'Content-Type: application/json' \ -d '{"username":"admin","password":"nebulacb"}'
🔍
Data Integrity Audit
Run a full source-vs-target comparison. Exercises the gocb SDK, REST topology discovery, and the validator module end-to-end.
# Kick off a full audit curl -u admin:nebulacb \ -X POST http://localhost:8899/api/v1/command \ -H 'Content-Type: application/json' \ -d '{"action":"run_audit"}' # Same thing via the CLI bin/nebulacb-cli run-audit bin/nebulacb-cli status
🔁
XDCR Diagnostics
Get live pipeline state, topology-change history, GOXDCR delay windows, and every diagnostic check the troubleshoot modal runs.
curl -u admin:nebulacb \ http://localhost:8899/api/v1/xdcr/diagnostics | jq # Restart the pipeline curl -u admin:nebulacb \ -X POST http://localhost:8899/api/v1/command \ -H 'Content-Type: application/json' \ -d '{"action":"restart_xdcr"}'
Dual-Cluster Load Test
Generate random writes against both clusters to stress XDCR during an upgrade. Prints per-cluster throughput every 5 seconds.
# 100 writes/sec, 50/50 split, 5 min go run ./cmd/xdcr-loadtest/ \ -rate 100 -duration 5m # Or trigger the in-process Storm generator curl -u admin:nebulacb \ -X POST http://localhost:8899/api/v1/command \ -H 'Content-Type: application/json' \ -d '{"action":"start_load"}'
🤖
AI Tabs (Ollama required)
Run AI locally. The Knowledge Base tab always works, but Ask AI, RCA, and Insights require a running Ollama instance.
# One-time setup curl -fsSL https://ollama.com/install.sh | sh ollama serve & ollama pull llama3 # Trigger an AI analysis curl -u admin:nebulacb \ -X POST http://localhost:8899/api/v1/command \ -H 'Content-Type: application/json' \ -d '{"action":"ai_analyze"}'
☸️
K8s Observability Tabs
Pod Logs, Events, and Operator tabs pull from the kubeconfig set in /etc/nebulacb/config.json. Make sure the nebulacb system user can read that file.
# Place kubeconfig where the service can read it sudo install -m 0640 -o root -g nebulacb \ ~/.kube/config /etc/nebulacb/kubeconfig.yaml # Point config.json at it and restart sudo sed -i 's|"kubeconfig": ".*"|"kubeconfig": "/etc/nebulacb/kubeconfig.yaml"|' \ /etc/nebulacb/config.json sudo systemctl restart nebulacb

Architecture

NebulaCB runs on your machine and connects to Couchbase clusters via REST API and the gocb SDK.

                             React Dashboard (:8899)
                                    |
                            WebSocket + REST API
                                    |
                          NebulaCB Go Server
                    /    |    |    |    |    |    \
              Storm  XDCR  Validator  Orchestrator  Monitor  AI   Failover
                |      |      |           |           |      |      |
             ClientPool (gocb SDK + REST + NodePort connections)
              /                                                   \
   Couchbase Source                                      Couchbase Target
   (k8s / docker / native)                              (k8s / docker / native)
              \_____________________ XDCR _____________________/
                              (bidirectional)

   Local AI: Ollama (llama3) at 127.0.0.1:11434
   Metrics: Prometheus endpoint at :9090/metrics

CLI Commands

CommandDescription
nebulacb-cli statusFull dashboard status (clusters, upgrade, XDCR, load, integrity, alerts)
nebulacb-cli start-loadStart the Storm load generator
nebulacb-cli stop-loadStop load generation
nebulacb-cli start-upgradeTrigger rolling upgrade
nebulacb-cli abort-upgradeStop tracking the upgrade
nebulacb-cli restart-xdcrRestart XDCR pipeline
nebulacb-cli run-auditRun full data integrity audit
nebulacb-cli reportGenerate post-upgrade report

Project Structure

nebulacb/ cmd/ nebulacb/ # Main server cli/ # CLI client xdcr-loadtest/ # Dual-cluster load test internal/ ai/ # AI analyzer (Ollama/Claude/OpenAI) api/ # HTTP + WebSocket server storm/ # Load generator xdcr/ # XDCR replication engine validator/ # Data integrity validation orchestrator/ # Upgrade + downgrade failover/ # HA & failover backup/ # Backup & restore migration/ # Data migration monitor/ # Multi-cluster polling region/ # Multi-region management pkg/ couchbase/ # SDK client + connection pool kubernetes/ # K8s client + port-forward web/nebulacb-ui/ # React dashboard deploy/helm/nebulacb/ # Helm chart

Ready to Upgrade Fearlessly?

Start managing your Couchbase clusters with confidence. Zero data loss guaranteed.

Get Started Star on GitHub