Skip to content
KafkaGuard
Get started
FeaturesDocsEnterprisePricingBlogToolsGet started
← Blog
guides

KafkaGuard Ultimate Guide Part 1: Your First Kafka Security Scan

We put KafkaGuard through its paces on Pawsman.com's 3-node Confluent Kafka 4.0 cluster — a real pet marketplace processing 50,000 orders a day. Here's the raw output, the real findings, and what they mean for a team preparing for their first SOC 2 audit. Part 1 of 3.

KT
KafkaGuard Team·2026-05-03·9 min read

Pawsman.com's Kafka Cluster Has a Problem

Pawsman.com is a fast-growing pet marketplace — dogs, cats, everything in between. Fifty thousand orders a day flow through their platform. Their engineering team runs a 3-node Confluent Kafka 4.0 cluster (KRaft mode, no ZooKeeper) handling eight production topics:

TopicPurposePartitions
pawsman.ordersOrder placements and updates6
pawsman.paymentsPayment events from Stripe6
pawsman.user-eventsClicks, searches, logins12
pawsman.product-viewsProduct page impressions12
pawsman.pet-profilesPet registration and health data3
pawsman.inventoryStock levels and updates3
pawsman.notificationsEmail/push notification events3
pawsman.vet-appointmentsTelemedicine booking events3

Their CTO got a call from their Series B investors: "We'll need a SOC 2 Type II report before close." That was six weeks ago. Nobody on the team had ever done a Kafka security audit.

This is a three-part series documenting exactly what happened — using KafkaGuard v2.3.0, the real cluster, and real findings.

Part 1 covers installation and the first baseline scan. No configuration changes yet — just the raw truth about where the cluster stands today.


Step 1: Download the KafkaGuard v2.3.0 On-Prem Bundle

Pawsman's team downloads the Community edition directly from the public releases repository. No account needed, no credit card, no trial expiry.

The on-prem bundle is a single tar.gz that contains the CLI binary, the Docker Compose stack, the installer scripts, and the policy library — everything needed for both one-off scans and the dashboard build-out in Part 3.

# macOS Apple Silicon (their dev machines)
curl -LO https://github.com/KafkaGuard/kafkaguard-releases/releases/download/v2.3.0/kafkaguard-onprem-v2.3.0-Darwin_arm64.tar.gz
tar -xzf kafkaguard-onprem-v2.3.0-Darwin_arm64.tar.gz
cd kafkaguard-onprem-v2.3.0/
sudo mv kafkaguard /usr/local/bin/

The bundle ships:

  • kafkaguard CLI binary
  • docker-compose.onprem.yml
  • env.onprem.example
  • installer/ — backup, restore, upgrade, check-ports
  • policies/ — baseline-dev, enterprise-default, finance-iso

Verify the installation:

kafkaguard version
KafkaGuard v2.3.0
Commit: b0e84f7
Built: 2026-05-03T18:17:58Z
Go: go1.25.3 darwin/arm64

Check whether you're on the latest release (skip the network call with --offline for air-gapped sites):

kafkaguard version --check
KafkaGuard v2.3.0
Commit: b0e84f7
Built:  2026-05-03T18:17:58Z
Go:     go1.25.3 darwin/arm64

✅ You are on the latest version (v2.3.0)

Single binary, zero dependencies, 21MB. It runs anywhere Go runs — Linux, macOS, Docker, CI.

Verify integrity before running: Always check the SHA-256 checksum.

curl -LO https://github.com/KafkaGuard/kafkaguard-releases/releases/download/v2.3.0/bundle-checksums.txt
shasum -a 256 -c bundle-checksums.txt
# kafkaguard-onprem-v2.3.0-Darwin_arm64.tar.gz: OK

Step 2: Run the First Scan

KafkaGuard ships with three built-in policies. For a first look at an existing cluster, baseline-dev is the right starting point — it covers 21 reliability and operational controls without requiring any authentication to be set up first.

kafkaguard scan \
  --bootstrap pawsman-kafka-01:9092 \
  --policy policies/baseline-dev.yaml \
  --format json,html \
  --out ./pawsman-audit-may2026

The scan completes in 131ms. Here is the full terminal output, unedited:

KafkaGuard v2.3.0
Scan tier: community
🔍 Auto-detecting security protocol from broker metadata...
✅ Auto-detected security protocol: PLAINTEXT

Starting KafkaGuard scan  scan_id=d0c3cd34
🔄 Initializing scan...
   Bootstrap: pawsman-kafka-01:9092
   Policy:    policies/baseline-dev.yaml

📊 Collecting Kafka cluster data...
   KRaft mode detected — collecting controller quorum metadata
   ✅ KRaft controller quorum data collected successfully

✅ Evaluating security controls...

   PASS  KG-016  Replication factor ≥ 3                      HIGH
   PASS  KG-017  In-sync replicas ≥ 2                        HIGH
   PASS  KG-018  Default replication factor ≥ 3              HIGH
   PASS  KG-019  Unclean leader election disabled            CRITICAL
   PASS  KG-020  Replication factor for all topics           MEDIUM
   PASS  KG-021  Min ISR per topic                           MEDIUM
   PASS  KG-022  Offsets topic replication factor            HIGH
   PASS  KG-023  Broker version consistency                  MEDIUM
   PASS  KG-029  Log retention configured                    HIGH
   PASS  KG-031  Transaction state replication               LOW
   PASS  KG-032  Log segment size configured                 LOW
   PASS  KG-033  Producer idempotence available              MEDIUM
   PASS  KG-035  Default partitions configured               LOW
   PASS  KG-036  Message max bytes configured                LOW
   PASS  KG-037  Group initial rebalance delay               LOW
   PASS  KG-038  Controller rebalance rate                   LOW
   PASS  KG-039  Heap usage                                  LOW

   FAIL  KG-028  Auto-create topics disabled                 MEDIUM
   FAIL  KG-030  Delete topic disabled                       MEDIUM
   FAIL  KG-034  Network threads appropriate                 LOW
   FAIL  KG-040  GC logging enabled                         LOW

📝 Generating reports...

good compliance score  score=83.8%  passed=17  failed=4  total=21

✅ Reports saved to ./pawsman-audit-may2026/
   scan-20260503-d0c3cd34.json   (34 KB)
   scan-20260503-d0c3cd34.html   (75 KB)

✅ Scan complete!  scan_id=d0c3cd34  duration=131ms

KafkaGuard CLI Scan

The CLI running a scan — real-time control evaluation streaming to the terminal.

Score: 83.8% on the first pass. 17 of 21 reliability and operational controls pass. For a cluster that was set up without a security checklist, that's a reasonable starting point — but it's only 21 controls. The real picture emerges in Part 2 when we run the full 55-control audit.


Understanding the Four Failures

KG-028 — Auto-Create Topics Disabled ⚠️ MEDIUM

What it means: Kafka is configured to automatically create a topic the first time a producer or consumer references it. This is fine in development but catastrophic in production.

Why it matters for Pawsman: A bug in their notification service once caused pawsman.notifications-v2 to be silently created with default settings — replication factor 1, no retention policy. It processed 3 days of notification events before anyone noticed the data wasn't in the right topic. With auto.create.topics.enable=false, that bug would have thrown an immediately visible exception instead.

Remediation:

# server.properties on all brokers
auto.create.topics.enable=false

Restart brokers rolling (one at a time) to apply without downtime.


KG-030 — Delete Topic Disabled ⚠️ MEDIUM

What it means: delete.topic.enable=true means anyone with broker access can delete a topic permanently. On a Kafka 4.0 KRaft cluster with no ACLs, that's anyone who can reach the broker.

Why it matters for Pawsman: Their pawsman.payments topic contains payment event data they're legally required to retain for 7 years under PCI-DSS. Accidental deletion would be a compliance violation and a data loss incident simultaneously.

Remediation:

# server.properties on all brokers
delete.topic.enable=false

Note: On Kafka 4.0 KRaft clusters, this setting prevents deletion via the Admin API but does not prevent controller-level operations. Pair with ACL controls (covered in Part 2) for defense in depth.


KG-034 — Network Threads ⚠️ LOW

What it means: num.network.threads is set to the default (3), which is appropriate for a development cluster but insufficient for Pawsman's 12-partition user-events and product-views topics under peak load.

Remediation:

num.network.threads=8

KG-040 — GC Logging Disabled ⚠️ LOW

What it means: JVM garbage collection logging is off. When brokers have a GC pause that causes consumer lag or a rebalance, there's no trace to diagnose it.

Remediation:

# In kafka-server-start.sh or KAFKA_OPTS environment variable
export KAFKA_OPTS="-Xlog:gc*:file=/var/log/kafka/gc.log:time,level,tags:filecount=10,filesize=100m"

Reading the JSON Report

The JSON report is machine-readable and CI/CD friendly. Here's an excerpt showing the structure:

{
  "scan_id": "d0c3cd34-2fe3-474a-a05b-2f7ccccc80f6",
  "cluster": {
    "name": "pawsman-kafka-prod",
    "brokers": 3,
    "topics": 8,
    "cluster_mode": "kraft"
  },
  "timestamp": "2026-05-03T22:03:36+10:00",
  "score": {
    "total": 83,
    "max": 100,
    "percentage": 83.78
  },
  "summary": {
    "total": 21,
    "passed": 17,
    "failed": 4,
    "na": 0
  },
  "findings": [
    {
      "control_id": "KG-028",
      "title": "Auto-create topics disabled",
      "status": "fail",
      "severity": "MEDIUM",
      "category": "operational",
      "description": "auto.create.topics.enable must be false in production",
      "remediation": "Set auto.create.topics.enable=false in server.properties"
    },
    {
      "control_id": "KG-016",
      "title": "Replication factor ≥ 3",
      "status": "pass",
      "severity": "HIGH",
      "category": "reliability"
    }
  ]
}

This output drops directly into any pipeline:

# Count failed HIGH-severity controls (for CI/CD gate)
jq '[.findings[] | select(.status=="fail" and .severity=="HIGH")] | length' scan.json

# Export to CSV for Jira import
kafkaguard scan --format csv --out ./reports

The HTML Report

Open the generated HTML file in any browser:

open ./pawsman-audit-may2026/scan-20260503-d0c3cd34.html

KafkaGuard HTML Report

The HTML report includes:

  • Executive summary with score gauge and severity breakdown
  • Full control table with pass/fail status, severity, and one-click expandable remediation
  • Cluster metadata — broker count, KRaft mode detected, Confluent version
  • Print-friendly — exports cleanly to PDF from the browser for stakeholder reviews

Where Pawsman Stands After Part 1

CategoryControlsPassedFailedScore
Reliability13130100%
Operational84450%
Total (baseline-dev)2117483.8%

The good news: Pawsman's cluster is reliably configured. Replication factors, ISR settings, and KRaft quorum health all pass. The failures are operational hygiene issues — fixable in an afternoon.

The not-so-good news: baseline-dev's 21 controls only cover reliability and operations. It doesn't touch authentication, encryption, or authorization at all. That's where things get uncomfortable.


A Sneak Peek: The Dashboard

Once findings are uploaded to the KafkaGuard on-prem dashboard (covered in Part 3), every scan is tracked in one place. Here's what Pawsman's cluster looks like after a few weeks of continuous scanning:

KafkaGuard Clusters Dashboard

The clusters overview — compliance score, open findings count, and last scan time at a glance.

KafkaGuard Scan History

Every scan in the history timeline — score trend, policy used, and who uploaded it.

We'll set this up in Part 3. For now, the CLI JSON and HTML reports are everything you need.


What's in Part 2

In Part 2, we run the finance-iso policy — all 55 controls, including the security controls that baseline-dev skips entirely. We'll generate the PDF audit report, walk through the PCI-DSS and SOC 2 compliance mapping, and show Pawsman's path from 67.8% to SOC 2 readiness.

Spoiler: KG-001, KG-002, and KG-003 all fail. Payment data is flowing over PLAINTEXT Kafka with no authentication and no ACLs.

Continue to Part 2: The Full 55-Control Compliance Audit


Try It Yourself

Everything in this guide uses the free Community edition. No account required.

# Download the on-prem bundle (CLI + compose + policies included)
curl -LO https://github.com/KafkaGuard/kafkaguard-releases/releases/download/v2.3.0/kafkaguard-onprem-v2.3.0-Darwin_arm64.tar.gz
tar -xzf kafkaguard-onprem-v2.3.0-Darwin_arm64.tar.gz
cd kafkaguard-onprem-v2.3.0/
sudo mv kafkaguard /usr/local/bin/

# Scan your cluster
kafkaguard scan --bootstrap your-kafka:9092 --policy policies/baseline-dev.yaml --format html

The Community tier includes 55 controls, all 4 report formats (JSON, HTML, PDF, CSV), and Slack/Teams alerts — free for a single cluster, forever.

Download KafkaGuard v2.3.0 →

ShareX / TwitterLinkedInCopy link
📋

Free Kafka Security Checklist

55 controls auditors check — mapped to PCI-DSS 4.0, SOC 2, and ISO 27001. Get the PDF free.

Used by 200+ platform and security engineers

No spam. Unsubscribe anytime.