Why 80% of Kafka Clusters Would Fail a SOC 2 Audit Tomorrow

The Uncomfortable Number

We aggregated findings from 50 production Kafka cluster scans. 80% of them had at least one finding that would fail a SOC 2 Type II audit on the spot. Not "needs improvement." Not "compensating control accepted." Fail.

The findings are not exotic. They're not edge cases. They're the same handful of mistakes, repeated across teams, frameworks, and managed-Kafka providers. This post breaks down the most common ones, what SOC 2 control they map to, and what to change.

If you're preparing for a SOC 2 audit with Kafka in scope — or you suspect an upcoming auditor question — read on. If you'd rather just scan your cluster, grab the binary from GitHub Releases (Linux, macOS, or Docker) and run it. Either path works.

What "in scope" actually means

Before we get to findings, the question every team gets wrong: is Kafka in your SOC 2 scope?

If your Kafka clusters touch any of the following, the answer is yes:

Customer data or PII
Payment information
Healthcare data
Authentication tokens or session data
Financial transactions
Audit logs from other in-scope systems

Most teams realise this in the auditor's second meeting, not the planning phase. By then, the surface area is larger than they remembered. See our SOC 2 compliance checklist for the full 55-control mapping.

The 7 Findings That Keep Showing Up

We found 80% of clusters failing on at least one of these seven controls. The frequencies are aggregated across all 50 scans.

1. Inter-broker traffic in plaintext (73% of clusters)

SOC 2 control: CC6.7 — The entity restricts the transmission, movement, and removal of information to authorized internal and external users and processes, and protects it during transmission.

If security.inter.broker.protocol is PLAINTEXT, replication, controller messages, and ISR state move between brokers unencrypted. Anyone with network access between brokers (a compromised pod, a misconfigured peering, a forgotten debug listener) can read every message replicating across the cluster.

Fix:

security.inter.broker.protocol=SSL
ssl.keystore.location=/etc/kafka/ssl/server.keystore.jks
ssl.truststore.location=/etc/kafka/ssl/server.truststore.jks

This is a rolling-restart change. Plan it, but don't ship without it.

2. Wildcard ACLs on production topics (73%)

SOC 2 control: CC6.1 — The entity implements logical access security software, infrastructure, and architectures over protected information assets.

ACLs granted to User:* or Group:*. Every authenticated principal can read or write the topic. The reason it persists: someone needed quick access during an incident, granted wildcard, and never came back. The auditor will read the ACL list line-by-line; they will find this.

Fix: Replace wildcards with explicit principals or a dedicated group. If your ACL surface is complex, declare ACLs in IaC and audit drift on every deploy.

3. No client authentication (52%)

SOC 2 control: CC6.1 (logical access controls)

A PLAINTEXT:// listener bound to anything other than localhost. Even when the listener is "internal-only," internal increasingly means every workload in the VPC, every compromised pod, every leaked service-account token. Network position is not an authentication mechanism in 2026.

Fix: SASL/SCRAM-SHA-512 at minimum, mTLS for stronger identity guarantees. Disable the unauth listener entirely:

listeners=SASL_SSL://0.0.0.0:9093
sasl.enabled.mechanisms=SCRAM-SHA-512

4. `auto.create.topics.enable=true` (44%)

SOC 2 control: CC8.1 — The entity authorizes, designs, develops, configures, documents, tests, approves, and implements changes to infrastructure, data, software, and procedures to meet its objectives.

Why this is a SOC 2 finding (not just hygiene): topics created on demand have no naming convention, no retention policy, no ACL, no documentation, no change-management trail. That's an unauthorized configuration change in CC8.1 terms.

Fix: Set it to false. Once. Topic creation routes through your provisioning pipeline like every other infrastructure change.

5. Outdated Kafka with unpatched CVEs (32%)

SOC 2 control: CC7.1 — The entity uses detection and monitoring procedures to identify (1) changes to configurations that result in the introduction of new vulnerabilities, and (2) susceptibilities to newly discovered vulnerabilities.

We routinely find clusters on 2.6, 2.8, 3.1. CVE-2023-25194 (Confluent JMX RCE), CVE-2024-27309 (Kafka principal-impersonation in MirrorMaker), and others have public exploits and were disclosed long enough ago that they should have been patched.

Fix: Run kafka-broker-api-versions.sh --bootstrap-server <broker> and compare. Plan the rolling upgrade. If you're stuck on an old version because of a downstream client that won't upgrade, document the compensating control — and make sure it's actually compensating.

6. JMX exposed without authentication (28%)

SOC 2 control: CC6.6 — The entity implements logical access security measures to protect against threats from sources outside its system boundaries.

Default port 9999. No auth. No TLS. Means anyone on the broker network can read JMX MBeans (broker internals, topic metadata, consumer group state). Some jmxremote configurations also allow code execution via deserialization.

Fix:

com.sun.management.jmxremote.authenticate=true
com.sun.management.jmxremote.ssl=true

Or, if you don't need JMX externally, drop it entirely.

7. No audit log retention (24%)

SOC 2 control: CC4.1 — The entity selects, develops, and performs ongoing and/or separate evaluations. (and CC7.2 monitoring)

Audit logs enabled but with no retention policy, or no audit logs at all. The auditor will ask: "Who consumed from this topic in Q1?" If the answer is "we don't know," it's a finding. If the answer is "we know, but the logs got rotated three weeks ago," it's also a finding.

Fix:

authorizer.class.name=kafka.security.authorizer.AclAuthorizer
log4j.logger.kafka.authorizer.logger=INFO, authorizerAppender

Plus an explicit retention policy on the audit log appender. 90 days minimum for SOC 2 Type II; 365 days if you want headroom.

What This Doesn't Cover

This post focuses on configuration findings. SOC 2 Type II also evaluates operational controls — change management, incident response, vendor management — that no scanner can validate. KafkaGuard handles the configuration side. The operational side is your team's process work.

The full mapping across PCI-DSS 4.0, SOC 2 Type II, and ISO 27001 Annex A is documented in our SOC 2 compliance checklist and the PCI-DSS scanning guide.

What to Do This Week

If you read this and felt the cold shoulder of the auditor, three options in order of effort:

Run a scan. Download the binary from GitHub Releases (Linux, macOS, or Docker), point it at a broker, and you'll have a findings list in 90 seconds. Free. Open-source. Single binary.
Pick the highest-blast-radius finding from the seven above and fix it this week. Inter-broker TLS and wildcard ACLs are usually the worst offenders.
Schedule the audit prep meeting with your security team using this list as the agenda.

You don't need to fix all seven before the auditor arrives. You need to be able to answer "yes, we know about it, here's the remediation timeline, here's the compensating control." The cluster doesn't have to be perfect. The story does.

Methodology Note

The 50 scans referenced are a mix of self-hosted Apache Kafka, Confluent Platform, Amazon MSK, Aiven, and Redpanda — anonymous data from KafkaGuard runs across our user base over Q1–Q2 2026. We did not include scans against test clusters or scans run with --policy baseline-dev (which intentionally relaxes checks). Findings are deduplicated per cluster.

If you'd like to contribute anonymized findings to the next iteration of this dataset, open a discussion on the GitHub repo.

Written by the KafkaGuard team. KafkaGuard is an open-source Kafka security and compliance scanner — 55 controls, PCI-DSS / SOC 2 / ISO 27001, single binary, runs in 90 seconds. Try it free.

ShareX / Twitter LinkedIn Copy link

Why 80% of Kafka Clusters Would Fail a SOC 2 Audit Tomorrow

The Uncomfortable Number

What "in scope" actually means

The 7 Findings That Keep Showing Up

1. Inter-broker traffic in plaintext (73% of clusters)

2. Wildcard ACLs on production topics (73%)

3. No client authentication (52%)

4. auto.create.topics.enable=true (44%)

5. Outdated Kafka with unpatched CVEs (32%)

6. JMX exposed without authentication (28%)

7. No audit log retention (24%)

What This Doesn't Cover

What to Do This Week

Methodology Note

Free Kafka Security Checklist

4. `auto.create.topics.enable=true` (44%)