CSPM vs. Reality: What Cloud Security Tools Promise and What They Actually Deliver

Anthony G. Tellez11 min read
Cloud SecurityCSPMAWSSecurityDevSecOpsInfrastructureCNAPPPrisma CloudWizLaceworkKubernetesIaC2023

Through the first half of 2023 I have been running competitive bake-offs for a living. As a Cloud Security Architect at Palo Alto Networks, Prisma Cloud, I was in enterprise environments almost every week running proof-of-concept evaluations against Wiz, Lacework, and Sysdig across AWS, Azure, and GCP, sometimes all three in the same account. Fortune 500 companies, financial services, healthcare, tech companies with microservices spread across hundreds of accounts. I watched a lot of demos go well and a lot of production deployments go sideways.

This post is what I actually learned from that work. Not the positioning deck. The field reality.

What CSPM Tools Promise

The pitch is compelling. Deploy an agentless scanner, connect your cloud accounts, and within hours you have unified visibility across every resource, continuous compliance against CIS, SOC 2, PCI-DSS, and a real-time map of every misconfiguration, exposed bucket, and overprivileged IAM role in your environment.

The core value proposition has three pillars: unified visibility (one pane of glass across multi-cloud), continuous compliance (automated evidence collection for auditors), and drift detection (alert when infrastructure diverges from a known-good state).

In a greenfield environment with a mature security team and clean infrastructure, this works largely as advertised. The problem is that almost nobody has a greenfield environment with clean infrastructure.

The Competitive Landscape: An Honest Assessment

I ran enough bake-offs to have real opinions here.

Wiz consistently won on time-to-value. Their agentless architecture using cloud-provider snapshots is genuinely differentiated. You connect an AWS account, give it read permissions on your snapshots, and within a few hours you have vulnerability data on every running workload without touching a single VM. For a CISO trying to show board-level risk reduction in 30 days, that is a powerful story. Wiz also built one of the better attack path visualizations in the space: the ability to show a chain from an exposed S3 bucket through a compromised EC2 instance to a database with sensitive data is the kind of thing that gets budget approved. Where Wiz was weakest in the bake-offs I ran: runtime protection depth and anything requiring agent-based telemetry. Agentless gives you a snapshot view, not a continuous one.

Lacework was the behavioral analytics story. Their Polygraph technology builds a learned model of normal activity for your environment and alerts on deviations, which is genuinely interesting, particularly for detecting compromised credentials and insider threats. In environments where the security team cared more about detecting active compromise than enumerating misconfigurations, Lacework often looked stronger. The weakness was that their behavioral models need time to learn your environment, and in competitive POCs on a 30-day timeline, the models were still noisy. Their multi-cloud compliance coverage was also thinner than Prisma Cloud's at the time.

Prisma Cloud wins on breadth. If you want a single platform that covers CSPM, cloud workload protection, container security, serverless, IaC scanning, UEBA, and native SOAR integration, Prisma Cloud is the most complete CNAPP platform on the market. In enterprises that had already standardized on Palo Alto for network security or XSOAR for response, the integration story was compelling. Where we lost deals: when the prospect had a simple, well-scoped problem (just "give me visibility into misconfigurations across my three AWS accounts"), Prisma Cloud's complexity and licensing structure looked like overkill next to Wiz's clean onboarding experience.

The Findings Avalanche Problem

Here is the thing the demos never show. You connect a mature enterprise cloud environment to any of these tools and within 48 hours you have 50,000 findings. Maybe 200,000. I have seen environments where the initial scan produced over a million individual policy violations.

Some of those findings are critical. Most are not. A subset are accurate. And a meaningful percentage are false positives or context-dependent findings that require understanding of how the organization actually operates.

This is the alert fatigue problem at a posture management scale, and it is worse than network security alert fatigue in one important way: CSPM findings are attached to real infrastructure that someone owns, and that someone has a day job that is not remediating cloud security findings. When you dump 50,000 findings on an engineering team, the predictable outcome is that they tune you out.

The tools that handled this best had two things: effective risk prioritization that could surface the genuinely exploitable, high-severity issues first, and a workflow layer that made it tractable for developers to understand and fix findings in their own context. Wiz's attack path scoring was good at the first. Nobody solved the second one particularly well.

The operational reality for most enterprises is that you deploy a CSPM tool, do an initial triage sprint to suppress or accept findings you are not going to fix, tune the policies to your environment's baseline, and then try to operationalize a sustainable weekly or sprint-based remediation workflow. That tuning process takes months, not days.

Container and Kubernetes Security: Why Image Scanning Is Not Enough

Image scanning is the part of container security that vendors love to demo because it produces clear, auditable output: here are the CVEs in your container images, here are the layers they came from, here is the CVSS score. This is useful. It is also insufficient.

By the time a container is running in production, the image vulnerability data tells you what was true at build time. What matters in a running cluster is runtime behavior: what system calls is the container making, what network connections has it established, is it trying to write to the filesystem in unexpected locations, is it spawning child processes it should not be.

Runtime protection requires an agent. The debate in the container security space was around the mechanism: kernel module, eBPF, or sidecar injection. eBPF-based approaches provided rich syscall-level telemetry with lower overhead than kernel modules. Sidecars were operationally simpler to deploy in some environments but gave you less visibility.

The Kubernetes admission controller piece is where shift-left theory meets enforcement reality. Admission controllers can block non-compliant workloads at deploy time: images that have not been scanned, images with critical CVEs, pods that request privileged mode. In theory this is the enforcement layer that makes your security posture real rather than aspirational. In practice I watched multiple enterprises back off strict enforcement after their first production incident where a legitimate workload got blocked because a dependency had a CVE that had not been patched yet in the upstream image. The organizational pressure to switch admission controllers from block mode to audit mode and leave them there is enormous.

IaC Scanning: Shift-Left in Practice

Shift-left is one of those ideas that is unambiguously correct in principle and extremely difficult to operationalize without creating developer friction. The idea is straightforward: scan your Terraform, CloudFormation, or Helm charts before they get deployed, fail the CI/CD pipeline on security policy violations, and eliminate entire classes of misconfigurations before they reach production.

The reality is that the tools that are good at catching security issues, such as overly permissive IAM policies, unencrypted storage, and open security groups, produce findings that are often technically correct but contextually complicated. A Terraform module that creates an S3 bucket without versioning enabled will fail a CIS benchmark check, but if that bucket is a temporary build artifact store, the risk context is completely different than if it is production data.

What worked: using IaC scanning as a gate on high-risk categories only, specifically public exposure, no encryption for sensitive data, and admin wildcards in IAM. What did not work: treating IaC scanning as a comprehensive compliance gate. Developer tolerance for pipeline failures caused by security tooling is low, and if you burn that trust on low-severity findings, they will find ways around your checks.

The teams I saw do this successfully had two things: a clear, published policy on exactly which findings would block a deployment, and a mechanism for developers to request exceptions with a ticket. Friction you cannot override is friction that gets worked around. Friction with a legitimate escape valve tends to be accepted.

CSPM and SIEM Integration: How Enterprises Actually Consume This Data

Most enterprises already have a SIEM: Splunk, Chronicle, Microsoft Sentinel. The question is not whether to forward CSPM findings to the SIEM; it is how to make that data actionable in the context of everything else the SOC is already handling.

The pattern I saw work: forward only high-and-critical findings to the SIEM. Keep medium-and-below in the CSPM tool where the cloud security team manages them. Create correlation rules in the SIEM that join CSPM findings with CloudTrail, VPC Flow Logs, or Azure Activity Logs to look for active exploitation of known misconfigured resources. A CSPM finding that an IAM role has excessive S3 permissions is interesting context. That same IAM role showing up in a GetObject call to a sensitive bucket from an unexpected IP address at 2am is an incident.

The other integration pattern that added real value: ticketing. Jira integration that automatically creates tickets for new high-severity findings, routes them to the appropriate team based on cloud resource tagging, and tracks remediation SLAs. CSPM tools that had this built in or had solid webhook support for building it were meaningfully easier to operationalize than tools that expected the security team to manage remediation entirely within the CSPM UI.

Build vs. Buy: When CNAPP Consolidation Makes Sense

The consolidation narrative, which promises to replace five point tools with a single CNAPP platform, has real appeal for enterprises managing vendor complexity. It also has real costs that do not show up on the comparison spreadsheet.

Consolidation makes sense when you are standardizing from scratch or doing a major platform refresh, the integration between modules genuinely reduces analyst work, your security team is small enough that managing fewer tools matters more than having best-of-breed capabilities in each category, and the licensing economics work at your scale.

It does not make sense when you have deep existing investments in specific tools your team knows well, the integrated platform has a weak module in an area that is critical to your threat model, or your organization's procurement process makes switching costs higher than any efficiency gains.

The honest answer I gave prospects in bake-offs: if you are evaluating cloud security tools for the first time and want to move fast, Wiz will get you to value faster. If you want the most complete coverage and you are already in the Palo Alto ecosystem, Prisma Cloud is the right long-term bet. If your threat model centers on detecting active compromise rather than enumerating misconfigurations, Lacework's behavioral analytics are worth the evaluation time.

Advice for Security Teams Evaluating Cloud Security Tools

Define your threat model before you start evaluating tools. The features that matter for passing a SOC 2 audit are different from those needed after three cloud breaches. Tools that market to both audiences often excel at neither.

Run your POC on your actual production environment, not a sandbox. Alert volume, false positive rate, and integration complexity look very different in real infrastructure than in a clean demo environment. I have seen POCs that looked great in a test account completely fall apart in a production environment with 400 AWS accounts and a decade of accumulated infrastructure debt.

Involve the engineering teams who will be doing remediation. A CSPM tool that security loves and engineering ignores is not a security tool; it is a reporting tool. The workflows that make findings actionable for developers matter as much as the accuracy of the findings themselves.

Set realistic expectations for the tuning timeline. Budget three to six months before your CSPM deployment is producing a signal-to-noise ratio that you can operationalize sustainably. The initial scan results are a starting point, not a finished product.

Think carefully about the agent question. Agentless tools have better time-to-value and less operational overhead. Agent-based tools have deeper runtime visibility. This is a real tradeoff, not a marketing distinction. Your choice should reflect what you actually need to see.

Cloud security tooling has matured substantially. The gap between what these platforms can do and what most enterprises actually get out of them is largely an implementation and process problem, not a technology problem. The teams I saw getting real value from their CSPM investments had invested as much in the operational workflows around the tool as in the tool itself.


Written in 2023 based on field experience as a Cloud Security Architect at Palo Alto Networks, Prisma Cloud.