Engineering

Cloud Architecture

June 19, 2026

Platform Engineering Tools: The 2026 Stack, Categorized

Most "top platform engineering tools" articles read like vendor roundups. Twenty names ranked, each with a feature list pulled from a marketing page, no opinions, no real trade-offs. You close the tab, knowing what exists and nothing about what to pick. The harder question, and the one platform teams actually ask, is which categories of tools you need, what each category solves, and where the architecture decisions sit between them.

This guide covers platform engineering tools by category: internal developer portals, infrastructure orchestration, GitOps and continuous deployment, Kubernetes and container orchestration, secrets and identity, observability and SLO management, and the architecture-decision layer most platform stacks leave underbuilt. It's written for platform engineering leads, staff engineers building internal developer platforms, and CTOs trying to figure out where the gaps in their current stack are.

What Are Platform Engineering Tools?

Platform engineering tools are the technologies a dedicated platform team uses to build, run, and maintain an internal developer platform (IDP). The job of the stack is to absorb the cognitive load product engineering teams used to carry, including spinning up environments, wiring up CI/CD, managing infrastructure, handling secrets, and instrumenting observability. The tools enable the discipline; the discipline produces the platform; the platform serves the developer.

The category boundary matters because "platform engineering tool" is sometimes used to mean every piece of software an engineering organization touches. That definition is so broad it's useless. The working definition is narrower. The category covers tools whose primary user is the platform team (and, through the platform, the internal developer). Application monitoring tools used by product teams aren't platform engineering tools; the observability layer the platform team operates is. Cloud provider consoles aren't platform engineering tools; the IaC layer the platform team builds on top of them is.

The stack splits roughly into seven categories. Most platform teams in 2026 end up with one tool from each, sometimes two when the category is broad enough to warrant a primary plus a specialist. Listed below in roughly the order they get adopted.

Internal Developer Portals (IDPs)

The IDP is the user-facing surface of the platform. It's where developers find services, spin up environments, scaffold new applications, see ownership, and (if the platform team has invested) execute golden paths without needing to know what's underneath. The category has consolidated since 2022 into a small number of recognized tools.

Backstage

Backstage is the open-source IDP framework originally built by Spotify and donated to the Cloud Native Computing Foundation, where it sits in the Incubating tier. The Software Catalog, Software Templates, and TechDocs are the core building blocks; everything else is a plugin. Backstage gets adopted because it's the most extensible, has the largest plugin ecosystem, and doesn't lock the team into a vendor. The cost is a real engineering investment: a Backstage deployment that does meaningful work for a 100-engineer organization usually requires a quarter or two of focused build-out, plus ongoing plugin maintenance as the platform evolves.

Port

Port is a SaaS IDP with a software catalog, scorecards, and a self-service action layer. Teams adopt Port when they want IDP capability without committing to a Backstage build-out. The trade-off is the standard SaaS one: lower upfront cost, faster time to first useful behavior, ongoing vendor commitment, less flexibility at the edges. For organizations with limited platform engineering bandwidth, Port is often the pragmatic choice.

Cortex

Cortex sits in a similar space to Port: SaaS IDP with a catalog, scorecards (Cortex's flagship feature), and integrations. The scorecard angle is the differentiator. Teams that want quantitative measurement of production readiness across services tend to gravitate to Cortex; teams that want the broader IDP surface tend to pick Port or Backstage.

OpsLevel

OpsLevel competes in the same SaaS IDP category, focusing on service ownership and maturity tracking. The honest framing is that Port, Cortex, and OpsLevel solve overlapping problems with different emphases, and the right choice usually comes down to which one's data model matches how the platform team actually thinks about the catalog.

The IDP category is the most visible piece of the stack, but it isn't the foundation. The foundation is the orchestration, GitOps, and observability layers underneath. An IDP without those is a portal pointing at nothing.

Infrastructure Orchestration and IaC

Infrastructure as code is where the platform team's discipline meets the cloud provider's reality. The tools in this category translate human-readable configuration into provisioned cloud resources, with varying degrees of opinion about how to do it.

Terraform / OpenTofu

Terraform remains the de facto standard for declarative infrastructure provisioning, with a multi-cloud provider ecosystem and a state-management model that the rest of the category measures itself against. OpenTofu is the Linux Foundation-backed fork of Terraform, created in 2023 following HashiCorp's change to the BSL license. It aims to preserve Terraform-compatible workflows and configurations. Most platform teams in 2026 are either sticking with Terraform under the existing license, migrating to OpenTofu, or running both side by side while they figure it out. The category is well-understood; the licensing question isn't.

Pulumi

Pulumi takes a different approach. Instead of HCL, you write infrastructure in TypeScript, Python, Go, or C#. The pitch is that infrastructure code lives in the same language as application code, enabling teams to reuse abstractions and tooling. The trade-off is that Pulumi's mental model assumes engineers are comfortable with general-purpose programming, whereas Terraform's HCL assumes the audience is platform and infrastructure specialists.

Crossplane

Crossplane is a Kubernetes-native control plane for managing cloud infrastructure through the Kubernetes API. The pitch is that if Kubernetes is already the platform's substrate, Crossplane lets infrastructure changes flow through the same reconciliation loops as workload changes. The category fit is real for Kubernetes-heavy platforms; the learning curve is steeper than Terraform.

Spacelift, env0, Scalr

This sub-category is "Terraform/OpenTofu management as a service" and covers policy enforcement, drift detection, plan-and-apply workflows, and multi-environment orchestration. Useful once the team is running Terraform at a meaningful scale and wants to stop building the management layer themselves.

GitOps and Continuous Deployment

GitOps treats Git as the source of truth for what should be running, and uses controllers to reconcile the running state against the Git state. The category is well-established for Kubernetes workloads; the choices are mostly between two implementations.

Argo CD

Argo CD is the most widely adopted Kubernetes GitOps controller and part of the CNCF Graduated Argo project family. It watches a Git repository for declared application state and continuously reconciles the live cluster against it. The UI is the strongest in the category, the multi-cluster story is mature, and the ApplicationSets pattern handles the cross-cluster case. Argo CD is usually the default choice for new GitOps deployments.

Flux

Flux is the other major Kubernetes GitOps controller, originally from Weaveworks (which shut down in 2024) and now a CNCF Graduated project. The architectural choices are different (more modular, more composable), and the operational footprint is smaller. Teams that pick Flux usually want the smaller surface area; teams that pick Argo CD usually want the UI and the operational maturity.

GitHub Actions/GitLab CI/CircleCI

The category labeled "CI" is upstream of the GitOps controllers, but the boundary blurs. GitHub Actions has consolidated as the de facto CI for repositories already in GitHub, with the Actions marketplace as the extensibility layer. The platform team's job is usually to provide curated, opinionated workflow templates that product teams consume rather than rebuild.

Kubernetes and Container Orchestration

Kubernetes itself is upstream of most of the rest of the stack. The platform team's relationship to it usually breaks down into running the clusters (or paying a managed service to do so), exposing them safely to product teams, and providing the abstractions that keep developers from having to learn the raw API.

Kubernetes

The default substrate for many platform teams beyond a certain scale, though not a universal requirement. Teams with simpler workloads or strong PaaS commitments may not need to expose Kubernetes directly. The decision is whether to self-host (EKS, GKE, AKS as managed control planes; everything else by the platform team) or to adopt a managed offering deeper up the stack. Most platform teams in 2026 are on a managed control plane with platform-built abstractions on top.

KubeVela

KubeVela is an open-source application delivery platform built on the Open Application Model (OAM). The pitch is a higher-level abstraction over raw Kubernetes manifests, so application engineers can express what they want without having to learn every CRD. Adoption is concentrated in organizations that want a clean separation between application configuration and platform configuration.

Humanitec

Humanitec is a commercial internal-developer-platform-as-a-service that handles the platform orchestration layer (workload definitions, environment provisioning, dynamic configuration). Teams adopt Humanitec for the same reason they adopt Port: less Backstage build-out, more time on product work. The vendor commitment is real, as it is with any SaaS platform.

Secrets Management and Identity

Secrets management is the boring category that turns critical the moment a leaked credential lands on GitHub. The category has matured.

HashiCorp Vault

Vault is the long-standing reference implementation, covering dynamic secrets, encryption-as-a-service, and identity-based access. HashiCorp moved Vault to a BSL license in 2023, so the Community/Enterprise distinction now matters when teams choose it; OpenBao is the actively maintained open-source fork for teams that need a true OSS license.

Infisical

Infisical is the newer open-source competitor, with a SaaS option and a model that targets the secrets-in-code problem directly. The category is converging on similar primitives; the differentiator is usually integration ergonomics for the platform team's existing stack.

Doppler

Doppler is the SaaS-first competitor, with an emphasis on environment-variable workflows and developer-facing UX. Teams pick Doppler when they want the ergonomic improvement without standing up in Vault.

SOPS, age, sealed-secrets

The lightweight end of the category: file-based encryption of secrets at rest in Git, with decryption at runtime. Used when the team's needs are narrow (encrypt the values in the YAML, don't build a whole secrets service) or when the GitOps controller's secret-handling story is already in place.

Observability and SLO Management

Observability for the platform itself is a different job than observability for product workloads. The platform team needs to know whether the platform is meeting its own SLOs; product teams need to know whether their services are.

Prometheus and Grafana

The CNCF-blessed default. Prometheus for metrics, Grafana for visualization, and Alertmanager for alerting. Most teams running Kubernetes end up here, sometimes with Thanos or Cortex for long retention.

Datadog

The commercial alternative, with a broader scope across logs, traces, metrics, APM, and security, at a cost that escalates with volume. Teams adopt Datadog when the operational burden of running their own observability stack exceeds the cost of the SaaS plan.

OpenTelemetry

OpenTelemetry isn't a single tool. It is the open standard for emitting traces, metrics, and logs. Adoption is now broad enough that "instrument with OpenTelemetry, send to whichever backend the platform team has standardized on" is the default pattern.

Architecture Decision Tools: What's Missing from Most Platform Stacks

Six categories down, the platform stack is genuinely capable: developers can ship to managed infrastructure via golden paths, with observability and secrets handled, atop a working GitOps loop. What's still usually missing is the layer above. The platform team's architecture decisions (which observability stack, which secrets pattern, which Kubernetes-vs-not boundary, which Backstage-vs-Port choice) live in slide decks, Slack threads, and somebody's head. Six months later, when the team has to explain why the platform looks the way it does, the answer is reconstruction.

Two categories of tools overlap with this gap.

Architecture Documentation and Catalog Tools

Service catalogs, enterprise architecture tools, and architecture documentation systems sit at the edge of the platform engineering stack. They tell you what services exist and how they connect, which is necessary but not sufficient. The platform team's architecture decisions are about which patterns and tools the platform standardizes on, which is upstream of the catalog.

Architecture-Aware Decision Systems

A newer, smaller category that treats architectural decisions as first-class artifacts tied to live system state. Catio operates here. The category isn't a replacement for any of the six above; it sits above them, so that the platform team can capture decisions like "we standardize on Argo CD over Flux because we already operate the UI investment and the ApplicationSets pattern fits our multi-cluster topology," and revisit those decisions when the system or team changes. For platform teams running multi-cloud or multi-region systems, the architecture-decision layer is what keeps the platform stack from quietly drifting away from its own intentions. Tools like Archie, Catio's conversational architecture copilot, let the platform team query the decision corpus the same way they'd query a service catalog, without needing to open the actual ADR files.

This layer also matters because the platform engineering stack is itself a system architecture that the team is responsible for. The IDP, the IaC layer, the GitOps controllers, the observability stack: these aren't just "tools we picked." They're an architecture the team has committed to maintaining, and the architecture decisions behind them deserve the same treatment as any other system-level decision. Most platform teams don't think of it this way until the second or third major rework, at which point the cost of the missing decision history shows up as institutional amnesia.

For the broader category around architecture-level tooling, see our roundup of software architecture tools.

Common Mistakes When Choosing Platform Engineering Tools

Beyond the individual category choices, a small number of failure patterns recur across platform teams.

Picking the IDP Before the Underneath Works

The most common pattern. The team spins up Backstage, integrates it with a service catalog, and discovers that the IaC layer is inconsistent, the GitOps loop is half-wired, and the secrets pattern still has manual steps. The IDP makes those gaps visible without fixing them, and the team ends up with a portal that points at incomplete infrastructure. The fix is to sequence: the foundational layers (IaC, GitOps, observability) before the visible layer (IDP).

Treating Tool Choice as Permanent

Every category in this stack has at least two viable choices. Locking in one and treating it as immutable produces a platform that drifts away from the team's actual needs. The discipline is to revisit category choices when the team's scale, technology mix, or organizational shape changes meaningfully. Most teams overcorrect, swapping tools too often. The right cadence is annual, not quarterly.

Counting Tools Instead of Outcomes

A platform team's success isn't measured by how many tools it operates. It's measured by developer productivity, deployment frequency, time-to-first-deploy for new services, and the cognitive load product teams report. Tool stacks that grow without those metrics improving are usually overcorrecting; the right move is consolidation, not addition.

How to Choose Platform Engineering Tools

The decision sequence that survives contact with execution.

Start with the platform's job. What problem is the platform team solving, and what cognitive load is it trying to absorb? The tool choices follow from the answer.
Inventory what the team already has. Most platform teams don't get to start from scratch. The existing CI, IaC, and observability stack are constraints, not blank pages.
Pick the IDP last, not first. The IDP is the user-facing layer, but it's only useful if the underneath is solid. Most failed Backstage rollouts started with the portal and then discovered that the underlying platform was incomplete.
Buy the SaaS when the team's bandwidth is the binding constraint. Build the open-source when the team has the engineering capacity and the long-term control matters more than the short-term cost.
Write down the decisions explicitly. Every category choice is an architecture decision that the team is committing the platform to. Without somewhere to land those decisions, the platform is one staff change away from being indefensible.

The framing connects to the broader question of how platform engineering relates to DevOps, which we covered separately in Platform Engineering vs. DevOps. The tools sit inside that organizational question; they don't answer it.

Closing the Loop

Platform engineering tools are easy to enumerate and hard to choose well. The choices that hold up are the ones connected to the platform's actual job and the team's actual capacity, with the architecture decisions behind them captured somewhere the next person can find.

If the platform stack has reached the point where the team's architecture decisions are starting to outrun the wiki page they live on, see how Catio treats those decisions as a first-class layer tied to the running system, so the platform stays connected to the design the team committed to rather than drifting away from it.

Frequently Asked Questions

What is the difference between DevOps tools and platform engineering tools?

DevOps tools are the technologies any engineering team uses to ship software: CI/CD, IaC, observability, and deployment automation. Platform engineering tools are a subset of those (plus internal developer portals) that a dedicated platform team uses to build an internal product for the rest of the engineering organization. The category overlap is real; the distinction is who the user is.

Are platform engineering tools the same as internal developer platforms (IDPs)?

No. An IDP is the product the platform team builds; the IDP framework (Backstage, Port, Cortex) is one tool in the stack. The full platform engineering tool stack also includes IaC, GitOps, observability, secrets, and the underlying orchestration layer.

Do you need Backstage for platform engineering?

No. Backstage is the most widely adopted IDP framework, but many platform engineering programs run on Port, Cortex, OpsLevel, Humanitec, or a custom internal portal. The choice depends on whether the team has the engineering capacity for a Backstage build-out and whether the data model aligns with how the team thinks about the catalog.

What are the open-source platform engineering tools?

The major open-source pieces of the stack: Backstage (IDP), Terraform/OpenTofu (IaC), Crossplane (Kubernetes-native control plane), Argo CD and Flux (GitOps), KubeVela (application delivery on OAM), HashiCorp Vault Community (or OpenBao for a fully open-source fork), Infisical, SOPS, Prometheus and Grafana (observability), OpenTelemetry (instrumentation standard). Most platform stacks combine open-source and commercial tooling depending on the team's capacity.

‍

Share this Post