Every Azure workload that calls another Azure service has to prove who it is, and the choice of how it proves that is where most credential leaks begin. The managed identity vs service principal decision looks like a small implementation detail, a line in a deployment script, until a secret scanner flags a client secret hardcoded in a pipeline variable or a six-month-old certificate expires at two in the morning and takes a production integration down with it. The two options answer the same question, namely how a non-human caller authenticates to Microsoft Entra ID, but they answer it with opposite assumptions about who holds the credential and who is responsible when it leaks or lapses. One asks you to create, store, protect, and rotate a secret forever. The other hands the entire credential problem to the platform and gives you nothing to leak.

That difference is not cosmetic. It changes your threat model, your operational burden, your audit story, and the blast radius of a mistake. This guide works through the comparison the way an engineer actually faces it: what each object is, where the credential lives, how the lifecycle plays out over the months after deployment, and a decision rule you can apply at design time rather than discovering at incident time. The goal is not to recite feature lists from the portal. The goal is to leave you able to look at any workload and reason your way to the right choice, defaulting to the option that removes the secret unless something concrete forces the other.
The no-secret-by-default rule
Here is the claim this entire article defends, stated plainly so you can carry it out of here and apply it without rereading the rest. Call it the no-secret-by-default rule: a managed identity removes the stored credential entirely, so it is the correct default for any workload that runs inside Azure, and a service principal with a manually held credential is the fallback you reach for only when the workload cannot use a managed identity at all.
The reasoning behind the rule is short. A credential you do not store cannot be stolen, cannot be committed to a repository, cannot expire unnoticed, and cannot show up in a breach report. A managed identity gives you exactly that: a principal in your tenant with no client secret and no certificate for you to handle, because the platform mints and rotates the underlying credential for you and your code never sees it. A service principal, by contrast, is an identity that carries a credential you generate and keep, and from the moment you create it you own a lifecycle: store it somewhere safe, hand it to the workload securely, watch its expiry, rotate it before it lapses, and revoke it if it leaks. The secret is the liability, and the rule is built around eliminating it.
The phrase “by default” carries real weight. It does not say service principals are wrong or that you should never create one. It says the burden of proof sits on the service principal. When you choose a managed identity, you choose it because it is the safe baseline. When you choose a service principal, you should be able to name the specific reason a managed identity will not work for that workload, and almost always that reason is location: the caller lives somewhere Azure cannot vouch for it directly. We will spend a good part of this article naming those reasons precisely, because a rule is only useful if you also know its exceptions.
What does “secretless” actually mean here?
Secretless means your code, your config, and your repository hold no credential for the principal. The platform issues and rotates the credential under the hood, and your workload retrieves short-lived access tokens at runtime through a local endpoint. There is no client secret to store, no certificate to renew, and nothing in source control that an attacker could lift and replay.
That definition matters because the word “secretless” gets used loosely. It does not mean there is no cryptography happening, and it does not mean no token ever crosses a wire. Tokens still flow, and they are still bearer credentials for their short lifetime. What secretless removes is the long-lived, human-held credential, the one you would otherwise paste into a key vault, an environment variable, or a pipeline setting and then have to guard for the life of the workload. That long-lived artifact is the thing that leaks, the thing that expires, and the thing a managed identity simply does not have.
What a service principal is
To compare the two fairly you have to be precise about what each object is, because the names confuse people and the portal does not always make the relationship obvious. Start with the service principal, because it is the older and more general concept, and because understanding it makes the managed identity easier to place.
In Microsoft Entra ID, an application has two related objects. The application registration, sometimes called the app registration, is the global definition of the application: its identifier, its reply URLs, the permissions it requests, the credentials it is allowed to present. Think of it as the blueprint. The service principal is the local instance of that application inside a specific tenant, the concrete identity that actually gets role assignments and actually authenticates. When you grant a role to an application so it can read a storage account, you are granting that role to the service principal, the in-tenant representation. The registration is the template; the service principal is the running identity that the template produces in your directory.
A service principal authenticates by presenting a credential it controls. That credential is one of two kinds. The first is a client secret, which is a string, effectively a password for the application, generated in the portal or by the CLI and shown to you exactly once. The second is a certificate, where the application holds a private key and Entra ID holds the matching public key, and the application proves possession of the private key during the token request. Certificates are the stronger of the two because the private key can stay in a protected store and never travels, but both share the defining trait that makes a service principal a service principal: you, the human, are responsible for creating the credential, delivering it to the workload, and managing it for the rest of its life.
How does a service principal prove who it is?
It presents a credential it holds to the Entra ID token endpoint and receives an access token in return. With a client secret, it sends the application identifier and the secret string. With a certificate, it signs a token request with its private key and Entra ID validates the signature against the registered public key. Either way, the workload must possess and protect that credential.
The implication is the whole story of this comparison. Possession of the credential is what authenticates the caller, so possession of the credential is also what an attacker needs. If a client secret leaks into a log, a screenshot, a committed config file, or a shared environment dump, whoever has it can request tokens as that application until you notice and rotate. Certificates raise the bar because the private key is harder to exfiltrate from a hardware-backed or vault-backed store, but the principle holds: the security of a service principal is the security of the credential you are holding on its behalf.
What a managed identity is
A managed identity is Microsoft Entra ID’s answer to the question the service principal leaves open: what if the platform held the credential instead of you? A managed identity is, underneath, still a service principal in your tenant, an identity that gets role assignments and requests tokens. The difference is who owns the credential and how the workload gets a token. With a managed identity, Azure provisions the underlying credential, rotates it automatically on a schedule you never see, and never exposes it to you or your code. Your workload does not authenticate by presenting a stored secret. It asks a local endpoint for a token, and the platform, having already established that the calling resource is who it claims to be, hands one back.
That local endpoint is the mechanical heart of the design. A virtual machine, an App Service app, a Function, a container, or another supported resource that has a managed identity assigned can reach a metadata endpoint available only from inside that resource. The endpoint is not reachable from the public internet; it answers only to code running on the resource itself. When your code requests a token, the platform uses the credential it manages internally to obtain an access token from Entra ID and returns it to your code. Your code then uses that token to call the target service. At no point does your code hold or even see the long-lived credential, because there is no long-lived credential for it to hold. The trust comes from the resource’s own identity, established by the platform, rather than from a secret you carry.
This is why a managed identity is the cleaner default for anything running in Azure. The thing that makes a service principal risky, the held credential, is gone. You did not generate it, you do not store it, you cannot leak it, and you never have to rotate it. The reasoning chain that started with the no-secret-by-default rule lands here: remove the secret, and you remove the largest single class of identity incidents in the cloud.
Is a managed identity just a managed service principal?
Yes, and holding that fact straight clears up most of the confusion. A managed identity is a service principal whose credential lifecycle Azure owns instead of you. It still appears in your tenant as an enterprise application, still receives role assignments the same way, and still requests tokens from the same endpoint. The only thing that changed is who creates and rotates the credential, and whether your code ever touches it.
Because the two share the same underlying shape, everything you know about granting access to a service principal applies directly to a managed identity. You assign it a role at a scope, you grant it API permissions where needed, and it shows up in sign-in logs as a workload identity. The mental model is not “two unrelated things” but “one identity model, two credential strategies,” and the strategy is the entire decision.
How the token actually reaches your code
The two approaches feel different to operate, but at the level of the wire they converge on the same destination: a short-lived access token that the workload presents to the target service. Understanding how each path arrives at that token clarifies why one path carries a stored string and the other does not, and it removes the sense that the platform-issued route is magic you cannot reason about.
Take the stored-string route first. Your code reads a string from configuration, an application setting, a vault reference, or an environment variable, and sends it to the Entra ID token endpoint along with the application identifier. The endpoint checks the string against what it has on record, and if they match it returns an access token scoped to the resource you asked for. Your code attaches that token to its outbound request, and the target service validates the token and serves the call. The token itself is short-lived, so your code repeats the request when it nears expiry, presenting the same stored string each time. The whole flow depends on that string being present, correct, unexpired, and known only to the legitimate caller.
The platform-issued route removes the first step. Your code does not read a string from anywhere. Instead it calls a local endpoint that is reachable only from inside the hosting resource, an address on a non-routable range that the public internet cannot touch. The platform, having already established at provisioning time that the calling resource is who it claims to be, uses the credential it maintains internally to obtain a token from Entra ID, and returns that token to your code. From there the flow is identical: your code attaches the token to its outbound request, the target validates it, and the call proceeds. The token is short-lived in exactly the same way, and your code re-requests it from the local endpoint when it nears expiry. The only structural difference is that the first route began by reading a stored string and the second began by asking a trusted local endpoint, and that single difference is the entire comparison expressed mechanically.
Why can an attacker on the public internet not call the local token endpoint?
Because the endpoint answers only to code running on the hosting resource itself; it sits on a non-routable address that the public network cannot reach, and the platform attests to the resource at provisioning time rather than checking an inbound caller. An attacker would first have to compromise the host, at which point any credential strategy is equally exposed.
That property is what makes the platform-issued route safe without a stored string. The trust does not come from presenting a secret that anyone could replay; it comes from the request originating on a resource the platform already vouches for, through a channel nothing outside that resource can use. An attacker cannot phone the local endpoint from elsewhere and ask for a token, because there is no path from elsewhere to that endpoint, and there is no string to lift that would let them impersonate the request from another location. The token still has to be protected once issued, since it is a bearer token for its short lifetime, but the long-lived artifact that the stored-string route depends on never enters the picture.
The InsightCrunch identity decision table
The fastest way to use the no-secret-by-default rule in practice is to start from where the workload runs, because location is the single signal that decides the choice in the large majority of cases. The table below is the findable artifact for this article. Read it as a decision tool: find the row that matches where your caller lives, and the table tells you which identity to reach for and the factor that decides it.
| Workload location | Recommended identity | Deciding factor |
|---|---|---|
| Azure-hosted resource that supports managed identity (VM, App Service, Function, Container Apps, AKS workload, Logic App, and similar) | Managed identity | The platform can vouch for the resource directly, so no credential needs to exist outside Azure. |
| Azure-hosted resource needing the same principal across several resources or a stable principal that survives resource recreation | User-assigned managed identity | Still secretless, but the principal is decoupled from any single resource lifecycle. |
| External CI/CD system (GitHub Actions, an external build server) deploying into Azure | Federated identity credential on an app registration | The external platform issues its own token, which Entra ID trusts directly, so no secret crosses the boundary. |
| External or on-premises workload that can present a managed certificate held in a protected store | Service principal with a certificate credential | The caller lives outside Azure and cannot use a managed identity, and a certificate keeps the private key non-exportable. |
| External or third-party system that supports only a shared secret | Service principal with a client secret, secret stored in a vault and rotated | A managed identity is not reachable from outside Azure and federation is unavailable, so a held secret is the only path. |
| Cross-tenant access where the caller is an application in one tenant acting in another | Service principal (multitenant app), with federation or certificate where possible | The identity must be presentable across the tenant boundary, which a managed identity cannot do. |
Three rows of that table are managed-identity or federation rows, and only the bottom rows fall through to a held credential. That distribution is the point. When you actually enumerate the cases, the workloads that genuinely require a stored secret are the minority, and they are recognizable by a single trait: the caller is somewhere Azure cannot directly attest to. Everything inside the Azure boundary should be reaching for a managed identity, and everything at the edge should be reaching for federation before it reaches for a secret. The held secret is the last resort, not the starting point.
How do I read this table for my own workload?
Ask one question first: does this caller run inside Azure on a resource that supports managed identity? If yes, you are done, use a managed identity. If no, ask whether the external platform can issue a token Entra ID will trust through federation. If yes, use a federated credential. Only if both answers are no do you fall through to a held certificate or secret.
That ordering is deliberate, because it walks you down the table from least liability to most. Each step you take downward adds something you now have to protect: first nothing, then a trust relationship you configure once, then a certificate’s private key, and finally a raw secret string. Engineers get into trouble by starting at the bottom of that ladder out of habit, generating a client secret because that is the example they copied, when the workload qualified for a rung much higher up where there was nothing to guard at all.
The secret lifecycle, where the real burden lives
The comparison becomes concrete the moment you stop thinking about the day you create the principal and start thinking about the eighteen months after. A service principal with a held credential is not a one-time setup. It is an ongoing obligation, and that obligation is where the operational cost and most of the risk accumulate. Walk the lifecycle of a client secret and the asymmetry with a managed identity becomes obvious.
On day zero you generate the secret. Entra ID shows it to you once, and you copy it somewhere. Already you have made a decision that matters: where does it go? If it lands in a pipeline variable in plaintext, in an environment file, or in a config committed to a repository, you have created a leak waiting to happen. The disciplined answer is a key vault, which means you now have a vault reference to manage, access policies or role assignments on the vault to maintain, and a retrieval path the workload must follow at startup. None of that work existed before you chose a held secret, and all of it exists to protect a string.
Then the secret has an expiry. Whatever lifetime you set, the clock starts. Somewhere between creation and that expiry you have to rotate: generate a new secret, distribute it to every consumer, confirm they picked it up, and remove the old one without an outage. If you forget, the secret lapses and the workload starts failing authentication, usually at an inconvenient moment, with an error that points at a token problem rather than at the calendar. The most common production sign-in incident is not a clever attack. It is a secret or certificate that quietly expired because the rotation reminder was missed or the person who owned it left the team.
A managed identity has none of this. There is no day-zero copy decision, because nothing is shown to you. There is no vault entry to protect, because there is no string to store. There is no expiry on your calendar, because the platform rotates the underlying credential on its own schedule and your code never depends on knowing when. There is no rotation runbook, no leaked-secret revocation drill, and no after-hours failure traced back to a missed renewal. The entire lifecycle that a held credential imposes simply does not apply, which is the operational half of why the secretless option is the default.
What actually happens when a credential expires?
Authentication starts failing. The workload requests a token using a credential Entra ID no longer accepts, the token endpoint rejects it, and downstream calls return authorization or token-acquisition errors. With a service principal secret or certificate, this is a real and recurring failure mode tied to a date you set. With a managed identity, there is no expiry you own, so this class of outage does not occur.
When it does happen on a service principal, the diagnosis often goes sideways because the error surfaces far from its cause. You see a failed call to storage or a vault, you chase the storage permissions, and only later do you realize the credential behind the call lapsed. This is exactly the kind of failure the troubleshooting guides in this series exist for, and if you are staring at a token failure right now it is worth working through the dedicated walkthrough on how to diagnose and fix managed identity token failures and its sibling on service principal authentication errors, both of which trace the symptom back to its real root rather than the permission you were tempted to blame.
The operational overhead, counted in hours rather than theory
The case for the platform-issued route is usually made in security terms, but the operational case is just as strong and often more persuasive to the people who keep the lights on. A stored string is not free even when nothing goes wrong with it. It consumes attention on a recurring basis, and that attention is the hidden cost that the secretless route deletes outright.
Account for the toil on the stored-string side honestly. Someone has to decide where the string lives and stand up a vault to hold it. Someone has to wire the workload to fetch it at startup and handle the case where the fetch fails. Someone has to set an expiry and put a reminder on a calendar, then actually perform the rotation when the reminder fires: generate the replacement, push it to every consumer, confirm uptake, and retire the old one without an outage. Someone has to respond when a scanner finds a copy of it in a pipeline log or a committed file, which means revoking and reissuing on short notice. And someone has to do all of this again for the next workload, and the one after that, because the work does not amortize across an organization; each stored string is its own small standing obligation. Multiply that by the number of integrations a real estate of workloads accumulates, and the rotation calendar alone becomes a meaningful slice of an operations team’s week.
The platform-issued route deletes that entire ledger. There is no vault entry to stand up for the workload’s own authentication, no fetch-at-startup path to harden, no expiry to track, no rotation to perform, no scanner finding to chase, and nothing to revoke and reissue under time pressure. The platform rotates what it holds on its own cadence, invisibly, and the workload keeps obtaining tokens without anyone touching anything. The savings compound with scale, because the secretless route’s per-workload overhead is close to zero where the stored-string route’s overhead is paid again for every workload that exists. When a team migrates a fleet off stored strings, the win they report most often is not a hypothetical breach avoided. It is the rotation runbook they deleted and the after-hours pages that stopped.
Does the secretless route cost anything to run?
The platform-issued route carries no per-token charge for the authentication itself, and a user-assigned identity is a lightweight resource with negligible cost. The real saving is in operational time: no rotation runbook, no expiry tracking, no scanner-finding response, and no per-workload vault entry for the workload’s own login. The overhead that the stored-string route pays repeatedly drops to roughly nothing.
That asymmetry is why the migration usually pays for itself quickly even ignoring the security improvement. The one-time engineering cost of switching a workload to the platform-issued route is bounded and predictable, while the recurring cost of maintaining a stored string is open-ended and grows with every integration added. Teams that frame the decision purely as a security choice sometimes hesitate because the current setup works today. Framing it as an operations choice, deleting a standing chore rather than guarding against a possible incident, tends to move the decision faster, because the toil being removed is visible on this week’s task board rather than hypothetical.
A migration playbook from a stored string to a platform-issued token
Most teams do not start clean. They start with a service principal and a stored client secret that has been working for a year, and the question is how to get off it safely. The migration is mechanical when done in the right order, and the order matters because doing it wrong produces an outage at the moment you remove the old credential. Here is the sequence that keeps both paths alive until the new one is proven.
Begin by enabling a platform-issued principal on the Azure-hosted resource that runs the workload. For a single resource that should own the principal, a system-assigned one is the simplest choice; for a fleet that should share it or for an identity that must outlive the resource, create a user-assigned one and attach it. At this stage you have added an identity but changed nothing about how the workload authenticates, so there is no risk yet. Next, grant the new principal the same roles at the same scopes that the existing service principal holds. This is the step engineers most often get wrong, because they enable the principal and then wonder why calls fail, when the cause is simply that the new principal has no role assignments yet. Mirror the existing grants precisely, and verify them before touching code.
With access in place, update the application to request tokens from the local endpoint instead of reading and presenting the stored string. Most platform libraries expose a credential type that tries the local endpoint automatically, so the code change is usually small and localized to where the client is constructed. Deploy that change to a non-production environment first and confirm through the sign-in logs that the workload is authenticating as the new principal and that its downstream calls succeed. The logs are your proof: if you see the new principal acquiring tokens and the storage or vault calls returning normally, the new path is live. Roll the change to production and watch the same signals.
Only after the new path is confirmed everywhere do you remove the old credential. Search the configuration, the vault, and the pipeline history for every reference to the stored string, because the failure mode at this stage is a single overlooked code path that still presents it. When you are confident nothing remains that uses it, delete the secret, and once the workload has run cleanly without it, retire the service principal and its registration if nothing else depends on them. Done in this order, enable, grant, verify, switch, confirm, then remove, the two paths overlap for a short window and the cutover carries no outage. The detailed setup mechanics for both flavors of platform-issued principal, with working role assignments, are covered in the companion walkthrough on how to set up managed identities the right way, which pairs naturally with this playbook.
What is the most common mistake during this migration?
Removing the old stored string before the new path is proven everywhere. A single overlooked code path that still presents the old credential will fail the moment the credential is deleted, and the error surfaces at a downstream call rather than at the login, so it gets misdiagnosed as a permission problem. Always confirm the new path in the sign-in logs across every environment before you retire anything.
The second most common mistake is forgetting the role assignments. Engineers enable the new principal, switch the code, and hit an authorization failure, then conclude the platform-issued route does not work, when the real cause is that they never granted the new principal the roles the old one held. The fix is to mirror the grants and verify them as a distinct step before the code switch, so that when authentication does succeed there is nothing left to surprise you on the authorization side.
System-assigned versus user-assigned, the choice inside the choice
Once you have decided on a managed identity, a second decision appears, and it trips people up because both options are secretless and both look identical from the workload’s perspective. Azure offers two flavors of managed identity, and choosing between them is about the principal’s lifecycle and how widely you want to share it, not about credentials at all.
A system-assigned managed identity is created on, and tied to, a single resource. Enable it on a virtual machine and the platform creates an identity bound to that machine. Its lifecycle is the resource’s lifecycle: it comes into existence with the resource and is deleted when the resource is deleted. That tight coupling is its strength and its limit. The strength is simplicity. There is exactly one identity, it belongs to one thing, and when that thing goes away the principal and all its role assignments go with it, leaving nothing orphaned. The limit is that it cannot be shared. Each resource that wants a system-assigned identity gets its own, distinct one, so you cannot grant a role once and have ten machines inherit it.
A user-assigned managed identity is a standalone resource in its own right. You create it deliberately, it has its own lifecycle independent of any workload, and you can assign it to many resources at once. Grant it a role at a scope, and every resource carrying that principal now has that access without a separate grant. This is what you want when a fleet of resources should share one identity, when an identity must survive the recreation of the resource that uses it, or when you want to set up access ahead of provisioning the workload that will consume it. It is still secretless, still rotated by the platform, still nothing for you to store. What you have traded is the automatic cleanup of the system-assigned model for the flexibility of a shared, durable principal that you now own and must remove yourself when it is no longer needed.
The decision rule is short. Reach for system-assigned when the principal belongs to exactly one resource and should die with it. Reach for user-assigned when more than one resource needs the same principal, when the principal must outlive a given resource, or when you provision access in infrastructure code before the consuming resource exists. Both honor the no-secret-by-default rule. This is a lifecycle and sharing decision layered on top of the secretless choice, not a security trade-off between them.
When should I pick user-assigned over system-assigned?
Pick user-assigned when the principal needs to be shared across resources, survive a resource being recreated, or be granted access before the workload exists. A common trigger is a scale set or a group of functions that should all read from the same vault under one role assignment rather than maintaining a separate grant per instance.
There is also a stability argument that infrastructure-as-code teams care about. A system-assigned identity’s object identifier changes when the resource is recreated, which means any role assignment or vault policy that referenced the old identifier must be reapplied. A user-assigned identity keeps a stable identifier across redeployments of the workloads that use it, so your access grants stay valid even as the compute underneath churns. If you manage access declaratively, that stability removes a whole category of drift. For teams setting this up the first time, the companion walkthrough on how to set up managed identities the right way covers both flavors with working assignments and is the natural next step once you have settled the comparison here.
Federated identity, the secretless option at the edge
The decision table had a row that deserves its own treatment, because it is the piece most engineers do not know exists and the piece that most often turns a held secret into no secret at all. When the caller lives outside Azure, the reflex is to create a service principal and a client secret and hand that secret to the external system. Federated identity credentials make that reflex unnecessary in a large and growing set of cases.
The idea behind federation is to let an external identity provider that you already trust vouch for the caller, so Entra ID never has to hold a credential for it. Instead of issuing a secret, you configure a federated credential on the app registration that establishes a trust relationship with an external token issuer. The external platform, an external CI/CD system or another cloud’s identity service, mints a short-lived token asserting the principal of the running workload. The external workload presents that token to Entra ID, which validates it against the trust relationship you configured and, if it matches, issues an Azure access token in exchange. No secret was created, none was stored on the external side, and none can leak, because the credential is the external platform’s own short-lived token rather than a long-lived string you minted.
This is the same secretless principle as a managed identity, extended past the Azure boundary. A managed identity works because the Azure platform can attest to a resource it runs. Federation works because an external platform you trust can attest to a workload it runs, and Entra ID accepts that attestation. The practical effect is that the classic pattern of storing deployment credentials in a CI system, the single most common place a long-lived Azure secret ends up, can be replaced with a trust relationship that holds nothing to steal. When a brief says a service principal or federation makes sense for external workloads, federation is the half you should try first, because it keeps the secretless property while still serving a caller Azure cannot host.
Does federation replace the service principal entirely?
Not entirely. Federation still uses an app registration and its service principal as the principal that receives roles; what it replaces is the held credential on that principal. You keep the principal, you grant it access the usual way, and you swap the client secret for a trust relationship with an external issuer, so the principal exists without a secret to guard.
The cases federation does not cover are the ones where there is no external issuer Entra ID can be configured to trust, or where the consuming system simply has no concept of presenting a federated token and only accepts a static credential. Those are real, and they are exactly the rows at the bottom of the decision table where you fall through to a certificate or a client secret. The skill is recognizing that the falling-through is the exception, not the rule, and reaching for federation whenever the external platform supports it before you accept the burden of a stored secret.
The security posture, compared honestly
It is tempting to flatten this into a slogan, “managed identity good, service principal bad,” but a useful comparison has to be more honest than that. Both approaches can be secure, and both can be insecure, depending on how they are operated. What changes between them is the shape of the attack surface and the number of ways an operator can get it wrong. The secretless option wins not because the other is broken but because it removes the most error-prone parts of the job.
Consider the attack surface of a service principal with a client secret. The secret can be read from wherever it is stored, intercepted in transit if delivery is sloppy, committed to source control, captured in a log or a crash dump, or simply over-shared among engineers who all need it to test. Each of those is a path to the credential, and the credential is the whole game. A certificate narrows several of those paths, because a private key in a protected store does not travel and is far harder to lift than a string, but it does not close the human-process paths: the certificate still expires, still has to be renewed, and still has to be delivered to the workload somehow. The service principal’s posture is therefore only as strong as the weakest link in a chain of human-operated steps that must hold for the entire life of the credential.
A managed identity removes that chain. There is no secret to read, intercept, commit, log, or over-share, because there is no secret. The remaining attack surface is the token the workload obtains at runtime, which is short-lived and scoped, and the local endpoint that issues it, which is reachable only from the resource itself. An attacker who fully compromises the host can certainly request tokens as that principal, but that is true of any approach once the host is owned, and a held secret on the same host would be equally exposed plus exposable through all the additional paths that do not require host compromise at all. The net is a smaller, simpler attack surface with fewer operator-controlled failure points, which is precisely what you want from a security control.
Which option is more secure, and why?
A managed identity is the stronger default because it eliminates the long-lived stored credential, which is the asset attackers most often capture and operators most often mishandle. A service principal can be operated securely, especially with a certificate in a protected store, but its security depends on a chain of storage, delivery, and rotation steps holding for the credential’s whole life.
The honest qualifier is that a managed identity is not a license to ignore the rest of the posture. It removes the credential problem, not the authorization problem. An over-privileged managed identity granted a broad role at a wide scope is a serious exposure regardless of having no secret, because anything that can obtain its token can use all of that access. The secretless property and the least-privilege property are separate disciplines, and the next section is about not letting the first lull you into neglecting the second. The broader framing of why identity is the control plane to harden first is covered in the series treatment of Zero Trust architecture in Azure, which places this decision inside the larger principle of verifying every identity explicitly.
Least privilege, applied to a secretless principal
Removing the secret solves the credential half of identity security and leaves the authorization half untouched, so a managed identity deserves the same least-privilege discipline you would apply to any principal. The temptation, having eliminated the thing you used to worry about, is to grant the principal a broad role at a broad scope and move on. That is a mistake, because the token a managed identity obtains carries exactly the access you assigned, and anything running on the host can obtain that token. A secretless principal with owner rights on a subscription is a larger exposure than a service principal with a tightly scoped secret, because the secretless part does nothing to limit what the access can do once acquired.
The concrete discipline is to scope the role assignment to the narrowest resource the workload needs and to choose the least-privileged role that still works. If a function only reads blobs from one container, it should hold a data-plane read role on that storage account, not a contributor role on the resource group. If an app only retrieves two secrets from one vault, it should have a secrets-user role on that vault, not a management-plane role that lets it change the vault’s configuration. The principle is the same one the series applies everywhere: derive the grant from what the workload does, not from what would be convenient to avoid a second permission error later. The fact that the identity has no secret does not widen the appropriate scope by a single resource.
There is a useful mental check here. Imagine the host is compromised. With a secret-bearing service principal, the attacker gets the secret and everything it can reach. With a managed identity, the attacker gets tokens for everything the identity can reach. In both cases the damage is bounded by the role assignments, not by the credential type. The credential type decides how likely the compromise is to happen through the credential itself; the role assignments decide how bad it is once any compromise occurs. Treating these as two separate dials, and turning both toward safety, is what a defensible posture looks like. Removing the secret is necessary, but scoping the access is what bounds the blast radius.
Does a managed identity reduce how carefully I must scope roles?
No. The secretless property lowers the chance of credential theft, but it does nothing to limit what the identity can do once a token is obtained. You still scope the role assignment to the smallest resource and choose the least-privileged role that works, exactly as you would for any service principal, because the token carries the full access you granted.
In practice the cleanest pattern is one principal per workload with a purpose-named user-assigned identity or a per-resource system-assigned identity, each granted only the specific data-plane roles its job requires. Shared, broadly scoped identities are convenient and dangerous in the same way shared secrets are: they spread access wider than any single workload needs and make it hard to reason about who can do what. The secretless model removes the credential sprawl problem but reintroduces an access sprawl problem if you let one identity accumulate roles for many unrelated jobs.
Real-world scenarios and the deciding factor in each
The rule and the table are abstractions. Here are the recurring shapes these decisions take in practice, each described as a pattern with the factor that settles it, so you can match your situation to one and act.
The first and most common is an Azure-hosted application reading configuration from a vault or writing to storage. A web app on App Service needs two secrets and a storage container. The instinct of a team that learned on tutorials is to register an app, generate a client secret, store it as an app setting, and read it at startup. The deciding factor is location: the app runs inside Azure on a resource that supports managed identity, so there is no reason for any secret to exist. Enable a managed identity on the app, grant it the data-plane roles on the vault and the storage account, and the app setting that held the secret disappears along with the whole class of leak it represented. This is the textbook case for the no-secret-by-default rule, and it covers a surprising share of real workloads.
The second is an external build system deploying into Azure. A continuous integration job runs on a platform outside Azure and needs to create or update resources. The reflex is a service principal with a client secret stored as a pipeline secret, which is also the single most common place long-lived Azure credentials leak. The deciding factor is that the external platform can issue its own short-lived token, so federation applies. Configure a federated credential on the app registration that trusts the build platform’s issuer, and the pipeline authenticates with its own token instead of a stored secret. The principal still exists and still holds its roles; what vanishes is the secret in the pipeline settings.
The third is a cross-tenant integration, where an application in one tenant must act in another. Here a managed identity does not fit, because a managed identity belongs to the tenant and the resource that hosts it and cannot present itself across the tenant boundary as a multitenant application. The deciding factor is the boundary itself. A multitenant app registration with a service principal in each tenant is the right shape, and you prefer a certificate or federation over a client secret where the consuming side supports it. This is a legitimate service-principal case, recognizable because the requirement is precisely the thing a managed identity cannot do.
The fourth is rotation pain pushing a migration. A team has lived with a service principal secret for a year, has been burned by an expiry once, and wants out. The deciding factor is that the workload turns out to run inside Azure all along, on a resource that supports managed identity, and the secret was only ever there out of habit. The migration is to enable a managed identity, grant it the same roles the principal held, switch the code to request a token from the local endpoint, and then delete the secret and the registration. The payoff is the elimination of the entire rotation runbook, which is usually the reason the migration was requested in the first place.
The fifth is the single-resource versus shared-identity decision covered earlier, seen in the wild. One function reading one vault wants a system-assigned identity that dies with it. A scale set of identical workers that should all read the same vault under one grant wants a user-assigned identity shared across the fleet. The deciding factor is whether the identity belongs to one resource or to many, and the answer changes nothing about the secretless property, only about the lifecycle and the sharing.
The sixth is a security review flagging stored secrets. An audit or a secret scanner finds client secrets in pipeline variables, app settings, or committed config. The deciding factor for each finding is the same triage you would run from the table: does the workload run in Azure, in which case it should be a managed identity, or does it run at the edge with a federatable issuer, in which case it should be federated, with a held credential surviving only where neither applies and even then moved into a vault with rotation. The review is not asking you to rotate the secrets faster. It is asking you to remove the ones that should never have existed.
Why do so many teams end up with a stored secret they did not need?
Usually because the example they followed used a client secret, and the workload happened to run inside Azure where a managed identity would have worked. The secret was copied from a tutorial, not chosen from the workload’s actual constraints. The decision table reverses that habit by starting from where the workload runs rather than from the first authentication snippet found online.
The second contributor is permission errors during early development. A managed identity that has not yet been granted the right role fails with an authorization error, and a tired engineer reaches for a secret-bearing service principal because the failure feels like the managed identity not working rather than a missing role assignment. The fix is to recognize that the authorization failure would happen to a service principal too; it is a missing grant, not a flaw in the secretless approach. Granting the role is the correct response, not abandoning the better credential model.
Reading the errors each approach produces
Whichever path you choose, things fail, and the two paths fail in recognizably different ways. Knowing the signature of each failure shortens the diagnosis, because the symptom often surfaces far from the cause and the instinct is to chase the wrong layer.
The stored-string path fails most often at the boundary of time. A secret or a certificate reaches its expiry, the token endpoint stops accepting it, and the workload’s calls start returning authorization or token-acquisition errors. The diagnosis goes sideways because the error appears on the downstream call, a storage operation or a vault read, so the first reflex is to inspect the permissions on storage or the vault. The permissions are fine; the login behind the call lapsed. The tell is timing: the calls worked yesterday and fail today with nothing in the access configuration having changed, which points at the calendar rather than at the role assignment. When you see a clean authorization setup suddenly producing token failures, check the expiry of the credential before you touch anything else.
The platform-issued path cannot fail from expiry, because there is nothing you own that expires, so its failures cluster around two other causes. The first is a missing or wrong role assignment: the identity authenticates fine, obtains a token, and then the downstream call is refused because the identity holds no role at that scope. The token acquisition succeeded and the authorization failed, which is a different shape from the stored-string expiry failure and points you straight at the role grant. The second is the identity not being attached, or the wrong user-assigned identity being attached, so the local endpoint either returns nothing or returns a token for the wrong principal. The tell here is that the workload behaves as if it has no identity at all, which sends you to the resource configuration to confirm the attachment rather than to the role assignments.
Two dedicated walkthroughs in this series trace these failures from symptom to root rather than to the layer that first looks guilty. If you are chasing a platform-issued token problem, the guide on how to diagnose and fix managed identity token failures separates the attachment cause from the authorization cause and gives you the confirming check for each. If you are chasing a stored-string failure, the companion on service principal authentication errors walks the expiry, the wrong-credential, and the wrong-tenant cases. Reaching for the right one of those depends first on recognizing which path you are even on, which the failure signature tells you.
How do I tell a credential-expiry failure from a role failure?
Look at where the failure occurs in the token flow. If the workload cannot acquire a token at all and the credential has an expiry date that has passed, it is a credential-expiry failure on the stored-string path. If the workload acquires a token successfully but the downstream call is refused, it is an authorization failure, meaning a missing or wrong role assignment, which can happen on either path. Token acquisition versus token use is the dividing line.
That distinction saves real time because the two failures live in different places. An acquisition failure sends you to the credential, its expiry, its tenant, and whether it is even present. An authorization failure sends you to the role assignments and their scopes. Mixing them up is the classic time sink, inspecting storage permissions when the login expired, or rotating a credential when the role was simply never granted. Naming which side of the token flow broke is the first move in either diagnosis.
Certificates versus client secrets when a held credential is unavoidable
Sometimes the decision genuinely falls through to a held credential, because the caller lives outside Azure and no federated issuer is available. When that happens the choice is not over, because a held credential comes in two strengths and the weaker one is the one people reach for by reflex. Choosing the stronger one is the difference between a manageable exposure and a string that authenticates anyone who finds it.
A client secret is a plain string. The token endpoint accepts it from anyone who presents it, which means its entire security rests on the string never being seen by the wrong party. That is a hard property to guarantee over a credential’s lifetime, because a string can be copied from a vault, captured in a log, pasted into a chat while debugging, committed to a repository, or shared among engineers who all need it to test. Every one of those is a path to replay, and none of them requires compromising a host. The string is also easy to misplace operationally: it gets duplicated across consumers, and then rotation becomes a coordination exercise across all of them at once.
A certificate raises the bar by changing what the caller has to possess. Instead of a string anyone can replay, the caller holds a private key and proves possession of it by signing the token request, while the directory holds only the matching public key. The private key can live in a protected store, a key vault or a hardware-backed module, from which it cannot be exported, so there is no string to copy and nothing that travels during authentication. An attacker cannot replay a certificate by reading a log, because what authenticated was a signature produced by a key they do not have. The certificate still expires and still has to be renewed, so the lifecycle burden does not vanish, but the replay surface shrinks to the question of whether the private key can be extracted from its store, which a properly protected store makes hard.
The practical rule for any held credential you keep is therefore to prefer a certificate wherever the consuming side supports it, store the private key in a protected, non-exportable store, and treat a client secret as the option of last resort used only where nothing else is accepted, and even then kept in a vault with a rotation policy rather than left in plaintext. Reaching for a client secret because it is the easier thing to paste into a configuration file is exactly how the long-lived-string problem spreads, and the small extra effort of a certificate buys a materially smaller attack surface for the credentials you cannot avoid holding.
Should I always pick a certificate over a client secret?
Prefer a certificate whenever the consuming system supports it, because the private key can stay in a non-exportable store and never travels, which removes the replay paths a plain string exposes. Use a client secret only when the other side accepts nothing else, and then store it in a vault with an expiry and a rotation policy rather than placing it in plaintext config.
The one caveat is that certificates carry their own renewal discipline. A certificate that nobody renews fails exactly like an expired secret, so the stronger credential does not remove the calendar obligation, it only shrinks the leak surface. If you adopt certificates, adopt the renewal tracking that goes with them, because an unmanaged certificate is merely a different way to arrive at the same expiry outage that catches unmanaged secrets.
The complications worth engaging
Two beliefs keep teams reaching for a service principal when a managed identity would serve them better, and a fair comparison has to meet both head on rather than waving them away.
The first is the long-lived secret used for an Azure-hosted workload that could have used a managed identity. This is not a deliberate choice so much as a default that survived because nobody questioned it. The workload runs in Azure, the platform could attest to it, and yet it authenticates with a stored secret that someone has to protect and rotate. The counter to this is simply to apply the rule: if the caller runs inside Azure on a supporting resource, the secret is unnecessary, and its presence is a liability you are carrying for no benefit. The migration is mechanical and the payoff is the removal of an entire maintenance and risk surface. The only real obstacle is inertia, the sense that the secret works today so why change it, which holds right up until the day it expires or leaks.
The second is the belief that a service principal is inherently more flexible, and therefore the safer general-purpose choice. There is a grain of truth that gets overextended. A service principal is more portable, in the narrow sense that its credential can be carried anywhere, which is exactly why it is the right tool for callers outside Azure. But portability is not the same as flexibility for an Azure-hosted workload, and for that workload the service principal’s portability buys you nothing while costing you the whole secret lifecycle. The flexibility you actually want inside Azure, the ability to grant access, scope it tightly, share an identity across resources, and have it rotate itself, is fully present in managed identities, with user-assigned identities covering the sharing and durability cases. So the honest statement is that a service principal is more flexible only in the dimension that matters when you have left Azure, and inside Azure that flexibility is a cost rather than a benefit.
A related overreach is treating the client secret as the normal credential and the certificate as an advanced option. For any service principal you do end up keeping, the certificate is the stronger credential because the private key can live in a protected, non-exportable store, and you should prefer it wherever the consuming side supports it. Reaching for a client secret because it is the easier thing to paste into a config is how the long-lived-string problem propagates. If the choice has genuinely fallen through to a held credential, make it the strongest held credential available, and move it into a vault with a rotation policy rather than leaving it in plaintext anywhere.
Is a service principal ever the right default?
For a workload that runs outside Azure and cannot use federation, yes, a service principal is the correct choice, not a fallback. The “default” in the no-secret-by-default rule is scoped to Azure-hosted workloads. Once the caller is outside the Azure boundary and no federated issuer is available, a service principal with the strongest credential it can hold is exactly the right tool.
What is never right is choosing a service principal for an Azure-hosted workload out of habit, copying a client-secret example for a caller that could have been secretless, or assuming that because service principals are general they are therefore the safe universal answer. The generality is real, but generality is not the same as appropriateness. The appropriate choice is the one the workload’s location forces, and for most workloads that location is inside Azure where the secretless principal is both available and preferable.
Verifying and auditing the posture
A decision you cannot verify is a decision you cannot trust, so the final discipline is making the chosen posture observable and repeatable. For a managed identity, verification has two parts: confirm the identity is assigned and confirm it holds only the roles it needs. The first is a check that the resource actually has the managed identity enabled and, for user-assigned, that the right identity is attached. The second is a review of the role assignments scoped to that principal, looking for anything broader than the workload requires. Because the identity appears as a workload identity in sign-in logs, you can also confirm in the logs that it is authenticating as expected and from the resources you expect, which is your evidence that the secretless path is live and the old secret path is dead.
For a service principal you keep, verification adds the credential dimension. You confirm where the secret or certificate lives, that it is in a vault rather than in plaintext, that an expiry and a rotation policy exist, and that the credential is not duplicated across consumers in ways that make rotation a coordination problem. You also confirm that no copy lingers in source control or pipeline history, which is where a secret scanner earns its place. The point of the audit is not a one-time pass but a repeatable assertion: every workload identity is the type its location calls for, every held credential is justified and protected, and every role assignment is scoped to need.
Making this repeatable means expressing it as code. Define the managed identity, the role assignment, and the scope in your infrastructure templates so the posture is reproducible and reviewable in a pull request rather than clicked together in the portal and forgotten. A user-assigned identity with a stable identifier and a declared role assignment is far easier to audit than a system-assigned identity whose identifier shifts on every redeploy, which is one more reason infrastructure-as-code teams lean toward user-assigned for anything that matters. When the posture lives in code, the audit becomes reading the code, and drift becomes a diff.
How do I prove a workload is fully secretless?
Check three things: the resource has a managed identity enabled, no client secret or connection string for that workload exists in config, vault, or pipeline settings, and the sign-in logs show the workload identity acquiring tokens. If all three hold, the workload authenticates through the platform with nothing stored, which is the definition of secretless.
The negative check matters as much as the positive one. It is common to enable a managed identity, switch most calls to it, and leave a stray secret behind that one code path still uses, so the workload is half-migrated and the leak surface is still present. Searching the config, the vault, and the pipeline for any remaining credential tied to that workload, and confirming the logs show only managed-identity sign-ins, is how you prove the migration is complete rather than merely begun. A half-removed secret is still a secret.
How this choice scales across an organization
A single workload’s decision is easy once you have the rule. The harder problem is keeping the rule applied across hundreds of workloads owned by dozens of teams, because the default that wins at scale is whatever is easiest to copy, and if the easiest thing to copy is a stored string, that is what proliferates. Turning the no-secret-by-default rule into an organizational habit is mostly about making the secretless path the path of least resistance.
The first lever is templates. If the infrastructure modules and starter projects your teams clone already wire a platform-issued principal and a scoped role assignment, then the default a team inherits is the safe one, and they have to go out of their way to introduce a stored string. If instead the example everyone copies registers an application and generates a secret, the leak surface grows by one with every new project, no matter how clearly the policy is written elsewhere. Documentation does not win against a copied example; only a better example does. So the highest-leverage move is to make the reference implementation secretless and let imitation do the rest.
The second lever is detection. A scanner that flags stored strings in repositories, pipeline settings, and configuration turns the policy into something observable rather than aspirational. The point of the scan is not to punish but to surface the workloads that took the stored-string path so they can be triaged against the decision table: the Azure-hosted ones move to a platform-issued principal, the external ones with a federatable issuer move to federation, and the genuine held-credential cases are documented as such and moved into a vault with rotation. Over time the scan’s job shifts from finding accidents to confirming that the only stored strings left are the ones that have a recorded reason to exist.
The third lever is the audit posture described earlier, expressed at fleet scale. When access is declared in infrastructure code, an organization can review who holds what across the estate by reading the code rather than clicking through the portal one resource at a time. A stable, user-assigned identity with a declared, scoped role assignment is far easier to reason about across hundreds of workloads than a sprawl of per-resource identities whose identifiers shift on every redeploy and whose grants were clicked together by hand. The governance story and the engineering story converge here: the choices that are easiest to operate at scale are the same ones that are easiest to verify, and both point toward the secretless default with scoped, declared access. This is the same principle the series develops in its treatment of Zero Trust architecture in Azure, where verifying every workload explicitly and granting it only what it needs is the organizing idea rather than a single control.
How do I roll this out without rewriting everything at once?
Start with the templates so new workloads are secretless by default, then run a scanner to inventory the stored strings that already exist, and triage them against the decision table in priority order rather than all at once. Migrate the highest-risk and easiest-to-move workloads first, document the genuine held-credential exceptions, and let the safe default propagate through copied examples over time.
The rollout that fails is the one that tries to convert every workload in a single sweep and stalls on the hard external cases. The rollout that works treats the estate as a backlog: fix the default so the problem stops growing, then burn down the existing inventory in order of risk and ease. Most Azure-hosted workloads move quickly because the migration is mechanical, which clears the bulk of the leak surface early, leaving the genuine external and cross-tenant cases to be handled deliberately as the documented exceptions they are.
Where to practice the comparison hands-on
Reading the rule is one thing; building both paths and watching the difference is what makes it stick. The most direct way to internalize this is to stand up a small Azure-hosted workload, wire it once with a service principal and a client secret and once with a managed identity, and compare what you had to store, protect, and rotate in each case. You can run the hands-on Azure labs and command library on VaultBook to work through exactly that exercise with tested commands for enabling both kinds of managed identity, configuring a federated credential, and assigning scoped roles, so the decision table stops being abstract and becomes a set of steps you have actually performed. Building the secretless path with your own hands is the fastest way to stop reaching for a client secret out of habit.
The verdict
The managed identity vs service principal decision resolves to a single rule that holds across almost every case you will meet: default to the secretless option for anything that runs inside Azure, and fall through to a held credential only when the caller lives somewhere Azure cannot vouch for it directly, preferring federation and then a certificate before you accept a raw client secret. A managed identity removes the long-lived credential entirely, which deletes the largest class of identity incidents, eliminates the rotation and storage burden, and shrinks the attack surface to a short-lived, scoped token issued from an endpoint only the resource can reach. A service principal exists for the genuine cases the managed identity cannot serve, the external and cross-tenant workloads, and for those it is the right and necessary tool, best operated with a certificate in a protected store and never with a casually stored secret.
The mistake to avoid is not using service principals. It is using them where a managed identity would have worked, carrying a secret you never needed because it was the example you copied or the habit you formed. Start every workload at the top of the decision table, ask where it runs before you ask how it authenticates, and let the location choose the credential strategy. Do that and the secret you do not store becomes the secret that cannot leak, the credential that cannot expire on you, and the line in a breach report that never gets written. That is the whole wager of the secretless default, and it is the right one to make.
Frequently Asked Questions
Q: What is the core difference between a managed identity and a service principal?
Both are workload identities in Microsoft Entra ID that receive role assignments and request access tokens, so structurally a managed identity is a kind of service principal. The difference is who owns the credential. A service principal authenticates with a credential you generate and hold, either a client secret or a certificate, and you are responsible for storing, delivering, and rotating it. A managed identity has no credential you ever see; the platform provisions and rotates it internally and your code obtains tokens from a local endpoint that only the hosting resource can reach. The practical consequence is that a managed identity has nothing for you to leak, expire, or rotate, while a service principal carries a long-lived credential you must protect for its entire life. That single distinction drives every other difference in operational burden, attack surface, and where each one fits.
Q: When should I use a managed identity instead of a service principal?
Use a managed identity for any workload that runs inside Azure on a resource that supports it, which includes virtual machines, App Service apps, Functions, Container Apps, AKS workloads, Logic Apps, and many more. The deciding factor is location: if the caller runs on an Azure resource the platform can attest to, there is no reason for a stored credential to exist, so the managed identity is the correct default. Reach for a service principal only when the caller lives outside Azure and cannot use a managed identity, and even then prefer a federated identity credential over a stored secret where the external platform supports it. The short version is to start every workload assuming a managed identity, then move off that default only when you can name the concrete reason it will not work, which is almost always that the workload is not hosted in Azure.
Q: Does a managed identity have no secret at all?
From your perspective, correct, there is no secret you create, store, or manage. There is cryptography happening underneath, and the platform does maintain a credential it uses to obtain tokens on the resource’s behalf, but that credential is provisioned and rotated by Azure and is never exposed to you or your code. Your workload never presents a stored secret; it requests a token from a local metadata endpoint reachable only from the resource itself, and the platform returns a short-lived access token. So while tokens still exist and still flow, the long-lived, human-held credential, the thing that leaks into repositories and logs and expires unnoticed, simply does not exist for a managed identity. That is the precise sense in which it is secretless, and it is the property that removes the largest class of credential incidents.
Q: Which option avoids credential rotation entirely?
A managed identity avoids rotation entirely because you never hold the credential. The platform rotates the underlying credential on its own schedule, your code never depends on knowing when, and there is no expiry on your calendar and no runbook to execute. A service principal, by contrast, always carries a rotation obligation: whether you use a client secret or a certificate, it has a lifetime, and you must generate a replacement, distribute it to every consumer, and retire the old one before it lapses or after it leaks. The most common production identity outage is a service principal credential that quietly expired because a rotation reminder was missed. If eliminating that recurring chore is your goal, and the workload runs in Azure, migrating to a managed identity removes the rotation work completely rather than just making it easier.
Q: What is the difference between system-assigned and user-assigned managed identities?
A system-assigned managed identity is created on and bound to a single resource; it shares that resource’s lifecycle, coming into existence with it and being deleted with it, and it cannot be shared with other resources. A user-assigned managed identity is a standalone resource with its own lifecycle that you can attach to many resources at once, grant a role to once, and have all attached resources inherit. Choose system-assigned when the identity belongs to exactly one resource and should be cleaned up automatically when that resource is removed. Choose user-assigned when several resources need the same principal, when the identity must survive the recreation of the resource that uses it, or when you provision access in infrastructure code before the consuming workload exists. Both are fully secretless and rotated by the platform; the choice is about lifecycle and sharing, not about credentials or security strength.
Q: When does a service principal or federated identity make more sense?
A service principal makes sense when the caller runs outside Azure, where a managed identity is not reachable, and in cross-tenant scenarios where an application in one tenant must act in another, which a managed identity cannot do because it is bound to its own tenant and resource. Within those external cases, a federated identity credential makes more sense than a stored secret whenever the external platform can issue its own short-lived token that Entra ID can be configured to trust, such as an external CI/CD system deploying into Azure. Federation keeps the secretless property across the Azure boundary by trusting the external issuer instead of holding a credential. You fall through to a held certificate or client secret only when the caller is external and no federated issuer is available, and even then a certificate in a protected store is preferable to a client secret.
Q: Is a managed identity more secure than a service principal?
A managed identity is the stronger default because it eliminates the long-lived stored credential, which is the asset attackers most often capture and operators most often mishandle through leaks, accidental commits, over-sharing, or missed rotation. Its remaining attack surface is a short-lived, scoped token issued from an endpoint only the hosting resource can reach, which is a smaller and simpler surface with fewer operator-controlled failure points. A service principal can be operated securely, especially with a certificate in a protected store, but its security depends on a chain of storage, delivery, and rotation steps holding for the credential’s entire life, and any weak link exposes the credential. The qualifier is that secretless does not mean unbounded safety: an over-privileged managed identity is still a serious exposure, because its token carries whatever access you granted. Removing the secret and scoping the roles are two separate disciplines you must apply together.
Q: Can I migrate from a service principal to a managed identity without downtime?
Usually yes, with a brief overlap rather than a hard cutover. Enable the managed identity on the Azure-hosted resource first, grant it the same roles at the same scopes the service principal currently holds, and verify those assignments before touching the code. Then update the application to request tokens from the local managed-identity endpoint instead of presenting the stored secret, deploy, and confirm through the sign-in logs that the workload is authenticating as the managed identity. Only after you have confirmed the new path works for every code path do you remove the client secret and retire the service principal. The risk is leaving a stray code path still using the old secret, so search the config, vault, and pipeline for any remaining credential and confirm the logs show only managed-identity sign-ins before you delete anything. Done in that order, the two paths coexist briefly and the switch carries no outage.
Q: Does using a managed identity mean I can skip least-privilege role design?
No, and treating it that way is a common and serious mistake. The secretless property lowers the chance that the credential itself is stolen, but it does nothing to limit what the identity can do once a token is obtained, and anything running on the host can obtain that token. A managed identity granted a broad role at a wide scope is a larger exposure than a tightly scoped service principal, because the broad access is available to anything that compromises the host. Scope the role assignment to the smallest resource the workload genuinely needs and choose the least-privileged role that still works, exactly as you would for any principal. The clean pattern is one purpose-named identity per workload, granted only the specific data-plane roles its job requires, so that removing the credential problem does not quietly create an access-sprawl problem in its place.
Q: How do I prove a workload is genuinely secretless?
Verify three things together. First, confirm the hosting resource actually has a managed identity enabled, and for user-assigned, that the correct identity is attached. Second, confirm that no client secret, connection string, or other long-lived credential for that workload exists anywhere in its config, in a vault, or in pipeline settings, including stale entries from a partial migration. Third, check the sign-in logs and confirm the workload identity is acquiring tokens as expected from the resources you expect. The negative check is the one teams skip: it is common to migrate most calls to a managed identity and leave one code path still using an old secret, so the workload is only half secretless and the leak surface persists. A workload is genuinely secretless only when the platform issues all its tokens and nothing stored remains, so searching for leftover credentials is as important as confirming the identity exists.
Q: Why do so many Azure-hosted workloads end up with a stored secret they never needed?
The usual cause is copied examples. A team follows a tutorial that registers an application and generates a client secret, and they carry that pattern into a workload that runs inside Azure where a managed identity would have worked with nothing to store. The secret was chosen by habit, not by the workload’s actual constraints. A second cause is permission errors during development: a managed identity that has not yet been granted the right role fails with an authorization error, and the failure gets misread as the managed identity not working, prompting a fallback to a secret-bearing principal. The fix in both cases is to start from where the workload runs rather than from the first authentication snippet found online, and to recognize that an authorization failure is a missing role assignment, which would affect a service principal equally, not a flaw in the secretless approach.
Q: What credential should a service principal use if I do have to keep one?
Prefer a certificate over a client secret wherever the consuming side supports it. A certificate lets the private key live in a protected, non-exportable store such as a key vault or a hardware-backed store, so the key never travels and is far harder to exfiltrate than a string, while Entra ID validates the signature against the registered public key. A client secret is a plain string that authenticates anyone who holds it, which makes it trivially replayable if it leaks into a log, a screenshot, or a committed file. Whichever you use, store it in a vault rather than in plaintext config or pipeline variables, set an expiry, and maintain a rotation policy so it does not lapse unnoticed. The general principle is that if the decision has genuinely fallen through to a held credential, make it the strongest held credential available and protect it accordingly, rather than defaulting to the easiest thing to paste into a configuration file.
Q: Can a managed identity be used for workloads outside Azure?
No, and this is the boundary that defines the whole comparison. A managed identity works because the Azure platform can directly attest to a resource it hosts and issue tokens to it through a local endpoint reachable only from that resource. A workload running on-premises or in another cloud has no such endpoint and no platform attestation that Entra ID will accept on that basis, so it cannot obtain managed-identity tokens. For those callers you use a service principal, and where the external platform can issue a trusted token you use a federated identity credential to keep the secretless property without a managed identity. The rule of thumb is that a managed identity stops at the Azure boundary; inside it, the managed identity is the default, and at or beyond the edge you move to federation first and a held credential only as a last resort.
Q: How does federated identity differ from a managed identity?
Both are secretless, but they attest to the caller from different sides of the Azure boundary. A managed identity relies on the Azure platform vouching for a resource it hosts, so it works only for Azure-hosted workloads. A federated identity credential relies on an external identity provider that you configure Entra ID to trust, so an external workload presents a short-lived token its own platform issued, Entra ID validates it against the trust relationship, and it returns an Azure access token in exchange. No long-lived secret is created or stored on either side. Federation still uses an app registration and its service principal as the identity that holds the role assignments; what it replaces is the held credential on that principal. The simplest way to keep them straight is that a managed identity is the secretless option inside Azure and federation is the secretless option at the edge, both avoiding a stored credential by trusting a platform to attest to the caller.
Q: Does each managed identity show up in Entra ID, and how do I manage access for it?
Yes. A managed identity appears in your tenant as an enterprise application, the same way any service principal does, and you manage its access exactly as you would for any workload identity. You grant it Azure roles at a chosen scope for management-plane or data-plane access, and where it needs to call an API that uses application permissions you grant those on the relevant resource. It will appear in sign-in logs as a workload identity, which is how you audit that it is authenticating and from where. Because a managed identity is structurally a managed service principal, none of the access-management mechanics are new: role assignments, scopes, and data-plane roles all behave the same. The only difference from a conventional service principal is that you never configure or rotate a credential for it, so the management surface is purely about authorization rather than about authorization plus credential upkeep.
Q: Is a user-assigned managed identity better for infrastructure-as-code?
Often, yes, because it gives you a stable identifier and a lifecycle you control declaratively. A system-assigned identity’s object identifier changes when its resource is recreated, so any role assignment or vault policy referencing the old identifier breaks and must be reapplied after a redeploy. A user-assigned identity keeps a stable identifier across redeployments of the workloads that consume it, so your declared role assignments stay valid even as the compute underneath churns. You can also create the identity and grant its access before the consuming resource exists, which fits a pipeline that provisions access and compute in separate steps. The trade-off is that a user-assigned identity does not clean itself up automatically the way a system-assigned one does, so you must remove it deliberately when it is no longer needed. For anything managed as code and likely to be redeployed, the stability usually outweighs the manual cleanup.
Q: What happens to my application if I just delete a service principal’s secret?
Authentication stops for any code path still presenting that secret. The application will request a token using a credential Entra ID no longer accepts, the token endpoint rejects it, and downstream calls fail with authorization or token-acquisition errors. This is exactly why a migration must enable and verify the replacement path, whether a managed identity or a federated credential, before the secret is removed, and why you search for every remaining use of the secret first. The failure is often misdiagnosed because it surfaces at the downstream call, a storage or vault operation, rather than at the credential, so engineers chase the wrong permission. If you see token failures after removing a credential, the cause is almost always a code path that was still using it, and the fix is to confirm the new authentication path is wired everywhere before the old credential is retired.
Q: How do I decide quickly between all these options under time pressure?
Walk the decision table top to bottom and stop at the first match. Ask whether the caller runs inside Azure on a resource that supports managed identity; if yes, use a managed identity and choose system-assigned for a single resource or user-assigned for a shared or durable identity. If the caller is external, ask whether its platform can issue a token Entra ID can trust through federation; if yes, configure a federated credential. Only if both answers are no do you fall through to a held credential, preferring a certificate over a client secret and storing it in a vault with a rotation policy. The ordering is deliberate because each step downward adds something you must protect, from nothing, to a trust relationship, to a private key, to a raw string. Starting at the top and stopping early keeps you on the option with the least liability that still serves the workload, which under time pressure is exactly the discipline that prevents an unnecessary secret.