Microsoft Entra ID Authentication Explained

Almost every Azure incident that reaches a security review eventually traces back to a sentence that begins “the app was authenticated, so we assumed.” A user signed in, a token was present, a call went through, and somewhere along that path a decision was made on the strength of the wrong proof. An ID token was forwarded to an API as though it were a key. An access token was inspected for the user’s name and trusted as evidence of who was on the other end. A refresh token sat in a log file long enough for someone to find it. None of these are exotic attacks. They are the ordinary result of treating authentication as a single binary event rather than as a pipeline that produces several different artifacts, each meant for a single purpose.

Microsoft Entra ID Authentication Explained - Insight Crunch

Microsoft Entra ID authentication is that pipeline. When a user opens an application, or when a background service wakes up to call an API, a sequence of steps runs that establishes who the caller is, decides whether the sign-in is allowed under the current policy, and then hands back a set of tokens that say different things to different parties. Understanding authentication in Entra ID means understanding that sequence well enough to place any sign-in problem, any token rejection, and any unexpected prompt on a specific point in the path. The reader who can do that stops guessing. They can look at a failed call and say, with confidence, that the failure is an audience mismatch and not a permissions problem, or that the prompt is Conditional Access asking for a second factor and not a broken credential.

This article lays the pipeline out end to end. It covers the sign-in sequence Entra ID runs, the OAuth 2.0 and OpenID Connect protocols underneath it, the three token types and what each one is actually for, single sign-on and federation, the authentication methods and multifactor options available, and how Conditional Access evaluates a sign-in after the primary credential checks out. The goal is not to memorize a flow diagram. It is to build a model of authentication detailed enough to reason from, so that when an app or a user proves identity, you know exactly what was proven, to whom, and what was authorized as a result. For the broader identity model that sits around this pipeline, the directory deep dive at Microsoft Entra ID explained is the companion piece; this article zooms into the act of signing in.

The id-proves-who, access-authorizes rule

The single most useful claim in all of Entra ID authentication is small enough to write on an index card and load-bearing enough to resolve a large fraction of token confusion. State it plainly: an ID token proves who the user is to the application that requested the sign-in, while an access token authorizes a call to a protected API. The ID token is identity. The access token is authorization. They are produced by the same sign-in, they often arrive together, and they look superficially similar because both are JSON Web Tokens, yet they are addressed to different audiences and carry different meaning. Most token confusion is, at root, using one where the other belongs.

Call this the id-proves-who, access-authorizes rule. It is worth naming because naming it makes it a thing you can hold up against a problem. A developer who sends an ID token to a downstream API and gets a rejection is not facing a mysterious bug. They are violating the rule: the API does not accept ID tokens because an ID token was never meant to authorize an API call. A developer who reads the user’s display name out of an access token and treats it as proof of identity is also violating the rule, in the other direction, and may be trusting a claim that the API was never supposed to rely on for identity. The rule cuts both ways, and almost every token mistake is a version of one of those two directions.

The rule also explains why the pipeline produces more than one token at all. If a single token could both prove identity and authorize every API in the world, it would be a universal credential, and a universal credential is a single point of catastrophic failure. Entra ID instead issues a narrow ID token scoped to the application that asked, and separate access tokens each scoped to a specific resource and set of permissions. The separation is the security property. The cost of the separation is that you have to keep the tokens straight, which is precisely the discipline this article is trying to build.

What does an ID token actually prove?

An ID token proves that a particular user authenticated to Entra ID and that the application requesting sign-in is the intended recipient. It carries claims such as the user’s object identifier, the issuer, the audience (the application’s client ID), and the time the authentication happened. It is meant to be consumed by the application that requested it, never forwarded to an API as a credential.

The ID token is the answer to the question the application asked when it started the sign-in: who is this person, and can I trust that Entra ID checked them? It is not a bearer credential for calling other services. The audience claim, written as aud, names the client ID of the application that initiated the flow. When an API receives a token, it validates the audience against its own identifier; an ID token’s audience is the client app, not the API, so the API correctly rejects it. That rejection is the rule working as designed, not a failure to fix by loosening validation.

The InsightCrunch authentication pipeline map

The findable artifact for this article is a map of the pipeline from raw credential to issued tokens, naming each decision point and each artifact so any authentication problem can be located on it. Read it top to bottom as the path a sign-in actually takes. The left column is the stage, the middle column is what happens, and the right column names the artifact or decision that stage produces and where a failure at that stage shows up.

Stage	What happens	Artifact or decision produced
1. Request	The application redirects the user to the Entra ID authorize endpoint, or a daemon calls the token endpoint directly, naming the client ID, the requested scopes, and the redirect URI	An authorization request; a malformed one fails here with an invalid request or redirect URI error
2. Primary authentication	Entra ID collects and verifies the first credential: password, passwordless sign-in, certificate, or a federated assertion from another identity provider	A verified primary credential; a wrong password or failed federation fails here
3. Conditional Access evaluation	After the primary credential checks out, Entra ID evaluates Conditional Access policies against signals such as user, device, location, application, and risk	A grant decision: allow, block, or require a control such as MFA or a compliant device
4. Additional controls	If a policy requires it, Entra ID satisfies the control, most commonly a second authentication factor	A satisfied control, recorded on the session; a failed or abandoned MFA prompt fails here
5. Authorization code or assertion	For interactive flows, Entra ID returns a short-lived authorization code to the redirect URI; the app then exchanges it at the token endpoint	An authorization code; a mismatched redirect URI or reused code fails the exchange
6. Token issuance	The token endpoint issues the tokens the flow requested: an ID token, an access token scoped to the requested resource, and usually a refresh token	The ID token, access token, and refresh token; a missing or wrong scope shows up here
7. Resource call	The application sends the access token as a bearer token to the protected API, which validates the signature, issuer, audience, and scopes	An authorized API call; an audience or scope mismatch fails at the API, not at Entra ID
8. Silent renewal	When the access token expires, the app uses the refresh token to obtain a new access token without prompting the user again	A fresh access token; a revoked or expired refresh token fails and forces interactive sign-in

The value of the map is diagnostic. When something goes wrong, the symptom tells you the stage. A redirect URI error is stage 1 or stage 5. A wrong password is stage 2. An unexpected MFA prompt is stage 3 producing a control that stage 4 then satisfies. A token that Entra ID issued happily but the API rejects is stage 7, almost always an audience or scope problem, and emphatically not a sign-in failure. A user who was working fine an hour ago and now gets bounced to a login page has hit stage 8, a refresh token that no longer works. Keep this map in mind for the rest of the article; every section that follows is a detailed pass over one or two of its rows.

Where do most authentication problems actually sit?

Most authentication problems sit at stage 3, stage 6, or stage 7 of the pipeline map. Stage 3 is where Conditional Access turns an otherwise valid sign-in into a blocked or challenged one. Stage 6 is where a missing or incorrect scope produces a token that lacks the permission the API needs. Stage 7 is where audience mismatches surface, because the token validates fine yet names the wrong recipient.

These three stages account for the bulk of real incidents because they are the points where two parties with different expectations meet. Stage 3 is policy meeting credential. Stage 6 is the app’s scope request meeting the resource’s permission model. Stage 7 is the issued token meeting the API’s validation. Errors at stages 1, 2, and 5 tend to be obvious and self-correcting, because the user or developer sees an immediate, specific failure. Errors at 3, 6, and 7 are the ones that produce the “but it was authenticated” confusion, because authentication succeeded and the failure is downstream.

The sign-in pipeline begins the moment an application decides it needs to know who the user is or needs a token to call an API on the user’s behalf. For an interactive application, this means redirecting the browser to the Entra ID authorize endpoint with a set of query parameters that describe the request. The most important of these are the client ID that identifies the registered application, the response type that says what the app expects back, the redirect URI where Entra ID should return the result, the scopes that name the permissions being requested, and a state value the app uses to correlate the response with its own request. The relationship between the application and its registration is the subject of Microsoft Entra app registrations explained, and the registration is what makes the client ID and redirect URI meaningful to Entra ID in the first place.

When the authorize endpoint receives this request, it does not immediately ask for a password. It first checks whether the user already has a session with Entra ID. If a valid session cookie is present, the user has already proven their identity recently, and Entra ID can skip the credential prompt entirely. This is the mechanism that makes single sign-on feel instant to the user, and it is also the reason a user who just signed in to one application is not asked to sign in again when they open a second. The session is the shared state that single sign-on rides on, and the absence of a session is what triggers the visible credential prompt.

If there is no usable session, Entra ID moves to primary authentication and collects the first credential. What it collects depends on what the tenant and the user have configured. The classic case is a username and password, validated against the directory or, for federated domains, against an external identity provider. Increasingly the first credential is passwordless: a FIDO2 security key, Windows Hello for Business, or a phone sign-in through the Authenticator app. From the pipeline’s point of view the credential type does not change the shape of the flow. Stage 2 either produces a verified primary credential or it does not, and the rest of the pipeline runs the same way regardless of how the first factor was proven.

Why does Conditional Access run after the password, not before?

Conditional Access runs after primary authentication because its decisions depend on knowing who the user is. A policy that says “require MFA for members of the finance group” cannot evaluate group membership until the user has proven identity. Conditional Access is therefore part of sign-in, layered on top of a verified credential, deciding whether that verified identity may proceed and under what additional conditions.

This ordering is the single most important thing to understand about how Conditional Access fits the pipeline, and it is covered in full in the Conditional Access deep dive. The credential check answers “is this the user they claim to be.” Conditional Access answers “given that it is the user, on this device, from this location, requesting this application, with this risk level, should the sign-in proceed and what extra proof do we want first.” Because the second question depends entirely on the answer to the first, the password necessarily comes first. A failed password never reaches Conditional Access at all; the sign-in is already over. Only a verified identity is interesting enough to evaluate against policy.

Once Conditional Access has produced its decision, the pipeline either stops, in the case of a block, or continues, possibly after satisfying an additional control. The most common control is multifactor authentication, where Entra ID prompts the user to approve a notification, enter a code, or present a second key. Other controls include requiring a managed or compliant device, requiring an approved client application, or forcing a password change for a risky account. Each control either succeeds, allowing the pipeline to continue, or fails, ending the sign-in with a specific reason recorded in the sign-in logs. After all required controls are satisfied, the session records what was proven, so that later sign-ins within the session lifetime do not have to repeat the same controls.

For interactive flows, the pipeline now returns an authorization code to the application’s redirect URI. This code is short-lived and single-use; it is not a token and cannot call anything. The application takes the code and makes a back-channel call to the token endpoint, presenting the code along with proof that it is the application that started the flow. For a public client such as a single-page application or a mobile app, that proof is a PKCE code verifier that matches the challenge sent at the start. For a confidential client such as a web server, that proof is the application’s own secret or certificate. The token endpoint validates the code and the proof, and only then issues tokens. This two-step structure, an authorization code followed by a token exchange, is what keeps tokens off the front channel where the browser and its history could leak them.

What does the token endpoint return at the end of a flow?

The token endpoint returns the tokens the flow requested, which for an interactive sign-in is typically an ID token, an access token scoped to the requested resource, and a refresh token. The response also includes metadata such as the access token’s lifetime and the scopes that were actually granted, which may be narrower than the scopes that were requested if the user or admin consented to less.

The exact set of tokens depends on what the application asked for. An app that only wants to know who the user is, with no API call to make, can request just an ID token. An app that needs to call Microsoft Graph and a custom API needs an access token for each resource, obtained either in the initial exchange or through later silent requests using the refresh token. The granted scopes in the response matter as much as the token itself; an app that requested broad permissions but received narrow ones because consent was limited will hold an access token that the API accepts but that lacks the permission for the specific operation, which surfaces as an authorization failure at the API rather than a token failure at issuance.

OAuth 2.0 and OpenID Connect beneath the pipeline

The pipeline described above is not a Microsoft invention layered on a proprietary protocol. It is an implementation of two open standards: OAuth 2.0 and OpenID Connect, usually abbreviated OIDC. Entra ID speaks these protocols, and understanding the division of labor between them removes a great deal of the mystery from the token types and the flows. The relationship between the two, and the specific grant flows Azure supports, is the subject of the dedicated protocol article at OAuth 2.0 and OIDC in Azure explained; here the goal is to see how they underpin authentication specifically.

OAuth 2.0 is an authorization framework. Its purpose is to let an application obtain limited access to a resource on behalf of a user, without the application ever handling the user’s password. The tokens OAuth defines, access tokens and refresh tokens, are about authorization: what the bearer is allowed to do, for how long, and against which resource. OAuth deliberately says nothing about the identity of the user. It was designed so that a photo-printing service could obtain permission to read your photos from a storage service, and for that use case the printing service does not need to know who you are, only that it has been granted read access to a specific set of photos.

OpenID Connect is a thin identity layer built on top of OAuth 2.0. It adds the one thing OAuth deliberately omits: a standard way to prove who the user is. OIDC introduces the ID token, defines a userinfo endpoint, and standardizes a set of identity claims so that any OIDC-compliant application can ask “who signed in” and get a consistent, verifiable answer. The crucial design point is that OIDC does not replace OAuth; it extends it. When Entra ID runs an authorization code flow that returns both an ID token and an access token, it is running OAuth for the authorization part and OIDC for the identity part in the same exchange. This is exactly the id-proves-who, access-authorizes rule expressed at the protocol level: OIDC produces the identity proof, OAuth produces the authorization grant.

In a single sign-in, OIDC handles proving who the user is and OAuth handles authorizing what the application may do. The ID token is the OIDC artifact, addressed to the application as proof of identity. The access token is the OAuth artifact, addressed to a resource as proof of authorization. One flow produces both because most applications need to know who the user is and also act on their behalf.

The practical upshot is that you should stop thinking of “logging in” as one thing. A typical sign-in is doing two jobs at once. The OIDC job answers the application’s question about identity and ends when the app has validated the ID token and established a session for the user. The OAuth job answers the resource’s question about authorization and continues for as long as the application needs to call APIs, renewing access tokens silently through the refresh token. The two jobs share the same initial user interaction precisely because making the user authenticate once and then satisfying both identity and authorization needs from that single act is what makes the experience usable. Separating the two jobs in your mental model is what makes the token behavior predictable.

The scopes an application requests are how it tells Entra ID which of these jobs it wants done and how much authorization it needs. The OIDC scopes openid, profile, and email request an ID token and the basic identity claims. Resource scopes such as User.Read for Microsoft Graph request authorization to a specific operation on a specific resource and drive the contents of the access token. A request that includes openid plus User.Read is asking for both an identity proof and a Graph access token in one flow, which is the everyday case for a line-of-business application that signs the user in and then reads their profile.

The grant flows that shape the pipeline

The pipeline map drawn earlier is the interactive case, where a human sits in front of a browser and proves identity. That is the most common shape, but it is not the only one, and the shape of the flow changes with the kind of application asking. OAuth defines several grant flows, and each one is a different route through the same endpoints, chosen to fit how the requesting application can prove it is what it claims to be. Choosing the right flow is part of authentication design, because the flow determines what is collected, what is returned, and where the security boundaries sit.

The authorization code flow with PKCE is the modern route for any application where a user is present and a browser is involved, whether that is a server-rendered web app, a single-page app, or a native mobile app. The flow sends the user to the authorize endpoint, returns a short-lived code to the redirect URI, and then exchanges that code for tokens at the token endpoint. PKCE, which stands for Proof Key for Code Exchange, binds the code to the specific client instance that started the flow by sending a hashed challenge up front and the matching verifier at exchange time. This binding is what lets a public client, one that cannot keep a secret because its code runs on the user’s device, prove that the code being redeemed belongs to the same session that requested it. Without PKCE a stolen authorization code could be redeemed by an attacker; with it the code is useless without the verifier that never left the original client.

The client credentials flow is the route for an application that authenticates as itself, with no user present at all. A nightly job that reads from a storage account, a service that calls Microsoft Graph to provision accounts, or any daemon that runs unattended uses this flow. There is no authorize step and no user interaction, because there is no user. The application proves its own identity directly to the token endpoint using its secret or, far better, a certificate, and receives an access token scoped to the resource it needs. The defining characteristic is that the resulting access token represents the application itself rather than a user acting through the application, which has significant consequences for what the token is permitted to do and how its permissions are granted.

The device code flow exists for input-constrained devices and command-line tools where typing a password into the device itself is awkward or impossible. A smart television, a CLI on a remote server, or an Internet-of-things device starts the flow, receives a short code and a URL, and displays them. The user goes to that URL on a phone or laptop, signs in normally, and enters the code, and the original device polls the token endpoint until the sign-in completes and tokens become available. The flow moves the credential entry to a device with a real keyboard and browser while still delivering tokens to the constrained device.

The on-behalf-of flow handles the chained case, where a middle-tier API receives an access token from a client and needs to call a further downstream API as the same user. The middle tier presents the incoming access token to the token endpoint and requests a new access token for the downstream resource, preserving the user’s identity through the chain. This is how a frontend can call an API that in turn calls Microsoft Graph without the frontend ever holding a Graph token, and it keeps the user’s delegated permissions intact across every hop rather than collapsing them into an application identity at the boundary.

Which grant flow should an application use?

The flow follows from the application type. An interactive app with a user present uses the authorization code flow with PKCE. A daemon with no user uses client credentials. An input-constrained device uses the device code flow. A middle-tier API calling a further API as the user uses on-behalf-of. The deprecated implicit flow should not be used in new applications; PKCE replaced it.

Picking a flow is therefore reasoning about the client, not copying a sample that happened to work. The questions are concrete: is a user present, can the application keep a secret, is there a browser, and does the call need to act as the user or as the application. Answering those questions selects the flow, and the selected flow determines the rest of the pipeline’s behavior. The deeper treatment of each flow, including why implicit was deprecated and how PKCE supersedes it, lives in OAuth 2.0 and OIDC in Azure explained; for authentication purposes the point is that the flow is a design decision driven by the nature of the requesting application, and choosing wrongly produces an application that either cannot prove itself or holds more authority than it should.

The token types: ID, access, and refresh

The three token types are where the abstract rule becomes concrete, and where careful attention pays off most. All three are issued by the same pipeline, two of them are JSON Web Tokens that can be decoded and inspected, and each has a distinct audience, lifetime, and purpose. Getting these straight is the difference between reasoning about a token problem and guessing at it.

An ID token is an OIDC artifact whose audience is the client application. It exists to prove the user’s identity to that application. Its claims include the subject identifier, the issuer, the audience, the authentication time, and a nonce that ties it to the specific request to prevent replay. The application validates the ID token’s signature against Entra ID’s published keys, checks the issuer and audience, verifies the nonce, and then trusts the identity claims. After that validation, the ID token has done its job. It is not stored to call APIs, not forwarded to backends, and not refreshed in the way an access token is. It is a one-time proof consumed at sign-in.

An access token is an OAuth artifact whose audience is a protected resource, not the client application. It exists to authorize a call to that resource. Its claims include the audience naming the resource, the scopes or roles granting specific permissions, the subject, the issuer, and an expiry. The resource, not the client, validates the access token: it checks the signature, the issuer, that the audience matches its own identifier, and that the scopes cover the requested operation. The client treats the access token as opaque; it should not parse it for identity or business logic, because the token’s format and contents are a contract between Entra ID and the resource, not something the client is entitled to depend on.

A refresh token is the third artifact, and it behaves differently from the other two. It is not a JWT to be validated by a resource; it is an opaque credential the application presents back to Entra ID’s token endpoint to obtain new access tokens without prompting the user again. Refresh tokens have longer lifetimes than access tokens and are bound to the client, the user, and often the session. They are sensitive precisely because they can mint new access tokens, which is why they must be stored securely, never logged, and are subject to revocation when a user’s password changes, a session is terminated, or a risk event triggers reauthentication.

Why does an API reject a token that Entra ID issued without complaint?

An API rejects a valid token almost always because of an audience mismatch: the token’s aud claim names a different recipient than the API expects. Entra ID issued the token correctly for the audience the application requested, but the application sent it to the wrong resource. The token is genuine and unexpired; it is simply addressed to someone else, and the API is right to refuse it.

This is the most common token problem in practice, and the pipeline map places it precisely at stage 7. The fix is never to weaken the API’s validation. The fix is to request an access token for the correct resource, so that its audience matches. If the application sent an ID token to the API, the audience is the client ID and the resource correctly refuses it; the application needs an access token instead. If the application sent an access token meant for a different API, it needs to acquire a token for the right resource, which usually means requesting the right scope. In both cases the diagnosis is the same once you read the aud claim, and reading that claim should be the first move whenever an API rejects a token that issuance produced cleanly. VaultBook’s hands-on Azure labs and command library include a sign-in tracing workflow where you can run a flow end to end and decode each token to inspect exactly this claim, which makes audience mismatches obvious rather than mysterious; you can run the hands-on Azure labs and command library on VaultBook to walk a real token through validation.

How do refresh tokens keep a user signed in without prompting?

Refresh tokens keep a session alive by silently exchanging themselves for new access tokens at the token endpoint. When an access token nears expiry, the application presents the refresh token, and Entra ID issues a fresh access token, and usually a new refresh token, without any user interaction. The user stays signed in until the refresh token itself expires or is revoked.

This silent renewal is what makes a web or mobile application feel like it stays logged in for days even though the access tokens inside it live only an hour or so. Short access token lifetimes limit the damage if a token leaks, while the refresh token absorbs the burden of keeping the session going. The trade-off is that the refresh token becomes the high-value secret in the system. Its revocation is the lever Entra ID pulls when something goes wrong: a password reset, an administrator revoking sessions, or a Conditional Access policy detecting risk all invalidate refresh tokens, which forces the next silent renewal to fail and sends the user back through interactive sign-in where the current policy gets a fresh chance to evaluate them.

The example code below shows what reading a token’s claims looks like in practice, decoding the payload of a JWT to inspect its audience, issuer, and scopes. This is a diagnostic step, not something an application does to an access token it merely passes along.

# Decode the payload of a JWT to inspect its claims (diagnostic only)
# The token has three dot-separated parts: header.payload.signature
TOKEN="eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJhdWQiOiJhcGk..."

# Extract the payload (second segment), fix base64url padding, and pretty-print
echo "$TOKEN" | cut -d '.' -f2 | \
  sed 's/-/+/g; s/_/\//g' | \
  awk '{ l=length($0)%4; if(l==2) print $0"=="; else if(l==3) print $0"="; else print $0 }' | \
  base64 -d 2>/dev/null | python3 -m json.tool

# Acquire an access token for a specific resource with the Azure CLI
# The --resource value sets the audience; this is how you control aud
az account get-access-token --resource https://graph.microsoft.com

# Acquire a token for a custom API by its application ID URI
az account get-access-token --resource api://contoso-orders-api

How a resource validates an access token

When a protected API receives an access token as a bearer credential, it does not simply trust the bytes. It performs a sequence of validation checks, and understanding that sequence is what lets you reason about why a particular call was accepted or refused. Every check corresponds to a property the token claims, and a failure at any one of them produces a rejection with a specific cause. This is stage 7 of the pipeline map seen from the resource’s side, and it is where a large share of “the token is right but the call fails” confusion gets resolved.

The first check is the signature. An Entra ID access token is signed with a key that Entra ID holds privately and whose public half it publishes at a well-known metadata endpoint as a JSON Web Key Set. The resource fetches those public keys, identifies the right one by the key identifier in the token header, and verifies that the signature over the header and payload is valid. A valid signature proves two things at once: that Entra ID issued the token and that nothing has altered it in transit. A signature failure means the token was forged, tampered with, or signed by a key the resource does not recognize, and the resource must refuse it. Because signing keys roll over on a schedule, a resource that caches the key set too aggressively and never refreshes can begin rejecting perfectly good tokens after a key rotation, which is a subtle failure that looks like a token problem but is in fact a stale-key problem on the validating side.

The second check is the issuer. The token carries an iss claim naming the authority that issued it, and the resource confirms that this matches the issuer it expects for its tenant. This prevents a token from a different tenant or a different authority from being accepted, which matters enormously for multitenant applications that must be deliberate about which issuers they trust. The third check is the audience, the aud claim, which the resource compares against its own identifier. This is the check that catches the audience mismatches discussed throughout this article: a token addressed to a different resource fails here even though its signature and issuer are perfectly valid.

The fourth check is validity in time. The token carries a not-before claim and an expiry claim, and the resource confirms that the current time falls inside that window, usually with a small allowance for clock skew. A token presented before it is valid or after it has expired is refused, which is the ordinary and expected behavior that makes short token lifetimes a security feature rather than a bug. The fifth check is authorization itself: the resource reads the scopes in delegated tokens or the roles in application tokens and confirms that the permission needed for the requested operation is present. A token can pass every prior check and still be refused here if it lacks the specific permission, which is the under-granted-consent case where issuance succeeded but the granted scope was narrower than the operation requires.

Why does a token rejection sometimes mean the validating side is misconfigured?

Because token validation is performed by the resource, not by Entra ID, a rejection can originate in the resource’s own configuration. A stale cached key set after a signing-key rotation, an issuer value that does not account for the tenant, or an audience identifier that does not match the registration all cause valid tokens to be refused. The token is fine; the validator’s expectations are wrong.

This is why diagnosing a stage 7 rejection means inspecting both the token and the validator. Decode the token to confirm its signature key identifier, issuer, audience, and scopes, then confirm that the resource is fetching current keys, expecting the right issuer, and identifying itself with the audience the token names. When the token’s claims are correct and the call still fails, the problem has crossed from the token to the validating code, and that is a different fix entirely. Holding both sides of the validation in view is what separates a precise diagnosis from a round of guesswork that blames the token for the validator’s stale configuration.

The anatomy of the claims inside a JWT

Both the ID token and the access token are JSON Web Tokens, and reading their internal structure is what turns the abstract rules into something you can inspect directly. A JWT has three parts separated by dots: a header, a payload, and a signature. The header names the signing algorithm and the key identifier that selects which published key was used. The payload is a set of claims, each a name and value asserting something about the subject, the issuer, or the grant. The signature, computed over the header and payload, is what the resource verifies to trust the rest. The first two parts are base64url-encoded JSON and can be decoded and read by anyone; only the signature requires the key to verify.

The claims worth knowing by name are few and they recur everywhere. The subject claim identifies the principal the credential is about. The object identifier and tenant identifier pin that principal to a specific directory object in a specific tenant, which is more stable than a username and is what application code should key on. The issuer names the authority that minted the credential, and the audience names the intended recipient, the two claims whose mismatch causes most refusals. The scope claim lists delegated permissions in a user-context credential, while a roles claim lists application permissions in an app-context one, and the presence of one versus the other is how you classify the credential’s kind. The issued-at, not-before, and expiry claims fix the validity window, and a nonce ties an identity proof to the specific request that asked for it.

An authentication-methods claim, when present, records how the user proved identity, which is how downstream logic and policy can know whether a phishing-resistant method was used rather than merely that some authentication happened. Reading these claims is not an advanced skill reserved for incidents; it is the everyday way to confirm that a credential is the kind you expect, addressed to the recipient you expect, carrying the permissions you expect, and still inside its validity window. The discipline of decoding and reading claims first, before forming theories, is the single habit that most reliably shortens an authentication investigation.

Which claims should I check first when a credential looks wrong?

Check the audience and the issuer first, because their mismatches cause most refusals, then the scope or roles claim to confirm the permission is present, then the expiry to confirm the credential is still valid. Those four claims, the recipient, the authority, the permission, and the time window, account for the overwhelming majority of cases where a genuine credential is nonetheless refused.

Reading them in that order is efficient because it follows the resource’s own validation sequence and front-loads the checks most likely to fail. The audience tells you whether the credential was even meant for this recipient. The issuer tells you whether it came from the authority the resource trusts. The permission claim tells you whether the granted authority covers the operation. The validity window tells you whether the credential is current. Working through those four resolves nearly every “valid credential, refused call” case without ever needing to reach for the signature, the registration, or the policy layer, and on the rare occasion they all check out, you have learned something precise: the problem has moved to the validator, not the credential.

Single sign-on and federation

Single sign-on is the property that a user who has authenticated once does not have to authenticate again for the next application within the session. It is not a separate protocol bolted on; it is a consequence of the session the pipeline establishes at stage 4. When the first sign-in completes, Entra ID sets a session artifact, and when the user navigates to a second application that also redirects to Entra ID, the authorize endpoint finds that session and can issue tokens for the second app without prompting for credentials again. The user experiences a near-instant sign-in to the second app. Under the hood the pipeline ran again, but it short-circuited at the session check rather than collecting a credential.

This is why single sign-on and authentication are so tightly linked. The session is the shared state, and everything that affects the session affects single sign-on. A short session lifetime means more frequent prompts. A Conditional Access sign-in frequency control can force reauthentication even when a session exists, which is how an administrator requires, for example, that access to a sensitive application always involves a fresh sign-in. The session also carries what was already proven, so that a second application protected by an MFA policy can be satisfied by the MFA the user already completed during the session, rather than prompting again. The session, in other words, is the memory that makes a series of separate sign-ins feel like one continuous authenticated experience.

Federation is the mechanism by which Entra ID trusts another identity provider to perform the primary authentication. In a federated configuration, when a user from a federated domain signs in, Entra ID does not collect the password itself. It redirects the user to the external identity provider, which authenticates the user by whatever means it uses, and returns a signed assertion vouching for the user’s identity. Entra ID validates that assertion against the trust it has established with the external provider and then continues its own pipeline: Conditional Access still evaluates, Entra ID still issues its own tokens. Federation changes only stage 2, the primary authentication, by delegating it. The rest of the pipeline is unchanged, which is why a federated user is still subject to Entra ID Conditional Access and still receives Entra ID tokens.

A sign-in is federated when the user’s domain is configured for federation, pointing primary authentication at an external identity provider. Entra ID redirects the user there, the external provider authenticates them and returns a signed assertion, and Entra ID validates that assertion in place of collecting a credential itself. This is common where an organization keeps an existing on-premises or third-party identity provider authoritative for passwords.

Federation matters for troubleshooting because it splits the credential check across two systems. When a federated user cannot sign in, the failure might be at the external provider, which Entra ID never sees the inside of, or it might be in the trust between the two, such as an expired token-signing certificate or a clock skew that makes the assertion appear invalid. The pipeline map still applies, but stage 2 is now a remote call, and the sign-in logs will show a federation-related result rather than a wrong-password result. Distinguishing “the external provider rejected the user” from “Entra ID rejected the assertion” is the core diagnostic skill for federated environments, and it follows directly from understanding that federation replaces only the primary authentication stage.

Application-only versus delegated authentication

A distinction that cuts across everything above, and that resolves a recurring class of confusion, is whether a call is made on behalf of a user or by an application acting as itself. Entra ID supports both, they produce access tokens that look similar but behave very differently, and conflating them leads to permissions that either fail to grant the access an app needs or grant far more than it should. This is the delegated-versus-application distinction, and it is foundational to reasoning about what an authenticated call is actually allowed to do.

Delegated authentication is the case where a user signs in and the application acts on that user’s behalf. The access token carries the user’s identity in its subject claim and carries delegated permissions, the scopes the user consented to, in its scope claim. The effective permission of such a call is the intersection of what the application was granted and what the signed-in user is actually allowed to do. A delegated call to read mail can only read the mail the user themselves can read, because the user’s own authorization bounds it. This is the everyday line-of-business case: a user opens an app, the app calls an API as that user, and the user’s permissions constrain the result.

Application-only authentication is the case where no user is present and the application authenticates as itself through the client credentials flow. The resulting access token carries the application’s own identity, and its permissions are application permissions, also called app roles, granted to the application by an administrator rather than consented to by a user. There is no user to bound the call, so an application permission to read mail can read all mail in the tenant, which is exactly why application permissions require administrator consent and demand careful scoping. The power of an application identity is also its danger: a leaked credential for an application with broad application permissions is a tenant-wide exposure, not a single user’s worth of access.

This is where managed identities and service principals enter, and where the choice between them becomes a security decision. A service principal is the local representation of an application in a tenant, and it can authenticate with a secret or a certificate. A managed identity is a special kind of service principal whose credential Entra ID and the Azure platform manage automatically, so there is no secret for a developer to store, rotate, or leak. For a workload running in Azure that needs to authenticate as an application, a managed identity removes the most dangerous part of application authentication, the standing secret, by making the platform responsible for the credential. The broader trade-off between a managed identity and a manually managed service principal, including when each fits, is the subject of its own comparison; for authentication the key point is that application-only authentication should default to a managed identity wherever the workload runs in Azure, because it eliminates the credential that would otherwise have to be protected.

How do I tell whether a token represents a user or an application?

Read the token’s claims. A delegated token carries a user subject and a scope claim listing delegated permissions; an application token carries the application’s identity and a roles claim listing application permissions, with no user. The presence of a scope claim versus a roles claim, and whether a real user identity is present, tells you which kind of authentication produced the token.

This check matters because the two kinds authorize very differently, and an API often accepts both while applying different logic to each. An API that sees delegated permissions should bound its actions by the user; an API that sees application permissions is dealing with an unattended caller whose authority is not bounded by any user and must be checked against the application’s granted roles. Misreading which kind of token arrived leads to either over-restriction, where a legitimate application call is treated as a user call and denied, or over-permission, where an application token is trusted as though a user had bounded it. Reading the claims to classify the token is the prerequisite for authorizing it correctly, and it is another instance of the same discipline that reading the audience claim represents: the token tells you what it is, if you look.

Authentication methods and multifactor

The primary authentication stage can be satisfied by a range of methods, and the choice of method has real security consequences. The oldest method is a password, which is also the weakest, because passwords are phishable, reusable, and guessable at scale. Entra ID supports passwords but increasingly steers tenants toward stronger methods, and understanding the spectrum is part of understanding authentication, because the method determines how much trust the primary credential actually warrants.

Multifactor authentication adds a second, independent proof to the first. The principle is that a second factor of a different kind, something you have or something you are, makes a stolen password far less useful, because the attacker would also need the phone, the key, or the biometric. In Entra ID, multifactor is most often invoked as a Conditional Access control rather than a blanket requirement, which means it is requested when policy decides the sign-in warrants it. The methods that satisfy a multifactor requirement range from a push notification approved in the Authenticator app, to a one-time code, to a phone call, to a hardware security key, with the stronger methods being more resistant to phishing than the weaker ones.

Passwordless authentication takes the logic a step further by removing the password entirely as the primary factor. With a FIDO2 security key, Windows Hello for Business, or phone sign-in, the user proves identity with a possession factor and a local gesture such as a PIN or biometric, and there is no password to phish in the first place. From the pipeline’s perspective these methods all satisfy stage 2, but they change the risk profile of the whole sign-in: a phishing-resistant method makes the primary credential trustworthy enough that some Conditional Access policies can relax additional controls, because the strength was already established at the first factor.

Which authentication methods resist phishing and which do not?

Phishing-resistant methods bind authentication to the legitimate site or origin, so a credential proven to a fake page cannot be replayed against the real one. FIDO2 security keys, Windows Hello for Business, and certificate-based authentication are phishing-resistant. Passwords, one-time codes, and push notifications are not, because each can be captured or approved against an attacker-controlled relay even when a second factor is present.

This distinction increasingly drives policy design. A push notification that asks “approve this sign-in” can be defeated by an attacker who triggers the prompt at the moment the user expects one, or who simply fatigues the user into approving. A one-time code can be relayed through a phishing proxy that sits between the user and the real site. A FIDO2 key cannot, because the cryptographic challenge is bound to the origin and the key refuses to respond to the wrong one. Number matching and other recent improvements harden push notifications considerably, but the structural property of phishing resistance belongs to the methods that bind to the origin. When designing authentication for a sensitive application, the question is not merely “do we have MFA” but “is the method phishing-resistant,” and the pipeline treats the answer as part of the strength recorded on the session.

Conditional Access deserves its own pass because it is the stage where authentication becomes a policy decision rather than a credential check. After the primary credential is verified, Conditional Access gathers signals about the sign-in and evaluates the tenant’s policies against them. The signals include the user and their group memberships, the application being accessed, the device and whether it is managed or compliant, the location and network, and the calculated risk level of the user and the sign-in. Each policy is a set of conditions matched against these signals, and a set of controls to apply when the conditions match.

The decision Conditional Access produces is one of three kinds. It can grant the sign-in outright, allowing the pipeline to continue to token issuance. It can block the sign-in, ending it with an access-denied result regardless of how valid the credential was. Or it can grant access subject to controls, the most common being a requirement for multifactor authentication, a compliant device, an approved client app, or a combination. When controls are required, the pipeline pauses to satisfy them, and only after they are satisfied does it continue. This is why a user with a perfectly valid password can still be prompted for more, or blocked entirely: the credential check is necessary but not sufficient, and Conditional Access holds the gate.

Conditional Access also reaches beyond the initial sign-in through session controls and through continuous access evaluation. Session controls can limit what a session can do, for example restricting downloads in an unmanaged browser. Continuous access evaluation lets a resource and Entra ID communicate during the life of an access token, so that a critical event such as a disabled account or a revoked session can take effect in near real time rather than waiting for the token to expire. These mechanisms extend the policy decision past the moment of sign-in and into the life of the session, which is why thinking of Conditional Access as a one-time gate undersells it; it is closer to a standing evaluation that begins at sign-in and continues for the session’s duration.

Yes. Conditional Access evaluates after the password is verified, so a correct password is only the entry condition. If a policy’s conditions match, for example a sign-in from an unmanaged device or an untrusted location, Conditional Access can require additional proof or block the sign-in entirely. The credential being correct does not exempt the sign-in from policy; it is precisely what allows policy to evaluate it.

This is the source of a great deal of help-desk confusion, because the user is certain their password is right, and they are correct. The block is not a credential failure. It is policy declining to allow this otherwise-valid sign-in under the current conditions. The sign-in logs make this clear by recording the policy that applied and the control that was required or the reason for the block, which is why reading the sign-in logs is the first diagnostic step for any “I cannot get in but my password works” report. The fix is never to reset the password; it is to understand which policy matched and whether the user can satisfy its control, such as enrolling a device or completing MFA, or whether the block is intentional.

Token lifetime, session, and continuous evaluation

The pipeline does not end when tokens are issued; the life of those tokens and the session behind them is where authentication meets the passage of time, and several recurring questions live here. The design tension is straightforward. Short-lived access tokens limit the blast radius of a leak, because a stolen token expires quickly, but short lifetimes would force frequent interactive sign-ins if there were no way to renew silently. The refresh token resolves the tension by carrying the renewal burden, and the session ties the whole thing together so the user experiences continuity rather than a string of prompts.

An access token’s lifetime is deliberately short, commonly on the order of an hour, though the exact figure is configurable and subject to change, so it should be verified against the current documentation rather than assumed. During that hour the access token is the credential the application presents to the resource. As it nears expiry, the application uses the refresh token to obtain a new access token from the token endpoint without involving the user. The refresh token itself has a longer lifetime and is typically rotated on each use, so each silent renewal both issues a fresh access token and replaces the refresh token, which limits how long any single refresh token remains valid. This rolling renewal is what keeps a session alive for days while no individual credential lives very long.

The session is the higher-level state that single sign-on rides on and that Conditional Access can govern. A sign-in frequency control can require that the user reauthenticate after a set interval regardless of refresh tokens, which is how an organization forces periodic fresh proof for sensitive applications. A session can also be revoked outright, which invalidates the refresh tokens bound to it and forces the next renewal to fail. The interplay matters: a user reports being “logged out,” and the cause might be the refresh token expiring naturally, an administrator revoking the session, a password change invalidating refresh tokens, or a sign-in frequency policy demanding fresh proof. Each of these is a different stage-8 event, and the sign-in logs distinguish them.

Continuous access evaluation changes the timing of all this for participating resources. Without it, a revoked session or a disabled account only takes effect when the current access token expires, leaving a window during which a token that should be dead still works. With continuous access evaluation, the resource and Entra ID maintain a channel so that critical events propagate in near real time: a disabled user, a revoked session, or a detected risk can cause the resource to reject an access token mid-life rather than honoring it until expiry. This effectively shortens the window between a security decision and its enforcement from the token lifetime down to seconds, which is why it matters so much for high-value resources and why it is a meaningful part of the modern authentication posture rather than an optional extra.

Why did a user get signed out when nothing seemed to change?

A silent renewal failed. The refresh token expired, was rotated out, was revoked by an administrator or a password change, or a sign-in frequency policy required fresh proof. From the user’s view nothing changed, but the renewal that had been keeping them signed in could not complete, so the next call forced an interactive sign-in. The sign-in logs record which of these caused it.

This is one of the most common support questions, and the model answers it cleanly. Being “signed in” is not a single durable state; it is a session kept alive by repeated silent renewals, each of which can fail for a specific reason. When the renewal fails, the user is sent back to interactive sign-in, where the current Conditional Access policy gets a fresh chance to evaluate them, which is itself a security feature: forcing reauthentication is how revocation and policy changes actually reach a user who was already signed in. Rather than treating an unexpected sign-out as a glitch, read it as the renewal mechanism doing its job, and let the sign-in logs name the specific cause.

The complication: when token roles get confused

Everything above rests on keeping the token roles straight, and the most instructive way to cement the model is to look closely at what goes wrong when they get crossed. The complication this article must engage is the temptation to treat the ID token as an API credential, or the access token as proof of user identity. Both mistakes are common, both compile and often appear to work in early testing, and both are wrong in ways that surface later as security gaps or production failures.

Treating the ID token as an API credential happens when a developer, having received an ID token at sign-in, reaches for it the next time the application needs to call an API, because it is the token they have in hand. The call may even succeed against a permissive or misconfigured API in development. In a correctly configured system it fails, because the API validates the audience and finds the client ID rather than its own identifier. The deeper problem is conceptual: the ID token was never an authorization grant, and an API that accepted it would be trusting a token addressed to someone else. The correct move is to acquire an access token for the API, which the application can do silently using the refresh token, and to send that. The ID token stays where it belongs, consumed at sign-in by the application.

Treating the access token as proof of user identity happens in the other direction, when an API receives an access token and reads claims out of it to decide who the user is for business logic, beyond the authorization the token grants. The access token does carry a subject claim, and using it to identify the calling user for authorization decisions the token was issued to support is legitimate. The mistake is relying on the access token as a general identity assertion, or worse, having the client parse the access token for identity, because the access token’s contents are a contract between Entra ID and the resource, and a client that depends on them is depending on something it was told to treat as opaque. The identity proof for the client is the ID token; the authorization proof for the resource is the access token; conflating the two is exactly the confusion the id-proves-who, access-authorizes rule exists to prevent.

What is the single fastest way to catch a token-role mistake?

Read the aud claim. If a token’s audience is a client ID and it is being sent to an API, it is an ID token in the wrong place. If a client is parsing an access token whose audience is a resource, it is reading a token addressed to someone else. The audience claim names the intended recipient, and almost every token-role mistake is visible the moment you compare that recipient to where the token is actually going.

This single check resolves an outsized share of token problems because the audience claim is the token’s address, and a token in the wrong place is a token whose address does not match its destination. Building the habit of reading aud first, before reaching for logs or theories, turns most token incidents into a thirty-second diagnosis. It is the practical expression of the namable claim, and it is why the pipeline map puts audience validation squarely at stage 7 where the token meets the resource.

Real-world patterns the model explains

The value of a model is that it explains the cases you actually encounter. Each of the patterns below is a recurring situation engineers report, and each is fully accounted for by the pipeline map and the token rule, which is the point: once you hold the model, these stop being separate problems to memorize and become instances of a few principles.

The first pattern is sending an ID token to an API that needs an access token. The application has an ID token from sign-in, forwards it to its backend API, and the API rejects it. The model places this at stage 7 with an audience mismatch: the ID token’s audience is the client, not the API. The resolution is to acquire an access token for the API and send that instead, leaving the ID token to do its identity job at the client.

The second pattern is a refresh token obtaining new access tokens silently. The user has not signed in interactively for hours, yet the application keeps calling APIs successfully. The model places this at stage 8: each time an access token nears expiry, the application exchanges the refresh token for a fresh one without prompting. This is not a problem to fix; it is the silent renewal working as designed, and recognizing it prevents the false alarm of “why is this still working when the token should have expired.”

The third pattern is single sign-on across applications in a tenant. The user signs in to one application and then opens a second that also uses Entra ID, and the second app signs them in instantly. The model places this at the session check before stage 2: the second sign-in finds the existing session and short-circuits the credential prompt. Understanding this explains both why single sign-on works and why ending the session, or a sign-in frequency policy, breaks the instant experience on purpose.

The fourth pattern is multifactor prompted by Conditional Access. The user enters a correct password and is then asked to approve a notification. The model places this at stage 3 producing an MFA control that stage 4 satisfies. The prompt is policy, not a credential problem, and the sign-in logs name the policy that required it. Recognizing this stops the reflex to reset a password that was never wrong.

The fifth pattern is federation to an external identity provider. A user from a federated domain is redirected away to authenticate elsewhere and returns to a working Entra ID session. The model places this at stage 2, delegated: federation replaces the primary credential check with an external assertion, and Entra ID validates the assertion and continues. When a federated sign-in fails, the model directs attention to the external provider or the federation trust rather than to Entra ID’s own credential store.

The sixth pattern is a token audience mismatch rejected by the API. An application acquires an access token but for the wrong resource, sends it to the intended API, and the API rejects it even though the token is valid and unexpired. The model places this at stage 7: the audience names a different resource than the API expects. The resolution is to request a token for the correct resource by using the right scope, so that the audience matches. This pattern and the first one share a root cause, a token whose audience does not match its destination, which is why the audience-first habit catches both.

Do these patterns ever combine into one harder case?

Yes, and the combined cases are where the model earns its keep. A federated user hitting a Conditional Access MFA requirement, then receiving a token their application forwards to the wrong API, is three patterns at once: a delegated stage 2, an MFA control at stage 3, and an audience mismatch at stage 7. Decomposing the symptom onto the pipeline map turns one confusing failure into three independent, each individually obvious, diagnoses.

This is the reason to invest in the model rather than collecting fixes. Real incidents rarely arrive as the clean textbook case; they arrive as a tangle, and the engineer who can lay the tangle across the pipeline map and say “this part is federation, this part is policy, this part is an audience mismatch” resolves it far faster than the one matching symptoms to a list of remedies. The map is a decomposition tool, and decomposition is what makes hard authentication cases tractable.

The pipeline map is most valuable under pressure, when a sign-in is failing and the symptom is a terse code or a vague message. The discipline is to map the symptom to a stage and then to a cause, rather than trying remedies in sequence. Entra ID records a result for every sign-in, and the family of the result points to the stage, which narrows the cause to a handful of possibilities. The series maintains dedicated troubleshooting articles for the specific error families; the goal here is the reasoning that places any of them correctly.

A redirect or reply-URL complaint belongs to the request stage. When the message names the reply URL or redirect URI, the application sent the user to the authorize endpoint with a return address that does not match a value registered for the application. The cause is a mismatch between what the code requested and what the registration permits, and the fix is at the registration, not in the credential or the policy. This family is self-describing once you know it sits at stage one, because the endpoint refuses to return a result to an address it has not been told to trust.

A consent-required result belongs to the issuance stage. When a sign-in fails because consent has not been granted, the application requested a permission that neither the user nor an administrator has approved, so Entra ID cannot issue a usable result for that resource. The cause is a missing consent for the requested scope, and the resolution is to grant the appropriate consent, with administrator consent required for permissions that cannot be self-serviced by a user. This is distinct from a permission that was granted but is narrower than needed, which produces a clean issuance and a later refusal at the resource.

A multifactor-required or access-blocked result belongs to the Conditional Access stage. When the result names a policy or an interrupt for additional proof, the primary credential succeeded and policy then required a control or declined the sign-in. The cause is a policy condition matching the sign-in, and the resolution is to satisfy the control or to understand the deliberate block, never to reset a credential that was already correct. A resource-principal or audience complaint belongs to the resource-call stage, where a genuine credential is refused because it names the wrong recipient, which the audience-first habit catches immediately.

Read the sign-in log result first, because it names the stage. If it points to a redirect or reply URL, check the registration. If it points to consent, grant the right permission. If it names a policy or extra proof, that is Conditional Access. If it points to audience or resource, decode the credential and compare its recipient to where it was sent. Match the family to the stage before trying any fix.

This ordering saves enormous time because it replaces trial and error with a decision tree rooted in evidence. The sign-in log is the pipeline’s own account, so the first move is always to read it rather than to theorize, and the family of the recorded result selects the stage, which selects the small set of plausible causes. An engineer who internalizes this stops applying random remedies and instead reads, classifies, and fixes the one thing the classification points to. The dedicated troubleshooting deep dives exist for the detailed remediation of each family, but the act of placing a failure on the right stage is the skill that makes those articles findable and useful rather than a wall of unrelated error codes.

Designing authentication with least privilege

Understanding the pipeline is the foundation; designing on top of it well is the payoff, and the organizing principle for the design is least privilege applied at every stage. Least privilege in authentication is not a single setting. It is a set of deliberate choices about how much each credential is trusted, how much each application is authorized, how strong each proof must be, and how long each session may live. Made well, these choices produce a posture that grants exactly the access needed and no more, which is both more secure and easier to reason about.

At the scope level, an application should request the narrowest permissions that let it do its job, so that its access credentials authorize only the operations it actually performs. Requesting broad permissions because they are convenient produces credentials that, if leaked, expose far more than the application ever needed, and it complicates consent because users and administrators are asked to approve more than the task warrants. Narrow scopes are the authentication expression of least privilege, and verifying granted scopes against requested ones is how you confirm the principle held through consent rather than quietly eroding.

At the identity level, the choice between delegated and application authentication is a least-privilege decision. Where a user is present and their authority should bound the call, delegated authentication keeps the action inside what the user can already do. Where an application acts alone, application permissions grant authority unbounded by any user, so they must be scoped tightly and granted deliberately, and the credential behind them must be protected as a high-value secret, which is the argument for a managed identity over a stored secret wherever the workload runs in Azure. Reaching for an application identity where a delegated one would do is a quiet over-grant, and the discipline is to default to the bounded option.

At the proof level, the strength of the authentication method is a least-privilege lever in a different sense: a phishing-resistant method makes the primary credential trustworthy enough that the rest of the posture can rely on it, while a weak method leaves a gap that policy must compensate for. Designing sensitive applications to require phishing-resistant authentication closes the most common attack path at the first factor. At the session level, tuning lifetimes and sign-in frequency to the sensitivity of the application keeps high-value access from outliving its warrant, and continuous access evaluation narrows the gap between a revocation decision and its enforcement. Each of these is least privilege expressed at a different point in the pipeline, and together they turn the model from an explanation into a design.

What is the most common authentication over-grant to look for?

Broad application permissions where delegated permissions would suffice. An application granted tenant-wide application permissions can act on everything of that type, unbounded by any user, so a leaked credential becomes a tenant-wide exposure. Where a user is present and should bound the call, delegated permissions confine the access to what that user can already do, which is almost always the safer default.

Catching this over-grant is high-value because it converts a worst-case exposure into a contained one. The review question is simple: does this application act without any user at all, or did it reach for application permissions because they were easier to grant than working out the delegated flow. When a user is present, the delegated path keeps the user’s own authorization in force as a natural ceiling, and that ceiling is exactly the kind of bound least privilege depends on. Pairing this with a managed identity for the genuinely user-less cases removes the standing secret as well, so the design both narrows the authority and eliminates the credential that would otherwise be the thing to steal.

Verifying the authentication posture

A model is only as good as your ability to confirm it against a running system, and Entra ID exposes the evidence needed to verify every stage of the pipeline. The primary instrument is the sign-in logs, which record each sign-in with the user, the application, the result, the Conditional Access policies that applied, the authentication methods used, and the failure reason when there is one. Reading a sign-in log entry is reading the pipeline’s own account of what happened, and it should be the first step in any authentication investigation rather than the last.

For token-level verification, decoding the token is the direct method. Reading the aud, iss, scp or roles, and exp claims tells you the audience, the issuer, the granted permissions, and the expiry, which together confirm whether a token is the right kind, addressed to the right resource, with the right permissions, and still valid. This is how an audience mismatch becomes visible in seconds rather than remaining a theory. The token’s claims are the ground truth that the abstract rules describe, and confirming them against the symptom closes the loop between model and reality. VaultBook’s command library and sandbox give you a place to run a sign-in, capture the tokens, and decode them safely without exposing real credentials, which is the kind of repeatable verification that turns understanding into a reliable skill.

Verification should also extend to the policy layer. Confirming which Conditional Access policies apply to a given sign-in, and what they require, is done through the policy evaluation tools and the sign-in logs together, and it answers the “why was I prompted” and “why was I blocked” questions definitively. The principle of least privilege applies here as much as anywhere: an application should request the narrowest scopes it needs, so that its access tokens authorize only the operations it actually performs, and verifying the granted scopes against the requested ones catches both over-broad requests and under-granted consent before they become production surprises.

Open the sign-in log entry for that sign-in. It records the result, the authentication methods used, every Conditional Access policy that applied and whether it granted, blocked, or required a control, and the specific failure reason when the sign-in did not succeed. The log is the pipeline’s own narration, so confirming behavior is reading the entry rather than reconstructing it from theory.

Making this repeatable is what turns a one-time diagnosis into an auditable posture. Sign-in logs can be exported to a workspace and queried, so that patterns such as a spike in MFA prompts, a rise in audience-mismatch failures from a particular application, or a cluster of blocks from an unexpected location become visible as trends rather than individual tickets. An authentication posture you can query is one you can audit, and an auditable posture is one you can prove to a security review rather than assert. The combination of the pipeline map as the mental model and the sign-in logs as the evidence is what lets you both reason about authentication and demonstrate that your reasoning matches what the system actually did.

Closing verdict

Microsoft Entra ID authentication is a pipeline, not a moment. A sign-in flows from an authorization request, through primary authentication, through Conditional Access evaluation and any controls it requires, to the issuance of tokens, and on into the silent renewal that keeps a session alive. Underneath it run two open standards with a clean division of labor: OpenID Connect proves who the user is and produces the ID token, while OAuth 2.0 authorizes what the application may do and produces the access and refresh tokens. The id-proves-who, access-authorizes rule is the compass for the whole space, and reading the audience claim is the fastest way to catch the mistakes that violate it.

The reader who holds the pipeline map can place any authentication problem on a specific stage and any token problem on a specific claim. A block with a correct password is Conditional Access at stage 3. A rejected token that issued cleanly is an audience mismatch at stage 7. A session that stays alive without prompting is the refresh token at stage 8. A federated user redirected away is a delegated stage 2. None of these are separate puzzles once the model is in place; they are instances of a single, well-structured flow. Build the model, confirm it against the sign-in logs and the token claims, and authentication stops being the thing everyone assumes and becomes the thing you can actually reason about.

Frequently asked questions

How does authentication work in Microsoft Entra ID?

Authentication in Entra ID is a pipeline. An application requests sign-in, Entra ID verifies a primary credential, Conditional Access evaluates the sign-in and may require additional controls such as MFA, and then the token endpoint issues an ID token proving identity and access tokens authorizing API calls. Single sign-on rides on the session this establishes.

How do OAuth and OIDC underpin Entra authentication?

OAuth 2.0 and OpenID Connect are the protocols beneath the pipeline. OAuth handles authorization and produces access and refresh tokens addressed to resources. OIDC adds an identity layer on top and produces the ID token addressed to the application. A single Entra ID sign-in runs both at once, which is why one flow returns both an identity proof and an authorization grant.

What is the difference between ID and access tokens?

An ID token proves who the user is to the application that requested sign-in; its audience is the client. An access token authorizes a call to a protected resource; its audience is that resource. The client consumes the ID token and treats the access token as opaque, while the resource validates the access token and never sees the ID token.

What authentication methods does Entra ID support?

Entra ID supports passwords, one-time codes, push notifications through the Authenticator app, phone calls, FIDO2 security keys, Windows Hello for Business, and certificate-based authentication. Methods differ in phishing resistance: keys, Windows Hello, and certificates bind to the origin and resist phishing, while passwords, codes, and push approvals can be captured or relayed even alongside a second factor.

How do SSO and federation work in Entra ID?

Single sign-on works because the first sign-in establishes a session; later sign-ins to other applications find that session and skip the credential prompt. Federation delegates the primary credential check to an external identity provider, which authenticates the user and returns a signed assertion. Entra ID validates the assertion, then runs the rest of its own pipeline and issues its own tokens.

Conditional Access evaluates after the primary credential is verified. It matches policies against signals such as user, device, location, application, and risk, then grants the sign-in, blocks it, or requires controls like MFA or a compliant device. A correct password is necessary but not sufficient, because policy holds the gate and can challenge or block an otherwise valid sign-in.

Why does my API reject a token that Entra ID issued successfully?

Almost always because of an audience mismatch. The token’s aud claim names a recipient that differs from what the API expects, so the API correctly refuses it even though the token is genuine and unexpired. The fix is to acquire an access token for the correct resource by requesting the right scope, never to weaken the API’s validation.

How long do Entra ID tokens last and what happens when they expire?

Access tokens are short-lived, commonly around an hour, while refresh tokens last much longer. When an access token expires, the application silently exchanges the refresh token for a new access token without prompting the user. When the refresh token expires or is revoked, the silent renewal fails and the user is sent back through interactive sign-in.

What is a refresh token and why is it so sensitive?

A refresh token is an opaque credential an application presents to Entra ID to obtain new access tokens without user interaction. It is sensitive because it can mint fresh access tokens, so a leaked refresh token is a durable foothold. It must be stored securely and never logged, and it is revoked on password reset, session termination, or detected risk.

How does Entra ID know who I am without asking for a password every time?

It uses a session established at your first sign-in. When you return or open another application, the authorize endpoint finds the session and issues tokens without prompting again. The session records what you already proved, including MFA, so subsequent sign-ins within its lifetime reuse that proof. Ending the session or a sign-in frequency policy forces a fresh credential check.

Yes. Conditional Access evaluates after the password is verified, so a correct password only lets policy evaluate the sign-in. If a policy’s conditions match, such as an unmanaged device or an untrusted location, it can require additional proof or block the sign-in outright. The sign-in logs name the policy that applied, which is where to look first.

What does the audience claim in a token mean?

The audience claim, written aud, names the intended recipient of the token. An ID token’s audience is the client application; an access token’s audience is the resource the token authorizes calls to. Validating the audience against your own identifier is how an API confirms a token was meant for it, and comparing aud to a token’s destination catches most token-role mistakes.

Why am I getting an MFA prompt when I did not enable MFA on my account?

MFA in Entra ID is most often required by a Conditional Access policy rather than enabled per account. When a policy’s conditions match your sign-in, it requires a second factor as a control, so the prompt comes from policy evaluating this specific sign-in. The sign-in log entry names the policy that required the prompt and the condition that matched.

In a normal sign-in Entra ID verifies the credential itself. In a federated sign-in it redirects the user to an external identity provider, which authenticates them and returns a signed assertion that Entra ID validates in place of collecting a credential. Only the primary authentication stage changes; Conditional Access still evaluates and Entra ID still issues its own tokens.

What are scopes and how do they affect the tokens I receive?

Scopes name the permissions an application requests. OIDC scopes like openid request an ID token and identity claims; resource scopes like User.Read request authorization to specific operations and shape the access token. The granted scopes may be narrower than requested if consent was limited, which yields an access token the API accepts but that lacks permission for some operations.

Should my client application read claims out of an access token?

No. The access token’s contents are a contract between Entra ID and the resource, and the client should treat it as opaque, passing it to the API unparsed. For identity, the client uses the ID token, which is addressed to it. A client that depends on access token claims is reading a token meant for someone else and may break when the resource’s token format changes.

What is continuous access evaluation and why does it matter?

Continuous access evaluation lets a resource and Entra ID communicate during the life of an access token, so that critical events such as a disabled account or a revoked session take effect in near real time rather than waiting for the token to expire. It extends Conditional Access beyond the moment of sign-in into the session, closing the window that short token lifetimes alone would leave open.

The Entra ID sign-in logs. Find the user’s sign-in attempt and read the result, the Conditional Access policies that applied, and the failure reason. A correct password that still fails is almost always a policy block or an unsatisfied control, which the log names explicitly. Resetting the password wastes time when the credential was never the problem.

How can I practice and verify all of this hands-on?

Run a real sign-in flow end to end and decode each token to inspect its audience, issuer, scopes, and expiry, then compare what you see to the pipeline map. VaultBook’s hands-on Azure labs and command library provide a sandbox and tested commands for tracing sign-ins and inspecting tokens, so you can confirm the model against a running system rather than only reading about it.