A deployment that worked yesterday stops working today, and the only signal you get is a terse line in the event stream saying the image pull was unauthorized. The node tried to fetch a container image from your Azure Container Registry, the registry refused it with a 401, and now your pod sits in a back-off loop while the rest of the rollout waits. An unauthorized pull from ACR is one of the most common reasons a working container platform suddenly refuses to start a workload, and it is also one of the most misdiagnosed, because the fastest looking fix is almost never the right one. The pull is unauthorized because the thing doing the pulling cannot prove it is allowed to read the registry, and the durable repair is to give that identity the permission it needs, not to weaken the registry until anyone can read it.

Fix Container Registry Pull Unauthorized

This article walks the whole failure from the symptom to the fix. You will learn how to read the error so you know which identity is actually being refused, how to confirm each distinct cause rather than guessing, and how to apply the least-privilege repair for the cause you actually have. By the end you should be able to look at an unauthorized pull, name the cause in one sentence, run one confirming check, and apply one fix, without ever reaching for the registry admin account as a shortcut.

What an unauthorized ACR pull actually means

When a container runtime starts a pod or a container instance, it needs the image referenced in the spec. If that image lives in a private Azure Container Registry, the runtime cannot simply download it the way it would pull an anonymous public image from a community registry. The registry demands proof of identity, and it grants the read only if the caller presents a token that maps to a principal with permission to pull. The word unauthorized in the error is precise. It does not mean the image is missing, it does not mean the tag is wrong, and it does not mean the network is down, although all three of those produce their own distinct errors. Unauthorized means the registry received a request, looked at the credential attached to it, and decided the credential does not entitle the caller to read the repository.

That distinction matters because the temptation under pressure is to treat every pull failure as the same problem and to throw the broadest possible fix at it. The broadest fix for an ACR pull is to enable the registry admin account, copy its username and password into an image pull secret, and move on. It works in the sense that the pull starts succeeding, and it is exactly the wrong instinct, because it trades a clean, auditable, per-identity permission model for a single shared credential that every workload now carries and that no audit can attribute to a person or a service. The registry admin account is disabled by default on a well-run registry for that reason, and the right response to an unauthorized pull is to find the identity that should have been allowed and grant it the read it was supposed to have.

To understand the failure you have to understand how a pull authenticates in the first place. A pull from a private ACR is authorized in one of a few ways. The cleanest, and the one Azure steers you toward, is a managed identity that holds the registry read permission directly, so the runtime exchanges the identity for a token and the registry honors it with no secret stored anywhere. On Azure Kubernetes Service this is the kubelet identity, the managed identity attached to the node pool that the kubelet uses when it asks the registry for an image. On Azure Container Apps and Azure Container Instances it is the managed identity you assign to the app or the group. A second path is an image pull secret, a Kubernetes secret or an equivalent credential reference that holds a username and password or a token the runtime presents on every pull. A third path is the registry admin account, the shared credential that is discouraged for production precisely because it is shared. Whichever path is in play, the registry checks the presented credential against its role assignments before it releases a byte of the image.

The role that matters for a pull is AcrPull. It is the built-in Azure role that grants exactly the permission to read and download images from a registry and nothing more, which is what least privilege means here. A principal with AcrPull can pull every repository in the registry it is scoped to and cannot push, delete, or change the registry in any way. There is a newer wrinkle worth knowing from the outset: registries that have been moved to the attribute-based access model, the mode Microsoft describes as RBAC Registry plus ABAC Repository Permissions, use a different role for image pull, the Container Registry Repository Reader role, and the AKS attach command does not wire that one up automatically. For the large majority of registries still on the classic RBAC model, AcrPull is the role, and the rest of this article uses AcrPull as the canonical pull role while flagging the ABAC case where it changes the fix. Verify which model your registry uses before you assign, because assigning the wrong role on the wrong model leaves the pull just as unauthorized as it was.

So the mental model to hold is simple. A pull succeeds when the identity doing the pulling holds the pull role on the registry it is pulling from, and the runtime can reach the registry over the network to present that proof. An unauthorized pull is the registry telling you that the first half of that sentence is false. Either there is no identity that should be allowed, or the identity that should be allowed has not been granted the role, or the credential the runtime is presenting has expired, or, in a slightly different failure that dresses itself up as the same thing, the runtime cannot reach the registry at all because a firewall or a private endpoint is in the way. Each of those is a real, distinct cause with its own confirming check and its own fix, and the work of fixing an unauthorized pull is the work of deciding which one you have.

How to read the error and gather the diagnostic signal

The first job is to read the error properly rather than reacting to the headline. On AKS the symptom you usually see first is a pod stuck in a state the cluster reports as waiting, and a describe on the pod shows the kubelet’s failed pull attempts. The kubelet records a Failed event with a message that the image could not be pulled, and within that message you find the registry’s own words: an unauthorized response, sometimes with a 401 status, sometimes with the registry telling you authentication is required. The cluster-side symptom of this, the looping container that never starts because the image never arrives, is the subject of its own deeper treatment in our piece on diagnosing AKS ImagePullBackOff and ErrImagePull, and the unauthorized response is one of the specific causes that piece routes here for the registry-auth detail. The describe is where you begin because it tells you three things at once: which image reference the runtime tried, which registry that reference points at, and what the registry said back.

Read the image reference carefully, because half the unauthorized pulls that engineers chase are not unauthorized in the way they think. The reference names a registry login server, a repository, and a tag, in the shape of a fully qualified name. If the login server in the reference is not the registry you think you granted access to, no amount of role assignment on the registry you have in mind will help, because the runtime is asking a different registry entirely. This is common when an image was built and pushed to one registry in a development subscription and the manifest still references that registry while the cluster has access only to a production registry. The fix in that case is not a role at all; it is correcting the image reference so it points at the registry the workload is actually allowed to read. Confirm the login server first, every time, before you touch a single role assignment.

Once you are sure the reference names the registry you intend, the next signal is the exact phrasing of the registry’s response, because the registry distinguishes between two adjacent failures that engineers often blur. A genuine unauthorized response means the registry could not authenticate the caller, which is the credential or identity problem this article is mostly about. A different response, a denied or forbidden message, means the registry authenticated the caller fine but the caller’s principal lacks the permission to read that repository, which is the missing-role problem. Both end with the pull not happening, and the everyday language for both is the same word, but the fix differs in emphasis: an authentication failure points at a missing identity, an expired credential, or a network block, while an authorization failure points squarely at a missing AcrPull role on a principal the registry already recognizes. Reading which of the two you got narrows the search before you have run a single command.

The third signal, and the most underused, is anonymous pull. When a runtime presents no credential at all, because no identity was assigned and no pull secret was configured, the registry treats the request as anonymous. A private registry refuses anonymous reads, and the message you get back can look like a plain unauthorized error even though the real cause is that nothing was ever presenting a credential in the first place. This is the failure behind a surprising number of brand new clusters and apps that have never pulled a private image successfully: the identity path was never wired up, so the runtime is pulling anonymously and the registry is correctly saying no. Distinguishing an anonymous failure from a present-but-insufficient credential is the difference between needing to assign an identity and needing to grant a role, and the diagnostic signals above let you tell them apart before you act.

How do you tell which identity is doing the pulling?

Everything downstream depends on knowing which principal the runtime presents to the registry, because the fix is always a property of that principal, and the single most common reason a fix does not work is that it was applied to the wrong identity. On AKS the principal is the kubelet identity, which is a managed identity distinct from the cluster control plane identity and distinct from any identity your pods use for their own application calls. People assign AcrPull to the wrong one constantly, granting it to the control plane identity or to a workload identity and then wondering why the pull is still refused, because the kubelet, not those, is what fetches the image. To find the kubelet identity, query the cluster for its identity profile and read the kubelet identity’s client and object identifiers, which you then use as the assignee when you check or create the role assignment. The command to read it is a single az aks show with a query into the identityProfile.kubeletidentity field, and the object identifier it returns is the principal you must grant AcrPull to.

# Read the kubelet identity that AKS uses to pull images
az aks show \
  --resource-group aks-rg \
  --name my-aks-cluster \
  --query "identityProfile.kubeletidentity.objectId" \
  --output tsv

On Azure Container Apps and Azure Container Instances the principal is the managed identity you assigned to the app or the container group, system assigned or user assigned, and the pull is configured to use that identity by referencing it in the registry settings of the resource. If you assigned no identity, the resource has no principal to present and falls back to whatever credential you configured, or to anonymous if you configured none. On a build or release pipeline the principal is the service connection’s service principal or the pipeline’s managed identity or workload identity federation, and the same rule holds: the thing that authenticates to the registry is the principal you must grant the pull role to, and our walkthrough of using Azure Container Registry in CI/CD covers how a pipeline authenticates and pulls so a deploy step does not hit the same wall a node does. Whatever the platform, write down the exact principal before you change anything, because the entire repair is scoped to it.

The attach-or-role rule

Here is the central claim of this article, the one rule that, once internalized, turns most unauthorized pulls from a panic into a thirty-second fix. A pull from a private Azure Container Registry is authorized by the pull role held by the pulling identity, and on AKS the supported way to grant that role is to attach the registry to the cluster, which assigns the pull role to the kubelet identity for you. Call it the attach-or-role rule: an unauthorized pull means the pulling identity does not hold the pull role, so the fix is to attach the registry to the cluster or to assign the role to the identity directly, and it is never a reason to enable the registry admin account. The admin account looks like a fix because it makes the symptom go away, but it answers a question nobody asked. The question is not how do I get any credential that can read this registry, it is how do I let the identity that should already be allowed do what it was supposed to do.

The attach-or-role rule has a clean shape because Azure built the attach command precisely to spare you the manual role assignment. When you run the attach against an AKS cluster, Azure takes the kubelet identity, finds the registry, and creates the AcrPull role assignment scoped to that registry for that identity, using your own permissions to make the assignment. After the attach, the kubelet can pull every repository in that registry with no secret stored in the cluster and no pull secret in any manifest. That is the model Azure wants you in, and it is the model that keeps the permission auditable, because the role assignment is a first class Azure object you can list, review, and revoke. The same outcome is available without the convenience command by assigning AcrPull to the kubelet identity by hand, which is the path you take when the attach does not apply, for instance on an ABAC-enabled registry where you assign Container Registry Repository Reader instead, or when the registry lives in another subscription and you prefer to manage the assignment explicitly.

The rule extends cleanly to the non-AKS runtimes even though they have no attach command. On Container Apps and Container Instances there is no attach, so the rule reduces to its second half: assign the pull role to the managed identity you gave the app, and reference that identity in the registry settings so the runtime presents it. The behavior of pulling an image into a Container Apps revision, and how the revision fails when the pull is refused, is covered in our explainer on how Azure Container Apps works, and the registry-auth half of that story is exactly the attach-or-role rule applied to a managed identity instead of a kubelet identity. In a pipeline the rule reduces the same way: the service principal or federated identity the pipeline authenticates as must hold the pull role on the registry, and granting it is the whole fix. Across every platform the rule is the same single sentence, and the platform only changes which identity you point it at.

The reason to elevate this to a named rule rather than a loose habit is that it gives you a decision you can make under pressure without thinking. When the pull is unauthorized, you do not open a menu of possible fixes and weigh them; you ask whether the pulling identity holds the pull role, and if it does not, you grant it by the supported path for the platform. The admin account is not on the menu because it is not an answer to the question the rule asks. Internalizing the rule is what stops the three-in-the-morning reflex to flip the admin switch, because the rule has already told you the admin switch is not the fix, the role assignment is.

The InsightCrunch ACR pull table

The fastest way to turn the attach-or-role rule into action is to match the symptom you have to the cause, run the one check that confirms it, and apply the one fix that addresses it. The table below is the findable artifact for this article, the InsightCrunch ACR pull table, and it maps each of the distinct causes of an unauthorized pull to the confirming check and the recommended least-privilege fix. Read it as a triage chart: find the row whose check comes back the way the table describes, and the fix in that row is the repair for the cause you actually have. The sections after the table walk each row in full, with the commands and the reasoning, so the table is the index and the sections are the detail.

Cause What the registry is telling you Confirming check Least-privilege fix
AKS not attached to the registry The kubelet presents an identity the registry does not recognize as allowed, so the pull is unauthorized az aks check-acr reports the cluster cannot authenticate to the registry; no AcrPull assignment exists for the kubelet identity Attach the registry to the cluster with az aks update --attach-acr, which assigns the pull role to the kubelet identity
Pulling identity missing the pull role The identity is recognized but lacks permission to read the repository az role assignment list for the kubelet or app identity scoped to the registry shows no AcrPull (or, on ABAC, no Container Registry Repository Reader) Assign AcrPull to that identity scoped to the registry, or the Repository Reader role on an ABAC-enabled registry
Expired token or credential A previously working credential no longer authenticates The image pull secret or service principal credential is past its expiry; re-running the pull with a fresh login succeeds Rotate to a managed identity so there is no secret to expire, or refresh the credential and re-create the pull secret
Registry firewall or private endpoint blocking the node Authentication never completes because the node cannot reach the registry az aks check-acr fails on connectivity, not auth; the node subnet is not in the registry network rules or lacks a private DNS record Add the node subnet or private endpoint to the registry network rules and ensure private DNS resolves the registry to the endpoint
Reliance on the disabled admin account A pull secret built from admin credentials stops working once admin is disabled The registry admin account is disabled; the failing pull secret holds the admin username and password Replace the admin-based pull secret with a managed identity holding AcrPull; do not re-enable admin
Cross-tenant pull without access The pulling identity lives in a different tenant from the registry and cannot be granted a role there directly The registry and the identity are in different Entra tenants; no cross-tenant assignment or secret exists Use an image pull secret with a token from the registry’s tenant, or a service principal granted AcrPull in the registry’s tenant

Root cause one: AKS is not attached to the registry

The most common single cause of an unauthorized pull on AKS is the simplest: the registry was never attached to the cluster, so the kubelet identity holds no pull role and the registry refuses it. This is the default state of a freshly created cluster and a separately created registry. Nothing wires them together automatically. You create the cluster, you create the registry, you push your images, you deploy a workload that references those images, and the very first pull is unauthorized because the kubelet has nothing to present that the registry will accept. Engineers who built the cluster and the registry in separate steps, or who inherited the environment, hit this constantly, and the giveaway is that the pull has never worked, not that it stopped working.

The confirming check for this cause is the purpose-built command Azure ships for exactly this question. The check-acr command takes the cluster and a registry login server and validates, end to end, whether the cluster can authenticate to and reach that registry. It exercises the kubelet identity against the registry the way a real pull would, so a clean pass means the attach and the role assignment are in place and a network path exists, while a failure tells you which half is broken. When the cause is a missing attach, the command reports that the cluster cannot authenticate to the registry, which is distinct from the connectivity failure you get when a firewall is in the way. Run it first, because it answers the attach question and the network question in one shot and saves you from guessing which you have.

# Validate whether AKS can authenticate to and reach the registry
az aks check-acr \
  --resource-group aks-rg \
  --name my-aks-cluster \
  --acr myregistry.azurecr.io

If the check confirms the cluster cannot authenticate, the fix is the attach, and the attach is a single command. Running az aks update with the attach flag against the cluster takes the registry you name, locates the kubelet identity, and creates the AcrPull role assignment scoped to that registry for that identity, using your own permissions to make the assignment. After it completes, give the assignment a minute to propagate through Azure, because a freshly created role assignment is not instantly visible to every part of the platform, and then retry the deployment. The pull that was unauthorized a moment ago now succeeds, because the kubelet finally holds the role the registry was checking for.

# Attach the registry to the cluster, granting the kubelet identity the pull role
az aks update \
  --resource-group aks-rg \
  --name my-aks-cluster \
  --attach-acr myregistry

There is one important exception to fold in here, the ABAC case mentioned earlier. If the registry has been moved to the attribute-based access model, the attach command does not assign the role that grants pull on that model, because on an ABAC-enabled registry the pull permission comes from the Container Registry Repository Reader role rather than AcrPull, and the attach path does not wire that one up. On such a registry the attach can appear to succeed while pulls remain unauthorized, which is a confusing failure because the command you trusted to fix it did not. The fix on an ABAC-enabled registry is to assign the Container Registry Repository Reader role to the kubelet identity by hand, scoped to the registry, and that manual assignment is the same shape as the role-assignment fix in the next section, just with a different role name. Knowing which access model your registry uses before you attach saves a frustrating loop where the attach reports success and the pull keeps failing.

A second subtlety is the cross-subscription registry. If the registry lives in a different subscription from the cluster, the attach still works, but you must reference the registry by its full resource identifier rather than by its short name, because the short name only resolves within the cluster’s own subscription. Supplying the resource identifier tells the attach exactly which registry across the subscription boundary to assign the role on, and the rest of the behavior is identical. The pull role is assigned to the kubelet identity on the named registry, and the pull starts working once it propagates.

Why does az aks check-acr say the pull will fail?

When check-acr reports that a pull will fail, it is doing more than restating the symptom; it is telling you which of two very different problems you have. The command separates authentication from connectivity, and reading which one it flags is the single most efficient diagnostic step in the whole exercise. If it reports it cannot authenticate to the registry, the kubelet identity does not hold the pull role, and you are in the missing-attach or missing-role world where the fix is to attach or to assign. If instead it reports it cannot reach the registry, that it timed out or could not resolve the registry’s name, the role might be perfectly in place and the real problem is a network block, which sends you to the firewall and private-endpoint cause rather than to a role assignment. The same unauthorized symptom in a pod describe collapses both of these into one message, and check-acr is the tool that un-collapses them.

The reason this is worth a dedicated subheading is that the wrong reading sends you down a fruitless path. If check-acr is failing on connectivity and you respond by assigning AcrPull over and over, you will assign a role that is already there or that does not matter, retry, and watch the pull fail identically, because the node still cannot reach the registry to present the role you keep granting. Conversely, if it is failing on authentication and you respond by opening firewall rules, you will widen the network for no reason while the actual missing role sits ungranted. Let the command tell you which half is broken, fix that half, and run the command again to confirm the fix landed before you retry the real workload. Treating check-acr as the arbiter of auth-versus-network is what keeps the repair on the rails.

Root cause two: the pulling identity is missing the pull role

The second cause looks like the first but is subtly different, and the difference is what makes it maddening to debug if you do not separate them. Here the registry recognizes the pulling identity, but that identity does not hold the pull role, so the registry authenticates the caller and then denies the read. This happens when the attach was run against the wrong cluster or the wrong registry, when an automation script created the cluster with a custom kubelet identity and never assigned the role to it, when someone detached and reattached registries and a stale assignment got left behind, or, most often on the non-AKS runtimes, when a managed identity was assigned to the app but never granted AcrPull on the registry. The symptom is the same unauthorized pull, but the underlying state is a recognized identity with insufficient permission rather than no permission path at all.

The confirming check is to list the role assignments for the pulling identity scoped to the registry and see whether AcrPull is among them. You read the identity’s object identifier the way the earlier section described, you take the registry’s resource identifier, and you list the assignments at that scope filtered to that assignee. If AcrPull is present, the role is not your problem and you should look elsewhere, most likely at the network or at an expired credential. If it is absent, you have found the cause, and the fix is to create the assignment. The absence is unambiguous in a way the unauthorized symptom is not, which is why this check is worth running even when you are fairly sure the attach was done, because attaches go to the wrong place often enough that confirming the assignment exists is cheaper than assuming it does.

# Resolve the identity and the registry, then list pull-role assignments
KUBELET_ID=$(az aks show -g aks-rg -n my-aks-cluster \
  --query "identityProfile.kubeletidentity.objectId" -o tsv)
ACR_ID=$(az acr show -g acr-rg -n myregistry --query "id" -o tsv)

az role assignment list \
  --assignee "$KUBELET_ID" \
  --scope "$ACR_ID" \
  --query "[].{role:roleDefinitionName}" \
  --output table

When the assignment is missing, you create it directly with az role assignment create, naming AcrPull as the role, the pulling identity’s object identifier as the assignee, and the registry’s resource identifier as the scope. Scoping to the registry rather than to a resource group or a subscription is the least-privilege choice, because it grants the identity the read on exactly the registry it needs and nowhere else. On an ABAC-enabled registry you substitute Container Registry Repository Reader for AcrPull, because that is the role that confers pull on that access model, and the rest of the command is identical. The assignment takes a short moment to propagate, the same as the attach, and then the recognized-but-denied pull becomes a recognized-and-allowed pull.

# Assign the pull role to the identity, scoped to the registry only
az role assignment create \
  --role "AcrPull" \
  --assignee-object-id "$KUBELET_ID" \
  --assignee-principal-type ServicePrincipal \
  --scope "$ACR_ID"

This cause is where the choice of identity model pays off, because a managed identity that holds AcrPull never expires, never needs a secret in the cluster, and is the cleanest thing to assign and audit. If your workloads or your nodes are not yet on managed identities, this failure is a good prompt to move them there, and the mechanics of setting up and assigning managed identities so they hold exactly the roles they need is the subject of our guide on setting up managed identities the right way. Getting the identity model right once removes a whole family of unauthorized pulls permanently, because there is no longer a secret to leak, rotate, or let expire, only a role assignment that either exists or does not.

Root cause three: an expired token or credential

The third cause is the one that breaks a pull that used to work, which makes it feel different from the first two even though it lands on the same unauthorized message. When a runtime authenticates to a registry with a credential that has a lifetime, that credential expires, and an expired credential authenticates as nobody. The classic version of this is a service principal whose client secret has reached its expiry date, used either as the cluster’s service principal in older clusters that predate managed identities or as the principal behind an image pull secret. The secret had a validity window, the window closed, and the next pull that tries to use it is refused because the registry cannot authenticate an expired secret. The tell is timing: the pull worked for weeks or months and then failed on a specific day, with no deployment or configuration change to explain it, because the change was the silent passing of an expiry date.

A second version of the same cause is a token-based pull secret built from a registry token or a short-lived access token that has aged out. Registry tokens and refresh tokens have lifetimes too, and a pull secret that captured one at a moment in time stops working when that token expires, exactly like the service principal secret. In both versions the registry is doing its job correctly: it received a credential, found it expired, and declined to authenticate it. The confirming check is to look at the expiry of the credential the pull secret or the service principal is using. For a service principal you query its credential metadata and read the end date; for a pull secret you inspect what it holds and trace that back to the token or secret it was built from and check that source’s expiry. If the credential is past its end date, the cause is confirmed and you do not need to look further.

# Check the expiry of the service principal behind a pull, if one is in use
SP_ID=$(az aks show -g aks-rg -n my-aks-cluster \
  --query "servicePrincipalProfile.clientId" -o tsv)

az ad sp credential list \
  --id "$SP_ID" \
  --query "[].{end:endDateTime}" \
  --output table

The narrow fix is to rotate the credential: generate a fresh secret or token, update the pull secret or the cluster’s credential with the new value, and retry. That restores the pull, and if a rotation is all you can do right now, it is a legitimate stopgap. The durable fix, though, is to remove the expiry from the equation entirely by moving to a managed identity, because a managed identity has no secret that expires. A kubelet identity or an app identity holding AcrPull authenticates by exchanging the identity for a token Azure issues fresh on demand, so there is no stored credential aging quietly toward a failure date. Every credential you rotate is a credential that will expire again, which means the rotation fix guarantees a repeat of this exact incident on a future date, while the managed-identity fix ends the recurrence. When you find an expired-credential pull failure, treat the rotation as the immediate unblock and the move to a managed identity as the actual repair.

Root cause four: the registry firewall or private endpoint blocks the node

The fourth cause is the impostor in the set, because it produces a failure that reads like an authorization problem but is really a connectivity problem. When a registry has network rules that restrict access, or when it is reachable only through a private endpoint, a pull from a node outside the allowed network never gets far enough to authenticate. The node tries to reach the registry, the network blocks the connection or the name fails to resolve to the private endpoint, and the runtime surfaces an error that, depending on how it bubbles up, can look indistinguishable from an unauthorized response even though no credential was ever evaluated. This is why check-acr earns its place: it separates the connectivity failure from the authentication failure, and a firewall block shows up there as an inability to reach the registry rather than an inability to authenticate to it.

There are two flavors. The first is a public registry with firewall rules that only permit specific networks, where the node’s outbound address or subnet is simply not on the allowed list, so the connection is refused at the network boundary. The second is a private registry exposed only through a private endpoint, where the registry has no public access at all and is reachable solely through a private IP in a virtual network, which means the node must be in a network with a route to that endpoint and, critically, must resolve the registry’s name to the private IP rather than to the now-disabled public one. The second flavor adds a DNS dimension that trips people up: even with perfect network routing, if the node resolves the registry name to its public address, the pull fails, because the public address is closed. Private DNS that maps the registry name to the private endpoint’s IP is as much a part of the fix as the network route.

The confirming check is again check-acr, read for connectivity rather than auth, supplemented by a name resolution test from the node’s perspective. If check-acr reports it cannot reach or resolve the registry, you are in this cause, and you should verify what the registry name resolves to from inside the node’s network and confirm the node subnet is either listed in the registry’s network rules or connected to the registry’s private endpoint. The fix follows the diagnosis: for the firewall flavor, add the node subnet, or the cluster’s outbound network, to the registry’s network rules so the connection is permitted; for the private-endpoint flavor, ensure a private endpoint exists for the registry in a network the nodes can route to and that a private DNS zone resolves the registry name to that endpoint’s address. Once the node can both reach and resolve the registry, the credential it was holding all along finally gets a chance to authenticate, and the pull succeeds.

# Inspect the registry network posture: public access, default action, and rules
az acr show \
  --name myregistry \
  --query "{publicNetworkAccess:publicNetworkAccess, defaultAction:networkRuleSet.defaultAction}" \
  --output table

The thing to resist in this cause is the reflex to assume an unauthorized message always means a credential problem and to start reassigning roles when the role was never the issue. If check-acr says the registry is unreachable, no role assignment will help, because the node cannot present the role it already holds. This is the cause most likely to send an engineer in circles, granting and re-granting AcrPull while the firewall quietly refuses every attempt, and the discipline that breaks the circle is to read check-acr for connectivity before you touch a single permission. Network first when the command says network, role first when the command says auth, and never both at once on a guess.

Root cause five: reliance on the disabled admin account

The fifth cause is self-inflicted and instructive, because it is the failure mode of the very shortcut this article warns against. The registry admin account is a single shared credential, a username and password the registry can hand out, that authenticates as the registry itself rather than as any particular identity. It is disabled by default on a well-configured registry, and many organizations disable it deliberately as a security baseline because a shared credential cannot be attributed, cannot be scoped below full access, and tends to get copied into more places than anyone can track. The failure arises when a workload was set up to pull using an image pull secret built from the admin account’s credentials, and then someone disables admin, either as a hardening step or because a policy started enforcing it. Every pull that depended on the admin credential now fails as unauthorized, because the credential it was using no longer authenticates.

The confirming check is to ask the registry whether admin is enabled and to inspect the failing pull secret to see whether it holds the admin username and password. If admin is disabled and the pull secret contains the admin credential, the cause is confirmed: the workload was leaning on a credential that has been turned off. It is a satisfying diagnosis to reach because it is unambiguous, and it is an uncomfortable one because it means the environment was depending on the thing it should not have been depending on. The temptation at this exact moment is overwhelming and exactly wrong: re-enable admin, the pull starts working again, the incident closes, and nothing has been fixed, because the environment is right back to depending on a shared credential that the next hardening pass will disable again.

# Confirm whether the registry admin account is enabled
az acr show \
  --name myregistry \
  --query "adminUserEnabled" \
  --output tsv

The right fix is to use the disabling of admin as the forcing function it should be and replace the admin-based pull with a managed identity holding AcrPull. Assign the pull role to the kubelet identity or the app identity as the earlier sections described, point the runtime at that identity, remove the admin-based pull secret, and leave admin disabled. The pull now authenticates as a scoped, auditable, per-resource identity that no policy will turn off, and the dependency on the shared credential is gone for good. This is the cause that most directly tests whether you believe the attach-or-role rule, because the rule’s whole point is that the admin account is never the answer, and the moment of an admin-disabled outage is precisely when that belief is hardest to hold and most worth holding.

Should you enable the ACR admin account to unblock the pull?

No, and it is worth being unambiguous about why, because this is the single most common wrong turn in the whole topic. Enabling the admin account does make the pull succeed, which is exactly what makes it dangerous, because a fix that works is far harder to argue against than one that does not. The cost is paid later and elsewhere. The admin credential is shared, so every workload that uses it is indistinguishable from every other workload in any audit, and a leaked admin password compromises the entire registry rather than one identity’s read access. The admin credential cannot be scoped below full registry access, so a workload that only needs to pull one repository ends up holding a credential that can do everything. And the admin credential is a secret that lives in pull secrets and configuration, which means it leaks the way all secrets leak, into logs, into repositories, into screenshots, into the clipboard history of whoever set it up.

Against all of that, the managed-identity fix costs one role assignment. The asymmetry is the entire argument. Enabling admin trades a clean, attributable, least-privilege model for a shared, unscoped, leakable one, and it does so to save the minute it takes to assign AcrPull to the right identity. The only defensible use of the admin account is a genuinely temporary, time-boxed unblock with a committed follow-up to remove it, and even then the honest version of that plan usually reveals that doing the role assignment now is faster than doing the admin enable now plus the cleanup later. When the pull is unauthorized and the admin switch is right there, the discipline is to ask the rule’s question instead, find the identity that should be allowed, and grant it the role.

Root cause six: the cross-tenant pull without access

The sixth cause is the specialist one, the failure that appears when the registry and the pulling identity live in different Microsoft Entra tenants. Role assignments are a within-tenant construct: you grant AcrPull to a principal, and both the principal and the registry have to be addressable within the same directory for the assignment to mean anything. When a cluster in one organization’s tenant needs to pull from a registry in another organization’s tenant, the clean managed-identity path does not directly apply, because you cannot assign a role in the registry’s tenant to an identity that lives in yours as if they were in the same directory. The pull is unauthorized not because a role is missing in the ordinary sense but because there is no in-tenant principal for the registry to grant the role to.

The confirming check is to compare the tenant of the registry with the tenant of the pulling identity, which you read from each resource’s directory context, and to confirm that no cross-tenant assignment or pull secret currently exists. If the two tenants differ and the pull is failing, you are in this cause, and the ordinary attach-or-assign fix will not work because there is no shared directory in which to make the assignment. This is the one cause where the managed-identity ideal bends, because identities do not span tenants the way a single organization’s environment assumes.

The fix is to bridge the tenant boundary with a credential that the registry’s tenant can issue and the pulling tenant can present. The pragmatic path is an image pull secret built from a token or a service principal that exists in the registry’s tenant and holds AcrPull there, which the runtime in your tenant then presents on the pull. You are still granting the pull role, you are just granting it to a principal that lives where the registry lives and then carrying that principal’s credential across the boundary in a pull secret. The same applies on the non-AKS runtimes and in pipelines: the principal that authenticates must be one the registry’s tenant recognizes and has granted the pull role. It is the one situation where a pull secret is the right tool rather than a fallback, because the cross-tenant boundary is exactly the case managed identities do not cross, and a credential issued in the registry’s tenant is the supported way through.

Prevention: stopping the unauthorized pull from recurring

Fixing the pull in front of you is satisfying, but the more valuable work is making sure this particular incident does not return, because unauthorized pulls are almost always preventable and almost always recur in environments that treat each one as a fresh surprise. The single highest-leverage prevention is to standardize on managed identities for every pull path and to retire stored credentials entirely. A kubelet identity holding AcrPull, an app identity holding the pull role, a pipeline using workload identity federation: none of these carry a secret that can expire, none can be copied into a place it should not be, and each is a first-class assignment you can list and audit. Most of the recurring causes in this article, the expired credential, the disabled-admin dependency, even some of the missing-role confusion, simply cannot happen in an environment that has no stored pull credentials to expire or to lean on.

The second prevention is to make the role assignment part of the infrastructure definition rather than a manual step someone remembers to run. When the cluster and the registry are provisioned together in code, the attach or the AcrPull assignment lives in the same template, so a cluster is never created without its pull permission, and a new environment stood up from the same definition inherits the working relationship automatically. The failure where the cluster and registry exist but were never connected is purely an artifact of provisioning them by separate manual steps; defining them together makes that gap impossible. The same principle extends to the pipeline path, where the pull role for the deploy identity is declared alongside the pipeline rather than granted ad hoc, and our coverage of Azure Container Registry in CI/CD shows how a pipeline’s pull identity should be wired so a release never discovers its missing permission at deploy time.

The third prevention is to validate the pull path before the workload depends on it, not after it fails. The check-acr command exists precisely so you can confirm, ahead of any real deployment, that a cluster can authenticate to and reach a registry, and running it as a gate after any change to a cluster’s identity, a registry’s network rules, or a registry’s access model catches the break before a user-facing rollout does. The cheapest unauthorized pull is the one a validation step caught in a pipeline, and the most expensive is the one a customer caught in production, and the only difference between them is whether someone ran the check. Building that check into the path that changes clusters and registries turns the validation from a thing you do during an incident into a thing that prevents the incident.

The fourth prevention is auditing what holds the pull role and what carries pull credentials, on a cadence rather than on incident. Periodically listing the AcrPull assignments on each registry tells you which identities can pull and surfaces both the assignments that should not exist and the workloads that are pulling through a stored credential instead of an identity. Periodically checking whether the admin account is enabled, and whether anything still depends on it, surfaces the disabled-admin failure before a hardening pass triggers it. This is the kind of repeated, scenario-shaped practice that is far easier to build as a habit than to improvise under pressure, and working through reproductions of each unauthorized-pull pattern is exactly what the scenario-based troubleshooting drills on ReportMedic are built for, alongside a place to run the hands-on Azure labs and command library on VaultBook where you can stand up a cluster, attach a registry, break the pull deliberately, and watch each cause and fix behave the way this article describes. Reproducing the failure in a sandbox is what turns reading about the attach-or-role rule into being able to apply it in the dark.

An unauthorized pull sits in a neighborhood of similar-looking container failures, and telling it apart from its neighbors is half the diagnostic work, because the wrong neighbor sends you to the wrong fix. The closest neighbor is the manifest-not-found failure, where the image reference points at a repository or a tag that does not exist in the registry. That failure can surface in the same back-off loop and the same describe output, but its root is a typo or a missing push, not a permission, and no role assignment will conjure an image that was never pushed. The way to keep them separate is to confirm the image exists in the registry before you assume the failure is about permission; if the tag is not there, you have a manifest problem wearing the costume of an auth problem, and the fix is to push the image or correct the reference.

The second neighbor is the broad pull back-off that the cluster reports when an image cannot be fetched for any reason, the symptom our piece on diagnosing AKS ImagePullBackOff and ErrImagePull treats as a family. Unauthorized is one member of that family, but the family also includes the manifest-not-found case, transient registry unavailability, rate limiting on public registries, and pure network timeouts that have nothing to do with credentials. When you see the back-off state, the unauthorized message inside the event is what narrows it to this article’s territory; without that specific message, you might be chasing a different member of the family entirely, and reading the event detail rather than the state name is what tells you which one you have.

The third neighbor is the runtime-specific failure that happens after the image pulls successfully. On Azure Container Apps a revision can fail to provision for reasons that have nothing to do with the pull, a failing probe, a wrong target port, a missing secret, and it is easy to assume the revision failed because of the registry when the image actually pulled fine and the failure came later. Reading whether the image was retrieved before the failure separates a genuine unauthorized pull from a downstream revision problem, and the broader behavior of how a Container Apps revision starts and where it can fail is covered in our explainer on how Azure Container Apps works. The general discipline across all three neighbors is the same: read the exact stage and the exact message, not just the headline state, because the headline is shared and the cause is not.

A fourth confusion worth naming is the authentication-versus-authorization split inside the unauthorized symptom itself, which the diagnostic section introduced and which deserves repeating because it is so easy to blur. An authentication failure means the registry could not work out who the caller is, which points at a missing identity, an expired credential, or a network block that stopped the handshake. An authorization failure means the registry knows exactly who the caller is and has decided that caller may not read the repository, which points squarely at a missing pull role. The everyday word for both is unauthorized, and the registry’s precise response, and the check-acr verdict, are what let you tell which of the two you are actually looking at. Resolving the split correctly is what stops you from assigning roles to fix an authentication problem or rotating credentials to fix an authorization one.

How the pull authenticates under the hood

It is worth holding a slightly deeper model of the pull handshake, because the deeper model makes every cause above feel inevitable rather than arbitrary. When a runtime wants an image from a private registry, it does not present a long-lived password on the wire for each blob. Instead it performs a token exchange: it authenticates once, with whatever credential it holds, to the registry’s token service, and receives a scoped access token that authorizes the specific pull operation it is about to perform. The credential it authenticates with is the variable. For a managed identity, the runtime obtains an Entra token for the identity and exchanges that for the registry access token, with no stored secret anywhere in the path. For a service principal or an admin account, it authenticates with that credential to obtain the same kind of access token. For an anonymous request, it presents nothing and the token service issues, at most, a token scoped to whatever anonymous pull the registry allows, which on a private registry is nothing.

Seeing the handshake this way explains each cause cleanly. The missing-attach and missing-role causes are failures at the authorization step of the token exchange: the registry authenticates the identity but the scoped token it would issue carries no pull permission, because the identity holds no pull role, so the operation is refused. The expired-credential cause is a failure at the authentication step: the credential presented to the token service is no longer valid, so no access token is issued at all. The network cause is a failure before the handshake even begins: the runtime cannot reach the token service or the registry endpoint, so there is no exchange to succeed or fail on its merits. The admin-disabled cause is a special case of the authentication failure, where the specific shared credential the runtime was presenting has been turned off. And the cross-tenant cause is a structural failure of the authorization step, because the token service in the registry’s tenant has no record of an identity from another tenant to grant a scoped token to.

The scoped access token the handshake produces is itself short-lived by design, which is a feature rather than a limitation. Because the token authorizes only the operation it was issued for and expires quickly, a token that does leak is worth almost nothing to whoever captures it, and the long-lived thing behind it, the managed identity, never travels on the wire at all. This is the opposite of the admin-account posture, where a single long-lived shared secret travels everywhere and is worth everything to anyone who captures it. The token service is doing the heavy lifting silently on every fetch, minting a fresh, narrowly scoped, short-lived authorization each time, and the only durable secret an engineer ever has to think about is the one they choose to store, which with a managed identity is none at all. Understanding that the durable credential and the on-the-wire token are different objects is what makes the security argument for managed identities click into place, because it shows that the model is not merely tidier but structurally safer at the level of what an attacker could actually steal.

This is also why the managed-identity path is so much more robust than the alternatives, and why so much of the prevention advice points at it. A managed identity collapses the credential question into an identity question. There is no secret to authenticate with, only an identity Azure vouches for on demand, so the authentication step cannot fail on an expired secret, the credential cannot leak because there is no stored credential, and the audit trail is the identity itself rather than a shared username. Every cause in this article that is about a credential rather than a role, the expired token, the disabled admin, the leaked secret you have not noticed yet, is a cause that the managed-identity model removes at the root. The model is not just cleaner to administer; it eliminates entire categories of the failure rather than making them easier to fix.

A worked example from symptom to fix

It helps to walk a realistic case all the way through, because the abstract rule becomes concrete only when you watch it triage a real failure. Picture a deployment to a cluster that has been running fine for months. A new microservice is rolled out, and its pods immediately settle into a back-off loop. A describe on one of the pods shows the kubelet’s failed pull attempts and, inside the events, the registry’s unauthorized response naming a fully qualified image reference. The instinct under deadline is to assume the registry permissions are broken and to start reassigning roles, but the discipline is to read the signal first, and the first thing to read is the image reference itself.

Reading it reveals the first surprise: the login server in the reference is the development registry, not the production registry the cluster is attached to. The new microservice’s manifest was built in a development pipeline that pushed to the development registry, and the reference was never updated to point at production when the service was promoted. No role assignment on the production registry would ever fix this, because the kubelet is not asking the production registry for the image; it is asking the development registry, which the cluster has no access to. The fix here is not a permission at all. It is correcting the image reference, or re-tagging and pushing the image to the production registry the cluster can read, after which the pull succeeds because the kubelet is finally asking a registry it is authorized against. This is the half of unauthorized pulls that are not really permission failures, and reading the reference first is what catches them before an hour is lost reassigning roles that were never the issue.

Now picture a second case that looks identical at the headline but resolves differently. The reference is correct, naming the production registry the cluster is attached to, and the pull is still unauthorized. The instinct again is to reassign, but the discipline is to run check-acr. It comes back reporting that the cluster cannot authenticate to the registry. That single result rules out the network and points squarely at a missing role, so the next step is to read the kubelet identity’s object identifier and list its assignments on the registry. The list comes back empty: no AcrPull. The attach was run months ago against the registry, but a recent operation detached and reattached registries during a migration, and the reattach went to a different registry by mistake, leaving the production registry’s assignment gone. The fix is to attach the production registry again, or to assign AcrPull to the kubelet identity directly scoped to that registry, wait the propagation minute, and retry, at which point the pull succeeds.

Picture a third case to round out the pattern, the one that defeats the reassign reflex entirely. The reference is correct and check-acr is run, but this time it reports that the cluster cannot reach the registry, a connectivity failure rather than an authentication one. Reassigning AcrPull here would be wasted motion, because the role is already in place and the node simply cannot get to the registry to present it. Reading the registry’s network posture shows it was recently moved behind a private endpoint as part of a security initiative, its public access disabled, and the cluster’s nodes are resolving the registry name to the now-closed public address because no private DNS zone was created to map it to the private endpoint. The fix is to create the private DNS zone, link it to the node network, and ensure a private endpoint exists with a route from the nodes, after which the nodes resolve the registry to its private address, the connection succeeds, and the AcrPull role that was there all along finally gets to do its job. Three identical-looking symptoms, three different causes, three different fixes, and the only thing that separated them was reading the signal before acting.

The lesson the three cases teach together is the order of operations, not any single command. Confirm the reference names the registry you intend. Run check-acr and read auth versus connectivity. Then, and only then, branch to the fix the confirmed cause calls for, whether that is correcting a reference, granting a role, or repairing the network. An engineer who runs that sequence resolves all three cases quickly and correctly, while an engineer who skips straight to reassigning roles resolves only the middle one and burns time fighting the other two with a tool that does not apply. The sequence is short enough to hold in your head during an incident, and holding it is what turns a stressful outage into a routine fix.

What to watch after the fix lands

Resolving the pull is not quite the end of the job, because a fix that is not verified is a fix you will be back to revisit, and there are a few things worth watching to confirm the repair is real and durable. The first is the pod or revision itself: after the role assignment propagates and you retry, watch the pull actually succeed in the events rather than assuming success because you ran the right command. The kubelet records a successful pull the same way it recorded the failures, and seeing that successful pull event is the proof that the credential and the role and the network all lined up. If the pull still fails after the propagation window, the events will tell you whether it is the same unauthorized message, which means the fix did not land where you intended, or a new message, which means you have uncovered a second cause stacked behind the first.

The second thing to watch is the scope of what you fixed, because it is easy to fix one workload’s pull and leave a sibling broken. If the cause was a missing role on a kubelet identity, every workload on that cluster pulling from that registry is fixed at once, which is the elegant property of fixing it at the identity level. But if you reached for a per-workload pull secret as a stopgap, you have fixed only the workloads that reference that secret, and the next deployment that does not reference it will fail identically. This is one more reason the managed-identity fix is the durable one: it repairs the pull for the whole cluster’s relationship to the registry rather than for one manifest, so you are not left with a patchwork of fixed and unfixed workloads that each need their own attention.

The third thing to watch is whether the fix introduced a dependency you will need to maintain. A managed identity holding AcrPull is essentially maintenance-free, which is the point, but a rotated credential or a freshly created pull secret is a future expiry waiting to happen, and a private DNS zone or a network rule you added is a piece of configuration that now has to be carried forward into every rebuild of that environment. Capturing these in the infrastructure definition, rather than leaving them as manual changes someone made during an incident, is what keeps the fix from quietly decaying. The most common way a resolved unauthorized pull returns is that the fix was applied by hand during an outage, never written into the definition the environment is rebuilt from, and silently lost the next time the environment was recreated. Writing the fix into code is the last step that makes it stick.

The fourth thing to watch, over the longer horizon, is the registry’s access model and admin state, because both can change underneath a working pull. A registry that is migrated to the attribute-based access model will start requiring the repository reader role where AcrPull used to suffice, and a working attach will silently stop covering new clusters. A registry whose admin account is disabled by a future policy will break any pull that was quietly leaning on it. Neither of these announces itself until a pull fails, so the way to stay ahead of them is the periodic audit described in the prevention section: list what holds the pull role, check the admin state, and confirm the access model, on a cadence rather than on incident. The fix you land today is correct for today’s configuration, and watching the configuration is what keeps it correct as the registry evolves.

The verdict

An unauthorized pull from Azure Container Registry is, underneath the alarming message and the stuck rollout, a small and well-bounded problem. The registry asked the caller to prove it is allowed to read, the caller could not, and the fix is to make the caller able to prove it, by the supported path for the platform it runs on. The attach-or-role rule is the whole of it: a pull is authorized by the pull role on the pulling identity, on AKS the attach grants that role to the kubelet identity, on the other runtimes you assign it to the managed identity directly, and an unauthorized pull is therefore a missing attach or a missing role and never a reason to enable the registry admin account. Hold that rule and the diagnosis becomes a short decision tree rather than a panic, and the fix becomes a single command rather than a gamble.

The discipline that separates a clean fix from a mess is reading the signal before acting. Confirm the image reference names the registry you think it does. Identify the exact identity the runtime presents, because the fix is a property of that identity and applying it to the wrong one is the most common reason a correct fix appears not to work. Run check-acr and read whether it fails on authentication or on connectivity, because those send you to entirely different repairs and the unauthorized symptom hides which one you have. Then apply the least-privilege fix for the confirmed cause: attach or assign the role for a permission problem, fix the network for a reachability problem, and rotate to a managed identity for a credential problem, which doubles as the prevention that stops the credential problem from ever returning. The admin account stays off, because the rule already told you it is not the answer.

The strategic verdict is that the right fix and the right prevention are the same move. Every time you resolve an unauthorized pull by assigning the pull role to a managed identity rather than by enabling a shared credential, you are not just unblocking today’s deployment; you are removing a class of future failures, because a managed identity holding a scoped role cannot expire, cannot leak a stored secret, and cannot be disabled by a hardening pass. The unauthorized pull is an invitation to do the durable thing, and the engineers who take that invitation stop seeing this error, while the ones who reach for admin keep meeting it again on the next rotation and the next policy change. Fix the cause you have, fix it with least privilege, and you will not be back here.

Frequently Asked Questions

Q: Why is my image pull from ACR unauthorized?

The registry received the pull request, looked at the credential the runtime presented, and decided it does not entitle the caller to read the repository. In practice that means one of a few things: the pulling identity holds no pull role on the registry, the registry was never attached to the cluster so the kubelet identity has nothing the registry accepts, a stored credential the runtime was using has expired, or the runtime is pulling anonymously because no identity or pull secret was ever configured. The first move is to confirm the image reference actually names the registry you think you granted access to, because a reference to the wrong registry produces the same message regardless of any role you assign. Once the reference is right, identify the exact identity the runtime presents and check whether it holds the AcrPull role on that registry. The fix is to grant that identity the pull role, by attaching the registry to the cluster on AKS or assigning the role directly elsewhere, not to enable the registry admin account.

Q: Does AKS need ACR attached to pull images from a private registry?

For a private registry, yes, in the sense that the kubelet identity the cluster uses to pull must hold the pull role on the registry, and attaching the registry is the supported way to grant that role on AKS. A freshly created cluster and a separately created registry are not connected automatically, so the very first pull of a private image is unauthorized until you attach. The attach command takes the kubelet identity, finds the registry, and creates the AcrPull role assignment scoped to that registry, using your own permissions to make the assignment, after which the kubelet can pull every repository in that registry with no stored secret. You can achieve the same result without the attach command by assigning AcrPull to the kubelet identity by hand, which is the path you take on a registry that uses the attribute-based access model, where you assign the Container Registry Repository Reader role instead and the attach command does not wire it up for you.

Q: Can an expired token or credential cause an unauthorized pull?

Yes, and it is the cause behind a pull that worked for weeks or months and then failed on a specific day with no deployment to explain it. A service principal client secret, a registry token, or a refresh token captured into an image pull secret all have lifetimes, and when the lifetime ends the credential authenticates as nobody, so the next pull is refused. The tell is timing: the failure correlates with an expiry date rather than with any change you made. Confirm it by reading the expiry of the credential the runtime is using, querying the service principal’s credential end date or tracing the pull secret back to the token it was built from. The narrow fix is to rotate the credential and update the pull secret, which restores the pull but guarantees a repeat on the next expiry. The durable fix is to move to a managed identity, which has no stored secret to expire, removing the cause at the root rather than resetting its clock.

Q: Does the pulling identity need the AcrPull role?

Yes, on a registry using the classic role-based access model, the principal that authenticates to the registry must hold AcrPull, the built-in role that grants exactly the permission to read and download images and nothing more. On AKS that principal is the kubelet identity, on Container Apps and Container Instances it is the managed identity you assigned to the resource, and in a pipeline it is the service connection’s principal or federated identity. The single most common reason a correct-looking fix fails is that AcrPull was granted to the wrong identity, the control plane identity or a workload identity rather than the kubelet identity, so confirm which principal actually does the pulling before you assign. On a registry that has been moved to the attribute-based access model, the pull role is Container Registry Repository Reader rather than AcrPull, and assigning AcrPull there leaves the pull unauthorized because that role does not confer pull on that model, so verify which access model the registry uses before assigning.

Q: Should I enable the ACR admin account or use RBAC to fix the pull?

Use role-based access, not the admin account. Enabling the admin account does make the pull succeed, which is exactly what makes it the wrong instinct, because a fix that works is harder to argue against than one that does not. The admin credential is shared, so every workload using it is indistinguishable in any audit and a single leak compromises the whole registry; it cannot be scoped below full access, so a workload that only needs to pull one repository ends up holding a credential that can do everything; and it is a stored secret that leaks the way all secrets leak. Against that, the role-based fix costs one assignment of AcrPull to the right identity, scoped to the registry only. The asymmetry is the entire argument. The only defensible use of admin is a strictly time-boxed unblock with a committed follow-up to remove it, and even then the role assignment is usually faster to do now than the admin enable plus the later cleanup.

Q: Can a registry firewall or private endpoint block the pull and look unauthorized?

Yes, and this is the impostor cause that sends engineers in circles. When a registry has network rules that permit only specific networks, or is reachable only through a private endpoint, a node outside the allowed network cannot reach the registry to authenticate, and the failure can surface in a way that reads like an unauthorized response even though no credential was ever evaluated. The way to tell is to run check-acr and read whether it fails on connectivity rather than authentication; a network failure shows up as an inability to reach or resolve the registry. For a firewall block, add the node subnet or the cluster’s outbound network to the registry’s network rules. For a private-endpoint-only registry, ensure a private endpoint exists in a network the nodes can route to and that a private DNS zone resolves the registry name to the endpoint’s private address, because resolving the name to the disabled public address fails even with perfect routing.

Q: How do I find which identity AKS uses to pull images?

Query the cluster’s identity profile for the kubelet identity, because that managed identity, not the control plane identity and not any workload identity your pods use, is what the kubelet presents to the registry when it fetches an image. Run az aks show with a query into the identityProfile.kubeletidentity field and read the object identifier it returns; that object identifier is the assignee you use when you list or create the AcrPull role assignment. Granting the role to the wrong identity is the most common reason a fix appears not to work, so confirm you have the kubelet identity specifically before you assign anything. On Container Apps and Container Instances the equivalent is the managed identity you assigned to the resource and referenced in its registry settings, and in a pipeline it is the service connection’s service principal or the federated identity the pipeline authenticates as. Whatever the platform, the identity the runtime presents is the one the entire fix is scoped to.

Q: What does az aks check-acr actually test?

It validates, end to end, whether a cluster can both authenticate to and reach a given registry, exercising the kubelet identity against the registry the way a real pull would. That dual scope is its value: it separates an authentication problem from a connectivity problem, which the unauthorized symptom in a pod describe collapses into a single message. A clean pass means the attach and role assignment are in place and a network path exists. A failure that reports it cannot authenticate points you at a missing attach or missing role, where the fix is to attach the registry or assign the pull role. A failure that reports it cannot reach or resolve the registry points you at a firewall rule or a private endpoint and DNS problem, where assigning roles will not help because the node cannot present the role it holds. Reading which half it flags before you act is the single most efficient diagnostic step in resolving an unauthorized pull, and rerunning it after a fix confirms the repair landed before you retry the workload.

Q: Why does the pull fail right after I disabled the registry admin account?

Because a workload was set up to pull using an image pull secret built from the admin account’s username and password, and disabling admin turned off the credential that pull secret depended on. The pull was never authenticating as a scoped identity; it was leaning on the shared admin credential, and the moment the credential was disabled, every pull that used it became unauthorized. Confirm it by checking that admin is now disabled and that the failing pull secret holds the admin credential. The wrong fix is to re-enable admin, which restores the pull and returns you to depending on a shared credential the next hardening pass will disable again. The right fix is to treat the disabling as the forcing function it is: assign AcrPull to a managed identity, point the runtime at that identity, remove the admin-based pull secret, and leave admin disabled. The pull then authenticates as a scoped, auditable identity that no policy will turn off.

Q: How do I fix an unauthorized pull when the registry is in a different tenant?

Cross-tenant is the one case where the clean managed-identity path does not directly apply, because role assignments are a within-tenant construct and you cannot assign a role in the registry’s tenant to an identity that lives in yours as if they shared a directory. Confirm the cause by comparing the registry’s tenant with the pulling identity’s tenant; if they differ and the pull is failing, this is your situation. The fix is to bridge the boundary with a credential the registry’s tenant can issue: create or use a service principal that exists in the registry’s tenant and holds AcrPull there, then present its credential through an image pull secret on the runtime in your tenant. You are still granting the pull role, just to a principal that lives where the registry lives, and then carrying that principal’s credential across the boundary. This is the one situation where an image pull secret is the right tool rather than a fallback, because the managed-identity model does not cross tenants.

Q: How do I confirm whether the AcrPull role is already assigned?

List the role assignments for the pulling identity scoped to the registry and look for AcrPull among them. Read the identity’s object identifier from the cluster’s kubelet identity profile or from the app’s assigned identity, take the registry’s resource identifier, and run az role assignment list filtered to that assignee at that scope. If AcrPull is present, the role is not your problem and you should look at the network or at an expired credential instead. If it is absent, you have found the cause and the fix is to create the assignment. This check is worth running even when you believe the attach was done, because attaches go to the wrong cluster or registry often enough that confirming the assignment exists is cheaper than assuming it does. On a registry using the attribute-based access model, look for Container Registry Repository Reader rather than AcrPull, because that is the role that confers pull on that model and AcrPull will not appear or will not help.

Q: Why does the attach succeed but my pull is still unauthorized?

The most likely reason is that the registry uses the attribute-based access model, where the attach command does not assign the role that grants pull. On a registry in the RBAC Registry plus ABAC Repository Permissions mode, image pull comes from the Container Registry Repository Reader role rather than AcrPull, and the attach path does not wire that role up, so the attach can report success while pulls stay unauthorized. The fix is to assign Container Registry Repository Reader to the kubelet identity by hand, scoped to the registry. A second possibility is propagation: a freshly created role assignment takes a short moment to become visible across the platform, so a pull retried immediately after an attach can still fail for a minute before the assignment propagates. A third is that the attach went to a different cluster or registry than the one the workload uses, so confirm the assignment exists on the exact identity and registry in play rather than assuming the attach landed where you intended.

Q: Is an unauthorized pull the same as ImagePullBackOff?

Not quite. ImagePullBackOff is the cluster-level state that says the kubelet has repeatedly failed to pull an image and is backing off between retries, and an unauthorized pull is one of several causes that can produce that state. The same back-off can come from a manifest that does not exist, a typo in the tag, transient registry unavailability, rate limiting, or a pure network timeout, none of which are about credentials. The unauthorized message inside the pod’s events is what narrows the back-off to this article’s territory; without that specific message you might be chasing a different cause entirely. So read the event detail rather than the state name: the state tells you the pull is failing and retrying, while the message inside tells you whether the failure is authorization, a missing image, or a network problem. Treating the unauthorized message as the signal, rather than the back-off state, is what routes you to the role-and-attach fix instead of a different repair.

Q: Should I use an image pull secret or a managed identity for ACR pulls?

Prefer a managed identity in almost every case, and reserve image pull secrets for the situations a managed identity cannot cover. A managed identity holding the pull role authenticates with no stored secret, so there is nothing to expire, nothing to leak, and a clean audit trail tied to the identity itself, which removes whole categories of the unauthorized-pull failure at the root. An image pull secret carries a credential the runtime presents on every pull, which means it can expire, can be copied into places it should not be, and shows up in audits as a credential rather than an identity. The legitimate uses for a pull secret are the cross-tenant case, where the managed-identity model does not cross the tenant boundary, and certain cross-cloud or external-registry cases. For pulls within a single tenant from an Azure registry, the managed identity is both the cleaner administration and the prevention, because it eliminates the credential problems rather than making them easier to manage.

Q: How do I make the ACR pull permission survive environment rebuilds?

Define the role assignment as part of the infrastructure rather than as a manual step. When the cluster and the registry are provisioned together in code, the attach or the AcrPull assignment lives in the same template, so a cluster is never created without its pull permission and a new environment stood up from the same definition inherits the working relationship automatically. The failure where the cluster and registry exist but were never connected is purely an artifact of provisioning them by separate manual steps, and declaring them together makes that gap impossible to reintroduce. Extend the same principle to the pipeline path by declaring the deploy identity’s pull role alongside the pipeline rather than granting it ad hoc, and add a validation gate that runs check-acr after any change to a cluster’s identity, a registry’s network rules, or a registry’s access model, so a break is caught before a user-facing rollout. The combination of declared assignments and a validation gate turns the unauthorized pull from a recurring surprise into a thing your pipeline prevents.

Q: Can a wrong image reference cause what looks like an unauthorized pull?

Yes, and it is worth ruling out first, because no role assignment fixes it. The image reference names a registry login server, a repository, and a tag, and if the login server is not the registry you granted access to, the runtime is asking a different registry entirely, which produces an unauthorized response regardless of any permission you set on the registry you had in mind. This happens when an image was built and pushed to a development-subscription registry and the manifest still references that registry while the cluster has access only to a production registry. It also happens with subtle login-server typos. The fix in these cases is not a role at all but correcting the reference so it points at the registry the workload is actually allowed to read. Confirm the login server in the reference before you touch any role assignment, every time, because half the unauthorized pulls engineers chase are really the runtime asking the wrong registry rather than a genuine permission gap on the right one.

Q: How long does an AcrPull assignment take to work after I grant it?

A role assignment is effective almost immediately but is not instantly visible to every part of the platform, so allow a short propagation window, usually about a minute, before you conclude that a fresh assignment did not work. A pull retried in the seconds right after an attach or an assignment can still fail simply because the assignment has not propagated to the component evaluating it yet, and the temptation in that moment is to assume the fix was wrong and to start changing other things, which muddies the diagnosis. Give it the minute, then retry, and if it still fails after propagation, look at whether the assignment landed on the correct identity and registry and whether the registry’s access model needs a different role than the one you assigned. Building the wait into your procedure, rather than reacting to the first post-assignment failure, prevents the common spiral of stacking unnecessary changes on top of a fix that was already correct and just needed a moment.

Q: Why does my pull work from one cluster but not another against the same registry?

Because the pull permission is a property of each cluster’s kubelet identity, not of the registry, so one cluster can be attached and authorized while another against the same registry is not. The working cluster’s kubelet identity holds AcrPull on the registry; the failing cluster’s kubelet identity does not, either because it was never attached or because the attach went somewhere else. Confirm by listing the AcrPull assignments for each cluster’s kubelet identity at the registry scope and comparing; the failing cluster will be missing the assignment the working one has. The fix is to attach the registry to the failing cluster or assign the role to its kubelet identity directly. The same logic explains why a pull can work in one environment and fail in another that uses the same images: the registry is shared, but the per-cluster identity and its role assignment are not, so each cluster needs its own grant and a working cluster is no guarantee that a sibling cluster is configured the same way.