Azure ExpressRoute Deep Dive: The Working Model

Azure ExpressRoute promises something that feels almost too good when you first read the marketing: a private connection from your datacenter into Azure that never touches the public internet. Teams hear that sentence, provision a circuit, attach a gateway, and then watch their throughput sit stubbornly at a fraction of what the circuit can carry. They blame the provider, open a support case, and wait. The circuit is fine. The gateway is the bottleneck, and nobody told them the gateway has its own ceiling that has nothing to do with the bandwidth they bought from the carrier. This is the most common and most expensive misunderstanding in the entire product, and it comes from treating ExpressRoute as a single thing rather than as three cooperating parts.

Azure ExpressRoute Deep Dive: The Working Model - Insight Crunch

The reality is that ExpressRoute performance and resilience are never a single number. They emerge from the interaction of three layers: the circuit you order from a connectivity provider, the peerings that ride on top of it, and the virtual network gateway that terminates the private path inside Azure. A throughput question is always a question about which of those three is the limit. A redundancy question is always a question about which of those three has a single point of failure. Once you can name the layer, the answer stops being a guess and becomes a calculation. That naming discipline is what this article gives you, and it is the model that the troubleshooting work later in the series leans on directly.

The circuit-peering-gateway model

Before any command or SKU, hold a picture of what ExpressRoute actually is. The product is not a cable. It is a logical service composed of three distinct layers that you provision and reason about separately, and the entire skill of designing or debugging a private connection comes down to knowing which layer a given symptom belongs to.

The first layer is the circuit. When you create an ExpressRoute resource in Azure, you are ordering a logical entity that represents a dedicated connection between your network and Microsoft, established through a connectivity provider or through ExpressRoute Direct ports you own. The circuit has a bandwidth value you choose at order time, a billing model, and a service key that you hand to your provider so they can complete the physical wiring on their side. The circuit is the only part of this picture that involves a third party, and that shared responsibility is the source of an entire category of confusion that we will untangle later. The important fact for now is that the circuit defines how much traffic can physically cross the boundary between your world and Microsoft’s.

The second layer is the set of peerings that ride on the circuit. A peering is a BGP routing relationship that determines which Azure destinations the circuit can reach. Private peering carries traffic to your virtual networks, the address space where your virtual machines and private endpoints live. Microsoft peering carries traffic to Microsoft’s public services that support it, reached over their public IP addresses rather than private ones. A single circuit can carry both peering types at the same time, and they are configured and troubleshot independently. Confusing the two, or assuming one peering being healthy says anything about the other, produces a steady stream of misdiagnoses.

The third layer is the virtual network gateway. The circuit and its private peering get traffic to the edge of Azure, but to deliver that traffic into a specific virtual network you need a gateway of type ExpressRoute deployed in that network. The gateway terminates the private path, exchanges routes with your on-premises routers over BGP, and forwards packets to the resources inside. Crucially, the gateway has its own throughput ceiling determined by its SKU, and that ceiling is independent of the bandwidth you ordered for the circuit. You can buy a ten gigabit circuit and bolt it to a gateway that tops out far below that, and the gateway, not the circuit, will be what your traffic hits first.

That is the InsightCrunch ExpressRoute model in one frame: the circuit defines the physical bandwidth at the boundary, the peerings define which destinations are reachable and over which routing relationship, and the gateway defines how fast traffic can actually enter a given virtual network. Performance is bounded by whichever of these is smallest. Resilience depends on whether each of these has redundancy. Reachability depends on whether the right peering exists and is advertising the right routes. Every real question about ExpressRoute decomposes cleanly into these three, and the rest of this article walks each one to the level of detail you need to design a connection and to debug it when it misbehaves.

The claim worth naming and remembering is this. ExpressRoute behavior is the product of the circuit, the peerings, and the gateway acting together, so any throughput or redundancy question must first identify which of the three is the limit before you change anything. Engineers who skip that step resize the wrong thing, pay for capacity they cannot use, and chase the provider for a problem that lives entirely inside their own subscription.

What is a private connection that never sees the internet?

It helps to be precise about what ExpressRoute does and does not give you, because the phrase “private connection” carries more weight than people realize. A standard hybrid setup uses a VPN gateway to build an encrypted tunnel across the public internet. The traffic is protected by encryption, but the path is the ordinary internet, with all of its variable latency, its shared congestion, and its dependence on whatever routes the internet decides to use that day. ExpressRoute replaces that path. Your traffic rides a dedicated connection arranged through a carrier, crosses into Microsoft’s network at a peering location, and travels Microsoft’s backbone to the region hosting your resources. At no point does it traverse the public internet.

The consequences of that are concrete rather than marketing abstractions. Latency becomes consistent because the path is deterministic rather than subject to internet routing changes. Bandwidth becomes a property you provisioned rather than a number you hope for. And the security posture changes in a way that matters for compliance: data that must not transit the public internet for regulatory reasons has a path that satisfies that requirement, because the connection is genuinely private at the network layer. That last point is why regulated industries reach for ExpressRoute even when a VPN would carry the volume. The requirement is not bandwidth, it is the path.

What ExpressRoute does not give you, and this trips people up, is encryption by default on the private path. The connection is private in the sense that it does not use the public internet, but private peering traffic is not encrypted on the wire by the service itself. If your threat model requires encryption in addition to a private path, you layer it on, either with MACsec on ExpressRoute Direct ports or with an encrypted tunnel running over the private peering. Treating “private” as synonymous with “encrypted” is a mistake that has surfaced in more than one security review, and the model keeps you honest about it: the circuit gives you a private path, and encryption is a separate decision you make on top.

How do private peering and Microsoft peering differ?

The two peering types confuse engineers more than any other part of ExpressRoute, because both ride the same circuit and both are configured in the same portal blade, yet they do completely different jobs and fail in completely different ways. Getting the distinction crisp is worth the paragraph it takes.

Private peering is the one most teams want and the one they spend most of their time on. It connects your on-premises network to the private address space of your Azure virtual networks. Traffic over private peering reaches virtual machines by their private IP addresses, reaches private endpoints, and behaves as though your datacenter and your virtual networks were part of one routed network. Routes are exchanged over BGP: you advertise your on-premises prefixes to Azure, and the gateway advertises the virtual network address space back to you. This is the peering that delivers the experience people imagine when they buy ExpressRoute, a seamless extension of the corporate network into the cloud, and it is the peering that requires a virtual network gateway to deliver traffic into a specific network.

Microsoft peering serves a different purpose. It provides connectivity to Microsoft’s public services over the private circuit instead of over the public internet. The destinations here are public IP addresses, the public endpoints of services that support reaching them this way, and the routing involves advertising and accepting public prefixes with the appropriate route filters. Microsoft peering does not deliver traffic into your virtual networks, so it does not involve a virtual network gateway at all. A team that has Microsoft peering up and assumes it gives them access to their virtual machines is conflating the two peerings, and the fix is not to troubleshoot the circuit but to configure private peering, which is a separate relationship.

The practical takeaway is to ask, for any reachability problem, which peering should be carrying the traffic. If the destination is a private IP inside a virtual network, the answer is private peering and a gateway, and the troubleshooting lives there. If the destination is a Microsoft public service you want to reach over the circuit, the answer is Microsoft peering and its route filters, and the gateway is irrelevant to it. The series troubleshooting article on a circuit reported as down leans on exactly this separation, because a circuit with one peering healthy and the other down looks, to a careless eye, like a half-broken circuit when it is really two independent relationships in two different states.

How do the primary and secondary links give redundancy?

Every ExpressRoute circuit is provisioned as a pair of links, a primary and a secondary, terminating on separate Microsoft Enterprise Edge routers at the peering location. This pairing is not optional and it is not an upgrade you buy. It is built into the circuit, and it exists so that the failure of a single device, a single port, or a single cable on Microsoft’s side does not take your connection down. Each link carries its own BGP session for each peering, so a private peering relationship is in fact two BGP sessions, one over the primary link and one over the secondary, both active at the same time.

The word “active” matters. ExpressRoute redundancy is not a cold standby where a backup link sits idle until the main one fails. Both links carry traffic, and both BGP sessions advertise and accept routes simultaneously. If one link drops, BGP withdraws the routes it was carrying, the remaining link continues to carry everything, and convergence happens in the timescale of BGP rather than the timescale of a human noticing. This is exactly what you want from a connection that regulated workloads depend on, and it is why the design conversation about ExpressRoute resilience starts from a position of strength rather than from scratch.

The trap is on your side, not Microsoft’s. The redundancy that Microsoft builds into the circuit protects the segment from your provider’s edge to Azure. It does nothing for the segment from your own routers to the provider, and it does nothing for the gateway inside Azure. A team that runs a single on-premises router into a single provider handoff has a redundant ExpressRoute circuit feeding a single point of failure they built themselves. When that router reboots for a firmware update, the circuit’s dual links are irrelevant because both of them terminate, on the customer side, at one device. True end-to-end resilience means redundancy at every layer: dual on-premises routers, dual provider connections where the budget allows, the inherent dual links of the circuit, and, as we will see, attention to the gateway as well.

So when someone asks whether a single link is enough for resilience, the honest answer is that a single link is never the whole picture, because the circuit already gives you two. The real question is whether the layers you control match the resilience the circuit provides. The counter-reading to engage here is the assumption that buying ExpressRoute buys resilience. It buys you a redundant circuit. Whether you have a resilient connection depends on what you put on either end of it.

How does BGP route exchange actually work over the circuit?

Routing over ExpressRoute is dynamic, and it is dynamic for a reason that becomes obvious the first time you watch a link fail and recover without anyone touching a configuration. Static routes cannot react to a link going down; BGP can. Over private peering, you and Azure run BGP as neighbors. You advertise the prefixes that represent your on-premises networks, and Azure advertises the address space of the virtual networks reachable through the gateway. The exchange happens over both the primary and secondary links, which is why the failure of one link is a BGP event rather than an outage.

There are limits on this exchange, and the limits are a frequent source of trouble that looks nothing like a routing problem on the surface. ExpressRoute private peering accepts a bounded number of route prefixes advertised from on-premises. Push more prefixes than the limit allows and the BGP session does not politely ignore the excess; it can drop, taking your connectivity with it. Teams that summarize their on-premises routing into a handful of aggregate prefixes never see this. Teams that advertise every individual subnet from a large enterprise network can blow past the ceiling and spend a frustrating afternoon convinced the circuit is broken when the actual cause is an oversized advertisement. The fix is route summarization, and the discipline is to treat the prefix count as a design constraint from the start rather than a surprise in production.

The direction of advertisement also matters for what your on-premises network learns. The gateway advertises the virtual network’s address space, and through gateway transit and peering it can advertise the address space of peered networks as well, which is how a hub-and-spoke topology lets spokes reach on-premises without each spoke having its own gateway. That mechanism is the subject of the series article on hub-spoke network topology, and it is worth understanding because the alternative, a gateway per spoke, is both expensive and operationally heavy. The point to carry forward is that BGP is what makes the whole arrangement adapt: links fail and recover, prefixes are learned and withdrawn, and the topology stays correct without manual intervention, as long as you respect the limits.

What does the ExpressRoute gateway SKU determine?

Here is the layer that surprises the most people and costs the most money when it is gotten wrong. The virtual network gateway has a SKU, and the SKU sets a throughput ceiling that is entirely separate from the bandwidth of your circuit. You can order a circuit far larger than your gateway can serve, and your traffic will be capped by the gateway. The circuit is not the bottleneck in that scenario; the gateway is, and resizing the circuit will do nothing to help.

The traditional gateway SKUs form a tiered ladder. Standard and the zone-redundant ErGw1Az sit at the bottom, suited to most ordinary workloads, with throughput on the order of one gigabit per second. High Performance and the zone-redundant ErGw2Az sit in the middle, roughly double that, for bandwidth-intensive scenarios. Ultra Performance and the zone-redundant ErGw3Az sit at the top of the traditional ladder, with throughput on the order of ten gigabits per second, for the most demanding connections. Newer to the lineup is the scalable gateway, ErGwScale, which supports far higher bandwidth, up to forty gigabits per second, and autoscales between a minimum and maximum number of scale units that you configure, so capacity tracks utilization rather than sitting fixed. Microsoft publishes the exact throughput numbers per SKU and revises them over time, so the values to design against should always be read from the current official documentation rather than from memory or from a blog. Verify the SKU throughput before you commit a design, because this is precisely the kind of figure that moves.

Sizing the gateway is therefore a capacity-planning exercise, not a default to accept. The question is how much traffic this virtual network will actually carry over ExpressRoute, and the SKU must be chosen to serve that, with headroom. Undersizing means your expensive circuit is throttled at the door. Oversizing means you are paying for a gateway tier you cannot fill. And changing your mind later is not free: you can resize up within the same SKU family without recreating the gateway, but moving between the non-zonal family and the zone-redundant family, or downgrading, requires deleting and recreating the gateway, which is downtime. That asymmetry is a reason to choose the zone-redundant family from the start when availability zones are available in your region, because it keeps a clean upgrade path open and gives you zone resilience for the gateway itself.

There is one more wrinkle worth keeping in view. When the gateway is serving connectivity to private endpoints, the available throughput and control-plane capacity can be reduced compared to serving ordinary virtual machine traffic. If your design routes heavy private endpoint traffic over ExpressRoute, factor that reduction into the SKU you pick rather than assuming the headline number applies to every kind of destination.

The InsightCrunch ExpressRoute model at a glance

The table below is the findable artifact for this article: the three layers, what each one bounds, and where to look when a symptom points at it. Keep it next to the diagram in your runbook, because most ExpressRoute incidents resolve the moment someone correctly assigns the symptom to a layer.

Layer	What it is	What it bounds	Redundancy mechanism	First symptom it explains
Circuit	Logical connection through a provider or ExpressRoute Direct	Physical bandwidth at the Azure boundary	Built-in primary and secondary links to separate edge routers	Provider-side or carrier outage; both peerings affected
Private peering	BGP relationship to your virtual network address space	Reachability of private IPs; route prefix count	Two BGP sessions, one per link	Cannot reach VM private IPs; prefix-limit session drop
Microsoft peering	BGP relationship to supported Microsoft public services	Reachability of public endpoints over the circuit	Two BGP sessions, one per link	Cannot reach a Microsoft public service over the circuit
Gateway	ExpressRoute virtual network gateway terminating the private path	Throughput into a specific virtual network	Zone-redundant SKU family; active instances	Throughput capped below circuit bandwidth
FastPath	Gateway-bypass feature on the data path	Per-flow latency and the gateway throughput ceiling	Inherits gateway and circuit redundancy	Throughput stuck at the gateway ceiling despite a large circuit

Read the table as a routing guide for your own attention. Throughput capped below what you paid for points at the gateway SKU or at the need for FastPath, not at the circuit. A specific private IP unreachable while others work points at routing and prefixes on private peering. A Microsoft public service unreachable over the circuit points at Microsoft peering and its filters. A whole-circuit outage that takes both peerings at once points at the circuit and the provider. Naming the layer first is the entire method.

What is FastPath and when does it actually help?

FastPath is the answer to a specific problem: the gateway is in the data path, and the data path is exactly where you do not want a bottleneck. In the ordinary arrangement, traffic from on-premises enters Azure, reaches the virtual network gateway, and the gateway processes and forwards it to the destination virtual machine. That processing is real work, and it is bounded by the gateway SKU’s throughput. FastPath changes the picture by letting traffic bypass the gateway in the data path and go directly to the virtual machines, while the gateway continues to handle the control plane, the route exchange that keeps everything correct.

The two effects are distinct and both worth wanting. First, latency drops, because removing the gateway from the data path removes a hop and the processing that goes with it. Second, and this is the effect that resolves the throughput surprise we keep returning to, traffic is no longer capped by the gateway’s throughput ceiling. With FastPath enabled on an Ultra Performance or ErGw3Az gateway, flows can exceed the gateway’s own ten gigabit limit because they are not going through it, and with the scalable gateway the headroom is larger still. FastPath is, in effect, the official way to stop the gateway from being the bottleneck once you have already sized the circuit for the volume you need.

FastPath is not a free toggle, and the requirements are precise. The gateway must be Ultra Performance or ErGw3Az; the lower SKUs cannot run it, because they lack the capacity to handle the control-plane work that FastPath still relies on. FastPath applies to private peering, not Microsoft peering. And because traffic bypasses the gateway, some processing that lived at the gateway is bypassed too, which has historically meant care around how user-defined routes on the gateway subnet and certain inline appliances interact with bypassed flows. Microsoft has expanded FastPath support for virtual network peering and user-defined routes over time, and the supported scenarios for things like Private Link traffic have specific conditions, so the rule is the same as for SKU numbers: confirm the current FastPath requirements and supported scenarios against the official documentation before you design around them, because this feature’s boundaries have moved repeatedly.

The decision rule is straightforward once the requirements are clear. If your virtual network needs more throughput than the gateway SKU can serve, or if you are chasing the lowest achievable latency for a latency-sensitive workload, FastPath is the lever, provided you are on a qualifying gateway SKU and using private peering. If your traffic comfortably fits under the gateway’s ceiling and latency is not the constraint, FastPath adds complexity you do not need. Reaching for FastPath when the gateway was never the limit is as much a misdiagnosis as ignoring it when the gateway clearly is.

The configuration that realizes the model

It is one thing to hold the model and another to express it in resources, so here is the shape of a real deployment, with the commands that create each layer. These are illustrative of the workflow rather than a copy-paste production script, because names, address spaces, and SKUs are yours to choose, and the VaultBook labs are the place to run a full version against a sandbox. The order matters: the gateway subnet exists before the gateway, the gateway exists before the connection, and the circuit must be provisioned by the provider before the connection to it will come up.

Start with the gateway subnet. ExpressRoute, like other gateways, requires a subnet named exactly GatewaySubnet, and it should be sized generously, a /27 or larger, with /26 recommended where you may connect several circuits to the same gateway.

# Create the dedicated gateway subnet. The name must be exactly GatewaySubnet.
az network vnet subnet create \
  --resource-group rg-hybrid \
  --vnet-name vnet-hub \
  --name GatewaySubnet \
  --address-prefixes 10.10.255.0/26

Create the ExpressRoute virtual network gateway. The gateway type is ExpressRoute, and the SKU is the capacity decision discussed above. Here the zone-redundant ErGw3Az is chosen so that FastPath remains available and the gateway itself spans availability zones.

# A public IP is required for the gateway VMs; Azure manages it for ExpressRoute.
az network public-ip create \
  --resource-group rg-hybrid \
  --name pip-ergw \
  --sku Standard \
  --allocation-method Static \
  --zone 1 2 3

# Create the gateway. Deployment commonly takes up to 45 minutes.
az network vnet-gateway create \
  --resource-group rg-hybrid \
  --name ergw-hub \
  --vnet vnet-hub \
  --gateway-type ExpressRoute \
  --sku ErGw3AZ \
  --public-ip-addresses pip-ergw \
  --no-wait

With the circuit provisioned and your provider reporting it as up on their side, connect the gateway to the circuit. The connection is the object that binds a specific gateway to a specific circuit’s private peering, and it is where FastPath is enabled.

# Fetch the circuit resource ID once the provider has completed provisioning.
CIRCUIT_ID=$(az network express-route show \
  --resource-group rg-hybrid \
  --name erc-primary \
  --query id -o tsv)

# Bind the gateway to the circuit and enable FastPath in one step.
az network vpn-connection create \
  --resource-group rg-hybrid \
  --name conn-ergw-to-circuit \
  --vnet-gateway1 ergw-hub \
  --express-route-circuit2 "$CIRCUIT_ID" \
  --express-route-gateway-bypass true

When the model misbehaves, the same resources answer questions in PowerShell. Reading the learned and advertised routes is the single most useful diagnostic for a reachability problem, because it tells you directly what BGP has actually exchanged rather than what you intended it to.

# Confirm the gateway SKU, the layer that bounds throughput into this VNet.
Get-AzVirtualNetworkGateway -Name "ergw-hub" -ResourceGroupName "rg-hybrid" |
  Select-Object Name, @{n='Sku';e={$_.Sku.Name}}, GatewayType

# Inspect the circuit and per-peering state in one view.
Get-AzExpressRouteCircuit -Name "erc-primary" -ResourceGroupName "rg-hybrid" |
  Select-Object Name, ServiceProviderProvisioningState, CircuitProvisioningState

# What prefixes has the gateway learned from on-premises over BGP?
az network vnet-gateway list-learned-routes \
  --resource-group rg-hybrid \
  --name ergw-hub \
  --output table

# What is the gateway advertising back toward on-premises?
az network vnet-gateway list-advertised-routes \
  --resource-group rg-hybrid \
  --name ergw-hub \
  --peer 10.0.0.1 \
  --output table

The commands map to the model precisely. The gateway SKU query answers the throughput-ceiling question. The circuit state query answers the provider and circuit-layer question. The learned and advertised route queries answer the peering and routing question, including whether a prefix-count problem has silently dropped a session. You are never running all of them blindly; you run the one that matches the layer the symptom points at, which is the whole reason the model exists.

The failure modes the model explains

Engineers report the same handful of ExpressRoute problems over and over, and what makes them feel mysterious is that the symptom rarely names the layer responsible. The model’s payoff is that each recurring case maps to a layer, and once you make that map the fix is usually obvious. Walking the common cases as patterns is the fastest way to internalize the method.

Why is my throughput capped well below the circuit bandwidth?

This is the signature ExpressRoute complaint, and the answer is almost always the gateway SKU. A team orders a large circuit, sees their transfers plateau far below it, and assumes the carrier is underdelivering. The circuit is fine. The traffic is hitting the gateway, and the gateway’s SKU sets a ceiling that the circuit’s bandwidth cannot override. Confirm it by reading the gateway SKU and comparing its documented throughput against what you are observing; if the plateau matches the SKU’s ceiling, you have found your limit. The fix is to size the gateway to the traffic, which may mean resizing up within the SKU family, moving to the scalable gateway, or, if you have a qualifying SKU already, enabling FastPath so traffic bypasses the gateway entirely. Resizing the circuit does nothing for this symptom, which is why naming the layer first saves both money and time.

When does the gateway itself become the thing to remove from the path?

Sometimes the gateway is correctly sized and still in the way, because what you need is not more gateway throughput but no gateway in the data path at all. Latency-sensitive workloads feel every hop, and a workload that must move more than ten gigabits into a single virtual network exceeds what even the top traditional SKU can forward. Both cases point at FastPath. The pattern to recognize is throughput pinned exactly at the gateway ceiling on a large circuit, or a latency budget that the extra hop blows. The confirming step is to check that you are on an Ultra Performance or ErGw3Az gateway and using private peering, the prerequisites FastPath demands, and then to enable the gateway bypass on the connection. After enabling, verify with a connectivity test that traffic is reaching the destination on the faster path rather than assuming the toggle did its job.

How do I know both links are actually carrying the load?

Redundancy that you have not verified is a hope, not a design. The circuit gives you primary and secondary links with a BGP session each, but you should confirm both sessions are established rather than trust that they are. The pattern that bites teams is a secondary link that quietly failed weeks ago, leaving them running on a single link without knowing it, until the primary fails too and the supposedly redundant circuit goes dark. Confirm by inspecting the per-peering state and the BGP session status on both links; a healthy circuit shows both up. The lesson the model teaches here is that resilience lives across layers, and the circuit’s built-in redundancy only helps if you also monitor it, run redundant routers on your own side, and treat the gateway’s zone redundancy as part of the same picture rather than an afterthought.

Why can I reach some destinations and not others over the circuit?

When part of what you expect to reach works and part does not, the split usually falls along the peering boundary. Private IPs inside virtual networks ride private peering and a gateway; Microsoft public services that support the circuit ride Microsoft peering and its route filters. A team with private peering healthy but Microsoft peering misconfigured will reach their virtual machines fine and fail to reach a public service over the circuit, and conclude the circuit is half broken when it is two independent relationships in two different states. The confirming step is to identify which peering should carry the failing destination and inspect that peering specifically. The fix lives in the peering that owns the destination, never in the other one, and never in the circuit as a whole when the circuit is plainly carrying the working traffic.

What happens when I advertise too many routes?

This failure wears a disguise. You add a batch of new on-premises subnets, advertise them all individually, and connectivity drops, but not for the new subnets alone, for everything. The cause is the prefix limit on private peering: exceed the allowed number of advertised prefixes and the BGP session can drop rather than ignore the surplus, and a dropped session takes its routes with it. The symptom looks like a circuit outage; the cause is an oversized advertisement. Confirm it by counting the prefixes you are advertising and comparing against the documented limit, and by checking whether the session state flapped at the moment you added routes. The fix is route summarization, aggregating many specific subnets into fewer broader prefixes, which is also simply good routing hygiene that keeps you comfortably under the ceiling as the network grows.

Is this the provider’s problem or mine?

The shared-responsibility seam between you, your provider, and Microsoft is where blame gets misassigned most often. The provider owns the physical connection from your premises to the Microsoft edge, including the carrier’s own equipment and the layer-2 link. Microsoft owns the edge routers and everything inside Azure. You own your on-premises routers, your BGP configuration, your prefix advertisements, your gateway, and your gateway SKU. When a circuit is down, the question is which segment failed, and ExpressRoute exposes a provisioning state for the provider’s side that you should read before escalating. If the provider’s provisioning state shows the circuit as not provisioned or deprovisioned on their side, the issue is theirs to complete. If their state shows provisioned and your gateway and routing are the problem, escalating to the carrier wastes a day. The series troubleshooting article on a down circuit walks this triage in depth, and the model’s contribution is simply to keep you asking which layer and which party owns the failure before you pick up the phone.

How does ExpressRoute interact with the rest of the network?

A circuit does not exist in isolation. It terminates in a virtual network, and that virtual network almost always sits inside a larger topology, so the design questions extend well past the three layers into how those layers plug into everything else. The dominant pattern, and the one worth designing toward, is to terminate ExpressRoute in a central hub and let workload networks reach on-premises through that single termination rather than each provisioning its own gateway.

The mechanism that makes this work is gateway transit over virtual network peering. When a spoke network is peered to the hub and gateway transit is enabled, the spoke can use the hub’s ExpressRoute gateway to reach on-premises, and on-premises learns the spoke’s address space through the routes the gateway advertises. One gateway serves many networks. The alternative, a gateway in every spoke, multiplies cost and operational surface for no benefit, because a gateway is not a cheap resource and managing a fleet of them is a needless burden. The hub-spoke topology article develops this design fully, and ExpressRoute is one of the shared services that most justifies the hub-and-spoke shape in the first place. Centralizing the circuit, the gateway, and the routing in a hub is the architecture that scales as the number of workloads grows.

There is nuance in how far the bypass features reach into this topology. FastPath, when sending traffic to virtual machines in peered spoke networks, has supported scenarios that have grown over time, and the interaction with user-defined routes and inline appliances in the data path has specific rules. If your hub places a firewall or network virtual appliance in the path and you also want FastPath, you are in territory where the supported scenarios must be checked carefully, because bypassing the gateway can mean bypassing the very inspection point you put there on purpose. The design tension is real: FastPath optimizes the data path by removing hops, and a security architecture often wants a hop in the middle for inspection. Resolving that tension is a deliberate decision, not a default, and it is exactly the kind of trade-off the layered model helps you see clearly rather than stumble into.

The other interaction worth naming is coexistence with a VPN gateway. ExpressRoute and a VPN can run side by side, with the VPN serving as a backup path for the circuit or carrying a subset of traffic. This is a legitimate resilience pattern: the private circuit is the primary path, and an encrypted tunnel over the internet stands ready if the circuit fails entirely, accepting the internet’s variability for the duration of an outage in exchange for staying connected. Designing that coexistence means thinking about routing preference, so that traffic uses the circuit when it is healthy and fails over to the VPN cleanly, which is once again a BGP and routing question rather than a magic setting.

ExpressRoute versus a VPN gateway: which path and why?

The counter-reading to engage head-on is the idea that ExpressRoute and a VPN gateway are interchangeable ways to connect on-premises to Azure, differing only in speed. They are not interchangeable, and the difference that decides between them is rarely raw bandwidth. It is the nature of the path.

A VPN gateway builds an encrypted tunnel across the public internet. The path is shared, the latency varies with internet conditions, and the connection is as resilient as the internet between your two endpoints, which is usually fine and occasionally not. A VPN gateway is fast to stand up, costs less, requires no carrier relationship, and is the right answer for a great many hybrid scenarios, especially where the volume is modest and the workload tolerates the internet’s variability. The series VPN gateway deep dive develops that side of the picture, including the route-based-by-default rule and the SKU choices that govern its own throughput, and it deserves a careful read before defaulting to ExpressRoute for a workload a VPN would serve perfectly well.

ExpressRoute earns its higher cost and operational weight when the requirement is one a VPN cannot meet by its nature. Consistent, predictable latency, because the path does not ride the variable internet. Higher sustained bandwidth than a VPN gateway comfortably provides. And, most decisively, a network path that genuinely avoids the public internet for compliance or data-handling reasons. When the requirement is the path itself, no VPN configuration changes the fact that a VPN traverses the internet, so the decision is made for you. When the requirement is throughput or latency, the choice is a calculation. When neither applies, a VPN is the lighter and cheaper tool, and reaching for ExpressRoute is over-engineering.

The full three-way decision, adding plain virtual network peering for the intra-Azure case, is the subject of the series article comparing peering, VPN, and ExpressRoute, which lays the options side by side on latency, bandwidth, cost, and whether traffic leaves the internet. The short version, carried by this article’s model, is that ExpressRoute is the choice when the private path is a requirement rather than a nicety, and that once you have chosen it, the circuit-peering-gateway model is what you design and debug against.

How do I design ExpressRoute for production?

Designing for production means turning the model into decisions that hold up under load and survive failure, and the discipline is to make each layer’s choice deliberately rather than accept a default and discover its limits later. Take the layers in turn.

For the circuit, size the bandwidth to the sustained volume the connection must carry, with headroom for growth, and decide early between a provider circuit and ExpressRoute Direct based on the scale you need and whether you want to own the ports. Establish the provider relationship and the service key handoff as a tracked dependency, because the circuit cannot come up until the provider completes their side, and that handoff has lead time that surprises teams who treat the circuit as something they can spin up in an afternoon like a virtual machine.

For redundancy, design every layer to match the resilience the circuit already gives you. The circuit’s primary and secondary links are there by default; honor them with redundant routers and, where the budget and the stakes justify it, redundant provider connections from separate peering locations. A circuit feeding a single on-premises router or a single provider handoff has resilience on Microsoft’s half and a single point of failure on yours, which is the asymmetry that turns a redundant circuit into a false sense of security. Monitor both BGP sessions so a silently failed secondary link is an alert rather than a surprise discovered during the next primary outage.

For the gateway, choose the SKU from the traffic the virtual network will carry, prefer the zone-redundant SKU family where availability zones exist so the gateway itself survives a zone failure and keeps a clean upgrade path, and decide on FastPath based on whether the gateway will be the throughput limit or whether latency demands removing it from the data path. Remember the asymmetry of changing your mind: resizing up within a family is online, but switching families or downgrading means recreating the gateway and accepting downtime, so the SKU-family choice at the start carries more weight than the specific SKU within it.

For routing, treat the prefix advertisement as a design constraint from day one. Summarize on-premises routes into aggregates that stay well under the private peering prefix limit, plan the address spaces so summarization is clean, and use gateway transit in a hub-and-spoke layout so one gateway serves many workloads rather than scattering gateways across spokes. Where you run a backup VPN alongside the circuit, design the routing preference so failover is automatic and clean, and test that failover rather than assuming it works.

The thread through all of it is the same claim the article opened with. Every production decision is a decision about one of the three layers, and a design is sound when each layer’s capacity and resilience are chosen on purpose and matched to the others, so that no single layer is the silent bottleneck or the unguarded single point of failure that takes the whole connection down.

How does a circuit get provisioned, and what do the states mean?

The circuit is the only layer with a third party in the loop, and its provisioning lifecycle is where the most avoidable delays happen, so it pays to understand the states before you order. When you create the ExpressRoute resource in Azure, the circuit comes into existence in a state where Azure has done its part but the provider has not yet completed theirs. Azure generates a service key, a unique identifier for the circuit, and you hand that key to your connectivity provider. The provider uses it to provision their side, wiring the physical connection from their network into the Microsoft edge for your circuit.

Two states tell you where things stand. The circuit provisioning state reflects whether the Azure-side resource is enabled. The service provider provisioning state reflects whether the carrier has finished their work, moving from a not-provisioned condition to provisioned once they complete it. A circuit will not carry traffic until both align, and reading these states is the first thing to do when a brand-new circuit refuses to come up. If the provider’s state still shows the work as incomplete, the ball is in their court, and no amount of fiddling on the Azure side will help. This is the same triage the model demands everywhere: identify the layer and the owner before you act.

The lead time on the provider side is the part that surprises teams treating a circuit like a resource they can conjure in minutes. Physical provisioning involves a carrier, sometimes a colocation facility, and a cross-connect that a human has to complete. For ExpressRoute Direct, where you own the ports into the Microsoft backbone directly, the ports begin billing on a grace period after creation to allow time for that cross-connect work, which tells you something about how long Microsoft expects the physical step to take. Plan the circuit as a dependency with real lead time, request it early, and track the provider handoff as a milestone rather than an afterthought.

What does the Premium add-on change?

The base circuit has limits that a large enterprise can outgrow, and the Premium add-on is how you raise several of them at once. It is an add-on to the circuit rather than a separate SKU of gateway or a different kind of peering, and it changes capacity ceilings rather than the fundamental behavior of any layer.

The change that matters most for the routing discussion earlier is the prefix limit. A standard circuit accepts a documented number of advertised routes on private peering, historically four thousand, and the Premium add-on raises that ceiling substantially, historically to ten thousand. If your on-premises network advertises a large number of prefixes and you cannot summarize them below the standard ceiling, Premium buys you the headroom, though summarization remains the cleaner first move because fewer prefixes is good routing hygiene regardless of the ceiling. The exact route numbers are the sort of figure to confirm against current documentation, since they are precisely the kind of limit that gets revised, but the shape of the benefit is stable: Premium lifts the route ceiling.

Premium also unlocks global connectivity and raises the number of virtual networks a single circuit can serve. Global connectivity means a circuit created in one geopolitical region can reach Azure resources in other regions worldwide, which a standard circuit cannot. The increased virtual network link count matters in dense hub-and-spoke estates where one circuit feeds many networks, because the standard ceiling on links per circuit can become a constraint as the estate grows. Enabling Premium begins billing for the add-on immediately, and disabling it has its own rules, so treat it as a deliberate capacity decision tied to a real need rather than a checkbox to enable speculatively.

# Inspect a circuit's SKU tier and provider provisioning state in one view.
$circuit = Get-AzExpressRouteCircuit -Name "erc-primary" -ResourceGroupName "rg-hybrid"
$circuit | Select-Object Name,
  @{n='SkuTier';e={$_.Sku.Tier}},
  ServiceProviderProvisioningState,
  CircuitProvisioningState

When do I own the ports with ExpressRoute Direct?

There are two ways to get a circuit: through a connectivity provider, which is the common path, or through ExpressRoute Direct, where you connect into the Microsoft backbone at peering locations on port pairs you own. The provider model is right for most organizations, because the carrier handles the physical connection and you consume the circuit as a managed thing. ExpressRoute Direct is for the scale and the scenarios where owning the ports earns its keep.

Direct provides high-capacity port pairs into the Microsoft network, at tiers such as ten and one hundred gigabits, and the scenarios that benefit most are the demanding ones: massive data ingestion, physical isolation for regulated markets, and dedicated capacity for bursty workloads like large rendering jobs. Owning the ports means you can carve multiple circuits out of the same physical capacity and you control the underlying connection directly rather than through a carrier’s provisioning. The trade-off is that you take on more of the responsibility and the cost structure that comes with port ownership, including the port fee that begins after the grace period regardless of how much traffic you push.

There is also a Local circuit option aimed at moving large volumes into a nearby Azure region at a different cost profile, with its own constraints on where the traffic can land. The decision among provider circuits, Direct, and Local is a scale-and-cost calculation layered on top of the same model: whichever you choose, you still reason about the circuit as the bandwidth-at-the-boundary layer, the peerings as the reachability layer, and the gateway as the into-the-network layer. The provisioning path changes; the model does not.

How does Global Reach connect on-premises sites to each other?

ExpressRoute normally connects your on-premises network to Azure. Global Reach extends the idea sideways: it connects your on-premises sites to each other through your existing ExpressRoute circuits, using the Microsoft global network as the transit between them. If you have a circuit in one location and another circuit in a second location, Global Reach can link them so traffic between the two sites rides Microsoft’s backbone rather than a separate carrier link you would otherwise have to buy and manage.

The appeal is consolidation. An enterprise with multiple datacenters already paying for ExpressRoute to reach Azure can use the same investment to interconnect those datacenters, replacing or supplementing a wide-area network between sites with the Microsoft backbone. The conditions are worth knowing. Linking circuits within the same geopolitical region does not require the Premium add-on, while linking circuits across geopolitical regions does require Premium on both. Global Reach is billed separately from the base ExpressRoute service, with its own add-on fee per circuit and its own data charges, and Global Reach connections count against the virtual network connection limit of the circuit, so a heavily used circuit has to budget its connection capacity across both gateway connections and Global Reach links.

Global Reach sits cleanly inside the model as an extension of what the circuit and its peering can carry. It does not introduce a new gateway layer for site-to-site traffic, and it does not change how private or Microsoft peering work for Azure-bound traffic. It is the circuit doing double duty, and the design questions it raises are familiar: which regions, whether Premium is required for the pairing, and how the added connections fit within the circuit’s limits.

What about encryption over the private path?

The word private does a lot of work in ExpressRoute marketing, and it is precise about one thing and silent about another. It is precise that the path does not traverse the public internet. It is silent on encryption, because private peering traffic is not encrypted on the wire by the service itself. For many compliance regimes the private path is the requirement and that is sufficient. For threat models that demand encryption in addition to a private path, you add it deliberately.

There are two common ways to add it. On ExpressRoute Direct ports, MACsec encrypts traffic at the layer-2 link between your edge and the Microsoft edge, protecting the physical connection itself. Above that, you can run an encrypted tunnel over the private peering, carrying IPsec-protected traffic across the circuit so the payload is encrypted end to end even though the underlying path is already private. Each approach has its own configuration and its own performance considerations, and the choice depends on where in the stack your requirement sits, the physical link or the traffic flows.

The lesson the model reinforces is to keep private and encrypted as separate properties in your head. The circuit gives you a private path. Encryption is a layer you choose to add on top, with MACsec on Direct ports or an encrypted tunnel over private peering, and conflating the two has tripped up more than one security review that assumed a private connection was an encrypted one. Decide each on its own merits, document which requirement you are satisfying, and do not let the convenience of one word paper over two distinct guarantees.

How is ExpressRoute billed, and how do I keep the bill sane?

ExpressRoute is one of the more expensive Azure networking services, and the bill has several components that the layered model helps you anticipate rather than discover. The circuit carries the largest charge, and it comes in two plans. The metered plan charges a monthly fee tied to the circuit’s bandwidth tier, includes inbound data transfer at no charge, and bills outbound data per unit at rates that vary by region. The unlimited plan charges a single higher fixed monthly fee that includes all inbound and outbound transfer. The choice between them is a utilization calculation: heavy, sustained outbound volume favors unlimited, while modest or sporadic usage favors metered. You can move from metered to unlimited freely, but moving the other way has constraints, so when in doubt starting on metered preserves flexibility.

The gateway is a separate, ongoing charge billed by the hour for the SKU you chose, and a top-tier gateway is not cheap, which is one more reason to size it to the traffic rather than reflexively reaching for the largest SKU. FastPath itself does not carry a separate data charge in the way the circuit does, but the gateway SKU it requires, Ultra Performance or its zone-redundant equivalent, sits at the upper end of the gateway pricing, so the cost of FastPath is largely the cost of the gateway tier that enables it. The Premium add-on adds its own fee, and Global Reach adds an add-on fee per circuit plus its own data charges for site-to-site traffic.

Keeping the bill sane therefore follows the same layer discipline as everything else. Size the circuit bandwidth to real volume rather than aspiration, choose the data plan from measured utilization rather than a guess, size the gateway to the throughput the network actually needs, and enable Premium or Global Reach only when a concrete requirement justifies the add-on. The most common waste is not a mystery line item; it is an oversized gateway serving a fraction of its capacity, or an unlimited data plan on a circuit that barely moves data. Both are visible the moment you attribute spend to the layer that incurs it.

How do I monitor an ExpressRoute connection so failures are alerts, not surprises?

A connection you do not monitor is a connection whose redundancy you are taking on faith, and the recurring pattern of a silently failed secondary link is the proof that faith is not a strategy. Monitoring ExpressRoute means watching each layer for the signal that tells you it is healthy, and alerting on the change before a human notices a slowdown.

At the circuit, watch the BGP session state for both the primary and secondary links on each peering. Two sessions up is the healthy condition, and a transition to one session up is the early warning that you are running without redundancy even though nothing has failed visibly yet. Watch the bits-in and bits-out metrics against the circuit’s provisioned bandwidth so you see saturation approaching rather than discovering it as user complaints. At the gateway, watch throughput against the SKU ceiling, because a gateway pinned at its limit is the throughput-cap symptom announcing itself, and watch the gateway’s own health and any zone-level events if you run a zone-redundant SKU.

For end-to-end confidence, Connection Monitor tests reachability and latency along the actual path and can confirm that traffic is taking the route you expect, including verifying that FastPath is sending traffic on the bypassed path rather than through the gateway. The combination gives you the layered picture the model asks for: circuit-layer signals for the provider and the links, peering-layer signals for the BGP sessions and route counts, and gateway-layer signals for throughput and health. Alert on the leading indicators, the second BGP session dropping and throughput approaching a ceiling, and the failures that would otherwise be 2 a.m. surprises become tickets you handle on your own schedule.

How does Microsoft peering reach public services over the circuit?

Private peering gets most of the attention because it carries the experience people buy ExpressRoute for, but Microsoft peering deserves its own treatment because it works differently enough to cause trouble when treated as a variant of the private kind. Microsoft peering carries traffic to Microsoft’s public-facing services that support reaching them over the circuit, and it does so over public IP addresses rather than the private space of your virtual networks. There is no virtual network gateway in this picture, because nothing is being delivered into a private network; the destinations are public endpoints reached privately, over the circuit instead of the internet.

Because the destinations are public IP space, Microsoft peering involves route filters that select which service communities you want to receive. Rather than accepting every public prefix Microsoft could advertise, you attach a route filter that specifies the service regions and the categories of service whose routes you want, and the peering then learns only those. This is both a control and a frequent source of “why can I not reach this service” confusion: if the route filter does not include the community for the service you want, the routes for it are not learned, and the destination is unreachable over the circuit even though the peering is otherwise healthy. The fix lives in the filter, not in the circuit and not in any gateway.

The misdiagnosis pattern here is the one the model is built to prevent. A team configures Microsoft peering, attaches a filter, reaches some services and not others, and starts inspecting the circuit as though it were failing. The circuit is carrying exactly what it was told to carry. The reachability question belongs to Microsoft peering and its route filter, a layer entirely separate from private peering and from the gateway. Naming that layer first turns a vague “the circuit seems broken” into a precise “the route filter is missing the community for this service,” which is a five-minute fix rather than a support case against the carrier.

Does ExpressRoute guarantee availability, and how do I reason about its SLA?

Microsoft publishes a service level agreement for ExpressRoute, and the temptation is to read the headline availability figure as a promise that your connection will be up that fraction of the time no matter what you do. That is not how an SLA works, and reasoning about it correctly is one more application of the layered model. The published figure is a commitment about the service Microsoft operates, contingent on the configuration being one that the SLA actually covers, and the configuration that earns the strongest commitment is a redundant one.

The circuit’s built-in primary and secondary links exist precisely so the connection can meet a high availability target through the failure of a single device on Microsoft’s side. But the commitment assumes you are using that redundancy rather than defeating it. A design that runs both links into a single on-premises router, or that terminates the circuit in a single non-zone-redundant gateway, has reintroduced single points of failure that the SLA does not insure you against, because they are on your side of the boundary or in choices you made. The way to actually achieve the availability the SLA describes is to build redundancy at every layer you control: dual on-premises edge devices, redundant provider connections where the stakes justify the spend, a zone-redundant gateway SKU where availability zones exist, and active monitoring so a degraded layer is repaired before a second failure compounds it.

The honest framing is that the SLA describes the service’s commitment under a redundant configuration, and your realized availability is the minimum across all layers, including the ones you built. Verify the current SLA terms against Microsoft’s published agreement before you cite a number in a design document, because SLA terms and the conditions attached to them are revised over time and are exactly the kind of figure that should never be quoted from memory. The durable lesson is independent of the specific percentage: availability is a property of the whole path, the circuit’s redundancy is necessary but not sufficient, and the layers you own are usually where realized availability is actually lost.

A worked example: sizing a connection from a real requirement

Abstract layers click into place once you push a concrete requirement through them, so consider a common one. A company is moving a data-heavy workload into a single Azure region, expects sustained transfers approaching several gigabits per second between their datacenter and that region, has a regulatory obligation that the traffic not cross the public internet, runs a large enterprise routing table on-premises, and wants the connection to survive the loss of any single component. Walk that through the model and every design choice falls out of a layer.

Start with the path requirement, because it is the one that is non-negotiable. Traffic must not traverse the public internet, which rules a VPN out by its nature and makes ExpressRoute the answer before bandwidth even enters the conversation. The requirement is the path, and only the private circuit satisfies it. That single fact justifies the cost and the operational weight that a lighter solution would have avoided, and it is worth writing down explicitly in the design so a future reviewer understands why the more expensive option was chosen.

Now the circuit layer. Sustained transfers approaching several gigabits mean a circuit bandwidth tier comfortably above that figure, with headroom for growth, ordered through a provider whose handoff lead time is tracked as a real dependency rather than assumed to be instant. Because the workload is data-heavy and sustained, the unlimited data plan likely beats metered, but that is a calculation to make from projected outbound volume rather than a reflex, and starting on metered preserves the option to switch if the projection proves high.

Next the peering and routing layer. The destination is private IP space inside a virtual network, so private peering is the relationship, and Microsoft peering is irrelevant unless the company separately wants public services over the circuit. The large enterprise routing table is a warning sign: advertising every on-premises subnet individually risks crossing the prefix ceiling and dropping the session. The design calls for route summarization into aggregates well under the limit, and if summarization genuinely cannot get there, the Premium add-on raises the ceiling, though summarization is tried first as the cleaner move.

Then the gateway layer. Several gigabits of sustained throughput into a single network exceeds the lower SKUs and lands on the upper tier. Choosing the zone-redundant ErGw3Az gives both the throughput headroom and zone resilience for the gateway itself, and it keeps FastPath available should the workload later push past the gateway’s ceiling or develop a latency sensitivity. If the projected volume already brushes the ten gigabit ceiling, FastPath is enabled from the start so the gateway is not the bottleneck, with a connectivity test confirming traffic takes the bypassed path.

Finally the resilience requirement, which touches every layer. The circuit’s primary and secondary links are there by default; the company honors them with dual on-premises routers so the customer side is not the single point of failure the model warns about. The zone-redundant gateway handles a zone failure. Both BGP sessions are monitored so a silent secondary-link failure becomes an alert. Where the budget allows, a second provider connection from a separate peering location, or a backup VPN with a tested failover, removes the last single point of failure. The realized availability is the minimum across all of these, so the design deliberately raises the weakest layer rather than over-investing in one that was already strong.

The point of the walk-through is that no step required intuition or a lookup of someone else’s reference architecture. Each decision belonged to a layer, each layer answered one question, and the requirements mapped onto the circuit, the peerings, and the gateway with nothing left over. That is what it means to design from the model rather than from a template, and it is the habit the rest of the series reinforces: name the layer, answer its question, and let the architecture assemble itself from requirements you can defend.

The verdict

ExpressRoute is not hard once you stop treating it as one thing. The circuit gives you a private path with physical bandwidth and built-in dual-link redundancy. The peerings decide what you can reach, with private peering carrying you to your virtual networks through a gateway and Microsoft peering carrying you to supported public services without one. The gateway terminates the private path into a specific network and bounds how fast traffic can enter it, with a SKU you must size deliberately and a FastPath option that removes the gateway from the data path when it would otherwise be the limit. Performance is set by the smallest of these. Resilience is set by the weakest of these. Reachability is set by whether the right peering exists and advertises the right routes within its limits.

Hold that frame and the product becomes legible. The throughput that plateaus below your circuit is the gateway, not the carrier. The destination you cannot reach while others work is a peering, not the circuit. The outage that takes everything at once is the circuit and the provider, not your routing. The redundancy you assumed is only real on the layers you actually built it into. Name the layer, run the one diagnostic that matches it, and change the one thing that owns the symptom. That is the circuit-peering-gateway model, and it is the working mental model the rest of the networking series builds on. When you are ready to put it into practice, run the hands-on Azure labs and command library on VaultBook, where you can model the circuit, the peerings, and the gateway against a sandbox and watch each layer behave exactly as the model predicts.

Frequently Asked Questions

Q: What is Azure ExpressRoute and how does it work?

Azure ExpressRoute is a service that gives you a private connection from your on-premises network into Microsoft’s cloud without traversing the public internet. It works as three cooperating layers. A circuit, ordered through a connectivity provider or owned as Direct ports, provides the physical bandwidth at the boundary between your network and Microsoft. Peerings, which are BGP routing relationships riding on the circuit, decide which destinations you can reach, with private peering serving your virtual networks and Microsoft peering serving supported public services. A virtual network gateway terminates the private path inside a specific network and forwards traffic to the resources there. Routes are exchanged dynamically over BGP across two links for resilience. The crucial mental model is that performance and reachability come from these layers acting together, so any question about ExpressRoute is really a question about which layer is the limit. Treating it as a single thing is the source of most confusion, and naming the layer is the start of every correct answer.

Q: What are the ExpressRoute peering types and what does each carry?

ExpressRoute supports two peering types on the same circuit, and they do entirely different jobs. Private peering connects your on-premises network to the private address space of your Azure virtual networks, reaching virtual machines and private endpoints by their private IPs. It requires a virtual network gateway to deliver traffic into a given network, and it carries the experience most teams buy ExpressRoute for. Microsoft peering connects you to Microsoft’s public-facing services that support the circuit, over their public IP addresses rather than private space, and it involves route filters that select which service communities you receive. Microsoft peering does not use a gateway, because nothing is being delivered into a private network. The two are configured and troubleshot independently, and one being healthy says nothing about the other. The most common mistake is assuming Microsoft peering grants access to your virtual machines, or treating a half-working circuit as broken when it is really two separate relationships in two different states. Identify which peering owns your destination first.

Q: How do the primary and secondary links provide redundancy?

Every ExpressRoute circuit comes with a pair of links, a primary and a secondary, terminating on separate Microsoft edge routers at the peering location. This pairing is built in rather than an option you buy, and it exists so the failure of a single device or cable on Microsoft’s side does not break your connection. Each link carries its own BGP session for each peering, and both are active at once rather than one sitting as a cold standby. When a link drops, BGP withdraws the affected routes, the surviving link carries everything, and convergence happens quickly without anyone intervening. The catch is that this redundancy protects only the segment from your provider’s edge to Azure. It does nothing for a single on-premises router, a single provider handoff, or a non-redundant gateway on your side. A redundant circuit feeding a single customer-side device still has a single point of failure you built. Real resilience means matching the circuit’s built-in redundancy at every layer you control and monitoring both links so a silent failure is an alert.

Q: What does the ExpressRoute gateway SKU determine?

The gateway SKU sets a throughput ceiling on how much traffic can enter a specific virtual network over ExpressRoute, and that ceiling is independent of the circuit’s bandwidth. This is the single most expensive misunderstanding in the product, because a large circuit attached to a small gateway is throttled at the gateway, not at the circuit. The traditional SKUs form a ladder: Standard and the zone-redundant ErGw1Az at the bottom, High Performance and ErGw2Az in the middle, and Ultra Performance and ErGw3Az at the top of the traditional range. The newer scalable gateway supports much higher bandwidth and autoscales between scale units you configure. Microsoft publishes the exact throughput per SKU and revises it over time, so verify current numbers before designing. Sizing is a capacity exercise: choose the SKU from the traffic the network will carry, with headroom. Prefer the zone-redundant family where availability zones exist, because resizing up within a family is online while switching families or downgrading requires recreating the gateway and accepting downtime.

Q: What is FastPath and when does it help?

FastPath lets traffic bypass the virtual network gateway in the data path and go directly to virtual machines, while the gateway continues handling the control plane and route exchange. It produces two benefits. Latency drops because a hop and its processing are removed from the path. Throughput is no longer capped by the gateway SKU, because the traffic is not passing through the gateway, which means a flow can exceed the gateway’s own ceiling. FastPath is the official way to stop the gateway from being the bottleneck once the circuit is already sized for the volume you need. The requirements are precise: the gateway must be Ultra Performance or ErGw3Az, the lower SKUs cannot run it, and FastPath applies to private peering rather than Microsoft peering. Supported scenarios for things like virtual network peering, user-defined routes, and Private Link traffic have grown and have specific conditions, so confirm current requirements against the documentation. Reach for FastPath when the gateway is the throughput limit or latency is the constraint; skip it when traffic fits comfortably under the gateway ceiling.

Q: When should I choose ExpressRoute over a VPN gateway?

The deciding factor is rarely raw bandwidth; it is the nature of the path. A VPN gateway builds an encrypted tunnel across the public internet, so its latency varies with internet conditions and its path is shared. ExpressRoute rides a dedicated private connection that never touches the public internet, giving consistent latency and a path that satisfies compliance requirements about data not transiting the internet. Choose ExpressRoute when the requirement is the private path itself, when you need consistent predictable latency, or when sustained bandwidth exceeds what a VPN comfortably carries. When the requirement is the path, no VPN configuration changes the fact that a VPN uses the internet, so the choice is made for you. Choose a VPN gateway when the volume is modest, the workload tolerates the internet’s variability, and you want a faster, cheaper setup with no carrier relationship. The two can also coexist, with the VPN serving as a backup path for the circuit. Reaching for ExpressRoute where a VPN would serve perfectly is over-engineering an expensive and operationally heavier solution.

Q: Why is my throughput lower than the circuit bandwidth I bought?

This is the signature ExpressRoute complaint, and the cause is almost always the gateway SKU rather than the circuit or the provider. Your traffic enters Azure and hits the virtual network gateway, whose SKU sets a throughput ceiling that the circuit’s larger bandwidth cannot override. If your transfers plateau at a number that matches the gateway SKU’s documented throughput, you have found the limit. Confirm by reading the gateway SKU and comparing its published throughput against what you observe. The fix lives at the gateway: resize up within the SKU family if you can, move to the scalable gateway for higher bandwidth, or, if you already run a qualifying SKU, enable FastPath so traffic bypasses the gateway entirely and is no longer bound by its ceiling. Resizing the circuit does nothing for this symptom, which is why the model insists on naming the layer before changing anything. A surprising amount of money is spent enlarging circuits that were never the constraint, when the gateway was the bottleneck the whole time.

Q: Does ExpressRoute encrypt my traffic by default?

No, and this distinction matters for security reviews. ExpressRoute gives you a private path, meaning your traffic does not traverse the public internet, but private peering traffic is not encrypted on the wire by the service itself. Private and encrypted are two separate properties, and conflating them has tripped up more than one architecture review that assumed a private connection was automatically an encrypted one. For many compliance regimes the private path alone is the requirement and that is sufficient. When your threat model also requires encryption, you add it deliberately in one of two ways. On ExpressRoute Direct ports, MACsec encrypts traffic at the layer-2 link between your edge and the Microsoft edge. Above that, you can run an encrypted tunnel carrying IPsec-protected traffic over the private peering, so the payload is encrypted end to end even though the underlying path is already private. The choice depends on whether your requirement sits at the physical link or at the traffic flows. Decide each on its own merits and document which guarantee, private path or encryption, you are actually satisfying.

Q: What is the difference between a provider circuit and ExpressRoute Direct?

There are two ways to obtain a circuit. The provider model is the common path: a connectivity provider handles the physical connection from your premises into the Microsoft edge, and you consume the circuit as a managed resource. ExpressRoute Direct is the alternative where you connect into the Microsoft backbone at peering locations on high-capacity port pairs you own, at tiers such as ten and one hundred gigabits. Direct suits demanding scenarios: massive data ingestion, physical isolation for regulated markets, and dedicated capacity for bursty workloads like large rendering jobs. Owning the ports lets you carve multiple circuits from the same physical capacity and control the underlying connection directly rather than through a carrier’s provisioning. The trade-off is more responsibility and a port fee that begins after a grace period regardless of traffic. Most organizations are well served by the provider model. The decision is a scale-and-cost calculation layered on the same three-part model: whichever path you choose, you still reason about the circuit, the peerings, and the gateway in exactly the same way.

Q: What does the ExpressRoute Premium add-on give me?

Premium is an add-on to the circuit that raises several capacity ceilings rather than changing how any layer fundamentally behaves. The benefit most relevant to routing is a higher prefix limit on peering, historically lifting the accepted route count from four thousand to ten thousand, which helps when you cannot summarize your on-premises advertisements below the standard ceiling. Premium also unlocks global connectivity, letting a circuit created in one geopolitical region reach Azure resources in other regions worldwide, which a standard circuit cannot. And it raises the number of virtual networks a single circuit can serve, which matters in dense hub-and-spoke estates where one circuit feeds many networks. The exact route and link numbers are figures to confirm against current documentation, since they get revised, but the shape of the benefit is stable. Enabling Premium begins billing for the add-on immediately, so treat it as a deliberate capacity decision tied to a concrete need. Summarizing routes remains the cleaner first move for the prefix ceiling, with Premium as the answer when summarization genuinely is not enough.

Q: How many route prefixes can I advertise over private peering?

Private peering accepts a bounded number of advertised prefixes, historically four thousand on a standard circuit and ten thousand with the Premium add-on, though you should confirm current values against the documentation since limits get revised. The behavior at the ceiling is what makes this a trap. Exceed the limit and the BGP session does not quietly ignore the surplus; it can drop, and a dropped session takes all of its routes with it. The result is an outage that looks like a circuit failure but is actually an oversized advertisement, often triggered the moment someone adds a batch of new subnets and advertises them individually. Confirm it by counting the prefixes you advertise and checking whether the session state flapped when you added routes. The fix is route summarization, aggregating many specific subnets into fewer broader prefixes, which keeps you comfortably under the ceiling and is good routing hygiene regardless. Premium raises the ceiling if you genuinely cannot summarize enough, but summarization is the cleaner habit and the first thing to reach for.

Q: What is ExpressRoute Global Reach used for?

Global Reach extends ExpressRoute sideways. Where the base service connects your on-premises network to Azure, Global Reach connects your on-premises sites to each other through your existing circuits, using Microsoft’s global network as the transit between them. If you have a circuit in one location and another in a second location, Global Reach links them so traffic between the two sites rides the Microsoft backbone rather than a separate carrier link you would otherwise buy and manage. The appeal is consolidation: an enterprise already paying for ExpressRoute to reach Azure can use the same investment to interconnect datacenters. The conditions matter. Linking circuits in the same geopolitical region does not require Premium, while linking across geopolitical regions requires Premium on both circuits. Global Reach is billed separately, with an add-on fee per circuit and its own data charges, and its connections count against the circuit’s virtual network connection limit. It sits cleanly inside the model as the circuit doing double duty, without introducing a new gateway layer for the site-to-site traffic.

Q: Do I need a gateway in every spoke virtual network?

No, and provisioning one per spoke is a common and costly mistake. The pattern to use instead is to terminate ExpressRoute in a central hub network and let spokes reach on-premises through that single gateway, using gateway transit over virtual network peering. When a spoke is peered to the hub with gateway transit enabled, the spoke uses the hub’s ExpressRoute gateway to reach on-premises, and on-premises learns the spoke’s address space through the routes the gateway advertises. One gateway serves many networks. A gateway is not a cheap resource and managing a fleet of them multiplies cost and operational surface for no benefit. This is one of the strongest justifications for the hub-and-spoke topology in the first place, since ExpressRoute is exactly the kind of shared service that belongs in a central hub. The design questions then become which networks peer to the hub, whether gateway transit is enabled, and how the address spaces are planned so routing stays clean. Centralizing the circuit, gateway, and routing in a hub is the architecture that scales as workloads multiply.

Q: Can I run a VPN gateway alongside ExpressRoute as a backup?

Yes, and it is a legitimate resilience pattern. ExpressRoute and a VPN gateway can coexist, with the private circuit serving as the primary path and an encrypted tunnel over the internet standing ready as a backup if the circuit fails entirely. The trade-off during a failover is that backup traffic accepts the internet’s variability for the duration of the outage in exchange for staying connected, which is usually a far better outcome than going dark. Designing the coexistence is a routing exercise: you set the routing preference so traffic uses the circuit while it is healthy and fails over to the VPN cleanly when it is not, and you test that failover rather than assuming it works. This pattern is especially worth considering when the circuit is a single point of failure you cannot otherwise eliminate, for example where a second provider connection is not feasible. The VPN does not need to match the circuit’s bandwidth to be valuable as a backup; it needs to carry enough to keep critical traffic flowing until the circuit recovers.

Q: How is ExpressRoute billed, metered or unlimited?

ExpressRoute offers two data plans for the circuit. The metered plan charges a monthly fee tied to the circuit’s bandwidth tier, includes inbound data transfer at no charge, and bills outbound data per unit at rates that vary by region. The unlimited plan charges a single higher fixed monthly fee that includes all inbound and outbound transfer. Choosing between them is a utilization calculation: heavy, sustained outbound volume favors unlimited, while modest or sporadic usage favors metered. You can move from metered to unlimited freely, but moving the other way has constraints, so starting on metered preserves flexibility when you are unsure. Beyond the circuit, the gateway is a separate hourly charge for the SKU you chose, and a top-tier gateway is expensive, which is another reason to size it to the traffic. The Premium add-on and Global Reach each add their own fees. The most common waste is not a hidden line item but an oversized gateway serving a fraction of its capacity, or an unlimited plan on a circuit that barely moves data, both visible once you attribute spend to the layer that incurs it.

Q: Which gateway SKU do I need for FastPath?

FastPath requires an Ultra Performance gateway or its zone-redundant equivalent, ErGw3Az. The lower SKUs, Standard, High Performance, and their zone-redundant counterparts, cannot run FastPath, because they lack the capacity to handle the control-plane work that FastPath still relies on even while the data path bypasses the gateway. If you are currently on a lower SKU and want FastPath, you must move to a qualifying SKU first, and that move has the usual gateway-resize considerations: upgrading within the same SKU family can be done online, but switching between the non-zonal and zone-redundant families, or downgrading, requires recreating the gateway and accepting downtime. Because FastPath is most often the answer to a throughput or latency problem that appears once a workload grows, it is worth choosing a FastPath-capable, zone-redundant SKU from the start if you anticipate ever needing the bypass, so that enabling it later is a configuration change rather than a gateway rebuild. The cost of FastPath is largely the cost of the upper-tier gateway it requires rather than a separate per-gigabyte charge.

Q: Does FastPath work with Microsoft peering?

No, FastPath applies to private peering, the relationship that carries traffic into your virtual networks, not to Microsoft peering. This follows from what FastPath does: it sends traffic directly to virtual machines in your virtual network, bypassing the gateway in the data path. Microsoft peering does not deliver traffic into a virtual network and does not use a virtual network gateway at all, so there is no gateway data path for FastPath to bypass in that case. If your goal is faster or lower-latency access to your virtual machines over the circuit, FastPath on private peering is the lever, provided you meet the gateway SKU requirement. If your goal is reaching Microsoft public services over the circuit, that is a Microsoft peering question governed by route filters rather than a FastPath question. Keeping the two peerings separate in your head prevents the misdiagnosis of trying to apply a private-peering optimization to public-service traffic. As always with FastPath, confirm the current supported scenarios against the documentation, because the boundaries of what FastPath supports have expanded over successive updates.

Q: What is the gateway subnet and how should I size it?

The gateway subnet is a dedicated subnet in your virtual network that holds the addresses the virtual network gateway uses, and it must be named exactly GatewaySubnet for Azure to recognize it. ExpressRoute requires this subnet to exist before you create the gateway, and the order of operations matters: subnet first, then gateway, then the connection to the circuit. Size it generously. A /27 is a workable minimum, but a /26 is the safer choice, especially if you may connect several circuits to the same gateway, because a too-small gateway subnet can constrain you later in a way that is awkward to fix once the gateway is deployed. Do not place other resources in the gateway subnet; it is reserved for the gateway’s own use. Treating the gateway subnet as an afterthought is a common setup snag, since an undersized or wrongly named subnet causes the gateway creation to fail or limits future expansion. Plan it as part of the address space design for the network, sized for the gateway and any circuits you reasonably expect to attach over the life of the deployment.

Q: How do I monitor ExpressRoute so I catch a failed secondary link?

Monitor each layer for the signal that confirms its health and alert on the change before users notice. At the circuit, watch the BGP session state for both the primary and secondary links on each peering. Two sessions up is the healthy condition, and a transition to one session up is the early warning that you are running without redundancy even though nothing visible has failed yet. This is exactly the silently-failed-secondary-link scenario that turns a supposedly redundant circuit into a single point of failure no one knew about. Also watch bits-in and bits-out against the provisioned bandwidth so you see saturation approaching. At the gateway, watch throughput against the SKU ceiling, because a gateway pinned at its limit is the throughput-cap symptom announcing itself, and watch gateway and zone health if you run a zone-redundant SKU. For end-to-end confidence, Connection Monitor tests reachability and latency along the actual path and can verify that FastPath is sending traffic on the bypassed path. Alert on the leading indicators, especially the second BGP session dropping, so failures become tickets on your schedule rather than 2 a.m. surprises.

Q: Can I change my ExpressRoute gateway SKU without downtime?

It depends on the direction of the change and the SKU families involved. You can upgrade to a higher-capacity SKU within the same family without deleting and recreating the gateway, an online resize that keeps the gateway available. The non-zonal family covers Standard, High Performance, and Ultra Performance, while the zone-redundant family covers ErGw1Az, ErGw2Az, and ErGw3Az, and the scalable gateway sits in its own category with its own upgrade and migration paths. The operations that require recreating the gateway, with the downtime that implies, are downgrades and switching between the non-zonal and zone-redundant families. This asymmetry is the practical reason to choose the zone-redundant family from the start where availability zones exist: it gives the gateway zone resilience and keeps a clean online upgrade path open, so growing into more throughput later is a resize rather than a rebuild. If you anticipate ever needing FastPath, choosing a FastPath-capable SKU early avoids a disruptive family switch down the road. Always confirm the current supported resize and migration paths against the documentation before planning a change, since the tooling and supported transitions evolve.