Hub-Spoke vs Virtual WAN Architecture

A network architect sketching out a multi-region Azure estate reaches the same fork every time. On one side sits the hand-built hub-spoke: a hub virtual network carrying shared services, spoke VNets peered into it, route tables steering traffic through a central firewall. On the other side sits Azure Virtual WAN: a managed service where Microsoft owns the hub, the routing engine, and most of the transit plumbing, and the team connects spokes and branches without wiring the mesh by hand. Both deliver hub-and-spoke connectivity. Both centralize egress through a firewall. Both can span regions. So the question that stalls the design review is rarely “which one connects VNets” and almost always “which one will this team still want to operate when the estate has thirty spokes across five regions and a dozen branch offices.”

Hub-Spoke vs Virtual WAN Architecture - Insight Crunch

That second framing is the one this guide answers. The choice between a manual hub-spoke and Virtual WAN is not a feature checklist where one product wins every row. It is a trade between two things an engineering organization values differently depending on its size, its skills, and its appetite for operational work. A hand-built hub-spoke gives the team total control over every route, every peering, every firewall rule, at the price of building and maintaining that machinery itself. Virtual WAN gives the team managed scale, a routing engine that handles transit and inter-hub connectivity automatically, at the price of surrendering some of that fine-grained control to Microsoft’s abstractions. The decision rule that follows from that trade is the spine of this entire article, and it is short enough to state before the first diagram: choose hub-spoke when control matters more than operational savings, and choose Virtual WAN when managed scale matters more than control.

The reason the decision feels harder than that sentence is that the two designs overlap heavily at small scale. With three spokes in one region, a manual hub-spoke and a single-hub Virtual WAN deployment look almost identical to the workloads running inside them. Traffic flows the same way, the firewall inspects the same packets, the latency is comparable. The differences only become decisive as the estate grows, as regions multiply, as branch connectivity enters the picture, and as the route tables that were once a single readable artifact balloon into a maintenance burden that consumes a meaningful slice of the platform team’s week. An honest comparison has to walk the reader from the small case, where the choice barely matters, to the large case, where it dominates the operating model. That is the journey this article takes.

The InsightCrunch topology decision table later in this guide rates the two options on the four axes that actually move the decision: scale, routing complexity, control, and cost. Before that table can mean anything, both designs need to be laid out plainly, the Azure services that realize each need to be named, and the routing model that separates them needs to be understood at the level of what a packet actually does. This guide builds that foundation first, then walks a reference design for each topology, then puts the decision rule to work against the recurring scenarios engineers report, and finally maps the migration path for teams that started on hub-spoke and have outgrown it. The Azure hub-spoke topology has its own dedicated walkthrough at hub-spoke network topology explained, and Virtual WAN has a full deep dive at Azure Virtual WAN deep dive; this article assumes neither and explains both, but those companions go deeper on each design in isolation than a comparison can.

What each topology actually is

Both designs answer the same connectivity problem, so the cleanest way to understand the difference is to define each one by what the team builds, what Azure builds, and where the boundary between those two falls. That boundary is the whole story. In a manual hub-spoke the team owns nearly everything above the VNet primitive. In Virtual WAN the team owns the spokes and the policy intent, and Microsoft owns the transit core. Everything else in this comparison is a consequence of where that line sits.

What is a hand-built hub-spoke topology?

A hand-built hub-spoke is a hub virtual network that holds shared services and connectivity components, with workload virtual networks peered to it as spokes. The hub typically carries the firewall, the VPN or ExpressRoute gateway, and shared platform services such as DNS resolvers. Spokes reach each other and the internet by routing through the hub, which the team wires explicitly.

The defining property of the manual design is that the team builds and owns the transit logic. Two spokes peered to the same hub do not talk to each other automatically, because VNet peering is not transitive. For spoke A to reach spoke B, the team must route spoke A’s outbound traffic to a next hop in the hub, usually the firewall, and route the firewall’s traffic onward to spoke B. That routing lives in user-defined route tables that the team writes, associates to subnets, and maintains as the estate changes. The hub itself is an ordinary virtual network; nothing about it is special to Azure except the gateway transit and peering settings the team enables. The control plane is entirely in the team’s hands, which is the source of both its flexibility and its operational weight.

A hub-spoke estate has a natural shape. The hub holds the shared firewall, the gateways that terminate on-premises connectivity, and shared platform services. Each spoke holds one workload or one environment, peered to the hub with gateway transit enabled so the spoke can use the hub’s gateway, and with a route table that forces traffic to the firewall as the next hop. When a second region appears, the team builds a second hub there and connects the two hubs, either through global VNet peering between the hubs or through the gateways, and then decides how cross-region spoke-to-spoke traffic should route. Each of those decisions is a deliberate act of design, written down in route tables and peering configuration that the team can read, audit, and change. That readability is the manual design’s quiet strength, and the volume of it is the manual design’s quiet cost.

What is Azure Virtual WAN?

Azure Virtual WAN is a managed networking service in which Microsoft operates the hub, called a virtual hub, and the routing engine that connects everything attached to it. The team creates a Virtual WAN resource, deploys one or more virtual hubs into chosen regions, and connects spokes, branches, and users to those hubs. Microsoft handles the transit routing between attachments, including inter-hub connectivity, without the team writing peering meshes by hand.

The defining property of Virtual WAN is that the transit core is a managed service rather than a set of primitives the team assembles. When a virtual network connects to a virtual hub, the team does not configure global VNet peering explicitly; the connection establishes transit automatically. When two hubs exist in the same Virtual WAN, Microsoft connects them so that a spoke on one hub can reach a spoke on the other without the team building a hub-to-hub mesh. Branches connecting over site-to-site VPN, remote users connecting over point-to-site VPN, and on-premises sites connecting over ExpressRoute all attach to the hub and reach each other through the same managed routing fabric. The team expresses what it wants through route tables on the hub and, more powerfully, through routing intent, and Microsoft realizes that intent in the underlying route distribution.

Virtual WAN comes in two types, Basic and Standard, where Basic supports only site-to-site VPN and Standard unlocks the full set of capabilities including ExpressRoute, point-to-site VPN, inter-hub transit, and routing intent. Production designs that need any-to-any connectivity use Standard. The virtual hub in Standard Virtual WAN can host an Azure Firewall instance, which turns it into a secured hub, giving the same centralized inspection that a manual hub-spoke achieves with a firewall in the hub VNet, but with the routing to and from that firewall managed by the platform rather than expressed in hand-written route tables. The shape of a Virtual WAN estate is therefore flatter to describe: a Virtual WAN resource, a hub per region, attachments to each hub, and a routing policy that says where traffic should be inspected. The mesh that a manual design spells out explicitly is, in Virtual WAN, implied by the attachments and the policy.

Where the two designs converge and where they split

At one region and a handful of spokes the two designs converge to the point where the choice is mostly about preference and existing skills. A single manual hub with three spokes and a firewall, and a single virtual hub with three connected VNets and a secured-hub firewall, route traffic the same way from a workload’s point of view. The packet leaves the spoke, reaches the firewall, gets inspected, and continues to its destination. Latency, inspection, and reachability are equivalent. A team comfortable with route tables might find the manual design more transparent at this size, and a team that wants less to operate might prefer the managed hub. Neither choice is wrong at small scale, which is exactly why small-scale comparisons mislead.

The split opens as the estate grows along three independent dimensions. The first is the number of spokes, where the manual design’s route tables and peerings multiply and the managed design’s attachments stay flat. The second is the number of regions, where the manual design’s hub-to-hub connectivity becomes a design problem the team solves repeatedly and the managed design’s inter-hub transit is handled by the platform. The third is the variety of attachment types, where adding branch VPN, remote-user VPN, and ExpressRoute to a manual hub means standing up and operating each gateway and stitching its routes in, while in Virtual WAN those attachments plug into the same managed fabric. The decision rule turns on which of those dimensions the estate will actually exercise. An estate that will stay small on all three rarely needs Virtual WAN. An estate that will grow on any of the three starts to pay the manual design’s operational tax, and that tax is what Virtual WAN is built to remove.

The Azure services that realize each topology

Naming the components makes the trade concrete, because the difference between the two designs is largely a difference in who operates which component. The manual design uses a longer list of services the team configures directly. Virtual WAN folds several of those into the managed hub, leaving the team a shorter list to operate but a thinner set of knobs to turn.

Components of a manual hub-spoke

A manual hub-spoke is assembled from standard Azure networking primitives. The hub is a virtual network. Inside or peered to it sits a firewall, usually Azure Firewall, which inspects and controls traffic between spokes and to the internet. Connectivity to on-premises terminates on a VPN gateway for site-to-site or point-to-site tunnels, or on an ExpressRoute gateway for private circuits, both deployed in the hub. Spokes connect through VNet peering, with gateway transit enabled so spokes can use the hub’s gateway and remote networks can reach the spokes. Traffic steering is the team’s responsibility, expressed in user-defined routes inside route tables associated to subnets, typically pointing the default route at the firewall’s private address.

The operational surface of that list is wide. The team sizes and scales the firewall, patches and monitors the gateways, manages the peering relationships as spokes come and go, and most consequentially writes and maintains the route tables. Each spoke needs a route table that sends traffic to the firewall, and the firewall needs routes that reach every spoke. When the estate forces traffic through a central appliance, the route table on each spoke and the routes describing every other spoke’s address space become a living document the team edits on every change. Azure caps user-defined routes at a finite number per route table, a figure to confirm against the current subscription and service limits before designing near it, and that cap quietly sets a ceiling on how many spokes a single firewall-forced hub can serve before the design needs to split. The team also owns the address-space planning that keeps spokes non-overlapping, the DNS resolution strategy in the hub, and the diagnostics that prove traffic is taking the path the route tables intend.

Components of a Virtual WAN deployment

A Virtual WAN deployment starts with the Virtual WAN resource itself, which is the global container, and one or more virtual hubs deployed into regions. Each virtual hub is the managed equivalent of the hub VNet, but the team does not build its internals. Virtual networks connect to the hub through virtual network connections, which establish transit without explicit peering configuration. Branch sites connect through the VPN gateway that the hub hosts, remote users through the point-to-site gateway, and private circuits through the ExpressRoute gateway, all provisioned as part of the hub rather than as separate VNets the team wires in. A secured virtual hub adds an Azure Firewall managed inside the hub, giving centralized inspection without a separate firewall VNet and its route tables.

The component the team operates most deliberately in Virtual WAN is routing, but the form it takes is different. Instead of user-defined routes per spoke, the team works with hub route tables that handle association and propagation, and with routing intent, which lets the team declare that private traffic, internet traffic, or both should pass through the security solution in the hub. Microsoft confirmed in its documentation that each virtual hub can carry at most one internet traffic routing policy and one private traffic routing policy, each pointing to a single next hop such as the hub’s Azure Firewall. That declarative model replaces the per-spoke route tables of the manual design with a per-hub statement of intent, and the platform distributes the routes that realize it. The team still owns address planning and DNS, but the transit mesh, the inter-hub connectivity, and the route propagation that the manual design hand-builds are now operated by the service.

How the component lists map onto operating cost

Reading the two lists side by side shows where the operational savings of Virtual WAN come from and what they cost. The manual design’s longer list is not inherently worse; every item on it is a knob the team can turn precisely. The firewall sizing, the gateway choice, the exact next hop for every route, the precise peering settings between every pair of networks all sit in the team’s hands. That is control in its literal sense. Virtual WAN’s shorter list is shorter because the platform absorbed the transit components, and the routing-intent model absorbed the per-spoke route tables. The savings are real and grow with scale, because the absorbed components are exactly the ones whose count rises as the estate grows. The cost of those savings is that the absorbed components are no longer the team’s to shape in fine detail. A design that depends on a routing behavior Virtual WAN’s abstractions do not express has to either accept the abstraction or stay on the manual design where that behavior is expressible. That tension, abstraction versus expressiveness, is the same scale-versus-control trade seen from the angle of the component list.

How does routing complexity compare between the two?

Routing complexity is the single biggest practical difference. A manual hub-spoke expresses transit through user-defined routes the team writes and maintains per spoke, so complexity grows with the estate. Virtual WAN expresses transit through routing intent, a declarative per-hub policy the platform realizes, so complexity stays roughly flat as the estate grows. This is the axis where the designs diverge most.

To see why, follow a single packet in each design from one spoke to another. In a manual hub-spoke, spoke A wants to reach spoke B. Because peering is not transitive, spoke A’s subnet route table must carry a route that sends traffic destined for spoke B, or more commonly the default route covering all non-local traffic, to the firewall’s private address as the next hop. The firewall, sitting in the hub, receives the packet, applies its rules, and forwards it. For the firewall’s return path and for the firewall to reach spoke B, the hub and the firewall subnet need routes that describe spoke B’s address space and resolve to the peering toward spoke B. Every spoke that should be reachable from every other spoke contributes to this picture. The route tables are readable, but their number and the cross-references between them rise with the count of spokes, and the cap on routes per table eventually forces summarization or a design split. The team carries this complexity as a maintenance task on every topology change.

In Virtual WAN, the same spoke-to-spoke flow is expressed once at the hub. The team connects spoke A and spoke B to the virtual hub and configures a private traffic routing policy that names the hub’s Azure Firewall as the next hop. From that single statement, the platform programs the routes so that traffic between the two connected VNets passes through the firewall. The team did not write a route table on spoke A pointing at the firewall, and it did not maintain a route on spoke B describing spoke A. It declared the intent, that private traffic should be inspected, and the service distributed the routes. Adding a third spoke does not add route-table maintenance; the new connection inherits the same policy. This is the structural reason Virtual WAN’s routing complexity stays flat. The intent is a constant-size artifact, and the platform scales the realization.

Why routing intent replaces user-defined routes

Routing intent is a declarative model. Rather than writing each route and choosing its next hop, the team states which classes of traffic, private and internet, should flow through the hub’s security solution, and the platform computes and distributes the routes. The advantage is that one statement covers all attachments to the hub, present and future, instead of a route table per spoke that the team edits on every change.

The mechanics are worth understanding because they explain both the power and the limits of the model. When an internet traffic routing policy is configured on a hub, Virtual WAN advertises a default route to all spokes and gateways attached to that hub, so their internet-bound traffic is drawn to the hub’s next-hop security solution, inspected, and then forwarded. When a private traffic routing policy is configured, the address prefixes of branches and virtual networks are treated as a single private space and steered through the same next hop. Microsoft’s documentation describes both branch and virtual-network prefixes being handled as one entity within the routing-intent concept, which is what lets a single policy cover the whole hub. The model’s limit is the same as its strength: one internet policy and one private policy per hub, each with one next hop. A design that needs several distinct next hops for different slices of private traffic on the same hub is pushing against the abstraction, and that is precisely the kind of fine-grained control the manual design retains. Static routes on a connection can extend the model for specific destinations, a documented pattern for sending selected prefixes to an NVA in a spoke while the rest follows the policy, but the moment a design needs many such exceptions it is signalling that it values control over the managed simplicity Virtual WAN offers.

What the route-table burden looks like at thirty spokes

The difference becomes visceral at scale. Picture an estate of thirty spokes across three regions that forces all traffic through a central firewall. In the manual design, each spoke carries a route table, and the firewall path needs routes describing the address spaces it must reach. The team maintains these by hand or, more sustainably, through infrastructure-as-code that generates them, but either way the artifact count and the cross-region hub-to-hub routing are the team’s to design, test, and keep correct. A single mistyped next hop or a missing return route produces a connectivity failure that the team diagnoses by reading route tables and effective routes. In Virtual WAN, the same thirty spokes are thirty connections across three hubs, the hubs are connected by the platform, and a routing-intent policy on each hub sends private traffic through the secured hub’s firewall. The artifact the team maintains is the set of connections and the policy, not thirty route tables and a hub-to-hub mesh. The connectivity reference in VNet peering vs VPN vs ExpressRoute covers how the underlying connection types differ, which matters because both designs ultimately rest on those same peering and gateway primitives even though Virtual WAN hides the peering behind connections.

The following code shows the contrast in concrete terms, with a representative user-defined route table for one spoke in the manual design and the routing-intent configuration for a hub in Virtual WAN. The manual snippet must be repeated and adjusted per spoke; the Virtual WAN snippet covers every attachment to the hub.

# Manual hub-spoke: a route table for ONE spoke, forcing traffic to the firewall.
# This pattern repeats for every spoke, and the firewall path needs reachability
# to every spoke address space in turn.
az network route-table create \
  --resource-group rg-hubspoke \
  --name rt-spoke-a

az network route-table route create \
  --resource-group rg-hubspoke \
  --route-table-name rt-spoke-a \
  --name default-to-firewall \
  --address-prefix 0.0.0.0/0 \
  --next-hop-type VirtualAppliance \
  --next-hop-ip-address 10.0.1.4   # the hub firewall private IP

# Associate the route table to the spoke subnet. Repeat the create+associate
# cycle for spoke B, spoke C, ... and revisit on every topology change.
az network vnet subnet update \
  --resource-group rg-spoke-a \
  --vnet-name vnet-spoke-a \
  --name snet-workload \
  --route-table rt-spoke-a

# Virtual WAN: routing intent declared ONCE on the hub covers all attachments.
# No per-spoke route tables; the platform distributes the routes.
az network vhub routing-intent create \
  --resource-group rg-vwan \
  --vhub-name hub-eastus \
  --name hub-eastus-intent \
  --routing-policies '[
    {
      "name": "PrivateTrafficPolicy",
      "destinations": ["PrivateTraffic"],
      "nextHop": "/subscriptions/<sub>/resourceGroups/rg-vwan/providers/Microsoft.Network/azureFirewalls/fw-hub-eastus"
    },
    {
      "name": "InternetTrafficPolicy",
      "destinations": ["Internet"],
      "nextHop": "/subscriptions/<sub>/resourceGroups/rg-vwan/providers/Microsoft.Network/azureFirewalls/fw-hub-eastus"
    }
  ]'

The shapes of these two snippets are the comparison in miniature. One scales linearly with the spoke count and lives in the team’s hands. The other is a fixed artifact the platform expands. Confirm the exact command syntax and the routing-intent schema against the current Azure CLI reference before running them, since the networking command surface changes between releases.

A reference design walked through: the manual hub-spoke

To judge the trade fairly, both designs deserve a walkthrough at a realistic size rather than a toy diagram. Take a company with a production environment, a non-production environment, a shared-services need, and on-premises connectivity, growing toward a second region. This is the size where a manual hub-spoke is still comfortable and where many estates live for years.

The hub virtual network anchors the design. It carries an Azure Firewall that inspects all east-west and north-south traffic, a VPN gateway or ExpressRoute gateway terminating the on-premises connection, and shared services such as private DNS resolvers and a jumpbox subnet for administrative access. The hub’s address space is planned to leave room for these subnets without overlap, because every spoke that joins later must also avoid overlapping the hub and every sibling spoke. Address planning is unglamorous and decisive; a manual hub-spoke that did not reserve enough contiguous space early pays for it later in renumbering.

Each spoke is a workload or environment virtual network peered to the hub. The production spoke, the non-production spoke, and any application-specific spokes each peer to the hub with two settings that matter. Allow gateway transit on the hub side and use remote gateway on the spoke side let the spoke reach on-premises through the hub’s gateway without its own gateway. A route table on each spoke subnet sends the default route to the firewall’s private address, so traffic leaving the spoke is inspected before it reaches another spoke or the internet. The firewall’s own subnet and the hub carry routes that let inspected traffic reach each spoke’s address space. The result is a readable topology: the team can point at the route table on the production spoke and say exactly where its traffic goes and why.

Walking a packet through the manual design

Consider a request from a virtual machine in the production spoke to a database in the shared-services area of the hub, and a second request from production to non-production. The first request leaves the production VM, hits the spoke’s route table, and follows the default route to the firewall. The firewall applies its network and application rules, permits the flow to the database subnet, and forwards it. The return traffic follows the firewall’s routes back to the production spoke. The second request, production to non-production, takes the same shape: out to the firewall, inspected, and only if a rule permits does it continue to the non-production spoke. Because peering is not transitive, there is no path from production to non-production that bypasses the firewall unless the team deliberately creates direct peering between the spokes, which most security designs forbid precisely so that all cross-spoke traffic is inspected.

This is the manual design at its best. Every flow is explicit, every inspection is guaranteed by routing the team controls, and an auditor can trace any path by reading configuration the team wrote. When the second region arrives, the team builds a second hub there with its own firewall and gateway, connects the two hubs, and decides how a spoke in region one reaches a spoke in region two. That decision, hub-to-hub connectivity and the routes that make cross-region spoke traffic transit both firewalls or just one, is a genuine design choice the team makes and documents. At two regions and a dozen spokes this is still tractable. The walkthrough shows both the strength, total legibility, and the seed of the cost, every flow and every cross-region path is the team’s to design and maintain.

Where the manual design starts to strain

The strain appears not as a single failure but as a rising slope of effort. Adding the tenth spoke is more work than adding the third, because the route tables, the firewall rules, and the address plan all have more existing state to respect. Adding the second region multiplies the hub-to-hub routing decisions. Adding branch offices over VPN means standing up and operating gateways and threading their routes through the same firewall. None of these is hard in isolation; the cost is their accumulation and the fact that each change touches hand-maintained artifacts that must stay mutually consistent. Teams that adopt infrastructure-as-code, generating route tables and peerings from a module, push this slope outward considerably and keep the manual design viable far longer. The strain is the operational tax that the decision rule weighs against control, and it is real even when it is managed well. The landing-zone foundation that many of these estates sit inside is covered in Azure landing zones explained, and a connectivity subscription built as a manual hub-spoke is one of the most common shapes that foundation takes.

A reference design walked through: Virtual WAN

Now take the same company and the same requirements and build them on Virtual WAN, so the comparison is like for like. The differences are not in what connects to what; they are in what the team builds versus what the platform provides.

The design begins with a Virtual WAN resource of the Standard type, since the requirements include ExpressRoute or VPN to on-premises and the intent is any-to-any connectivity. Into it the team deploys a virtual hub in the first region. The hub is secured by deploying an Azure Firewall inside it, which makes it a secured virtual hub and gives the same centralized inspection the manual design achieves with a firewall in the hub VNet. The production VNet, the non-production VNet, and the shared-services VNet connect to the hub as virtual network connections. On-premises connectivity terminates on the hub’s VPN or ExpressRoute gateway, provisioned as part of the hub. Branch sites and remote users, if present, attach to the hub’s gateways as well. The team did not build a hub VNet, did not configure peering between the hub and each spoke, and did not write a route table per spoke.

The routing is expressed through routing intent. The team configures a private traffic routing policy and an internet traffic routing policy on the hub, both naming the hub’s Azure Firewall as the next hop. From those two statements, the platform programs the routes so that traffic between connected VNets, and traffic to the internet, passes through the firewall. The production-to-non-production flow is inspected because the private policy steers it through the firewall, exactly as the manual design’s route tables did, but the team declared it once rather than writing it per spoke. When the second region arrives, the team deploys a second virtual hub there, connects its VNets, and applies the same routing intent. The platform connects the two hubs automatically, so a spoke on the first hub reaches a spoke on the second without the team designing hub-to-hub routing. The inter-hub transit that was a deliberate design choice in the manual world is, in Virtual WAN, a property of attaching both hubs to the same Virtual WAN.

Walking a packet through Virtual WAN

Trace the same two requests. Production to shared services: the packet leaves the production VNet, and because a private traffic routing policy is in effect, the route the platform distributed sends it to the hub’s firewall, which inspects and forwards it to the shared-services VNet. Production to non-production: same path, drawn to the firewall by the private policy, inspected, and forwarded only if a rule allows. To the workloads, this is indistinguishable from the manual design. The packet is inspected, the rule decides, and reachability follows policy. What differs is upstream of the packet: no team member wrote the route that pulled the packet to the firewall. The routing-intent policy did, and it does so for every connection on the hub including ones added later.

Now extend to the cross-region case that strained the manual design. A production spoke in region one needs to reach a workload spoke in region two. In Virtual WAN, both hubs belong to the same Virtual WAN, the platform has connected them, and the routing intent on each hub steers private traffic through that hub’s firewall. The cross-region flow transits the managed inter-hub path and is inspected according to each hub’s policy, with no hub-to-hub mesh for the team to design. This is the scenario the brief calls scale tilting toward Virtual WAN, and the walkthrough shows why: the work that grew with the manual design, hub-to-hub routing and per-spoke route tables, simply is not the team’s work here.

Where Virtual WAN asks the team to give something up

The Virtual WAN walkthrough is not free of cost; it asks for control in places the manual design grants freely. The one-internet-policy-and-one-private-policy-per-hub model is clean but constrained. A design that wants different private next hops for different workload classes on the same hub, sending some traffic to one inspection appliance and other traffic to another within a single region, fits the manual design’s per-route next hops more naturally than Virtual WAN’s per-hub policy. Static routes on connections relax this for specific prefixes, and multiple hubs can separate concerns, but the team is now working around an abstraction rather than directly expressing intent. Likewise, a team that wants a specific, unusual routing behavior between two spokes, or a firewall topology the secured-hub model does not match, finds the manual design more expressive. These are not common requirements, but when they are real they are decisive, and they are why the decision rule names control as the thing hub-spoke protects.

The InsightCrunch topology decision table

With both designs walked through, the four axes that move the decision can be rated directly. The InsightCrunch topology decision table is the findable artifact of this guide: it rates a hand-built hub-spoke and Azure Virtual WAN on scale, routing complexity, control, and cost, and names the deciding signal that should tip a real estate one way or the other. The ratings are deliberately not a scorecard that sums to a winner, because the right choice depends on which axis the estate weights most.

Axis	Hand-built hub-spoke	Azure Virtual WAN	Deciding signal
Scale across regions and attachments	Grows in operational effort with every spoke, region, and branch; viable far with infrastructure-as-code but never flat	Managed transit and inter-hub connectivity keep effort roughly flat as spokes, regions, and branches grow	Many regions or branches, or a spoke count that keeps climbing, favors Virtual WAN
Routing complexity	Per-spoke user-defined routes the team writes and maintains; cross-references and route-count caps rise with the estate	Declarative routing intent per hub; one policy covers all attachments and the platform distributes routes	A route-table burden that consumes real maintenance time favors Virtual WAN
Control and expressiveness	Total control over every route, next hop, peering, and firewall placement; any behavior is expressible	Clean abstractions with limits: one internet and one private policy per hub, single next hop each	A requirement for fine-grained, per-flow routing or unusual topology favors hub-spoke
Cost and operating model	Pay for the components the team runs; the larger cost is the engineering time to operate the machinery	Pay for the managed hub and its gateways plus the same peering and data charges; less engineering time	A small, stable estate where the managed hub’s baseline cost is not justified favors hub-spoke

The table reads as a set of conditional verdicts rather than a single ranking. On scale and routing complexity, Virtual WAN wins as the estate grows, because those are exactly the dimensions where managed transit removes work that otherwise rises with the estate. On control and the cost of a small stable estate, hub-spoke wins, because total expressiveness and the absence of a managed-hub baseline matter most when the estate is small enough that operating the machinery is cheap and a specific routing behavior is required. The deciding signal column is where a reader should look first: it names the concrete condition, not the abstract preference, that should tip the choice.

The scale-versus-control rule

The whole comparison reduces to one axis, and naming it is the point of the exercise. The scale-versus-control rule states it plainly: a hand-built hub-spoke gives control at the cost of operational work, while Virtual WAN gives managed scale at the cost of control, so the topology decision is simply which of those two matters more for this estate. Everything in the decision table is a projection of that single trade onto a specific axis. Routing complexity is the operational-work cost of control made concrete. Scale across regions is the managed-scale benefit made concrete. Cost is the same trade expressed in money and engineering time. Control is the thing hub-spoke protects and Virtual WAN spends.

The rule is useful because it stops the decision from being made on a feature comparison that will always look ambiguous. Feature by feature, both designs do hub-and-spoke connectivity, both centralize inspection, both span regions, both connect branches. Reading that list, a team can argue either way indefinitely. The rule cuts through it by asking the one question that actually differs: does this estate value control over its routing more than it values being relieved of the work of operating that routing at scale? An estate with strict, unusual routing requirements and a small, stable footprint answers control. An estate with many regions, many branches, a climbing spoke count, and a platform team that would rather spend its time elsewhere answers managed scale. The InsightCrunch topology decision table is the rule applied; the rule is the table compressed.

Using the rule without overfitting

A rule this compact invites two failure modes, and both are worth naming so the reader avoids them. The first is forcing a large estate onto a hand-built hub-spoke because the team is comfortable with route tables and undervalues the rising operational slope. The comfort is real and the slope is gentle at first, which is exactly why this mistake persists until the route-table maintenance is consuming a platform engineer’s week and the next region is a multi-day project. The second is adopting Virtual WAN for a small estate that has a genuine fine-grained control requirement, then fighting the abstractions with static routes and multiple hubs to recover behavior the manual design would have expressed directly. Both mistakes come from weighting the wrong side of the rule. The corrective is to weight the side the estate’s actual trajectory and requirements demand, not the side the team finds familiar. The scenarios later in this guide work through the common cases so the rule has concrete anchors rather than only an abstract statement.

Which scales better across many regions and branches?

Virtual WAN scales better across many regions and branches because the platform owns the transit between hubs and the connectivity for branches, so adding a region or a branch is an attachment rather than a design project. A hand-built hub-spoke scales too, but each region and each branch adds hub-to-hub routing and gateway operation that the team designs and maintains, so the effort rises with the count.

The clearest way to see the scaling difference is to separate the two senses of scale that the word hides. The first is raw capacity: can the design carry the throughput, the connection count, and the address space the estate needs. Both designs scale in this sense, bounded by Azure’s documented limits, which should always be confirmed against the current subscription and service limits page before designing near them. A manual hub-spoke is bounded by VNet peering limits, where a virtual network supports a finite number of peerings, a figure reported around several hundred per VNet and effectively lower when a firewall-forcing route table consumes its own route-count budget. Virtual WAN’s hubs and gateways carry their own documented capacities, with gateway throughput scaling through instances. Neither design hits a wall at modest scale; the capacity limits matter only for genuinely large estates and should be read from current documentation rather than memory.

The second sense of scale is the one the decision rule actually cares about: operational scale, how the effort to run the design grows with the estate. Here the designs diverge sharply. In a manual hub-spoke, each new region is a new hub with its own firewall, gateways, and a hub-to-hub connectivity decision, plus the routes that make cross-region spoke traffic transit correctly. Each new branch is a gateway connection and the routes that bring its traffic through the firewall. The effort per addition does not fall as the estate grows; if anything it rises, because each addition must respect more existing state. In Virtual WAN, each new region is a virtual hub attached to the same Virtual WAN, which the platform connects to the existing hubs automatically, and the routing intent applied to the new hub is the same policy already in use. Each new branch is an attachment to a hub’s gateway, joining the managed fabric without new route design. The effort per addition stays roughly flat, which is the scaling property that matters when the estate is genuinely large.

Why branches and remote users tilt the decision

Branch connectivity is where Virtual WAN’s design intent shows most clearly, because Virtual WAN was built around the branch and remote-user case. An estate connecting many branch offices over site-to-site VPN, or many remote users over point-to-site VPN, benefits from the hub hosting those gateways and the managed fabric stitching them into the same any-to-any connectivity as the VNets. In a manual hub-spoke, the same branches mean operating the gateways in the hub and routing their traffic through the firewall by hand, which is workable for a few branches and increasingly heavy for many. The deciding signal here is the count and the growth rate. A handful of stable branches does not tilt the decision. A growing fleet of branches, or a remote-user population that the estate must serve at scale, tilts firmly toward Virtual WAN because that is the case the service was designed to absorb.

When a small estate makes scale a non-argument

The scaling advantage of Virtual WAN is only an advantage if the estate will actually exercise it. A small estate, one or two regions, a handful of spokes, a stable and small branch count, never reaches the size where the manual design’s rising slope becomes painful. For that estate, the scaling argument is a non-argument; the design will live entirely in the region of the curve where the manual hub-spoke is comfortable. This is the most common reason the decision rule points to hub-spoke: not because hub-spoke scales better, it does not, but because the estate will never test the scale where Virtual WAN’s advantage materializes, and so the other axes, control and the cost of a managed-hub baseline, decide instead. Reading scale honestly means asking not which design scales better in the abstract, but whether this estate will ever reach the scale where the difference is felt.

How do they compare on cost and control?

Cost and control are the two axes where a hand-built hub-spoke can win, and they are linked: the control the manual design grants is paid for in the engineering time that is its true cost, while Virtual WAN trades some of that control for a managed-hub baseline charge and lower engineering time. Comparing them honestly means counting both the cloud bill and the human time, because the human time is usually the larger number.

On the cloud-bill side, the two designs share most of their costs. Both pay for the firewall that inspects traffic, both pay for the gateways that terminate on-premises connectivity, both pay the data-processing and peering charges for traffic crossing the network, and both pay for the VNets and the workloads. Microsoft’s documentation notes that VNets connected to a virtual hub incur peering charges, and connected across regions incur global peering charges, which mirrors the peering charges a manual hub-spoke pays for its own peerings. Where the bills differ is in the managed-hub baseline. Virtual WAN’s virtual hub carries a baseline charge for the managed infrastructure that a manual hub VNet, being an ordinary VNet, does not. For a small estate, that baseline can be a meaningful fraction of the total, which is one reason small stable estates lean toward the manual design on pure cost. For a large estate, the baseline is a small fraction of a much larger total, and the engineering-time savings dominate the comparison.

Why engineering time is the cost that usually decides

The cloud bill is the visible cost, but the engineering time to operate the design is usually the larger and the deciding one. A manual hub-spoke’s route tables, peerings, hub-to-hub connectivity, and gateway operation are all work a platform engineer does, and that work scales with the estate. At a large estate, the time to add a region, onboard a branch, or trace a connectivity failure through hand-maintained route tables is a recurring draw on a scarce skill set. Virtual WAN moves much of that work to the platform, so the same estate needs less of that engineering time, freeing it for work that differentiates the business. Costing the two designs only on the cloud bill, and ignoring the engineering time, is the most common way the comparison gets distorted, because it makes the manual design look cheaper precisely where its hidden human cost is highest. A correct cost comparison at scale counts the platform team’s time as a real and growing line item, and once it does, Virtual WAN’s managed-hub baseline is often cheaper in total than the engineering hours the manual design consumes.

How control turns into a real constraint

Control is the axis where the manual design wins outright, and it becomes a real constraint, not a preference, when a design needs a routing behavior Virtual WAN’s abstractions do not express. The clearest case is multiple distinct next hops for different private-traffic classes on a single hub, which the one-private-policy-per-hub model does not offer directly. Another is a firewall topology that does not match the secured-hub model, or a need to place inspection appliances in spokes with custom routing between them. In these cases the manual design simply expresses what is needed, route by route, while Virtual WAN requires workarounds, static routes on connections, multiple hubs to separate concerns, or accepting an inspection path the team would not have chosen. When such a requirement is genuine and central to the design, control stops being a soft preference and becomes the deciding factor, and the rule points to hub-spoke regardless of the scale argument. The honest reading is that most estates do not have such a requirement, and for them control is a preference that the scale and cost axes outweigh; but the estates that do have one should not be talked out of it by a scale argument that does not apply to their constraint.

The trade-offs and failure modes each design must handle

A topology is only as good as its behavior when something goes wrong, so the comparison has to cover the failure modes each design carries and the trade-offs each one bakes in. The failure modes are different in character: the manual design fails through human error in hand-maintained state, and Virtual WAN fails through misunderstanding the abstraction. Knowing the shape of each failure is part of choosing between them.

The manual hub-spoke’s characteristic failure is the broken route. A spoke whose route table lacks the route to the firewall, or whose default route points at a stale next hop, drops or misroutes traffic in ways that look like an application failure until the team reads the effective routes and finds the cause in configuration it wrote. Asymmetric routing is a close cousin: traffic that leaves through the firewall but returns by a different path, often because a return route is missing or a peering was added without the matching route, produces intermittent failures that are hard to reason about. The non-transitive nature of peering is the trap underneath both: engineers who expect two spokes peered to the same hub to talk to each other directly are surprised when they cannot, and the surprise reappears every time a new engineer joins. These failures are all diagnosable because the state is the team’s and is readable, but they are frequent in proportion to how much hand-maintained state the estate carries, which is why they grow with scale.

Virtual WAN’s characteristic failure is the misread abstraction. The most common is assuming a routing behavior that the routing-intent model does not provide, such as expecting two different private next hops on one hub, and then finding traffic taking the single policy’s path. Another is overlapping address spaces, which the platform’s automatic transit cannot resolve any more than manual peering can, and which surface as connectivity failures the team traces back to address planning rather than routing. A third is forgetting that some capabilities require the Standard type or specific gateway configurations, so a design built on assumptions about what a Basic Virtual WAN or a particular gateway supports fails at the feature boundary. These failures are diagnosable too, but they require understanding the service’s model rather than reading the team’s own route tables, which is a different skill and one a team newly arrived on Virtual WAN may not yet have.

The trade-offs neither design escapes

Some trade-offs are inherent to centralizing traffic through a hub and apply to both designs equally. Forcing all traffic through a central firewall adds a hop and the firewall’s processing, which raises latency relative to direct spoke-to-spoke peering that bypasses inspection. A comparison of latencies puts direct peering lowest, hub transit through a firewall a few milliseconds higher, with the exact figures depending on region and firewall, and the trade is the same in both designs because both centralize inspection. Both designs also concentrate a failure domain in the hub: if the hub firewall or the hub’s connectivity fails, the spokes that depend on it are affected, which is why both designs need the hub’s resilience considered deliberately, through availability zones and capacity planning. Choosing between hub-spoke and Virtual WAN does not escape these trade-offs; it only changes who operates the components that carry them. The latency hop, the centralized failure domain, and the address-planning discipline are shared costs of the hub-and-spoke pattern itself, not differentiators between its manual and managed forms.

When each topology fits and when it is overkill

The decision rule earns its keep against concrete cases, so this section works through the recurring scenarios engineers actually report, naming the deciding factor in each. These are the patterns the brief identified as the ones that move real decisions, and reading them turns the abstract rule into a set of recognizable situations.

The first pattern is many regions and branches, which favors Virtual WAN. An estate that spans five regions and connects dozens of branch offices is squarely in the territory the service was built for. The managed inter-hub transit removes the hub-to-hub routing the manual design would require for each region pair, and the hub-hosted gateways absorb the branch connectivity that would otherwise be per-branch gateway operation. The deciding factor is the combination of region count and branch count: when both are high or growing, the operational savings dominate and Virtual WAN is the fit. A manual hub-spoke at this scale is not impossible, but it commits the team to operating a transit mesh and a branch fleet by hand, which is the operational tax the rule warns against.

The second pattern is routing intent replacing manual user-defined routes, which favors Virtual WAN whenever the route-table burden has become a maintenance cost in its own right. A team that finds itself maintaining dozens of route tables, regenerating them through infrastructure-as-code on every change, and still occasionally shipping a broken route, is paying the manual design’s routing tax in full. Routing intent collapses that burden to a per-hub policy. The deciding factor is whether the route tables are a meaningful and growing share of the platform team’s work; when they are, the managed model pays for itself in reduced maintenance and fewer routing failures.

When hub-spoke is the right fit, not the legacy choice

The third pattern is a small estate where hub-spoke control is fine, and it is the most common case where the manual design is the correct choice rather than a legacy holdover. An estate with one or two regions, a stable handful of spokes, and few or no branches never reaches the scale where Virtual WAN’s advantages materialize, and it pays the managed-hub baseline for capabilities it will not use. For this estate the manual hub-spoke is comfortable, fully legible, and cheaper in total. The deciding factor is the absence of growth pressure: if the estate is stable and small, the scale argument does not apply, and the control and simplicity of a hand-built hub-spoke make it the right fit. Calling this choice legacy is a mistake; it is the rule pointing correctly at control and cost for an estate that will not exercise scale.

The fourth pattern is a control requirement keeping the estate on hub-spoke, which favors the manual design regardless of size. A design that genuinely needs multiple distinct inspection next hops per region, an unusual firewall topology, or per-flow routing that Virtual WAN’s policies do not express, has a hard requirement the abstraction cannot meet without workarounds. The deciding factor is whether the control requirement is central and genuine rather than a preference; when it is central, it overrides the scale argument, because a design that cannot express what it must express is not a viable design no matter how well it would otherwise scale. The honest counter-reading is to test whether the requirement is genuinely central before letting it decide, since a soft preference dressed up as a hard requirement is the most common reason teams stay on the manual design longer than they should.

The two overfitting failures to avoid

Two failures recur at this decision, and both are the rule applied backwards. The first is hand-building a hub-spoke at a scale where Virtual WAN would be simpler, usually because the team is comfortable with route tables and discounts the rising operational slope until it is steep. The corrective is to weight the estate’s trajectory, not the team’s comfort, and to recognize that a climbing spoke count, multiplying regions, or a growing branch fleet are the signals that the comfortable manual design has become the expensive one. The second is adopting Virtual WAN where fine-grained control is required, then spending the next quarter working around the abstraction with static routes and extra hubs to recover behavior the manual design would have expressed directly. The corrective is to test the control requirement honestly before committing, and to choose the manual design when the requirement is real. Both failures come from applying the rule to the wrong side; both are avoided by reading the estate’s actual scale and actual control requirement rather than the team’s habits.

How to migrate from hub-spoke to Virtual WAN

Many estates do not choose between these designs once; they start on a hand-built hub-spoke when they are small and migrate to Virtual WAN when they outgrow it. Knowing the migration path matters as much as the initial choice, because the most common real-world trajectory is hub-spoke first, Virtual WAN later, and a migration done carelessly causes the outage the whole exercise was meant to avoid. The path is incremental, and the guiding principle is to move spokes one cohort at a time while keeping connectivity intact throughout.

The migration begins with a parallel-run posture rather than a cutover. The team deploys a Virtual WAN and a virtual hub in the relevant region alongside the existing manual hub, without yet moving any workload. The new hub is secured with its firewall and configured with the routing intent the design needs, so that it is ready to receive spokes. At this stage nothing has changed for production; the new hub exists and waits. This parallel posture is what makes the migration safe, because it lets the team validate the new hub’s connectivity and policy against test workloads before any production spoke depends on it.

The move itself proceeds spoke by spoke, or cohort by cohort for related spokes that must move together. For each spoke, the team establishes its connection to the virtual hub, validates that the routing intent steers its traffic correctly through the new hub’s firewall, and only then removes the spoke’s peering to the old manual hub and its old route table. The order matters: connect to the new before disconnecting from the old, so that at no point is the spoke without a path. During the transition a spoke may briefly be reachable through both hubs, which is why address planning and avoiding overlap matter throughout, and why the team validates each spoke’s flows before cutting its old peering. Cross-region and on-premises connectivity migrate the same way, by attaching the on-premises gateway to the virtual hub and validating before retiring the old gateway. The manual hub is decommissioned last, only after every spoke and every on-premises connection has moved and been validated on the new fabric.

What to validate at each migration step

Validation is the discipline that keeps the migration from becoming the outage it was meant to prevent. At each step the team confirms three things before proceeding. First, reachability: the migrated spoke can reach the destinations it must reach, on-premises, other spokes, and the internet, through the new hub. Second, inspection: traffic that must be inspected is still passing through the firewall, which under routing intent means confirming the private and internet policies are steering the migrated spoke’s traffic as intended rather than assuming the policy applies. Third, return paths: traffic returning to the migrated spoke takes the inspected path and does not produce the asymmetric routing that plagues half-migrated estates. Only when all three hold for a spoke does the team remove that spoke’s old peering. Skipping validation to move faster is the failure mode that turns a careful incremental migration into a sequence of intermittent outages, because the broken route or asymmetric path is exactly what the validation would have caught.

When migration is not worth it

Not every hub-spoke estate should migrate, and recognizing when to stay is part of using the rule well. A small, stable estate that is comfortable on its manual hub-spoke, has no growth pressure, and no control requirement pushing either way, gains little from migrating and pays the cost and risk of the migration for benefits it will not use. The migration is worth it when the estate has crossed into the territory where Virtual WAN’s scale advantages are real, when the route-table burden is consuming meaningful engineering time, when new regions or branches are arriving faster than the manual design can comfortably absorb them, or when the team operating the manual machinery is stretched thin enough that the managed model’s reduced operational load is itself the goal. The deciding factor for migration is the same scale-versus-control rule applied to a moving estate: migrate when the estate has grown past the point where control is worth the operational work, and stay when it has not.

How DNS and name resolution differ between the designs

Connectivity is only half of what a network estate needs; the other half is name resolution, and the two topologies handle it differently enough to matter in a real design. Both must answer the same question, how a workload in a spoke resolves the name of a private resource, whether that resource is a private endpoint, an on-premises service, or another spoke’s workload, but the place the resolver lives and the way traffic reaches it differ.

In a manual hub-spoke, the common pattern places a DNS resolution capability in the central network, often a private DNS resolver or a forwarder, and points spoke workloads at it through their virtual network DNS settings. Spokes send their queries to the central resolver, which resolves private zones and forwards on-premises names to the on-premises resolver over the gateway. Because the team controls the route tables, it can guarantee that DNS traffic to the resolver follows the intended path, and because private DNS zones link to virtual networks, the team manages those links as part of the same hand-maintained estate. The strength is the same legibility the routing has: the team can point at exactly where a name resolves and why. The cost is the same maintenance: zone links, resolver capacity, and the forwarding rules are the team’s to operate and grow.

In Virtual WAN, the resolution pattern is similar in outcome but lives alongside the managed fabric. A resolver or forwarder deployed in a spoke connected to the managed core serves the connected networks, and private DNS zones link to those networks as before. The connectivity from a spoke to the resolver rides the same managed transit that carries the rest of the private traffic, so the team does not write the route that reaches the resolver. The trade is the familiar one: less to operate, and less direct control over the exact path DNS traffic takes, which is rarely a problem but occasionally matters when a design needs DNS traffic to follow an unusual route. The practical guidance is the same in both designs: plan name resolution as deliberately as connectivity, because a topology that connects correctly but resolves names incorrectly fails in ways that look like application errors and waste diagnostic time chasing the wrong layer.

Why private endpoints complicate both designs equally

Private endpoints, which give a private address to a platform service such as a storage account or a database, add a name-resolution requirement that both topologies must satisfy. The endpoint’s private address must be resolvable from the spokes that use it, which means the private DNS zone for that service must link to the networks that resolve it, and the connectivity to the endpoint’s address must route correctly. Neither design escapes this; both must link the zone and route the traffic. The difference is again operational: the manual design links zones and writes the routes by hand, while the managed core routes the traffic and the team links the zones. The most common private-endpoint failure, a workload that cannot reach the service because the name resolves to the public address rather than the private one, has the same cause and the same fix in both topologies, a missing or incorrect zone link, which is why private-endpoint DNS deserves the same deliberate planning as the rest of name resolution regardless of which topology carries it.

Observability and diagnosing connectivity in each design

A topology has to be diagnosable when a flow fails, and the two designs are diagnosed differently because the state that determines routing lives in different places. In a manual hub-spoke, the team reads its own configuration: the route tables, the effective routes on a network interface, the peering settings, and the firewall logs. The advantage is that everything that decides where a packet goes is configuration the team wrote and can read directly, so a broken flow is traced by reading the effective routes, finding the route that points at the wrong next hop or the missing return route, and correcting it. The connection monitor and the next-hop diagnostic confirm what the route tables say the packet should do against what it actually does. The skill required is the ability to read route tables and reason about next hops, which is a skill the team already exercises by operating the manual design.

In Virtual WAN, diagnosis shifts toward understanding the managed routing rather than reading hand-written tables. The effective routes on a connection still show what the platform programmed, but the cause of a misrouting is more often a misread of the routing-intent model, an overlapping address space, or a feature-tier assumption than a route the team typed wrong. Diagnosing a Virtual WAN flow means confirming that the routing intent is in effect, that the connection is associated with the route table the design expects, that the next hop is the firewall the policy names, and that no address overlap is silently breaking transit. The platform exposes the effective routes and the connectivity state, and the firewall logs show inspection as in the manual design, but the reasoning runs from the service’s model down to the observed behavior rather than from the team’s configuration. A team new to Virtual WAN often finds this harder at first precisely because the routing is not theirs to read line by line; once the model is understood, the smaller surface of hand-maintained state means fewer places a flow can break.

What to instrument from the start in either design

Whichever topology a team runs, the instrumentation worth standing up on day one is the same: flow logs that record what traffic the firewall and the network see, connection monitoring that continuously tests the paths the design depends on, and alerting on the connectivity tests so that a broken route or a failed transit surfaces as an alert rather than as a user complaint. The manual design additionally benefits from a discipline of validating effective routes after every change, since hand-maintained route tables are where its failures live. The managed design benefits from validating that routing intent and connection associations match the design after every topology change, since misread abstractions are where its failures live. Instrumenting the path, not just the components, is what turns a topology from one that works until it silently does not into one that announces a problem the moment a flow stops taking the path the design intends.

How team size and skills should weight the decision

The decision rule names control and managed scale as the axes, but a real choice is also shaped by the team that will operate the result, and ignoring that produces designs that are correct on paper and unsustainable in practice. A topology is not just an architecture; it is an operating commitment, and the commitment lands on whatever team owns the network. Weighting that team honestly is part of choosing well.

A small platform team, or a team where networking is one responsibility among many rather than a dedicated specialty, is poorly served by an estate whose correctness depends on maintaining a growing set of hand-written route tables and peering meshes. The manual design rewards deep, continuous networking attention, and a team that cannot give it that attention will ship the broken routes and asymmetric paths that are the manual design’s characteristic failures. For such a team, the managed model’s smaller surface of hand-maintained state is not merely convenient; it is the difference between an estate that stays correct and one that drifts into intermittent failures nobody has time to chase. The managed core absorbs exactly the work the under-resourced team cannot reliably do, which is why team capacity, not just estate size, should weight the choice toward the managed design when the team is stretched.

A larger or specialized networking team changes the calculus, because it can absorb the manual design’s operational load and may value the control that load buys. A team with deep networking skill, the time to maintain infrastructure-as-code that generates route tables and peerings, and a genuine need for the fine-grained control the manual design grants, can run a large hand-built estate well and may prefer it precisely because it exposes every knob. For that team, the managed model’s abstractions can feel like a loss of the control they are equipped to exercise. The weighting here is not about which design is objectively easier but about which matches the team’s capacity and values: a team built to operate networks deeply may rationally choose the design that gives it the most to operate, while a team that wants networking to recede into a reliable substrate should choose the design that asks the least of it.

Why the operating model outlasts the architecture

The reason team weighting matters so much is that the operating model outlasts the people who chose the architecture. A design chosen because the founding engineer loved route tables becomes a liability when that engineer leaves and the estate is inherited by a team that does not share the skill or the inclination. A design chosen for managed simplicity stays operable across team changes because its correctness depends less on continuous specialist attention. Choosing a topology is therefore partly a bet on the team that will exist in two years, not just the team that exists today, and the safer bet for an organization without a durable networking specialty is the design that degrades gracefully under reduced attention rather than the one that demands constant care. This is the same scale-versus-control rule read through the lens of people: control is worth its operational cost only to a team that can pay that cost reliably over time, and managed scale is worth its loss of control to a team that cannot.

Matching the decision to a real organization

Putting the team weighting together with the technical axes gives a fuller decision than the rule alone. An organization with a small, generalist platform team, a growing multi-region estate, and no hard control requirement should choose the managed design without hesitation, because every axis and the team capacity point the same way. An organization with a deep networking team, a stable estate, and a genuine control requirement should choose or keep the manual design, because the team can pay for the control it needs. The hard cases are the mixed ones: a deep team with a huge estate that has outgrown even its capacity to maintain by hand, which should migrate to the managed design despite the team’s skill, or a small team with a hard control requirement, which must either grow the team or accept the friction of working around the managed model. Reading both the technical axes and the team that will operate the result is what turns the rule from a slogan into a decision an organization can stand behind.

The verdict

The choice between a hand-built hub-spoke and Azure Virtual WAN is not won on features, because both deliver hub-and-spoke connectivity, centralized inspection, multi-region reach, and branch connectivity. It is won on the single axis the scale-versus-control rule names: a hand-built hub-spoke gives control at the cost of operational work, and Virtual WAN gives managed scale at the cost of control, so the decision is which of those two the estate values more. An estate that is large, multi-region, branch-heavy, or growing on any of those dimensions, and that would rather not spend a platform engineer’s week on route tables and hub-to-hub meshes, should choose Virtual WAN, because managed scale is what it needs and the control it surrenders is control it was not using. An estate that is small and stable, or that has a genuine fine-grained routing requirement Virtual WAN’s per-hub policies cannot express, should choose or stay on a hand-built hub-spoke, because control and the absence of a managed-hub baseline are what it needs and the scale advantage it forgoes is scale it will never reach.

The InsightCrunch topology decision table is the rule made specific across scale, routing complexity, control, and cost, and the deciding signal in each row is the concrete condition to check against the real estate rather than the team’s habits. The most common mistakes are the two overfitting failures: hand-building at a scale where Virtual WAN would be simpler, and adopting Virtual WAN where control was required and then fighting the abstraction. Both are avoided by reading the estate’s actual trajectory and actual requirements. For teams already on a manual hub-spoke that have outgrown it, the migration path is incremental, parallel-run, and validated spoke by spoke, and it is worth doing when the estate has crossed into Virtual WAN’s territory and not before. The right topology is the one whose trade matches what the estate values, and the rule is the fastest way to find which trade that is. To build both topologies hands-on and compare their routing behavior in a sandbox, run the hands-on Azure labs and command library on VaultBook, where the manual route tables and the routing-intent policies can be deployed side by side and watched.

Frequently asked questions

Hub-spoke versus Virtual WAN: which architecture should I choose?

Choose by the scale-versus-control rule rather than by feature count, because both deliver hub-and-spoke connectivity. Pick a hand-built hub-spoke when control matters more than operational savings, which is the case for a small, stable estate or a design with a genuine fine-grained routing requirement that Virtual WAN’s per-hub policies cannot express. Pick Virtual WAN when managed scale matters more than control, which is the case for an estate that spans many regions, connects many branches, or has a climbing spoke count that makes the manual design’s route-table and hub-to-hub maintenance a real and growing cost. The decision turns on which of those two the estate actually values, and the deciding signal is the estate’s trajectory and requirements, not the team’s familiarity with route tables.

Is managed Virtual WAN better than a manual hub-spoke?

Neither is better in the abstract; each wins on a different axis. Virtual WAN is better on operational scale, because it owns the transit between hubs and the connectivity for branches, so adding a region or a branch is an attachment rather than a design project. A manual hub-spoke is better on control and on the cost of a small stable estate, because it grants total expressiveness over every route, next hop, and peering, and it avoids the managed-hub baseline charge. The honest answer is that the better design is the one whose trade matches the estate. A large, multi-region, branch-heavy estate is better served by Virtual WAN. A small estate or one with a hard control requirement is better served by the manual design.

Which topology scales better across many regions?

Virtual WAN scales better operationally across many regions, because the platform connects the hubs to each other automatically when they belong to the same Virtual WAN, so cross-region spoke-to-spoke connectivity does not require the team to design and maintain a hub-to-hub mesh. In a manual hub-spoke, each new region is a new hub with its own firewall and gateways, plus a deliberate hub-to-hub connectivity decision and the routes that make cross-region traffic transit correctly. That effort does not fall as regions multiply; it accumulates. Both designs scale in raw capacity within Azure’s documented limits, but operational scale, the effort to run the design as regions grow, is where Virtual WAN’s managed transit gives it the clear advantage, and it is the sense of scale the topology decision actually weighs.

How does routing complexity compare between the two?

Routing complexity is the biggest practical difference. A manual hub-spoke expresses transit through user-defined routes the team writes per spoke, so the artifact count and the cross-references between route tables rise with the estate, and the cap on routes per table eventually forces summarization or a design split. Virtual WAN expresses transit through routing intent, a declarative policy stated once per hub that the platform realizes for every attachment, so complexity stays roughly flat as spokes are added. The structural reason is that the manual design’s routing is the team’s work and scales with the estate, while Virtual WAN’s routing is a fixed-size intent the platform expands. When the route-table burden becomes a meaningful share of a platform team’s time, that is the signal that routing complexity favors Virtual WAN.

How do hub-spoke and Virtual WAN compare on cost and control?

They are the two axes where the manual design can win, and they are linked. On cost, both designs share most charges, the firewall, the gateways, the peering and data charges, but Virtual WAN adds a managed-hub baseline that a manual hub VNet does not carry, which weighs on small estates and becomes negligible at large ones. The larger cost, usually decisive, is engineering time: the manual design’s route tables, peerings, and hub-to-hub meshes are work that scales with the estate, while Virtual WAN moves much of that to the platform. On control, the manual design wins outright, granting any routing behavior the design needs, while Virtual WAN’s per-hub policy model is cleaner but constrained. Count both the cloud bill and the human time to compare them fairly.

When should I migrate from hub-spoke to Virtual WAN?

Migrate when the estate has crossed into the territory where Virtual WAN’s scale advantages are real and the manual design’s costs have become a burden. The concrete signals are a route-table maintenance load that consumes meaningful engineering time, new regions or branches arriving faster than the manual design can comfortably absorb, a climbing spoke count pushing against route-count or peering limits, or a platform team stretched thin enough that reducing operational load is itself the goal. Do not migrate a small, stable estate with no growth pressure and no control requirement, because it pays the cost and risk of migration for benefits it will not use. The migration decision is the same scale-versus-control rule applied to a moving estate: migrate once the estate has grown past the point where control is worth the operational work.

Why is VNet peering not transitive in a hub-spoke, and does Virtual WAN fix it?

In a manual hub-spoke, VNet peering connects two networks directly but does not carry traffic onward to a third, so two spokes peered to the same hub cannot reach each other unless the team routes their traffic through the hub, usually the firewall, with user-defined routes. This non-transitivity is the trap that surprises engineers who expect same-hub spokes to communicate automatically. Virtual WAN effectively provides the transit that manual peering lacks: connecting two VNets to the same virtual hub establishes transit between them through the managed routing fabric, and connecting two hubs to the same Virtual WAN extends that transit across regions, all without the team building the routes. Virtual WAN does not change peering’s nature; it provides a managed transit layer above it so the team does not hand-build the transitivity.

Does Virtual WAN replace user-defined routes entirely?

Largely, for the common case, but not absolutely. Routing intent on a hub replaces the per-spoke user-defined routes that a manual design uses to force traffic through a firewall, because one policy statement covers every attachment to the hub and the platform distributes the routes. For most designs this removes the route-table maintenance entirely. The exception is when a design needs a behavior the per-hub policy does not express, such as sending a specific prefix to an appliance in a spoke while the rest follows the policy. For those cases, static routes on a connection extend the model for selected destinations. The presence of many such exceptions is itself a signal: a design that needs extensive custom routing is one that values control, and that is the territory where the manual hub-spoke is the more natural fit.

How many routing policies can a Virtual WAN hub have?

A virtual hub supports at most one internet traffic routing policy and one private traffic routing policy, each pointing to a single next hop such as the hub’s Azure Firewall. The private policy treats branch and virtual-network prefixes as one private space and steers them through the named next hop; the internet policy advertises a default route to the attachments so their internet-bound traffic is inspected before leaving. This one-policy-per-class-per-hub limit is the model’s main constraint and the reason designs that need several distinct private next hops on one hub push against the abstraction. Confirm this limit against current Microsoft documentation before designing near it, since service behavior and limits are revised over time, but the single-policy-per-class model is the structural fact that shapes how Virtual WAN routing is designed.

Can a hub-spoke handle as many spokes as Virtual WAN?

In raw capacity, a hub-spoke is bounded by VNet peering limits, where a virtual network supports a finite number of peerings reported around several hundred, and effectively lower when a firewall-forcing route table consumes its own route-count budget against the per-table route cap. Virtual WAN’s hubs carry their own documented capacities. In practice, though, the binding limit on a manual hub-spoke is rarely the hard peering cap; it is the operational effort of maintaining route tables and peerings, which becomes painful well before the technical ceiling. Virtual WAN can carry a large spoke count with flat operational effort because the platform manages the transit. So the manual design can handle many spokes in theory, but the effort to operate them at that count is exactly the cost that tilts large estates toward Virtual WAN. Confirm all current limits against Azure documentation.

Does forcing traffic through a hub firewall add latency in both designs?

Yes, equally, because the latency cost comes from centralizing inspection, not from which design centralizes it. Direct spoke-to-spoke peering that bypasses a firewall has the lowest latency, typically a fraction of a millisecond within a region. Routing through a central firewall, whether in a manual hub or a Virtual WAN secured hub, adds a few milliseconds for the extra hop and the firewall’s processing. The figure depends on the region and the firewall tier and should be measured for the specific design rather than assumed. The point for the comparison is that this trade is identical in both topologies: choosing Virtual WAN over hub-spoke does not add or remove the inspection latency, it only changes who operates the firewall and how the routing to it is expressed. The latency hop is a cost of the hub-and-spoke pattern itself.

What is a secured virtual hub and how does it relate to a manual hub firewall?

A secured virtual hub is a Virtual WAN hub with an Azure Firewall deployed inside it, which gives the centralized inspection that a manual hub-spoke achieves by placing a firewall in the hub VNet. The functional outcome is the same: traffic between spokes and to the internet is inspected by a firewall in the hub. The difference is operational. In the manual design, the team writes the route tables that draw traffic to the firewall and maintains them per spoke. In a secured virtual hub, routing intent names the firewall as the next hop for private and internet traffic, and the platform distributes the routes. So the secured hub is the managed equivalent of the manual hub firewall, delivering the same inspection with the routing operated by the service rather than hand-written, which is the scale-versus-control trade applied to the firewall path specifically.

Is Virtual WAN only worth it if I have branch offices?

No, but branches are the case it most clearly excels at. Virtual WAN was designed around connecting branches over site-to-site VPN and remote users over point-to-site VPN into the same any-to-any fabric as the VNets, so a branch-heavy estate is squarely in its territory. That said, Virtual WAN also earns its place for a VNet-only estate that spans many regions, because the managed inter-hub transit removes the hub-to-hub routing the manual design requires per region pair. The deciding factor is not whether branches exist but whether the estate exercises any of the dimensions, regions, branches, or spoke count, where managed scale beats hand-built control. A multi-region VNet-only estate can favor Virtual WAN on the strength of inter-hub transit alone, with no branch in sight.

Can I run hub-spoke and Virtual WAN at the same time?

Yes, and doing so is the foundation of a safe migration. During a move from a manual hub-spoke to Virtual WAN, the team deploys the Virtual WAN and its hub alongside the existing manual hub, then moves spokes one cohort at a time, connecting each to the virtual hub and validating its routing before removing its old peering. For a period the two designs coexist, and a spoke may briefly be reachable through both, which is why address planning to avoid overlap and per-spoke validation matter throughout. Some estates also run them in parallel deliberately, keeping a specialized workload on a manual hub-spoke for a control requirement while the bulk of the estate runs on Virtual WAN. Coexistence is supported and is the normal posture during migration; the key discipline is validating each connection before cutting the old path.

How does on-premises connectivity differ between the two designs?

In a manual hub-spoke, the VPN or ExpressRoute gateway lives in the hub VNet, the team operates it, and spokes reach on-premises through gateway transit enabled on the peering, with routes carrying the on-premises prefixes to the spokes. In Virtual WAN, the gateway is hosted by the virtual hub and provisioned as part of it, and connected VNets reach on-premises through the managed fabric without the team configuring gateway transit per spoke. The connectivity outcome is the same, on-premises reachable from the spokes, but the operation differs: the manual design’s gateway and its routes are the team’s, while Virtual WAN operates the gateway and distributes the routes. For a single on-premises connection the difference is small; for many branches or circuits it grows, which is one more reason branch-heavy estates favor Virtual WAN.

What address-planning mistakes break both designs?

Overlapping address spaces break transit in both topologies, because neither managed nor manual routing can forward correctly between two networks that claim the same prefixes. In a manual hub-spoke this surfaces as a peering that establishes but carries no usable traffic, or as routes that cannot be summarized cleanly. In Virtual WAN it surfaces as a connection that attaches but does not communicate, traced back to the overlap rather than to routing. The discipline is the same in both: plan a non-overlapping address scheme with room to grow before the first spoke joins, because renumbering a live estate is far more expensive than reserving space early. Address planning is the unglamorous foundation that both designs rest on, and getting it wrong produces failures that look like routing problems but are planning problems at root.

Does choosing Virtual WAN lock me out of fine-grained control permanently?

Not permanently, but it constrains you to the model’s expressiveness while you are on it, and recovering specific behaviors means working around the abstraction. Static routes on a connection let you send selected prefixes to an appliance in a spoke while the rest follows routing intent, and deploying multiple hubs lets you separate concerns that one hub’s single-policy-per-class model cannot. These extend the model meaningfully, but they are workarounds rather than direct expression, and a design that needs many of them is signalling that it values control enough to question whether Virtual WAN is the right base. If a hard control requirement is central, the manual hub-spoke expresses it directly, and you can also run a specialized workload on a manual hub-spoke alongside a Virtual WAN estate. The lock-in is soft, but the friction of fighting the abstraction is real.

What is the single deciding question between the two topologies?

Ask whether the estate values control over its routing more than it values being relieved of the work of operating that routing at scale. That one question is the scale-versus-control rule, and everything else, routing complexity, multi-region scale, cost, the migration timing, is a projection of it onto a specific axis. An estate with a hard, unusual routing requirement and a small stable footprint answers control, and the hand-built hub-spoke fits. An estate with many regions, many branches, a climbing spoke count, and a platform team that would rather spend its time elsewhere answers managed scale, and Virtual WAN fits. Feature comparisons stay ambiguous because both designs do hub-and-spoke connectivity; the deciding question is the one place they genuinely differ, and answering it honestly for the real estate settles the choice faster than any feature checklist.

What each topology actually is

What is a hand-built hub-spoke topology?

What is Azure Virtual WAN?

Where the two designs converge and where they split

The Azure services that realize each topology

Components of a manual hub-spoke

Components of a Virtual WAN deployment

How the component lists map onto operating cost

How does routing complexity compare between the two?

Why routing intent replaces user-defined routes

What the route-table burden looks like at thirty spokes

A reference design walked through: the manual hub-spoke

Walking a packet through the manual design

Where the manual design starts to strain

A reference design walked through: Virtual WAN

Walking a packet through Virtual WAN

Where Virtual WAN asks the team to give something up

The InsightCrunch topology decision table

The scale-versus-control rule

Using the rule without overfitting

Which scales better across many regions and branches?

Why branches and remote users tilt the decision

When a small estate makes scale a non-argument

How do they compare on cost and control?

Why engineering time is the cost that usually decides

How control turns into a real constraint

The trade-offs and failure modes each design must handle

The trade-offs neither design escapes

When each topology fits and when it is overkill

When hub-spoke is the right fit, not the legacy choice

The two overfitting failures to avoid

How to migrate from hub-spoke to Virtual WAN

What to validate at each migration step

When migration is not worth it

How DNS and name resolution differ between the designs

Why private endpoints complicate both designs equally

Observability and diagnosing connectivity in each design

What to instrument from the start in either design

How team size and skills should weight the decision

Why the operating model outlasts the architecture

Matching the decision to a real organization

The verdict

Frequently asked questions

Hub-spoke versus Virtual WAN: which architecture should I choose?

Is managed Virtual WAN better than a manual hub-spoke?

Which topology scales better across many regions?

How does routing complexity compare between the two?

How do hub-spoke and Virtual WAN compare on cost and control?

When should I migrate from hub-spoke to Virtual WAN?

Why is VNet peering not transitive in a hub-spoke, and does Virtual WAN fix it?

Does Virtual WAN replace user-defined routes entirely?

How many routing policies can a Virtual WAN hub have?

Can a hub-spoke handle as many spokes as Virtual WAN?

Does forcing traffic through a hub firewall add latency in both designs?

What is a secured virtual hub and how does it relate to a manual hub firewall?

Is Virtual WAN only worth it if I have branch offices?

Can I run hub-spoke and Virtual WAN at the same time?

How does on-premises connectivity differ between the two designs?

What address-planning mistakes break both designs?

Does choosing Virtual WAN lock me out of fine-grained control permanently?

What is the single deciding question between the two topologies?

Please disable your content blocker

Read the rest with bitcoin

Related Reading

Hub-Spoke vs Virtual WAN Architecture

Write to David