Azure Virtual Network (VNet) Deep Dive

Most teams meet the Azure Virtual Network the same way: they accept a default when they create their first virtual machine, never look at it again, and then spend an afternoon six months later trying to work out why two subnets that should talk to each other suddenly cannot. The VNet is the quietest resource in a deployment and the one that decides the most. It sets the private address range every resource draws from, it owns the default routing that moves a packet from one subnet to the next, it governs whether a name resolves to a private address or a public one, and it draws the line between what is reachable inside your cloud estate and what has to cross a gateway to get in. Get the address plan wrong on day one and you inherit a renumbering project later. Misread the default routing and you will chase a connectivity ghost that no firewall rule explains. This guide treats the Azure Virtual Network as the foundational primitive it actually is, and the goal is a working mental model you can reason from rather than a settings tour you forget by lunch.

Azure Virtual Network VNet address space subnets routing and DNS deep dive - Insight Crunch

The reader who finishes this article should be able to plan an address space and a subnet layout deliberately, predict the default routing and name-resolution behavior before deploying anything, and state precisely what a VNet does and does not isolate. That last point is where most production confusion lives, so we will keep returning to it. A VNet is an address and routing boundary. It is not, by itself, a security boundary. The difference between those two statements is the difference between a network you understand and a network that surprises you.

What an Azure Virtual Network actually is

Strip away the portal blades and the VNet is a private slice of address space that Azure scopes to your subscription and a single region, inside which the resources you place can address one another over private IP without anything you configure, and out of which traffic reaches the public internet or other networks only along paths the platform either provides by default or you carve on purpose. The address space is yours within the cloud. It does not have to be globally unique the way a public address must be, which is why almost every VNet uses ranges from the private RFC 1918 blocks. It does have to be unique among any networks you intend to connect, because two networks that share an overlapping range cannot be peered or routed together without translation, and Azure will reject the peering outright.

The mental model that serves best is a building with floors. The VNet is the building, with a street address that is its address space. Each floor is a subnet, a contiguous slice of that address space where you actually place resources. A network interface card attaches a resource to a floor and gives it a specific private address on that floor. The hallways between floors are the system routes the platform installs automatically so that any resource on any floor can reach any resource on any other floor without a router you manage. The front door, the loading dock, and the service elevator are the connectivity options, the gateways and peerings and endpoints that decide how traffic enters and leaves the building. Hold that picture and most VNet questions resolve into “which floor,” “which hallway,” and “which door.”

What does an Azure Virtual Network isolate?

A VNet isolates addressing and default routing, not traffic. Resources inside it share a private address space and can reach one another over the platform’s automatic routes, while resources in a separate VNet cannot reach in without an explicit connection. The VNet does not filter packets by itself, which is why isolation and security are two different design decisions.

That answer is worth dwelling on because the word “isolation” carries a security connotation that the VNet does not earn on its own. Placing two workloads in separate VNets does isolate their address spaces and removes the default reachability between them, and in that narrow sense it is an isolation boundary. What it does not do is inspect or restrict the traffic that flows where reachability exists. Inside a single VNet, every subnet can reach every other subnet by default, with no filtering, until you attach a network security group to change that. So when an architect says “we put the database in its own subnet for isolation,” the subnet placement organizes the address space and gives you a surface to attach controls to, but the isolation only becomes real when the controls are attached. The VNet hands you the boundary; you supply the enforcement.

The address space and why it is a one-way door

The single most consequential decision you make about a VNet is its address space, and it is consequential precisely because it is so hard to change after the fact. You assign one or more CIDR ranges to the VNet at creation, and every subnet you later carve must fit inside one of those ranges. The platform lets you add address ranges to an existing VNet and, with care, remove or resize ranges that are not yet in use, but the moment resources, peerings, and gateways depend on a range, shrinking or renumbering it becomes a migration rather than an edit. This is the renumbering project nobody plans for, and it is entirely avoidable with a few minutes of arithmetic at the start.

The arithmetic that matters is reservation, not utilization. Plan the address space for the network you will have in three years, not the one you are deploying this week. A range that looks absurdly large today costs nothing while it sits unused, because Azure does not bill for address space, only for the resources that consume addresses. A range that turns out to be too small costs you a weekend of careful surgery and a maintenance window. The asymmetry is total, so the correct bias is toward generosity. A common and defensible pattern is to allocate a sizable private block to each VNet, leave deliberate gaps between subnets so a subnet can grow without colliding with its neighbor, and reserve whole ranges for subnets you have not designed yet.

Overlap is the other trap. Because a VNet’s address space must not overlap with any network you connect it to, the addresses you pick today constrain every future peering, every site-to-site tunnel to an on-premises network, and every connection to a partner. If two VNets you later want to peer both use the same popular default range, you cannot peer them, and you are back to renumbering. The discipline that prevents this is a address allocation registry maintained at the organization level: a single source of truth that records which ranges are spoken for, so no two networks are designed into a collision. Teams that skip this step almost always discover the conflict at the worst possible moment, when two business units try to connect networks that were each designed in isolation.

How should I plan VNet address space and subnets?

Start from the largest network you will plausibly need, allocate a generous non-overlapping CIDR block, then divide it into subnets with deliberate gaps so each can grow. Record every range in an organization-wide registry to prevent overlaps that block future peering. Size for three years, because expanding is cheap and renumbering is expensive.

The subnet layer sits one level down from the VNet’s address space, and it is where the design becomes concrete. A subnet is a contiguous sub-range of the VNet, and it is the unit to which most network controls attach. You associate a network security group at the subnet level, you attach a route table at the subnet level, you delegate a subnet to a managed service, and you place network interfaces into a subnet. The instinct to create one subnet and dump everything into it is the instinct to fight, because the subnet is your primary lever for applying different routing and filtering policy to different tiers of an application. A typical layout separates a public-facing tier, an application tier, a data tier, and the infrastructure subnets that gateways and bastions and firewalls require, each sized for its own growth and each a place where you can apply controls without affecting the others.

Reserved addresses and the math that trips people up

Every subnet in Azure quietly loses several addresses that you might expect to be usable, and the engineer who sizes a subnet by counting hosts and forgetting the reservations will run out of addresses earlier than the math suggested. In a standard subnet, the platform reserves the first and last addresses of the range for the network and broadcast equivalents, as any IP network does, and on top of that Azure reserves three additional addresses at the start of every subnet for its own internal use: a default gateway, and two addresses used to map Azure’s DNS service and to support the platform’s metadata and DHCP functions. The practical consequence is that a subnet does not give you the full count of addresses its mask implies.

Why does each subnet lose several usable addresses?

Azure reserves five addresses in every subnet: the first and last for the network and broadcast roles, plus three more at the start for the default gateway and the platform’s internal DNS and metadata services. A subnet’s usable host count is therefore its total size minus five, which matters most for small subnets where five is a large fraction.

This matters most at small sizes, where five reserved addresses is a punishing fraction of the total. A subnet sized with a /29 mask has only eight addresses, so after the five reservations you are left with three usable addresses, which is rarely enough for anything real once you account for scaling. This is why Azure enforces a minimum subnet size and why experienced engineers avoid the very small masks except for narrow, fixed-size purposes. The reservation also explains a class of confusing errors where a deployment fails to find a free address in a subnet that, by a naive count, should have room: the naive count forgot the five, and the autoscaling event or the additional network interface tipped the subnet over its real capacity. Size every subnet by subtracting five from the mask’s host count, then leave headroom on top of that, and the surprise disappears.

A second piece of the math is that certain managed services demand a dedicated subnet, sometimes with a minimum size larger than you would expect, and sometimes with a reserved name. A VPN or ExpressRoute gateway requires a subnet named exactly for the gateway, an Azure Firewall requires its own subnet, a Bastion host requires its own subnet, and several platform-as-a-service offerings require a delegated subnet they will not share. Each of these consumes a slice of your address space that you must plan for in advance, which is another argument for generosity at the VNet level. Designing the address space without leaving room for the infrastructure subnets is a common way to back yourself into a corner where the only fix is, again, renumbering.

How traffic actually moves: the default route table

The behavior that surprises engineers most often is that an Azure VNet routes traffic between its subnets with no configuration at all, and it does so because the platform installs a set of system routes into every subnet automatically. These default routes are invisible until you go looking for them, and understanding them is the difference between predicting your network’s behavior and being baffled by it. The platform creates a route that covers the entire VNet address space and points it at the virtual network itself, which is what lets any subnet reach any other subnet directly. It creates a default route to the internet so that, by default, outbound traffic to public addresses has a path. It creates routes that black-hole certain reserved ranges. And when you add a peering or a gateway, the platform injects additional system routes so the connected networks become reachable.

The decisive property of these system routes is that they exist whether or not you want them, and the only way to change where a given destination’s traffic goes is to install a user-defined route that is more specific or that overrides the system route for that prefix. A route table you create and attach to a subnet does not replace the system routes wholesale; it adds your routes on top, and Azure selects the route by longest-prefix match, with user-defined routes winning ties against system routes. This is the route-then-filter model in action: routing decides the path a packet takes, and it is a completely separate decision from whether a network security group permits the packet once it is on that path. Almost every connectivity problem in Azure reduces to one of two questions. Which route is this packet following? And which rule is allowing or dropping it? Keeping those two questions distinct is the core networking skill, and conflating them is the most common reason an engineer fixes the wrong thing.

How does default routing work inside a VNet?

Azure installs system routes into every subnet automatically: one covering the whole VNet address space so subnets reach each other, a default internet route, and routes injected by peerings and gateways. You change a destination’s path only by adding a user-defined route that is more specific or overrides the system route, selected by longest-prefix match.

Consider the failure mode this creates. A team deploys a network virtual appliance, a firewall instance they want all subnet traffic to pass through, and they assume that placing it in the VNet is enough to force traffic through it. It is not. The system route still sends inter-subnet traffic directly, bypassing the appliance entirely, because nothing told the platform to send traffic to the appliance first. The fix is a user-defined route that overrides the relevant prefixes and points them at the appliance’s address as the next hop. Forget that route and the appliance sits idle while traffic flows around it, which looks like the appliance is broken when the real problem is that no route directs traffic to it. The mirror-image failure is a user-defined route that points a prefix at an appliance which then fails or is removed, black-holing all traffic for that prefix because the route still says “send it here” and “here” no longer forwards. Routing is powerful precisely because it is absolute, and that is also why a wrong route is so destructive.

The outbound default and why it is changing

For most of the VNet’s history, a resource deployed without an explicit outbound configuration still reached the internet, because Azure provided a default outbound address automatically. This was convenient and it was also a quiet liability, because the default outbound address is shared platform infrastructure that you do not own, cannot predict, and cannot allowlist on a remote firewall. Microsoft has announced that this implicit default outbound access is being retired for newly created deployments, with the platform steering engineers toward explicit outbound methods instead, such as a NAT gateway attached to the subnet, a load balancer with outbound rules, or an explicit public IP on the resource. The exact retirement timeline and the precise conditions under which the change applies are the kind of platform detail that shifts, so confirm the current state against the official Azure networking documentation at the time you read this rather than treating any date as fixed.

The design lesson is durable regardless of the timeline. Relying on implicit outbound access was always fragile because it gave you no control over the source address your traffic presented, which made remote allowlisting impossible and made SNAT port exhaustion a real risk under load. The deliberate pattern is to attach a NAT gateway to subnets that need outbound internet access, because the NAT gateway gives you a stable, owned set of outbound addresses, a much larger pool of SNAT ports than the implicit mechanism, and a single place to reason about your egress. Treat outbound internet access as something you design rather than something you inherit, and the platform change costs you nothing because you were never depending on the default in the first place.

Peering: connecting VNets and the rule that catches everyone

A single VNet is regional and self-contained, so any real estate spanning regions, business units, or a hub-and-spoke topology depends on connecting VNets together, and the primary mechanism for that is VNet peering. Peering links two VNets so that resources in one can reach resources in the other over the Azure backbone using private addresses, with low latency and no public internet exposure, as if the two address spaces were part of one routable network. It works within a region and, as global VNet peering, across regions. When you peer two VNets, the platform injects system routes into both so each VNet’s address space becomes reachable from the other, which is why peered VNets simply work once the link is established, provided their address spaces do not overlap.

Is VNet peering transitive across a hub?

No. VNet peering is not transitive. If spoke A peers with a hub and spoke B peers with the same hub, A and B still cannot reach each other through the hub by peering alone, because peering only establishes reachability between the two directly peered VNets. Connecting spokes through a hub requires user-defined routes plus a forwarding appliance or gateway in the hub.

The non-transitive property is the single most misunderstood fact about peering, and it produces a predictable failure. An architect builds a hub-and-spoke design, peers each spoke to the hub, and assumes the spokes can now reach each other through the hub because they are all connected to it. They cannot. Peering establishes reachability only between the two VNets directly joined by the peering link. Spoke A’s peering to the hub makes the hub reachable from A and A reachable from the hub, and nothing more. For spoke A to reach spoke B, you need the hub to actually forward traffic between them, which means a routing appliance or a gateway in the hub and user-defined routes in each spoke that send the other spoke’s prefix to that hub appliance as the next hop. The peering carries the packets to the hub; the hub’s forwarding logic and the spokes’ route tables carry them the rest of the way. Skip that and the spokes are islands that can each see the hub but never each other. When you reach the point of weighing peering against gateways and circuits for connecting networks at scale, the trade-offs among VNet peering, VPN gateways, and ExpressRoute become the deciding factors, and the non-transitive rule is exactly why hub designs need more than peering to function.

Peering also carries a few properties worth committing to memory. It is non-transitive, as covered. It requires non-overlapping address spaces, like every other connection method. Each side of the peering can be independently configured to allow or block forwarded traffic, gateway transit, and access to the remote VNet, so a peering is really two directional configurations that you tune separately. And gateway transit, the feature that lets a spoke use a gateway that physically lives in the hub, is what makes the hub-and-spoke pattern economical, because it means you provision one expensive gateway in the hub rather than one per spoke. These knobs are where a peering goes from “connected” to “connected the way you intended,” and reading them carefully prevents the subtle cases where traffic flows but not along the path you assumed.

Name resolution: the separate concern that breaks things quietly

Routing gets a packet to an address, but the application almost never starts with an address; it starts with a name, and name resolution is a concern entirely separate from routing that engineers consistently conflate with it. A VNet provides name resolution by default through an Azure-supplied resolver, reachable at a well-known address inside the VNet, which resolves public names and provides automatic resolution of the resources within the same VNet by their hostnames. This default is enough for many simple deployments and it requires no configuration, which is exactly why it disappears from people’s attention until it is the thing that breaks.

Should a VNet use Azure-provided or custom DNS?

Use Azure-provided DNS when you only need public name resolution and automatic resolution of resources within a single VNet. Switch to custom DNS servers when you need resolution across peered VNets, integration with on-premises DNS, conditional forwarding, or private DNS zones for private endpoints. The choice is set on the VNet and applies to every resource in it.

The default resolver has real limits that push serious deployments toward a custom configuration. It does not resolve names of resources in a different VNet, even a peered one, by their short hostnames, so the moment your application spans VNets you need a name-resolution strategy that crosses them. It does not, on its own, give you the conditional forwarding you need to resolve on-premises names from Azure and Azure names from on-premises in a hybrid network. And it is not the mechanism that makes private endpoints resolve to their private addresses; that job belongs to private DNS zones linked to the VNet. So the progression is predictable. A single VNet with public dependencies uses the default and is fine. A network that spans VNets, reaches on-premises, or uses private endpoints needs either custom DNS servers configured on the VNet, the Azure DNS Private Resolver, or private DNS zones, depending on the requirement. The detail of how public zones, private zones, auto-registration, and conditional forwarding fit together is involved enough that Azure DNS and Private DNS zones deserve their own treatment, but the principle to carry here is that name resolution is its own layer, and a connectivity problem that turns out to be a resolution problem is one of the most time-consuming to diagnose precisely because engineers keep looking at routes and firewalls when the name never resolved to the right address in the first place.

A concrete and common version of this: a team stands up a private endpoint for a storage account or a database, configures the network path correctly, confirms the route and the security rules allow the traffic, and the connection still fails. The cause is almost always DNS. Without the private DNS zone linked to the VNet, the application resolves the storage account’s name to its public address rather than the private endpoint’s address, so the perfectly good private path is never used because the application is aiming at the wrong destination. The network is correct; the name is wrong. This is the canonical example of why the route-then-filter-then-resolve concerns must be kept mentally distinct, because the symptom looks like a network failure and the fix is entirely in the resolution layer.

Network security groups: where filtering actually happens

Because the VNet itself does not filter traffic, the enforcement layer is the network security group, a stateful packet filter you attach to a subnet or to an individual network interface, and understanding where it sits relative to routing is what completes the route-then-filter model. A network security group holds a prioritized set of allow and deny rules evaluated by priority, lowest number first, with a set of default rules that, among other things, permit traffic within the VNet and from the load balancer and deny inbound from the internet unless you allow it. Because the rules are stateful, allowing an inbound flow automatically permits the corresponding return traffic, so you reason about the direction the connection is initiated in rather than having to mirror every rule.

The placement choice, subnet or network interface, is a real design decision rather than a formality. A network security group on a subnet applies to every resource in that subnet, which is the natural place for tier-wide policy. A network security group on a network interface applies only to that one resource, which is the place for an exception. When both are present, traffic must pass both, so the effective policy is the intersection, and an allow on the subnet that is denied on the interface still results in a drop. The depth of how rules, priorities, default rules, service tags, and application security groups interact is substantial, and the full evaluation logic belongs in a dedicated study of Network Security Groups, but the load-bearing idea for VNet design is that the network security group is the filter and the VNet plus its route tables decide the path, and a packet has to survive both to arrive. When traffic that routes correctly still does not arrive, the network security group is the first place to look, and when traffic arrives that should not, the route is sending it somewhere your filter does not cover. The full picture of how routing and filtering compose into the path a packet takes is the subject of the broader Azure networking fundamentals treatment, which builds the packet-path map that the troubleshooting articles draw on.

The limits and quotas that quietly shape a design

A VNet design runs into platform limits long before it runs into anything theoretical, and knowing where those ceilings sit prevents architecting toward a wall. There are limits on the number of VNets per subscription per region, on the number of subnets within a VNet, on the number of peerings a single VNet can hold, on the number of address ranges and route-table routes and security-group rules, and on the addresses a single VNet can carry. The specific numbers are exactly the kind of platform value that Azure raises over time and that varies by subscription type, so the discipline is never to memorize a figure but to know the dimension exists and to confirm the current ceiling against the official documentation when a design approaches it. What matters more than the numbers is the architectural lesson they teach: a single VNet does not scale infinitely, and a design that tries to put everything in one enormous VNet eventually meets a peering limit, a route limit, or a rule limit that forces a restructure.

The peering limit in particular shapes large topologies. Because a VNet can hold only so many peerings, a flat mesh where every VNet peers with every other VNet does not scale, which is one of the structural reasons hub-and-spoke and managed connectivity backbones exist. They reduce the number of peerings each VNet must hold by routing through a hub rather than connecting every pair directly. The route-table and security-rule limits teach a parallel lesson: a design that accumulates hundreds of bespoke routes or rules is usually a design that should be simplified with summarized prefixes, service tags, and application security groups rather than one that should request a limit increase. Hitting a limit is frequently a signal that the design has grown more complicated than it needs to be, and the right response is often to consolidate rather than to expand the ceiling. Treat the limits as design feedback, confirm the live numbers when you near them, and structure topologies so no single VNet has to hold an unreasonable count of anything.

Are there limits on how large a VNet can be?

Yes. Azure caps the address ranges, subnets, peerings, routes, and rules a single VNet can hold, and the exact figures vary by subscription and change over time, so confirm them against current documentation when a design nears one. The architectural takeaway is that one VNet does not scale infinitely, which is why large estates use hub-and-spoke topologies rather than a single sprawling network.

How a packet from the internet reaches a backend

Inbound traffic from the internet to a workload inside a VNet follows a path worth tracing once, because understanding it clarifies where each piece of the puzzle fits and why the parts that feel redundant are not. A public address is the only thing the outside world can target, so inbound traffic arrives at a public address associated with a load balancer, an application gateway, a firewall, or directly with a resource. A load balancer at the standard tier receives that traffic on a frontend public address, matches it against its rules, and distributes it across a backend pool of resources by their private addresses, using health probes to send traffic only to instances that respond. The load balancer does not hold the backends’ traffic; it directs the flow to a private address in a subnet, where the subnet’s routing and the resource’s security group then apply exactly as they do for any other flow.

The reason this matters for VNet design is that the inbound path threads through several of the constructs the article has built up, and seeing them in sequence dispels the sense that they overlap. The public address is the reachable target. The load balancer or gateway is the distribution and entry logic. The backend pool’s members live on private addresses in subnets. The route table on those subnets governs where the backends’ own traffic goes next, and the security groups decide whether the inbound flow is permitted to reach them. A health probe that fails takes an instance out of rotation at the load balancer level, which is a different concern from a security group dropping traffic or a route misdirecting it, and conflating a failing probe with a filtering problem is a common misdiagnosis. When inbound traffic does not reach a backend, the questions are again separable: is the public address and load balancer rule correct, is the backend healthy according to its probe, does the route carry the traffic, and does the security group permit it. Each is a distinct layer with a distinct check.

Availability zones and what a VNet does across them

A VNet is a regional construct, and a region in many cases comprises multiple availability zones that are physically separated datacenters within that region, which raises a question engineers ask and often answer wrongly: does a subnet belong to a zone? It does not. A subnet spans the zones of its region, and zonal placement is a property of the resources you put in the subnet rather than of the subnet itself. You place a virtual machine or other zonal resource into a specific zone, or you spread a set of them across zones for resilience, and they all draw addresses from the same zone-spanning subnet. This means designing for zone resilience is about how you distribute resources across zones, not about carving zone-specific subnets, and a team that creates a subnet per zone has misunderstood the model and added structure that buys nothing.

The resilience implication is that a workload survives the loss of a single zone only if its resources are spread across zones and the entry points in front of them are zone-redundant, which is where the standard tier of public addresses and load balancers earns its place, since their zone redundancy keeps the entry point alive when a zone fails. A workload concentrated in one zone, even inside a perfectly designed VNet, fails entirely when that zone does, because the VNet’s regional nature does not by itself distribute the resources. The VNet provides the address space and routing that span the zones; using that span for resilience is a deliberate distribution decision you make for each tier. Reasoning about availability zones, region pairs, and zonal versus zone-redundant resources at the depth a resilience design needs is involved enough to warrant its own treatment, but the load-bearing fact here is that the subnet is not zonal and resilience comes from spreading resources across the zones the subnet already spans.

Does a subnet belong to a single availability zone?

No. A subnet spans all the availability zones in its region, and zonal placement is a property of the resources you deploy into it, not of the subnet. You achieve zone resilience by spreading resources across zones and fronting them with zone-redundant entry points, not by creating a separate subnet per zone, which adds structure without benefit.

The InsightCrunch VNet design reference

The following reference distills the load-bearing facts of VNet design into one place, so an engineer planning or debugging a network can place any decision on it. This is the artifact to bookmark, because almost every VNet question lands on one of these rows.

Design element	What it controls	The fact that trips people up	The deliberate practice
VNet address space	The private CIDR every subnet draws from	Hard to change once resources depend on it; must not overlap any connected network	Size for three years, keep an org-wide non-overlapping registry
Subnet	The unit controls attach to (NSG, route table, delegation)	Loses five reserved addresses; some services need a dedicated or named subnet	Subtract five from the mask, leave gaps for growth and infra subnets
System routes	Default reachability between subnets, to internet, across peerings	Exist automatically; route inter-subnet traffic directly with no filtering	Add UDRs to override; never assume an appliance is in-path without a route
User-defined routes	Overriding the path for a prefix	Selected by longest-prefix match; a stale UDR black-holes traffic	Point prefixes at the next hop you intend; audit when appliances change
Outbound internet	The source address your egress presents	Implicit default outbound is being retired; gives no control or stable address	Attach a NAT gateway for stable, owned egress and ample SNAT ports
VNet peering	Private reachability between two VNets	Non-transitive; needs non-overlapping spaces; directional knobs per side	Use a hub appliance plus UDRs to connect spokes; enable gateway transit
DNS	How names resolve to addresses	Default resolver does not cross VNets or resolve private endpoints	Use custom DNS or private DNS zones once you span VNets or use private endpoints
Network security group	Whether a permitted-path packet is allowed	The filter, not the router; subnet and NIC policies intersect	Keep filtering separate from routing in your diagnosis

The namable claim this reference advances is what we have been circling throughout, and it is worth stating plainly as the VNet boundary rule: an Azure Virtual Network is an address and routing boundary, not a security boundary, so it isolates addressing and default reachability while leaving filtering to a separate layer, which is why every VNet design and every VNet problem decomposes into the three distinct questions of which addresses, which route, and which rule. An engineer who internalizes that decomposition stops chasing the wrong layer, and a network designed around it behaves the way its diagram says it should.

Dual-stack addressing and where IPv6 fits

A VNet can run as a dual-stack network, carrying both IPv4 and IPv6 address space so that resources hold an address in each family, which matters for workloads that must serve IPv6 clients or that anticipate the long migration the wider internet is making. In a dual-stack VNet you assign an IPv6 range alongside the IPv4 range at the VNet level, carve dual-stack subnets that each hold an IPv4 and an IPv6 prefix, and resources receive an address from each family. The routing and filtering concepts carry over with their own per-family rules: routes and security-group rules apply to the address family they name, so an IPv4 allow rule does nothing for IPv6 traffic and a complete dual-stack design must cover both families deliberately rather than assuming a rule written for one protects the other.

The practical guidance is to adopt dual-stack only when there is a real requirement, because it doubles the addressing and rule surface you maintain and introduces a class of mistake where one family is secured and the other is left open. A team that enables IPv6 and writes filtering rules only for IPv4 has created an unguarded path on the IPv6 side that a security review will flag, so the rule of thumb is that any subnet carrying both families needs its security policy written for both, checked for both, and reasoned about for both. Where IPv6 is not a genuine requirement, the simpler single-stack IPv4 design is less to get wrong. Where it is required, treat the IPv6 plane as a full peer of the IPv4 plane in every routing and filtering decision rather than an afterthought bolted on, because the asymmetry of a half-configured second family is exactly where exposure hides.

The operational lifecycle: dependencies, deletion order, and governance

A VNet sits at the bottom of a dependency stack, which has operational consequences that surface most sharply when you try to tear something down. You cannot delete a VNet while resources still depend on it, and you cannot delete a subnet while a network interface, a delegated service, or certain platform resources still occupy it, so the deletion order is the reverse of the creation order: the dependents come down first and the network comes down last. The frequent frustration here is an attempt to delete or resize a subnet that fails with a message about a resource still in it, and the resource is sometimes a managed service’s hidden interface or a delegation that is not obvious in the portal. The fix is to find and remove every dependent first, and the prevention is to understand that the network is foundational and everything built on it has to be cleared before the foundation can move.

Governance of VNets follows the same foundational logic. Because the network underpins everything, it is a natural place to apply organizational controls: resource locks that prevent accidental deletion of a network many workloads depend on, policies that enforce naming and address-range standards so the organization-wide registry stays accurate, and a clear ownership model so that the team responsible for the address space is the team that approves changes to it. A network that anyone can edit is a network that drifts, and drift in the address plan or the routing is exactly the kind of change that breaks distant workloads in ways that are hard to trace back to the edit that caused them. The way the platform models these dependencies and the order in which it creates and destroys them is part of the broader resource management story, and understanding how Azure Resource Manager sequences dependent resources explains both the deletion-order behavior and why the network reliably comes up before the things that ride on it. Treat the VNet as shared infrastructure with an owner, a lock, and a policy, not as a resource any team edits at will, and the network stays as predictable as its diagram.

Why can I not delete my subnet or VNet?

A VNet or subnet cannot be deleted while resources still depend on it, including network interfaces, delegated services, and platform resources whose presence is not always obvious in the portal. Remove every dependent resource first, since the network is foundational and must come down last. The deletion order is the reverse of the creation order.

Subnet delegation and service injection

Two related mechanisms let managed services live inside your VNet, and distinguishing them prevents a class of confusion about why a subnet behaves differently than expected. Subnet delegation hands control of a subnet to a specific Azure service so that the service can deploy its managed instances into your address space and apply the policies it requires, which is how offerings such as certain database and container platforms integrate into a VNet without you managing their plumbing. A delegated subnet is dedicated to that service and will not accept arbitrary resources, so you plan it as a single-purpose slice. Service injection, the broader idea, is the pattern of placing a platform service’s compute directly into your VNet rather than reaching it over a public endpoint, which keeps the traffic on private addresses and inside your routing and filtering controls.

The design implication is that VNet integration for platform services is not free address space; each integrated service consumes a subnet, sometimes with a minimum size and specific delegation, and the cumulative demand of several integrated services is easy to underestimate. This is one more input to the address-space generosity argument. A network that looked comfortably sized for virtual machines can run short once it also hosts a delegated database subnet, a container platform subnet, a gateway subnet, a firewall subnet, and a bastion subnet, each carved from the same address space and each unavailable for general use. Account for the infrastructure and integration subnets at planning time, because discovering them one at a time as you add services is how a generous-looking address space quietly fills up.

Public and private addressing, and why the IP SKU matters

A network interface inside a VNet always carries a private address from its subnet, and it may additionally carry a public address when the resource needs to be reachable from or to initiate traffic to the internet directly. The distinction between these two address types, and the way the public address is configured, shapes both reachability and resilience in ways that catch teams off guard. A private address is what every internal flow uses, it comes from the subnet range, and it can be assigned dynamically by the platform or pinned statically when a resource needs a predictable address that survives restarts. A public address is a separately billed resource you associate with an interface, a load balancer, or a gateway, and it is the only address that the outside world can target.

The public IP resource carries a tier choice that has consequences beyond the address itself. The newer standard tier is closed to inbound traffic by default and requires an explicit security rule to permit it, it supports zone redundancy so the address survives the loss of a single availability zone, and it is the tier the modern load balancer and many current services expect. The older basic tier was open by default, lacked zone redundancy, and is on a deprecation path. The practical guidance is to use the standard tier for anything you are building today, because the default-closed posture is safer, the zone redundancy matters for resilience, and the basic tier’s retirement means anything built on it inherits a migration later. Confirm the current deprecation timeline against the official documentation, since these schedules move, but the direction is settled and building on the standard tier avoids the rework.

The static-versus-dynamic choice for private addresses deserves the same deliberate treatment. A dynamically assigned private address is fine for stateless compute that does not care what address it holds, but anything that another resource targets by address, a domain controller, a custom DNS server, an appliance that a route table points at as a next hop, needs a static private address so that a restart does not move it and silently break every reference. The classic incident here is a user-defined route pointing at an appliance whose address was dynamic; the appliance restarts, takes a different address, and the route now points at nothing while the old address black-holes traffic. Pinning the appliance’s private address removes that failure mode entirely. The rule of thumb is that any address something else depends on should be static, and only addresses nothing references can safely float.

When should a resource have a static private address?

Pin a static private address whenever another resource targets it by address: custom DNS servers, domain controllers, and any appliance a user-defined route names as its next hop. A dynamic address is fine only for stateless compute that nothing references. The common break is a route pointing at an appliance whose dynamic address changed after a restart.

Service endpoints versus private endpoints: two ways to privatize platform access

A frequent source of design confusion is that Azure offers two distinct mechanisms for reaching a platform service over a private path rather than its public endpoint, and they work so differently that picking the wrong one produces subtle problems. A service endpoint extends your VNet’s identity to the platform service so that the service recognizes traffic as originating from your subnet and can be configured to accept only that traffic, but the service still resolves to a public address and the traffic, while it stays on the Microsoft backbone, reaches the service through its public-facing front door restricted to your network. A private endpoint, by contrast, projects an actual private address from your subnet that represents the specific service instance, so the service becomes a resource on your own address space reachable over a genuinely private path, with no reliance on the public endpoint at all.

The decision between them turns on what you are trying to achieve. A service endpoint is lighter to configure, costs nothing extra, and suffices when your goal is to restrict a platform service to accept traffic only from your subnets while keeping the public endpoint as the resolution target. A private endpoint is the stronger isolation because it gives the service a private address in your space, removes the public endpoint from the path, and works across peered VNets and from on-premises, at the cost of the private endpoint resource itself and the DNS configuration it demands. That DNS configuration is the catch covered earlier and worth repeating in this context: a private endpoint only delivers its benefit if the service name resolves to the private endpoint’s address, which requires the right private DNS zone linked to the VNet, and the most common private endpoint failure is exactly the missing zone leaving the name resolving to the public address. The way private endpoints depend on resolution and routing is precisely why Azure Private Link and private endpoints reward careful study rather than copy-paste configuration.

What is the difference between a service endpoint and a private endpoint?

A service endpoint tags your subnet’s traffic so a platform service can accept only your network, but the service still resolves to and is reached at its public address over the backbone. A private endpoint projects a private address from your subnet for the specific service instance, removing the public endpoint from the path entirely and working across peerings and on-premises.

The choice also interacts with your broader topology in a way teams underestimate. Service endpoints are scoped to the VNet and region and do not extend their effect to peered networks or on-premises, so a hub-and-spoke that needs every spoke to reach a locked-down platform service privately is better served by private endpoints, because a private endpoint placed once becomes reachable from every spoke that can route to it and from on-premises across a gateway. Designing this early matters because retrofitting private endpoints onto a sprawling estate means revisiting DNS across every network that must resolve them, which is a larger change than it first appears. The principle is to decide, per platform service, whether subnet-restricted public access is enough or whether you need the service as a private address in your own space, and to make that call before the topology grows complex enough that changing it becomes a project.

Seeing what the network is actually doing

A network you cannot observe is a network you debug by guessing, and Azure provides a set of diagnostic capabilities that turn guesswork into evidence, which is the difference between a five-minute diagnosis and a five-hour one. The single most valuable tool is the effective route table on a network interface, already introduced, because it shows the actual routes the platform has assembled for that interface, system and user-defined together, after all peerings and gateways have injected their routes. When a flow goes somewhere unexpected, this table is the ground truth that ends the argument about which route is in play, and it should be the first thing you look at for any path problem.

Alongside routing visibility, Azure offers a connectivity verification capability that tests whether traffic can flow from a source to a destination and reports the result along with the network security group rules and routes that allowed or blocked it, which collapses the route-and-filter diagnosis into a single check that names the responsible rule. There is also an address-level flow verification that takes a source address, destination address, port, and protocol and tells you whether a packet would be permitted and, if denied, which security rule denied it, which is the fastest way to confirm whether a filter is the cause without manually tracing every rule’s priority. For traffic-volume and pattern visibility, network security group flow logs record the flows that the security groups evaluated, allowed, and denied, and feeding those logs into an analytics workspace turns them into a queryable record of what the network actually carried, which is invaluable both for security review and for understanding a workload’s real communication pattern rather than its assumed one.

# Confirm whether a packet would be allowed and which rule decides
az network watcher test-ip-flow \
  --resource-group rg-network-demo \
  --vm <vm-name> \
  --direction Inbound \
  --protocol TCP \
  --local <local-ip>:443 \
  --remote <remote-ip>:50000

# Verify end-to-end connectivity and see the routes and rules in the path
az network watcher test-connectivity \
  --resource-group rg-network-demo \
  --source-resource <source-vm> \
  --dest-address <destination> \
  --dest-port 443

The discipline these tools encourage is to gather evidence before changing anything, because the temptation under incident pressure is to start editing rules and routes on a hunch, which frequently introduces a second problem on top of the first. The address-level flow check tells you in seconds whether the filter layer is responsible, the connectivity test traces the whole path and names the blocker, and the effective route table proves which way traffic is actually heading. Running those three before touching a single rule is what separates a controlled diagnosis from a flailing one, and it is a habit worth building into any runbook for network incidents. When you want to practice this kind of evidence-first diagnosis on realistic topologies, the hands-on Azure labs and command library on VaultBook is built to let you break a network deliberately and trace the failure back through these same checks.

How do I prove whether a route or a rule is causing a connectivity failure?

Run the address-level flow verification to learn in seconds whether a security rule denies the packet and which rule it is, then read the effective route table on the source interface to confirm the path is correct. If the flow check passes and the route is right, the problem lies elsewhere, such as DNS resolving the name to the wrong address.

Global peering, bandwidth, and the cost of crossing regions

Peering two VNets in the same region is the simple case, but real estates span regions, and global VNet peering connects VNets in different regions over the Microsoft backbone with the same private reachability, no public exposure, and the same non-transitive behavior as regional peering. The difference that matters for design is that traffic crossing regions travels farther, so it carries more latency than intra-region traffic, and inter-region data transfer is billed, which makes the placement of chatty components a cost and performance decision rather than an afterthought. A workload that scatters tightly coupled, high-traffic components across regions pays for every byte that crosses the region boundary and inherits the latency of the distance, which is why components that talk to each other constantly belong in the same region unless there is a compelling resilience reason to separate them.

The peering switches gain extra weight at global scale. Gateway transit and the allowance of forwarded traffic determine whether a global peering participates in a hub topology or merely connects two networks point to point, and getting those wrong produces a peering that is established but does not carry the traffic you expected through the hub. Because peering remains non-transitive across regions exactly as it is within one, a global hub-and-spoke needs the same forwarding appliance and route-table discipline as a regional one, with the added consideration that the hub appliance now sits in one region and inter-region traffic flows through it. None of this makes global peering wrong; it makes it a deliberate choice whose latency and cost you account for rather than discover on a bill. The full comparison of when peering, a VPN gateway, or a circuit is the right reach across distance is the territory of the VNet peering, VPN, and ExpressRoute decision, and the cost of crossing regions is one of its deciding factors.

Encryption and the real security posture inside a VNet

Engineers often assume that traffic inside a VNet is automatically encrypted, and the truth is more nuanced and worth getting right because it shapes how you design for sensitive workloads. Traffic between resources in a VNet travels the Microsoft backbone rather than the public internet, which is a meaningful isolation property, and the platform applies its own protections to traffic as it crosses the physical infrastructure between datacenters. What that does not mean is that every application-layer flow is encrypted end to end in the way a security review expects, so the responsible posture is to encrypt sensitive traffic at the application layer regardless of the network’s transport, using transport-layer security between services, rather than treating the private network as a substitute for application encryption. The private path reduces exposure; it does not relieve the application of its own confidentiality obligations.

This connects back to the boundary rule that runs through the whole article. The VNet gives you a private address space and a path that stays off the public internet, which is real and valuable, but it is not a guarantee about the confidentiality or integrity of any individual flow, and it is certainly not a filter. The complete security posture for a workload is the composition of the private network path, the filtering you apply with network security groups, the encryption you apply at the application layer, and the identity controls that decide who may even establish a connection, which in Azure means the way Microsoft Entra ID governs the identities that services use to authenticate to one another. Designing security as that composition, rather than leaning on the VNet to provide it alone, is what produces a posture that survives a real review. The network layer does its part, and the parts it does not do are the application’s and the identity layer’s to cover.

A second worked example: forcing traffic through an appliance and filtering a tier

The first walkthrough showed the defaults; this one shows the two deliberate overrides that most real designs require, a route table that forces traffic through an appliance and a network security group that filters a tier, so the route-then-filter model becomes concrete in commands. The route table below creates a route that sends all internet-bound traffic from a subnet to an appliance’s address as the next hop, which is the override that puts a firewall appliance into the path that the system route would otherwise bypass.

# Create a route table and a route forcing egress through an appliance
az network route-table create \
  --resource-group rg-network-demo \
  --name rt-egress-through-fw

az network route-table route create \
  --resource-group rg-network-demo \
  --route-table-name rt-egress-through-fw \
  --name default-to-firewall \
  --address-prefix 0.0.0.0/0 \
  --next-hop-type VirtualAppliance \
  --next-hop-ip-address 10.20.3.4

# Associate the route table with the application subnet
az network vnet subnet update \
  --resource-group rg-network-demo \
  --vnet-name vnet-core \
  --name snet-app \
  --route-table rt-egress-through-fw

The next-hop type of a virtual appliance and the explicit next-hop address are what direct the traffic, and the appliance at that address must be configured to forward, with its own interface set to allow forwarding, or the traffic dies at the appliance rather than passing through it. The 0.0.0.0/0 prefix is the broadest possible match, so it captures everything not covered by a more specific route, which is exactly the behavior you want for routing all egress through a single inspection point. Pair this with a filter on the same subnet and you have the two halves of the model working together.

# Create a network security group and a rule, then attach it to the subnet
az network nsg create \
  --resource-group rg-network-demo \
  --name nsg-app-tier

az network nsg rule create \
  --resource-group rg-network-demo \
  --nsg-name nsg-app-tier \
  --name allow-https-from-web-tier \
  --priority 100 \
  --direction Inbound \
  --access Allow \
  --protocol Tcp \
  --source-address-prefixes 10.20.0.0/24 \
  --destination-port-ranges 443

az network vnet subnet update \
  --resource-group rg-network-demo \
  --vnet-name vnet-core \
  --name snet-app \
  --network-security-group nsg-app-tier

With both attached, a packet bound for the application subnet must follow the route table, which sends egress through the appliance, and must satisfy the security group, which here permits inbound HTTPS only from the web tier’s range. The route decided the path; the rule decided survival; they are independent, and you can see each one’s effect separately with the diagnostic checks from the previous section. This is the entire route-then-filter model in two artifacts you can read, audit, and reason about, and building it once by hand fixes the model in a way that no diagram does. The deeper mechanics of route selection, next-hop types, and the order in which routes are evaluated are the subject of the dedicated Azure route tables and user-defined routes treatment, which is the companion to this section.

When a VNet is the right tool and when to reach further

The VNet is the default substrate for almost everything in Azure that has a network presence, so the question is rarely whether to use one but how to compose it with the connectivity and security primitives that surround it. A single VNet suits a workload that lives in one region and whose components are comfortable sharing one address space and one routing domain. The moment you need a second region, a second business unit’s network, or a connection to on-premises, you move from one VNet to a topology, and the topology decision is where the surrounding services come in. Peering connects VNets directly and is the right reach for a hub-and-spoke within reasonable scale. A VPN gateway connects to on-premises or to another network over the internet with encryption when a private circuit is unnecessary or unavailable. An ExpressRoute circuit provides a private, high-bandwidth connection through a provider when the workload demands it. Private Link and private endpoints bring a specific platform service onto a private address inside your VNet when you want that service reachable without any public exposure, and the way Azure Private Link and private endpoints work hinges on the same DNS and routing concerns this article has emphasized.

The alternative to reach for, when a hand-built hub-and-spoke of peered VNets with appliances and route tables becomes too much to manage by hand, is a managed connectivity backbone that takes over the hub’s routing and scaling for you, trading some control for managed simplicity. That is a genuine fork rather than a different capability, and the right side depends on how much routing complexity you are willing to own. Within a single VNet, the levers are address planning, subnetting, routing, and filtering. Across VNets, the levers are the connection methods and the topology. Knowing which layer your problem lives in is, again, the skill that this whole article is built to install.

Putting the model to work: a deployment from first principles

A reproducible way to see the pieces fit is to build a small VNet from the command line and observe each default as it appears. The commands below create a VNet with an address space and a first subnet, then add a second subnet, and they are transcribed to run against the Azure CLI; confirm flag names and any service-specific minimums against the current official documentation, since the platform’s tooling evolves.

# Create a resource group to hold the network
az group create \
  --name rg-network-demo \
  --location eastus

# Create the VNet with a generous address space and the first subnet
az network vnet create \
  --resource-group rg-network-demo \
  --name vnet-core \
  --address-prefixes 10.20.0.0/16 \
  --subnet-name snet-app \
  --subnet-prefixes 10.20.1.0/24

# Add a second subnet, leaving a deliberate gap after the first
az network vnet subnet create \
  --resource-group rg-network-demo \
  --vnet-name vnet-core \
  --name snet-data \
  --address-prefixes 10.20.2.0/24

# Inspect the effective routes on a NIC to see the system routes Azure installed
az network nic show-effective-route-table \
  --resource-group rg-network-demo \
  --name <nic-name> \
  --output table

The address space here is a /16, which is deliberately large; the two /24 subnets each give 256 total addresses, 251 of them usable after the five reservations, and the gap between 10.20.1.0/24 and 10.20.2.0/24 leaves room to grow either subnet without collision. The final command is the one worth running on any network you are trying to understand, because the effective route table shows you the actual system and user-defined routes the platform has assembled for that interface, which is the ground truth that ends most routing arguments. When two subnets cannot talk and the diagram says they should, the effective route table tells you whether a route is sending the traffic somewhere unexpected, and when it shows the expected direct VNet route, you have proven the problem is in the filter layer rather than the routing layer. Peering two VNets is a matching pair of commands.

# Peer vnet-core to a second VNet named vnet-spoke (both directions required)
az network vnet peering create \
  --resource-group rg-network-demo \
  --name core-to-spoke \
  --vnet-name vnet-core \
  --remote-vnet vnet-spoke \
  --allow-vnet-access

az network vnet peering create \
  --resource-group rg-network-demo \
  --name spoke-to-core \
  --vnet-name vnet-spoke \
  --remote-vnet vnet-core \
  --allow-vnet-access

The two commands are not redundant; a peering is two directional configurations, and creating only one side leaves the peering in a state where it is not fully established. Each side carries its own switches for forwarded traffic, gateway transit, and remote access, which is the granularity that lets a hub allow a spoke to use its gateway while the spoke does not extend the same courtesy in return. Building this by hand once, and reading the effective route table before and after each step, teaches the default behavior more durably than any amount of reading, which is exactly why working through the moves in a sandbox pays off. You can run the hands-on Azure labs and command library on VaultBook to build these pieces and trace a packet path end to end, watching the system routes appear, the peering routes inject, and the effective route table change as you go.

Common failure patterns and how to read them

The recurring problems engineers hit with VNets cluster into a handful of patterns, and naming them turns a confusing incident into a quick diagnosis. The first is the subnet that runs out of addresses sooner than expected, which is the five-address reservation plus an autoscaling event consuming the headroom that was never really there; the fix is to size by subtracting five and leaving margin, and the prevention is to never use the smallest masks for anything that scales. The second is the appliance that sits idle while traffic flows around it, which is the missing user-defined route; the system route sends inter-subnet traffic directly, so without a route pointing prefixes at the appliance, the appliance is in the VNet but not in the path. The third is the black hole, a user-defined route pointing at a next hop that no longer forwards, dropping all traffic for that prefix while the route insists the traffic belongs there.

The fourth pattern is the spokes that cannot reach each other through a hub, which is the non-transitive peering rule; the fix is a forwarding appliance in the hub and user-defined routes in the spokes, never another peering. The fifth is the private endpoint that will not connect despite a correct network path, which is almost always the missing private DNS zone resolving the service name to its public address instead of the private endpoint. The sixth is intra-VNet traffic assumed to be isolated but flowing freely, which is the absence of a network security group; the default is reachability, and isolation is something you add. The seventh is asymmetric routing after a topology change, where traffic leaves along one path and tries to return along another that a stateful filter rejects, which is the kind of subtle break that follows a route-table edit. The eighth is name resolution failing while connectivity is fine, the cross-VNet or hybrid resolution gap that the default resolver does not cover. Every one of these decomposes cleanly into the address, route, and rule layers, which is why holding those three layers apart is the whole game. The detail of how to override the path deliberately and audit it lives in the dedicated treatment of Azure route tables and user-defined routes.

A ninth pattern deserves its own mention because it surfaces at the worst time, during a merger or an attempt to connect two networks that were each designed in isolation: the overlapping address space that makes peering impossible. Two teams each picked the same popular default range, each network works perfectly on its own, and the day someone tries to peer them the platform rejects the peering because the address spaces collide and there is no unambiguous way to route a packet to a destination that exists in both. There is no quick fix once both networks are in production, because the resolution is to renumber one of them, which is the migration the whole address-planning discipline exists to prevent. The only true prevention is the organization-wide address registry applied before either network is built, so that no two networks are ever designed into a collision in the first place. When the collision already exists, the interim options are network address translation between the two spaces, which adds complexity and obscures the real addresses, or accepting that the networks cannot peer and connecting them through a more elaborate path. None of these is pleasant, which is precisely why the few minutes of registry discipline at design time pays for itself many times over, and why overlap belongs on the short list of decisions that are genuinely expensive to get wrong.

The strategic verdict

The Azure Virtual Network rewards the engineer who treats it as a deliberate design rather than an accepted default, and it punishes the one who does not, usually months later and usually at the worst time. The decisions that matter are made early and are expensive to revisit: the address space that must be generous and non-overlapping because renumbering is a migration, the subnet layout that organizes both growth and the surfaces controls attach to, and the recognition that the VNet gives you reachability and a place to enforce policy but enforces nothing on its own. Everything downstream of those decisions, the routing you override, the peerings you establish, the DNS you configure, and the filtering you apply, is comprehensible once you accept that routing chooses the path, name resolution chooses the destination, and the network security group chooses whether the packet survives, and that these are three separate decisions that a sound design keeps separate. Plan the address space as if you will regret being stingy, because you will, design the subnets around the controls you intend to apply, and reason about every connectivity question by asking which addresses, which route, and which rule. A VNet built and debugged that way behaves the way its diagram promises, and the afternoon you would have spent chasing a connectivity ghost is yours to keep. Build it deliberately once, and the network rewards you for years; accept the defaults blindly, and it bills you for the shortcut later, with interest.

Frequently Asked Questions

Q: What is an Azure Virtual Network and what does it actually do?

An Azure Virtual Network is a private slice of IP address space, scoped to one subscription and one region, inside which the resources you place can address each other over private IP automatically and out of which traffic reaches other networks only along paths the platform provides or you create. It defines the address range every resource draws from, owns the default routing that moves packets between subnets, governs how names resolve, and marks the boundary between what is reachable inside your estate and what must cross a gateway. What it does not do is filter traffic, so it is an address and routing boundary rather than a security boundary, and treating it as the latter is the most common source of production surprise. Think of it as the building, with subnets as floors and system routes as the hallways the platform builds for you.

Q: How big should my VNet address space be?

Size the address space for the network you will plausibly have in three years, not the one you are deploying this week, because Azure does not bill for unused address space but renumbering an in-use VNet is a migration with a maintenance window. Allocate a generous private CIDR block, leave deliberate gaps between subnets so each can grow without colliding with its neighbor, and reserve whole ranges for subnets you have not yet designed, including the gateway, firewall, and bastion subnets that infrastructure requires. The block must not overlap any network you intend to connect, now or later, which is why an organization-wide address registry that records every spoken-for range is essential. The asymmetry is total: a too-large range costs nothing, while a too-small one costs you a weekend, so the correct bias is always toward generosity.

Q: Why does my subnet have fewer usable addresses than the mask suggests?

Azure reserves five addresses in every subnet. The first and last play the network and broadcast roles that any IP network reserves, and Azure takes three more at the start of the range for the default gateway and the platform’s internal DNS and metadata services. A subnet’s usable host count is therefore its total size minus five. This matters most for small subnets, where five is a large fraction of the total, so a /29 with eight addresses leaves only three usable, which is rarely enough once scaling is considered. It also explains deployments that fail to find a free address in a subnet that, by a naive count, should have had room: the naive count forgot the five, and an autoscaling event or an extra network interface pushed the subnet over its real capacity. Always size by subtracting five and then leaving headroom on top.

Q: Is VNet peering transitive?

No, VNet peering is not transitive, and this is the single most misunderstood fact about it. Peering establishes private reachability only between the two VNets directly joined by the peering link. If spoke A peers with a hub and spoke B peers with the same hub, A and B still cannot reach each other through the hub by peering alone, because A’s peering only makes the hub reachable from A, not everything the hub connects to. To connect the spokes you need the hub to actually forward traffic between them, which means a routing appliance or gateway in the hub and user-defined routes in each spoke pointing the other spoke’s prefix at that hub next hop. The peering carries packets to the hub; the hub’s forwarding and the spokes’ route tables carry them the rest of the way. Forget this and your spokes become islands that each see the hub but never each other.

Q: How does routing work inside a VNet by default?

Azure installs system routes into every subnet automatically, with no configuration from you. There is a route covering the entire VNet address space pointed at the virtual network itself, which is what lets any subnet reach any other subnet directly. There is a default route to the internet so outbound public traffic has a path. There are routes that black-hole certain reserved ranges, and when you add a peering or gateway the platform injects more system routes so the connected networks become reachable. These routes exist whether you want them or not, and the only way to change where a destination’s traffic goes is to add a user-defined route that is more specific or that overrides the system route for that prefix. Azure selects routes by longest-prefix match, with user-defined routes winning ties, so your routes layer on top of the system ones rather than replacing them wholesale.

Q: Why is my network virtual appliance being bypassed?

The appliance is being bypassed because nothing tells Azure to send traffic to it. Placing a firewall or routing appliance inside the VNet does not put it in the traffic path; the system route still sends inter-subnet traffic directly between subnets, flowing around the appliance entirely. To force traffic through it, you must create a user-defined route that overrides the relevant prefixes and points them at the appliance’s address as the next hop, then attach that route table to the subnets whose traffic should pass through. Until that route exists, the appliance sits idle while traffic ignores it, which looks like the appliance is broken when the real issue is purely routing. The mirror-image risk is a user-defined route pointing at an appliance that later fails or is removed, which black-holes all traffic for that prefix because the route still insists the traffic belongs there.

Q: Should I use Azure-provided DNS or custom DNS for my VNet?

Use Azure-provided DNS when your needs are simple: public name resolution and automatic resolution of resources within a single VNet by their hostnames, with no configuration required. Switch to custom DNS servers, or add private DNS zones, when you need resolution across peered VNets, integration with on-premises DNS, conditional forwarding in a hybrid network, or private DNS zones to make private endpoints resolve to their private addresses. The DNS setting is configured on the VNet and applies to every resource in it, so it is a VNet-level decision rather than a per-resource one. The default resolver’s blind spots, cross-VNet names, hybrid resolution, and private endpoints, are exactly the cases that push a serious deployment toward custom DNS or private zones, and a connectivity problem that is really a DNS problem is among the most time-consuming to diagnose because engineers keep examining routes and filters.

Q: Why does my private endpoint connection fail when the network path is correct?

The cause is almost always DNS rather than the network. When you stand up a private endpoint and confirm the route and the security rules allow the traffic, the connection can still fail because, without the private DNS zone linked to the VNet, the application resolves the service’s name to its public address instead of the private endpoint’s address. The perfectly good private path is never used because the application is aiming at the wrong destination entirely. The fix lives in the resolution layer: link the appropriate private DNS zone to the VNet so the service name resolves to the private endpoint address. This is the canonical example of why routing, filtering, and name resolution must be kept mentally distinct, because the symptom presents as a network failure while the actual fix is purely in DNS.

Q: Does a VNet provide security isolation between workloads?

Not on its own. A VNet isolates addressing and default reachability, so workloads in separate VNets do not have automatic reachability to each other, and in that narrow sense separation exists. What the VNet does not do is filter the traffic that flows where reachability exists. Inside a single VNet, every subnet can reach every other subnet by default with no filtering until you attach a network security group. So placing a workload in its own subnet organizes the address space and gives you a surface to attach controls to, but the isolation only becomes real when those controls are attached. The accurate way to describe it is that the VNet hands you the boundary and you supply the enforcement, which is why security design and network design are related but separate exercises.

Q: What is the difference between attaching an NSG to a subnet versus a NIC?

A network security group attached to a subnet applies to every resource in that subnet, which makes it the natural place for tier-wide policy that should govern everything on that floor of the building. A network security group attached to an individual network interface applies only to that one resource, which makes it the place for an exception that should not affect its neighbors. When both are present, traffic must pass both, so the effective policy is their intersection: an allow on the subnet that is denied on the interface still results in a drop, and vice versa. Reason about it as two filters in series rather than one combined rule set. Most designs apply broad policy at the subnet level and reserve interface-level rules for the rare resource that genuinely needs a tighter or looser exception than its subnet peers.

Q: How do I connect a VNet to my on-premises network?

You connect to on-premises with either a VPN gateway or an ExpressRoute circuit, depending on the requirement. A VPN gateway establishes an encrypted tunnel over the public internet, which is quick to stand up and economical, and it suits workloads that tolerate internet-path latency and the bandwidth a tunnel provides. An ExpressRoute circuit provides a private connection through a connectivity provider, bypassing the public internet entirely, with higher and more consistent bandwidth at higher cost and with provider dependency. Both terminate in a gateway that lives in a dedicated gateway subnet you must plan for in the VNet’s address space. The decision turns on whether the workload needs the privacy and bandwidth of a circuit or is well served by an encrypted tunnel, and in a hub-and-spoke the gateway typically lives in the hub so spokes share it through gateway transit.

Q: What is gateway transit in a peering?

Gateway transit is the peering feature that lets a VNet use a gateway that physically lives in another peered VNet, rather than provisioning its own. In a hub-and-spoke topology this is what makes the pattern economical: you provision one VPN or ExpressRoute gateway in the hub, enable gateway transit on the hub side of each peering, and configure the spoke side to use the remote gateway, so every spoke reaches on-premises through the hub’s single gateway instead of each spoke paying for its own. It is one of the directional switches a peering exposes, set independently on each side, which is why a peering is really two configurations rather than one. Without it, every spoke that needs hybrid connectivity would require its own gateway, which is both expensive and operationally heavier than sharing the hub’s.

Q: What changed about default outbound internet access?

For most of the platform’s history, a resource deployed without explicit outbound configuration still reached the internet because Azure provided a default outbound address automatically. Microsoft has announced that this implicit default outbound access is being retired for newly created deployments, steering engineers toward explicit methods such as a NAT gateway, a load balancer with outbound rules, or an explicit public IP. The exact timeline and conditions are platform details that shift, so confirm the current state against the official Azure networking documentation when you read this. The design lesson is durable regardless: implicit outbound was always fragile because it gave you no control over the source address your traffic presented and risked SNAT port exhaustion under load. Attaching a NAT gateway gives you a stable, owned set of outbound addresses and a large SNAT port pool, so designing egress deliberately means the change costs you nothing.

Q: Why can two of my subnets not communicate when they should?

Start by separating the two questions that this symptom collapses. First, which route is the traffic following? Run the effective route table on the source network interface and confirm whether the expected direct VNet route is present or whether a user-defined route is sending the traffic somewhere unexpected, such as an appliance that black-holes it. If the route is correct and points directly across the VNet, the problem is not routing. Second, which rule is dropping it? Check the network security groups on both the source and destination, at both subnet and interface level, remembering that the effective policy is the intersection and that a deny anywhere in the path wins. The overwhelming majority of “subnets cannot talk” incidents resolve into either a route sending traffic astray or a security rule dropping it, and naming which layer is at fault before changing anything prevents fixing the wrong thing.

Q: Do I need a separate subnet for managed services?

Often yes. Several Azure services require a dedicated subnet, and some require it to be delegated to that service or named in a specific way. A VPN or ExpressRoute gateway needs a subnet named exactly for the gateway, an Azure Firewall needs its own subnet, a Bastion host needs its own subnet, and a number of platform-as-a-service offerings require a delegated subnet they will not share with other resources. Each of these consumes a slice of your address space that you must plan for in advance, and the cumulative demand of several such subnets is easy to underestimate. A network that looked comfortably sized for virtual machines can run short once it also hosts gateway, firewall, bastion, and delegated database or container subnets. Account for the infrastructure and integration subnets at planning time rather than discovering them one at a time as you add services.

Q: What is the difference between a VNet and a subnet?

The VNet is the whole private network: the address space, the region scope, and the routing and DNS domain that everything inside it shares. A subnet is a contiguous sub-range carved out of the VNet’s address space, and it is the unit to which most network controls attach. You associate a network security group at the subnet level, attach a route table at the subnet level, delegate a subnet to a managed service, and place network interfaces into a subnet. In the building analogy, the VNet is the building and a subnet is a floor: the building defines the street address and the hallways between floors, while each floor is where you actually place resources and apply the policy that should govern that tier. You design the VNet for total capacity and non-overlap, and you design the subnets around the distinct routing and filtering policies different application tiers need.

Q: Can I change a VNet’s address space after creation?

You can add address ranges to an existing VNet, and you can resize or remove ranges that are not yet in use, but the moment resources, peerings, and gateways depend on a range, shrinking or renumbering it becomes a migration rather than an edit. That migration involves moving resources, re-establishing peerings, and a maintenance window, which is exactly the renumbering project that careful initial sizing avoids. The practical guidance is to treat the address space as effectively permanent for any range that is in use, plan it generously at creation so you never need to shrink, and add new non-overlapping ranges if you genuinely outgrow the original. Because Azure does not bill for unused address space, the cost of over-provisioning is zero while the cost of under-provisioning is a weekend of surgery, so the decision is not close.

Q: How does the route-then-filter model help me debug faster?

The model gives you a fixed order of questions so you stop chasing the wrong layer. Routing decides the path a packet takes, and it is a completely separate decision from whether a network security group permits the packet once it is on that path. So when something will not connect, you first ask which route the traffic follows, using the effective route table to see the ground truth, and you confirm the packet is even heading toward the right place. Only once the path is proven correct do you ask which rule is allowing or dropping it, examining the security groups at both subnet and interface level. Add name resolution as a third, separate layer, because a connection can fail simply because the name resolved to the wrong address. Almost every Azure connectivity problem decomposes into which addresses, which route, and which rule, and answering them in order is what turns an afternoon of guesswork into a few minutes of diagnosis.

Q: When should I move from peered VNets to a managed connectivity backbone?

Move when the routing complexity of a hand-built hub-and-spoke outgrows what you want to manage by hand. A topology of peered VNets with appliances in the hub and user-defined routes in every spoke is entirely workable at modest scale, but as the number of spokes and connections grows, the route tables multiply and the operational burden of keeping them correct rises. A managed connectivity backbone takes over the hub’s routing and scaling for you, centralizing route policy and growing without you hand-editing route tables, in exchange for some of the control a hand-built design gives you. The decision is a genuine trade of control for managed simplicity rather than a gain of new capability, so the right answer depends on how much routing complexity your team is willing to own. Smaller estates are well served by peering; large, fast-growing ones tend to justify the managed backbone.

Q: What address ranges should I use for a VNet?

Use ranges from the private RFC 1918 blocks, because a VNet’s address space does not need to be globally unique the way a public address must, and private ranges are the conventional and expected choice. The harder constraint is non-overlap: the range you pick must not overlap any network you intend to connect, now or in the future, because overlapping address spaces cannot be peered or routed together without translation and Azure will reject the peering. This is why so many organizations get into trouble by independently choosing the same popular default range for multiple networks and then discovering they cannot connect them. The defense is an organization-wide registry that records every range already assigned, so each new VNet is designed into free space rather than into a future collision. Pick a generous block from the private space, check it against the registry, and reserve room for growth and infrastructure subnets.